Wikipedia Predicts How Movies Will Perform at the Box Office

By Lisa Raffensperger
Aug 23, 2013 1:16 AMNov 20, 2019 3:40 AM


Sign up for our email newsletter for the latest science news

For the hundreds of millions of dollars spent on producing movies every year, predicting how they'll perform at the box office is still more art than science. The best metric, at the moment, is nothing more than counting the number of theaters carrying the film on opening weekend. But a new method that takes the pulse of the general public via Wikipedia activity can predict how a movie will fare up to a month before it hits theaters.

From Monitoring to Prediction

The Web has produced lots of interesting real-time analytics---everything from the current moods of Twitter users

to the locations of every plane in the sky above you

. But what's less well-understood is how to use so-called "big data" for prediction. Some research teams have explored using Twitter or Google keyword volumes to predict stock market changes; Google Flu Trends

uses similar data to predict where a viral outbreak might occur. For movies, though, these methods haven't worked as well. The closest researchers have come is using Twitter activity the night before a film release to gauge its subsequent earnings

. The method was highly accurate for a small sample of movies studied. But more than 24 hours advance warning would be more useful data for the film moguls, marketers and critics who rely on these trends. For instance, a film exec might decide to change the movie's rollout strategy, or a marketer might decide some last-minute advertising is in order, based on predictions a few weeks before opening day.

A Better Model for Movies

For this retrospective study, researchers focused on films released in the United States in 2010. They found a total of 312 films with Wikipedia pages, and using the freely-available data from Wikimedia Toolserver

, they extracted three main data points:

  • number of pageviews from the time of the entry's creation until the film's release date

  • number of editors who modified the article

  • number of edits made to the article

For each film they also obtained first-weekend box office earnings via IMDb. By building a mathematical model using these factors alongside the number of opening theaters, the researchers were able to predict box office earnings with much greater precision than by using theater count alone. Their model matched up with real-world data with 77 percent accuracy, versus the 57 percent accuracy of theater-count alone. What's more, these more-accurate predictions could be made as much as a month before release date, the researchers report

in PLOS One. The method has some limitations---it is, for instance, much better at predicting the performance of blockbusters than B-movies, because a higher volume of data leads to more accurate predictions. But because everything has a Wikipedia page these days, the authors say use of Wikipedia activity to predict future outcomes could be applied to a wide range of products, from a new television series, to a new variety of soda, to whatever freaky flavor of potato chips they come up with next. So if you need to know whether the next superhero smash is going to live up to expectations, look no further than the Wikipedia buzz---just, no spoilers, please. Image by Visionstyler Press via Flickr

1 free article left
Want More? Get unlimited access for as low as $1.99/month

Already a subscriber?

Register or Log In

1 free articleSubscribe
Discover Magazine Logo
Want more?

Keep reading for as low as $1.99!


Already a subscriber?

Register or Log In

More From Discover
Recommendations From Our Store
Shop Now
Stay Curious
Our List

Sign up for our weekly science updates.

To The Magazine

Save up to 40% off the cover price when you subscribe to Discover magazine.

Copyright © 2024 Kalmbach Media Co.