Predicting the Winner of the 86th Oscars Best Picture with Data

academy_awards

It’s that time of year when spring is just around the corner and the Academy Awards (aka the Oscars) are in bloom. The Academy of Motion Picture Arts and Sciences honors movies and stars whose work resonated with us in the past year. I’m not a huge fan of awards shows, but I do love to watch nominated best films to see what’s new and innovative in the cinema, even if it’s not my normal viewing pleasure. One of the side diversions that the Academy Awards offer is to guess, bet, or try to predict who will win. I thought it’d be fun to bring some of the data analysis skills SHIFT uses every day for a fun look at what we might predict for this year’s Oscars.

I started by looking at how various different forms of media such as blogs, discussion forums and review sites, and traditional news media predicted the winner out of 9 nominations in the Best Picture category last year.

85th Oscar Best Picture Predictions

Looking at last year’s data was interesting.  We’ve compared the nominees using a query of “film name” AND “Academy Award” AND “Best Picture” so as to try to find mentions and conversations specific to each film in the time leading up to the awards, from the day of nomination to just before the awards ceremony. In 2013, we see that Lincoln , Argo, and Zero Dark Thirty were neck in neck in all of the forms of media we measured. All films were up for the award, but these three clearly  led the pack.

oscars data.001

The jump in popularity was always a race among the three as we can see using popularity charts from Sysomos.

oscars data.002

From the data above, forums and traditional news media predicted Argo as the winner, while blogs predicted that Lincoln would win. Argo actually won the award, so where did blogs go wrong? The reason is a flaw in the data collection method; let’s see if you can catch it while we move onto this year’s predictions.

86th Oscar Best Picture Predictions

We have, again, compared the 9 nominees this year using a query of “film name” AND “Academy Award” AND “Best Picture”. Do we think Her is set to win this year’s award for Best Picture? Nope. I’d bet a fiver that it isn’t the winner.

oscars data.003

The popularity trend for this year’s films, today, is showing that three films are being discussed with lots of overlap on the charts. It’s less clear this year who the favorite is based on discussion on blogs, forums and in the news.

oscars data.004Why, despite the seemingly obvious data about “Her”, would I bet that “Her” will not get the Academy nod? The reason is interference. The word “Her”, even with our restrictive query to try to isolate discussions about the movie itself, is so broad, so vague, and so widely used, that it will always interfere heavily with any attempt to mine conversations about it. Think about it for a second.

There are categories for leading and supporting actresses, and every article is bound to use the word “her” multiple times, even if the article isn’t referring to the film. Our query helped to reduce some of the interference, but we can’t remove its influence entirely.

Now based on everything you’ve read so far, can you deduce the issue with the 2013 data set and using that data to make a prediction of the winner? Lincoln. It’s not as commonly used as the word “her”, but it’s common enough. Many students have likely written blog posts and papers while doing research about President Lincoln and may have referenced the movie in their text and would have mentioned the name, but were not discussing the movie in any meaningful way.

Question everything

Data is awesome. The right data is more important. Asking the right questions, whether it be actual questions or queries of a database is extremely important in any analysis that you do.

There are many factors that can affect your results. I’ve learned to always question myself, the results, the techniques, and the source of the data. When doing this type of work for clients, for yourself, you have to be willing to question everything. Otherwise, you are scratching the surface and presenting bad results as news. That’s not honorable or smart, it’s just lazy.

What is our prediction for this year’s winner? We’ll pass on making one, but I mostly hope it’s American Hustle. Forget Christian Bale, JLaw rocks all the things.

Chel Wolverton
Senior Marketing Analyst

[cta]?

Photo via captivate.com.

SHIFT Communications engaged Marketwired Sysomos to track mentions of Academy Award nominees for 2013 and 2014 from blogs, forums, and traditional news sources. Twitter data was not included because 2013 data was no longer included. Queries followed the format “Picture Title” AND “Academy Awards” AND “Best Picture”. The timeframes for all data were from the date of nomination (sources: Wikipedia, Academy Awards website) to the present day for 2014 and an equivalent time period in 2013. SHIFT Communications is the sole investor in the study. Sysomos data is drawn from its own internal web spider engine and data connections to major newswires. A copy of the source data spreadsheet is available upon request.

Posted on February 28, 2014 in Culture, Data, Metrics, Research

Share the Story

About the Author

Chel works in the Integrated Services as a specialist who uses her knowledge of marketing technology, analytics, and their strategies to strengthen the agency. She spends her free time rucking, writing and/or gaming, creating art via canvas or photography and listening to JT and/or Black Lab. You're probably overly familiar with her love of Sherlock (BBC).
Back to Top