It’s that time of year when spring is just around the corner and the Academy Awards (aka the Oscars) are in bloom. The Academy of Motion Picture Arts and Sciences honors movies and stars whose work resonated with us in the past year. I’m not a huge fan of awards shows, but I do love to watch nominated best films to see what’s new and innovative in the cinema, even if it’s not my normal viewing pleasure. One of the side diversions that the Academy Awards offer is to guess, bet, or try to predict who will win. I thought it’d be fun to bring some of the data analysis skills SHIFT uses every day for a fun look at what we might predict for this year’s Oscars.
I started by looking at how various different forms of media such as blogs, discussion forums and review sites, and traditional news media predicted the winner out of 9 nominations in the Best Picture category last year.
85th Oscar Best Picture Predictions
Looking at last year’s data was interesting. We’ve compared the nominees using a query of “film name” AND “Academy Award” AND “Best Picture” so as to try to find mentions and conversations specific to each film in the time leading up to the awards, from the day of nomination to just before the awards ceremony. In 2013, we see that Lincoln , Argo, and Zero Dark Thirty were neck in neck in all of the forms of media we measured. All films were up for the award, but these three clearly led the pack.
The jump in popularity was always a race among the three as we can see using popularity charts from Sysomos.
From the data above, forums and traditional news media predicted Argo as the winner, while blogs predicted that Lincoln would win. Argo actually won the award, so where did blogs go wrong? The reason is a flaw in the data collection method; let’s see if you can catch it while we move onto this year’s predictions.
86th Oscar Best Picture Predictions
We have, again, compared the 9 nominees this year using a query of “film name” AND “Academy Award” AND “Best Picture”. Do we think Her is set to win this year’s award for Best Picture? Nope. I’d bet a fiver that it isn’t the winner.
The popularity trend for this year’s films, today, is showing that three films are being discussed with lots of overlap on the charts. It’s less clear this year who the favorite is based on discussion on blogs, forums and in the news.
Why, despite the seemingly obvious data about “Her”, would I bet that “Her” will not get the Academy nod? The reason is interference. The word “Her”, even with our restrictive query to try to isolate discussions about the movie itself, is so broad, so vague, and so widely used, that it will always interfere heavily with any attempt to mine conversations about it. Think about it for a second.
There are categories for leading and supporting actresses, and every article is bound to use the word “her” multiple times, even if the article isn’t referring to the film. Our query helped to reduce some of the interference, but we can’t remove its influence entirely.
Now based on everything you’ve read so far, can you deduce the issue with the 2013 data set and using that data to make a prediction of the winner? Lincoln. It’s not as commonly used as the word “her”, but it’s common enough. Many students have likely written blog posts and papers while doing research about President Lincoln and may have referenced the movie in their text and would have mentioned the name, but were not discussing the movie in any meaningful way.
Data is awesome. The right data is more important. Asking the right questions, whether it be actual questions or queries of a database is extremely important in any analysis that you do.
There are many factors that can affect your results. I’ve learned to always question myself, the results, the techniques, and the source of the data. When doing this type of work for clients, for yourself, you have to be willing to question everything. Otherwise, you are scratching the surface and presenting bad results as news. That’s not honorable or smart, it’s just lazy.
What is our prediction for this year’s winner? We’ll pass on making one, but I mostly hope it’s American Hustle. Forget Christian Bale, JLaw rocks all the things.
Senior Marketing Analyst
Photo via captivate.com.
SHIFT Communications engaged Marketwired Sysomos to track mentions of Academy Award nominees for 2013 and 2014 from blogs, forums, and traditional news sources. Twitter data was not included because 2013 data was no longer included. Queries followed the format “Picture Title” AND “Academy Awards” AND “Best Picture”. The timeframes for all data were from the date of nomination (sources: Wikipedia, Academy Awards website) to the present day for 2014 and an equivalent time period in 2013. SHIFT Communications is the sole investor in the study. Sysomos data is drawn from its own internal web spider engine and data connections to major newswires. A copy of the source data spreadsheet is available upon request.