{PR}edict: Predictive Analytics and the Future of PR, Part 4

predictive analytics in pr

In the last post, we looked at a sample prediction using Google Analytics™ data to make a prediction about my blog’s website traffic. We used clean, compatible, well-chosen data and looked forward 365 days to see what future performance of my blog looked like.

What if, however, we didn’t have textbook data at our fingertips? How might our predictions go awry? In this post, we’ll look at some common scenarios which confound our predictive skills.

Bad Data

The first and most common scenario in predictive analytics is flat-out bad data. If we have data which is poorly formed, broken, incompatible, etc. – and we don’t know it – then our predictions are likely to be very wrong.

Novice data analysts often assume that a data source, especially an internal one or one from a bespoke source like Google Analytics™, is inherently clean. Nothing could be further from the truth. Treat all data as suspect until we’ve had a chance to inspect it for quality.

Black Swans

The second scenario where predictive analytics often fails is with black swans – events that are significant and impactful, but could not be foretold from existing data. While much business and marketing data is cyclical, seasonal, and predictable, we still encounter these events from time to time.

For example, no amount of predictive analytics could have correctly forecasted the September 11 attacks or the attack on Pearl Harbor, yet these events changed the world.

Confounding Variables

A third circumstance in which predictive analytics often fail is with confounding variables. These situations occur from our failure to understand our data and the context it occurs in. To use a classical data science and statistics example, suppose we’re modeling and predicting ice cream sales. We’ve got great sales data from the last 50 years, and we’re building our model based on it.

Yet, the next year we look back and we see our predictive forecasts were terribly wrong because the summer was unseasonably cold. If we only used our ice cream sales data and didn’t account for weather in the model at all, we missed the context of our data. There was a dependency we didn’t forecast that we should have known about; we certainly have weather data for the last 50 years and could have built models for a cold summer, an average summer, and a hot summer.

Insufficient Engineering

The fourth circumstance in which predictive analytics go awry is in insufficient engineering. This is specific to data science; feature engineering is the time we spend ensuring we’ve selected good data and trained our models appropriately.

For example, in my web analytics, if I’m attempting to forecast my traffic for the next year and I know anomalies are present, I should engineer them out. A random one-time Reddit hit is enough to skew a model, but if I had no part in creating the hit, if it was truly random, it has no place in the model.

Mistaking Predictions for Insights

The final way predictive analytics goes wrong isn’t with the prediction, but what we do with it. All descriptive and diagnostic analytics, being based in mathematics and statistics, can only tell us what happened. Predictive analytics models, built with the same math and statistics, will only tell us what is likely to happen.

None of these analytics ever explain why something did happen or why it will happen. None of these mathematical models understand the humans often at the root of the data we’re studying.

Never mistake what for why. Our models help us plan and predict what is to come, but we still require human insight and judgement to determine whether the circumstances of the model remain appropriate – and if not, how to build more insightful models.

Next: The Future of PR

In the next post in this series, we’ll review where we’ve been and what the road ahead looks like as more machine learning and artificial intelligence find their way into the world of public relations. Stay tuned!

Christopher S. Penn
Vice President, Marketing Technology

Download our new whitepaper, PR and the Google Customer Journey

Posted on September 25, 2017 in Analytics, Artificial Intelligence, Data-Driven PR, Machine Learning, Marketing, Marketing Technology, Predictive Analytics, Public Relations

Share the Story

About the Author

Christopher S. Penn is an authority on digital marketing and marketing technology. A recognized thought leader, author, and speaker, he has shaped three key fields in the marketing industry: Google Analytics adoption, data-driven marketing and PR, and email marketing. Known for his high-octane, here’s how to get it done approach, his expertise benefits companies such as Citrix Systems, McDonald’s, GoDaddy, McKesson, and many others. His latest work, Leading Innovation, teaches organizations how to implement and scale innovative practices to direct change. Christopher is a highly-sought keynote speaker thanks to his energetic, informative talks. In 2015, he delivered insightful, innovative talks on all aspects of marketing and analytics at over 30 events to critical acclaim. He is a founding member of IBM’s Watson Analytics Predictioneers, co-founder of the groundbreaking PodCamp Conference, and co-host of the Marketing Over Coffee marketing podcast. Christopher is a Google Analytics Certified Professional and a Google AdWords Certified Professional. He is the author of over two dozen marketing books including bestsellers such as Marketing White Belt: Basics for the Digital Marketer, Marketing Red Belt: Connecting With Your Creative Mind, and Marketing Blue Belt: From Data Zero to Marketing Hero.
Back to Top
Subscribe to SHIFT Happens!

Subscribe to SHIFT Happens!

Want fresh PR and earned media news delivered to your inbox? Sign up for the SHIFT HAPPENS newsletter!

You have Successfully Subscribed!