Making Data-Driven Airbnb Reservations

There are two groups of people when it comes to decision making: those who make decisions based on fact and those who make decisions based on emotion. While selecting an Airbnb based on emotion might sound appropriate – Let’s stay here. Look at the built-in bookshelves and natural light, this place is dreamy! – there are certain benefits of using a data-driven approach when finding and booking an Airbnb.

To prove this, we visualized and analyzed Airbnb data from NYC, Boston and SF to uncover a few key benefits of using data and data visualization software, such as Tableau, to make everyday decisions.

Stop feeling overwhelmed by Airbnb options in NYC

“Everybody ought to have a lower East Side in their life.” — Irving Berlin

“Sometimes I feel like my only friend, is the city I live in, is beautiful Brooklyn.” — Mos Def

“Many places in the Bronx seem hidden in shadows, just as the Bronx itself is in Manhattan’s shadow. And dark stories develop best in dark shadows.” — S. J. Rozan

These are all great sentiments but probably none that will help you decide which borough to stay in when visiting NYC, much less which Airbnb in said borough will be right for you.

Enter: data.

In the above (interactive!) visualization, every Airbnb listing in NYC as of June 12, 2017 – whether it be a four-story walk up in the Bronx, a loft in Brooklyn or a spacious three-bedroom in Manhattan – is plotted and accounted for. Searchable attributes include room type, size, price, neighborhood, ratings, etc.

Using data visualization is a great way to eyeball options, slice and dice listings based on specific needs and shortlist options in record time. Willing to stay anywhere in NYC but require a large Airbnb that can accommodate 12-14 people? Use the sliding bars to adjust your search and quickly narrow your options in less than a New York minute.

Finding the right value Airbnb in Boston

Here’s an example of when making an emotional-based decision about which Airbnb to book could burn you. Say you’re a science fan visiting Boston and you’d like to stay near the Museum of Science in the West End. Turns out, overall satisfaction rates for the West End come in below average – and prices for Airbnbs in this area are above average. Not a great value no matter which scientific formula you apply.

Alternatively, you could decide you absolutely will not stay in Hyde Park due to the historically high levels of crime in the areas. But when we visualize the data, we see that overall satisfaction is quite high for this area while the average price per night comes in much below average. Nice.

Statistical relevance of San Francisco listings

Data is great and all, but what happens when there’s no logical relationship among variables? Is the data still meaningful; can it still be used to inform decisions? We wanted to better understand individual neighborhood values to see if there was a statistically significant relationship between a neighborhood’s average overall satisfaction (rating) and the average price per night. To answer our question, we performed a regression analysis on our data to determine what that relationship between data points looks like. Here’s what we found for San Francisco:

When we hover over the regression analysis line in the above visualization, we see a low p-value and a low R-squared value. Here’s how The Minitab Blog interoperates those values:


In regression analysis, you’d like your regression model to have significant variables and to produce a high R-squared value. This low P value / high R2 combination indicates that changes in the predictors are related to changes in the response variable and that your model explains a lot of the response variability. However, a low [R-square value can indicate] that even noisy, high-variability data can have a significant trend. The trend indicates that the predictor variable still provides information about the response even though data points fall further from the regression line.


Huzzah! This means the San Francisco Airbnb data set can be used to make predictions about new listings, even though the data falls a bit further from the regression line than we might like. To note, San Francisco was the only city we analyzed that had statistically significant neighborhood value. Sorry Boston and NYC.

Why is this? Is it because the Airbnb market in San Francisco is older than in some other cities? Or because there are more overall listings? Because of x, y or z? The great thing about data is it often gives us our next question to explore, which in turn leads to more data-driven decisions.

What other questions do these visualizations inspire you to ask and find answers to? Let us know in the comments!

Natalie Cullings, Senior Marketing Analyst
Emily Mong, Senior Marketing Analyst


Keep in Touch

Want fresh perspective on communications trends & strategy? Sign up for the SHIFT/ahead newsletter.

Ready to shift ahead?

Let's talk