How Airbnb Guests See San Francisco: A Story In Data

Airbnb launched in San Francisco in August of 2008. Since then, it has grown to more than 81,000 cities in 191 countries with a total of six million listings¹. In the process, it has generated an immense amount of context-specific data about the cities in which it operates. As a resident of San Francisco for two years now, I’ve visited almost every neighborhood in the city at least once and am familiar with the reputation that each has. I was curious, however, to source opinions from people with an outsider perspective on the city and its different neighborhoods — guests at Airbnb listings. Leveraging data from Inside Airbnb², I analyzed ~330,000 reviews of ~7,000 listings in San Francisco. While these reviews often focus on the host and the residence itself, they also regularly comment on the surrounding areas.

As I investigated the dataset, information beyond the reviews became too interesting to ignore and I carried out additional analyses. So, in addition to analyzing the review data, I end by presenting my findings in investigating questions like: “How fast is Airbnb growing in San Francisco?” and “ What is the seasonality of bookings?”

The Data

I grabbed the San Francisco listings and reviews data from Inside Airbnb as .csv files and loaded them into Pandas DataFrames, then joined them on listing_id. The joined DataFrame provided 103 fields to work with, including the date of the review, id of the listing, neighborhood the listing is in, text of the review, and many other pieces of information. Then, I wrote and applied a series of scripts to clean and process the raw text data into a format I could more easily work with, including tokenizing reviews at the sentence and word levels and grouping reviews by neighborhood.

Working With Review Data

Method Selection

After cleaning and formatting the text data, I investigated a series of natural language processing methods to extract meaning from the reviews. As an initial approximation, I started with constructing an n-gram model of the reviews but felt that the model lacked an ability to capture context and isolate comments related to the neighborhood of the listing. Later, I came across Mikolov’s Distributed Representations of Words and Phrases and their Compositionality paper³ detailing the construction of vector representations of words that capture syntactic and semantic word relationships. This model addressed my concerns about capturing the nuance of a review that discusses multiple aspects of a guest’s experience.

Implementation

I implemented the Word2vec method detailed by Mikolov by utilizing the open source Gensim library⁴. Given that the corpus was not unbearably large, I was able to train all models on my local machine in a reasonable amount of time. Word2vec can be implemented either as a continuous bag-of-words (CBOW) or continuous skip-gram. CBOW does not specify order and is slightly faster to train while continuous skip-gram does a better job handling infrequent words. In training my models, I opted for the latter.

Results: San Francisco Through the Lens of Airbnb Guests

I reduced the set of neighborhoods in a two step process, first by selecting neighborhoods with more than 10,000 reviews (9 neighborhoods met this condition), then by picking five of the nine neighborhoods based on my assessment of how iconic or well-known they are. The five neighborhoods explored in this analysis, ranked by the number of reviews, are:

  • Mission District (35,409 reviews)
  • Richmond District (21,016 reviews)
  • The Castro (17,524 reviews)
  • Potrero Hill (11,678 reviews)
  • Haight-Ashbury (10,074 reviews)

To evaluate what guests thought was unique about each neighborhood, I trained a model instance on reviews specific to each neighborhood using a continuous skip-gram. Then, I used the word vectors generated by the model and passed in the strings “unique” and “neighborhood” as positive parameters to elicit the ten most similar words used in context as defined as their “cosine similarity between a simple mean of the projection weight vectors of the strings passed in.”

Technical jargon aside, here are the ten most similar words used in association with the terms “unique” and “neighborhood” for each neighborhood:

What Is ‘Unique’ About Each Neighborhood?

  • Mission District: diverse, quirky, soul, dynamic, colorful, artsy, funky, bohemian, flavor, grungy (it’s worth noting that the 11th most similar word/phrase was “up-and-coming” — the reviews stretch back to 2009 when the Mission District was less well-known)
  • Richmond District: vibrant, suburb, artistic, serene, tranquil, quaint, community, homey, charming, nestled
  • The Castro: eclectic, architecture, hip, diverse, funky, vibe, quirky, colorful, artistic, history
  • Potrero Hill: authentic, architecture, vibe, impressive, unbeatable, hilltop, low-key, charm, stellar, upscale
  • Haight-Ashbury: quirky, eclectic, historical, vibrant, funky, hip, architecture, bohemian, historic, colorful

From my experience as a San Francisco resident, these descriptions are actually quite accurate! I was happy with the model’s ability to parse what is unique about each neighborhood from reviews.

What Is ‘Negative’ About Each Neighborhood?

Next, I was interested in figuring out what guests thought was “negative” about each of the neighborhoods examined. This turned out to be more difficult than identifying the “unique” aspects of each neighborhood because there were many more positive reviews than negative reviews, and the reviews that were negative often focused on some aspect of the listing itself that was not satisfactory (ex: no air conditioning available). However, the models still generated interesting and relevant terms specific to each neighborhood. To get the terms below, I generated the top twenty similar terms for each neighborhood seeded with ‘negative’ and ‘neighborhood,’ then removed the terms that were not relevant to describing a neighborhood:

  • Mission District: edgy, shady, gentrifying, chaos, dicey
  • Richmond District: avenues, terrible, scarce, rough, unusual
  • The Castro: daunting, scary, quietness, disappointing, colder
  • Potrero Hill: challenging, daunting, roads, meter, slope
  • Haight-Ashbury: weird, rough, creepy, disappointing, reality

After a bit of filtering, the terms generated were interesting and could be considered relevant to each neighborhood.

Airbnb’s Growth in San Francisco Over Time

To measure the growth of Airbnb in San Francisco over time, I grouped reviews by month and selected for unique reviewer_id. I chose unique reviewers per month as the growth metric rather than the number of reviews per month to capture changes in the number of people using the platform in San Francisco on the demand side. Reviews must be left within two weeks of a stay and therefore closely correspond to actual booking activity in a given month. In the past, Airbnb has suggested that about 50% of guests leave a review of their stay, meaning that enough reviews are likely left to be a reasonable proxy for the growth of the platform in San Francisco.

Chart 1

Despite operating in San Francisco for roughly ten years, the platform continues to grow quickly! The graph also reveals seasonality in reviewers which I will explore further in a later section.

Year-Over-Year Growth by Month

Just how quickly has Airbnb been growing in San Francisco? While the platform experienced explosive growth in 2012–2014 (excluded from the chart below), it has maintained a high growth rate within San Francisco. In 2018, every month had 30%+ more unique reviewers than the same month a year prior did, and years before that had even higher increases.

Chart 2

Understanding Seasonality in Reviewers

While the growth trend in Chart 1 reveals the presence of seasonality in reviewers, it’s tough to determine just how this fluctuates by season against the backdrop of a strong positive growth trend. To separate the two, I employed an STL decomposition⁵. Chart 3 shows the trend component and Chart 4 shows the seasonality component. The slightly negative values at the beginning of the trend estimation are offset by positive seasonal values; the values at the beginning of the time series are also very small compared to the rest of the dataset. Chart 5 shows the trend and seasonal components plotted with the actual values.

Chart 3

Chart 3 highlights the strong positive growth that Airbnb has experienced in San Francisco.

Chart 4

By separating out seasonality from trend, Chart 4 shows that there are annual peaks in reviewers between August and October and troughs in November and December.

Chart 5

Plotting the trend and seasonal values along with the actual values, as shown in Chart 5, can visually show roughly how well the two combined are able to resemble the actual time series of values.

Future Work

The work I’ve done here just scratches the surface of what is possible with the data available. A few interesting ideas that came to mind to extend the analysis include:

  • A comparative analysis between cities (growth rate, how reviewers talk about each city, etc.)
  • Understanding the impact of regulatory battles in cities like New York and Barcelona on listings and booking activity
  • Analyzing the growth of the recently launched “Airbnb Experiences”
  • Passing in different input strings to the Word2vec model to understand what guests liked/disliked about the listings themselves and the interactions they had with the hosts of the listing

…and many other ideas.

If you found this analysis interesting or have additional ideas about how to extend it, I’m always happy to chat! My DMs are open on Twitter and I can be reached at philjglazer@gmail.com.

The full analysis was conducted in a Jupyter Notebook. If there’s interest, I’ll push it to GitHub and make it public there.