Ryanair - a Love-hate Relationship
When we have lots of customer reviews, it’s difficult to see the wood for the trees. Here, I’ve used Python’s Natural Language Toolkit to identify certain themes in both the good and the bad reviews.
This SWD Challenge focuses on visualizing qualitative data.
Last week I flew to Barcelona with Ryanair. At the airport in both London and Barcelona, I noticed some passengers getting very angry with the gate agents.
At one stage there was even some shouting!
This got me thinking about what people hate about Ryanair. And also if there’s any love for them out there.
With a rating of only 1.3 stars on Trustpilot, there’s certainly no shortage of people who don’t like Ryanair - 6,626 to be precise!
But they also have 526 five star reviews.
I had many questions:
- What do these people love/hate so much about Ryanair?
- Are there any themes which link the lovers and the haters?
- Which factors led to customers deciding to leave a 1-star review vs. a 5-star review?
I used Python’s Beautiful Soup package to parse Trustpilot’s HTML.
This let me scrape the review titles, text and ratings into a dataframe which I could then use for my analysis.
Natural Language Toolkit (NLTK)
Python’s NLTK package is ideal for gaining insights from text strings such as customer reviews.
I used it to look at the frequencies of individual words as well as two, three and four word combinations to see which themes emerged.
Here are the results of my analysis: