Analyze large quantities of reviews in minutes using AI21 Studio
Imagine that you have a platform for hotel reservations, similar to Hotels.com. On your platform, hotel visitors can leave written reviews and an overall star rating for the hotel. These reviews allow hotel owners to gain a general understanding of guest satisfaction through their overall score, but that’s just one part of the story. The quality of the hotel experience is a combination of many factors, including the room quality, the facilities, and the staff. Ideally, the owner would like to get a clear picture of their hotel’s strengths and weaknesses so they can improve on the aspects that are lacking and highlight the aspects that visitors find positive.
To extract these insights, some hotel owners sit down from time to time and read all of the reviews that were written about their hotel. This is a tedious task - one you, as a developer, would probably want to spare your users from. What if you could create a dashboard that highlights the areas guests mention positively or negatively in their reviews? As a result, your user - in this case, the hotel owner - can get a real-time snapshot of their strengths and weaknesses with just a glance at this dashboard.
Not so long ago, you would have needed to work pretty hard to create a solution like that (for example, using several classical methods). But with large language models (LLMs), it’s really easy. You can perform this analysis with high accuracy, even with no prior knowledge of natural language processing (NLP). If that sounds appealing to you, then read on. By the end of this post, you will be able to implement this feature in your platform or product.
This post will walk you through the process of building an NLP-powered dashboard for a hotel, including:
If you are new to large language models, we recommend first reading this post.
You can find the Hotels.com API on RapidAPI. There are several endpoints to this API. We need the reviews endpoint, that returns the reviews for a given hotel, page-by-page. There are approximately 50 reviews per page. In order to retrieve all of the available reviews we will call the API while iterating through the pages. The following functions do just that:
Note that running this function requires an API key for the reviews endpoint, which you can obtain from RapidAPI. This is not your AI21 Studio API key. There are some edge cases that this function doesn’t cover, such as requesting a page number that doesn’t exist, but we’ll set them aside for the purposes of this blog post.
Throughout this process, we will use the Empire Hotel in New York as a running example. You can use an API endpoint to get the hotel ID, or you can find it in the hotel’s URL on Hotels.com:
Large language models are very powerful and can ingest text of all shapes and sizes, but as the old saying goes, “garbage in, garbage out”. If you feed the model a bad prompt, the results will be far from optimal.
In this case, you should keep the following points in mind:
In order to remove reviews that are too short or not in English, you can apply some simple filters to all of the reviews:
Note: Here, we have removed reviews that have no real content by applying a very basic filter. Although this is not mandatory, we recommend that you pre-process your reviews by, for example, removing weird characters and extra spaces, etc.
This step is where you truly harness the power of AI21 Studio’s large language models!
You want the model to extract the topics and sentiments of each free text review into a structured JSON format. You can do this by leveraging a main strength of language models: when provided with text in plain English, the language model can identify patterns and generate text that follows the same pattern. By feeding the model a prompt with a few examples (this is called a few-shot prompt), it can identify the pattern and generate a reasonably good completion.
Obtaining these examples, however, requires you to manually go through several reviews, which we have done below. The resulting few-shot prompt is as follows (the reviews are as written by the platform’s users, with no grammar or spelling corrections):
Creating a good prompt is more than simply deciding on the pattern. The goal is to construct a prompt that triggers the model to generate the optimal completion (this is called prompt engineering). To achieve this, you should keep the following in mind:
Additionally, in this use-case, we recommend setting the temperature to 0, as high accuracy is required more than creativity. Increasing the temperature will result in more creative results, while lowering the temperature will increase their accuracy. Curious about temperature? See Step 4 in this post for more detail on this.
Happy with the prompt and want to start analyzing reviews? You can copy the few-shot prompt from the playground:
And use the following function to create the prompt for every review:
For every review, create the prompt and then call Jurassic-1 to perform the analysis (you can take the call from the playground, as illustrated above, or use the function from here).
Once you have the list of topics and sentiments, you can create your dashboard.
First, gather all of the topics together, assigning a count of "Positive" or "Negative" to each topic. Since you already have the completion in JSON format, you can process it using standard packages. However, as the format may not be perfect, and you don’t want any failures in your automated process, you can add a simple try-catch block. This means if a completion from the model isn’t in perfect JSON format, you drop it. You can use the following function:
At this stage, all that’s left to do is create the figure. With minor changes to this matplotlib example, you’ll have your dashboard:
You can see that the hotel is deemed excellent in Location and rather good in Service and Cleaning. However, it should invest more in the WiFi and AC, and perhaps do some renovations or upgrades to the rooms and facilities.
As a last step and some sanity check, you’ll probably want to validate your results, but without manually reading every review. One way to do that is to compare them with those of other hotel platforms, such as Booking.com. If you go to the Booking.com page for this hotel, where visitors are asked to rate hotels across numerous categories, you will find the overall picture is very similar to your own analysis:
By following the steps laid out in this post, you have built a very useful feature that can be implemented on hotel and accommodation platforms. Thanks to large language models, analyzing pieces of text, such as reviews, has never been easier. A few simple tweaks, such as writing the examples in JSON format, can save a lot of time in post-processing, making the entire process faster and easier. You can find the full notebook in our dev-hub.
Are you interested in building your own feature? With a custom model, you’ll always get the highest quality results. You can find out more about that here.