Escape Room Ranking 1.0 – Building a Better Ranking System

June 30, 2019 escape rooms

Originally Published: 5/28; Last Update: 6/30

Introduction

We published the Top 100 Escape Room in the World ranking in March and received a lot of good and useful feedback.  We definitely appreciated your feedback and we listened!  This is leading to the development of a better ranking system.

When we created the previous ranking, we were aware of the limitations of using the weighted averages provided by Google, Yelp, and TripAdvisor.  We desired to implement a better solution that will allow us to automate the collection and processing of review data on a monthly basis.

Developing this process revealed several challenges in getting the cleanest data possible.  The time to develop this new system is taking much longer than expected.  Since it is taking much longer, we would like to keep you apprised of our progress.  This article will be updated frequently, so please check back to see where we are at!

Process for the Updated Ranking System

Here is the process we are using to implement the updated tracking system:

  1. Choosing the city database – Complete

First off, we searched for a database containing the most populated cities in the world.  The database we located contains more than 12,000 cities across the world.  We would have preferred a list of cities that were geometrically spaced apart to limit duplication of records, but we are not aware that such data exists. Though a little extra work is needed, we were fortunate enough to find this data!  FYI – the figure above is displaying some of these cities and population histograms.

  1. Queries for Review Links by City – Complete

Next, we coded queries to retrieve the URLs for the Google, Yelp, and TripAdvisor business listings.  URL structures for these data sources were important to understand how to retrieve the data.  The resulting data set had many duplicated records, which occurred for several reasons.  For example, duplicate search results are provided within overlapping areas, so results from Littleton Colorado could, in theory, overlap results from Denver Colorado.  Additional examples include variation in business names and inaccurate business locations.

  1. Cleaning/combining/verifying records – Complete

The next step is the data combination, cleaning, and verification.  In addition to the examples listed above, language differences have provided additional challenges, specially in Russia and the Ukraine.  Google searches often provided better results using the russian language versus english.  Translating the English business titles often provided useful to find the Google review pages for Russian businesses.

This step is largely manual and has been very time consuming to go through thousands of records to match the Google, Yelp, and TripAdvisor business URLs with each escape room business name and address.  The US business listings seemed to be easiest but that may be due to our familiarity with the structure and language of these business names and addresses.

As with any internet query, some of the results will not be relevant (e.g., not escape rooms).  We have been weeding these out along with businesses that have permanently closed.  We are currently estimating to have more than 8,000 escape room locations.  We have ~10.4k records now, but many are marked for deletion and we expect to find more duplicates as we uncover businesses that have moved locations.

Update 6/5.  We have completed verification and deleted the records marked for deletion resulting in ~9,400 records.  Now we are marking duplicate URLs for Google, Yelp, and TripAdvisor to examine whether these records need to be revised.  In some cases these will be duplicate records for the same address, URLs that represent multiple locations, or records showing the former and present address of a business.

Update 6/7.  We have completed the URL checks and removed more duplicate records.  The resulting dataset is now at ~9,000 records. There may be several records that are associated with inactive business.  The best way to find these will be in determining the date of the last review.  We also double-checked that all business locations in our directory that were produced from members who purchased Gold Membership packages are being tracked.

  1. Develop and Run Software to Retrieve and Process Composite Review Ratings – Complete

Once developed, this code will retrieve the detailed review counts from the three sources and perform the calculations to provide a more accurate average review value.  In addition, we will be looking into whether we can determine when the last review was submitted.  Businesses that have not had a recent review may have temporarily or permanently closed their business.

Update 6/10 – We have developed and are running the software now.  We are also spot-checking some results along the way.

  1. Additional Checks for Completeness – Complete

Though we are expending exhaustive efforts for a complete data set, additional checks may be needed.  More detailed or different queries may be needed to find links to reviews that the simple queries were not able to achieve.  The above example for the Russian Google queries illustrates that alternate ways to query these sources may be needed.  In some cases we found querying with the business URL worked better than using the business name.  The difficulty here is that we won’t know for sure whether there will be a listing for a business.

Update 6/15: We have determined that the existing queries are adequate.

  1. Review Statistics & Choose Cut-Off Values – Complete

Once the data is run, we will spot check for accuracy.  We will also take a look at the data to determine an appropriate value for the number of reviews cut off value.  The intent is to choose a value where the data shows no/little correlation between the number of reviews and the average review values.

Update 6/15: We have examined the data and selected a cutoff of 125 reviews.  Values  greater than 125 did reduce R^2, but that reduction was insignificant.  We want to choose a cutoff that does not prohibit escape rooms in lower population areas from competing in this ranking. We have also determined that locations with a most recent review > 90 days will be screened from the ranking.

Update 6/23: After more careful consideration, we are using a cutoff of 150 reviews.

  1. Re-run & Provide Results – Complete

Depending on what is needed in the last two steps may require the software to be run again.  Once completed, we will format and post the data.

Update 6/23: We completed the spot-checks and are re-running this morning.  Then we will start formatting the tables for posting.

Update 6/28: We have posted the results.  If you compare results against the previous ranking, you will see a significant change.  We believe this occurred due to one or more of the following: 1) New process removed “rounding errors” from the previous weighted-average results, 2) More escape rooms businesses are now being tracked, and 3) Changes based on new reviews since the previous ranking.  Some businesses moved down (or off) the list, and some moved up.  For those that moved down, it’s important to note that movement was most likely due to items 1 and 2 and not an indication of poor performance.

As we update the ranking going forward, we will post each businesses rank from the previous ranking.  That way we can see the relative performance of a business (movement up or down the ranking) as compared to other businesses.

Ranking Process

The ranking process is very simple, though pulling the data for 9,000 business locations takes time!  Every review counts.  Each 5-star review is 5 points, each 4-star review is worth 4 points, etc.  Then it’s just the total points divided by the total number of reviews.

Here is an example for Company “X” that gives an average of 4.970:

Company X 5-star 4-star 3-star 2-star 1-star Total pts # Reviews Average
Google: 500 6 3 2 1 2538 512 4.957
Yelp: 10 0 0 1 0 52 11 4.727
TripAdvisor: 350 1 0 0 0 1754 351 4.997
Total pts: 4300 28 9 6 1 4344 874 4.970

Future Enhancements

As always, we appreciate your feedback.  We hope to run the update monthly.  We are also looking into tracking the monthly results to show the change in review average over time.

One challenge we see is becoming aware of new escape room businesses.  If you know of any, please send us the name and location of the business to contact@escapetheroomz.com.

Please Support Our Ranking Efforts

As you can see, we have worked hard and have spent many hours developing and improving the ranking software to track ~9,000 businesses.  Please consider donating to our ETR Ranking Effort!  This will help us pay the bills while being able to continue focusing on promoting escape room businesses and providing this information to players.  We want to continue updating the World and US ranking monthly, and we need your support to do this!  Please donate using the button below.

Only members can leave comments. Login or Register!
Jeff
August 21, 2019

We have updated the software for better detection of businesses that have permanently closed.