Visualizing Video Games Throughout History

June 10, 2015

Call of Duty :: Frank Underwood // Frogger :: George Costanza // Shinobi :: Wu-Tang Clan // Super Nintendo & Sega Genesis :: ???

Video games have long infiltrated popular culture. This article will attempt to explain that reach. Inspired by FiveThirtyEight's great article "Designing the Best Board Game on the Planet", I set out to extend this methodology to video games. The focus of 538's research related to a robust data set pulled from BoardGameGeek, an exhaustive collection of board games contributed by it's passionate users. I set out to find a similar user-contributed dataset. and settled on the MobyGames video games database.

Created in 1999 by Jim Leonard to catalogue titles and connect games to their developers and fans, the database has grown to nearly 50,000 games. Ratings, images, and other details are contributed by users. Though there has been management changes leading to some contributor vitriol in recent years, the site is on stable footing and recently surpassed 50,000 unique games.

Method

Using the R package ‘rvest’ I scraped the site’s complete game list html table, then used each game’s URL to scrape additional game information from their profile pages. To avoid issues with replicate game titles the URL was used as a unique reference ID for each title - for example, 007: The World is Not Enough has three unique pages containing details for each system for which it was released. Each game’s profile follows a similar format which made it effective for scraping, but specific genres of games used atypical formatting, requiring a good deal of database cleansing. The guideline that 80% of the work with data involves cleaning is no expection here. We’re left with a database of 50,021 games. Check out the code I used to pull and clean the data on my Github page.

Let’s dive into what the numbers tell us.

Ratings

MobyGames aggregates two sources to formulate their video game ratings: users and critics. User ratings are determined on a 0 to 5 scale and are contributed by an individual’s subjective rating of a particular game. Critic ratings are reviews from magazines, entertainment websites, and other online video game forums. A weighted average of these individual critics formulate the site’s “MobyRank” score. For this article I gave equal importance to user and critic ratings to ascertain each game’s overall rating, simply averaging the two scores into one total rating. Of the 50,000+ games, a total of 9224 had either user or critics ratings, ranging in years from 1976-2015. In the scatterplot below you’ll see the combined Mobygames ratings by date of original publication.

That looks pretty sharp, but in the spirit of video games, lets make it a bit more...8-bit.

As you can see, very little has changed in a video game's reception by year of distribution. There is a mild increase in ratings by year since video games came into prominence. This displays an interesting contrast compared to 538's article on board games, which described the possibility that recent years might be considered "the golden age of serious board gaming." I'd hypothesize that this is effect is due to the platform these entertainment options are consumed. Board games are easily compared; Settlers of Catan and Monopoly are using equal technology, though one is deemed of a higher quality sinced it's final product is more innovative in using similar tools. Video games are influenced by the effects of advances in technology. Objectively, you can't compare Pong! to Mario Tennis; you have to consider the technology of the day. Keeping that in mind, below you'll see the top 11 games by the average user/critics score:

TitleYearPublisherGenrePlatformRating
Bayonetta 22014Nintendo of America Inc.ActionWii U94.5
Metroid Prime Trilogy2009Nintendo of America Inc., Nintendo of Europe GmbHActionWii, Wii U94.5
Super Mario Galaxy 22010Nintendo Co., Ltd., Nintendo of America Inc., Nintendo of Europe GmbHActionWii, Wii U94.5
ESPN NFL 2K52004Global Star Software Inc., SEGA Europe Ltd., SEGA of America, Inc.SportsPlayStation 2, Xbox93.75
World Series Baseball 2K32003SEGA of America, Inc.SportsPlayStation 2, Xbox93.75
Final Fantasy X | X-2 HD Remaster2013Square Enix Co., Ltd., Square Enix, Inc.Role-Playing (RPG)PlayStation 3, PS Vita93.5
Grand Theft Auto V2013Rockstar Games, Inc.Action, Racing / DrivingPlayStation 3, PlayStation 4, Windows, Xbox 360, Xbox One93.5
Sins of a Solar Empire: Trinity20101C-SoftClub, rondomedia Marketing & Vertriebs GmbH, Snowball Studios, Stardock Entertainment, Inc.StrategyWindows93.5
Super Mario Bros.1985Nintendo Co., Ltd., Nintendo of America Inc., Nintendo of Europe GmbHActionArcade, Game Boy Advance, NES, Nintendo 3DS, Wii, Wii U93.33
RalliSport Challenge 22004Microsoft Game StudiosRacing / DrivingXbox93
Vampire Chronicle for Matching Service2000Capcom Co., Ltd.ActionDreamcast93

Though the earlier graph illustrated game ratings haven't changed much over the last 40 years, 10 out of the top 11 games have been released within the past 15 years. This may be indicative of a recency bias, a publisher's increased understanding of qualities that facilitate more highly regarded games, or some other factor. Many of these best game have a so-called "replay value", such as RPGs, infinite universes, or sports franchises.

Themes

Mobygames uses 43 themes to delineate aspects of games, which is detailed in their online glossary. While some games have multiple themes, others have no listed theme - it’s unclear whether this is due to user-contribution error or the unique qualities of some games. Super Mario 3, one of the highest-regarded games ever created, doesn’t list a theme; it doesn’t easily fall into any of the 43 listed genres either. Arcade? Puzzle-solving? Probably neither.

Of the 38,615 games listing at least one theme, the image below shows the distribution of themes as a percentage decimal.

I was also curious which themes were most likely to occur simultaneously. Think of the following interactive as such: "If I were to randomly select from the pool of Arcade games, what is the likelihood that it also has a _____ theme?" Click the Themes on the left side to reorganize the heatmap, and hover over the cells to gain further details.

These two graphics make it evident that Puzzle-Solving, Shooter, Sci-Fi, and Arcade themed games are at a higher rate than other games.

Before moving on, here are a couple interesting observations, from most to least sensical:

  • 68% of Adult-themed games have an Anime/Manga theme, while only 20% of Anime/Manga-themed games are adult themed.
  • Only 46% of Helicopter games are Flight-themed - How could that be?!? What are you doing in these other 54% of Helicopter games?
  • 3% of Rhythm/Music games are Shooter-themed. No word to report on the Samba de Amigo/Halo crossover though.
  • There is one Adult-themed Tank game. What could this possibly entail, you ask? Do yourself a favor and read the game description.

Themes

Mobygames uses 8 genres to categorize game types. The vast majority of games have at least one genre, many having multiple. Shovelware games, which are packages of dissimilar games (think The 1000 Best Windows 95 Games Bundle) are responsible for many of the missing titles.

It's a bit unexpected that Action games have such a dominating share of the market. This is likely due to its emphasis on reflex-based gameplay, which is a component of most of the other genres - especially racing and sports games.

Similar to the theme interactive, I was curious which themes were most likely to occur in a given genre of game. Think of the following interactive as such: "If I were to randomly select from the pool of Educational games, what is the likelihood that it has a _____ theme?" Click the Genres on the left side to reorganize the heatmap, and hover over the cells to gain further details.

The data set is quite large with and there is much to still be explored, so stayed tuned to the site for additional posts!


Have feedback, questions, or want to see something else added? Check out the code I used to create this page or fork my repository to propose changes. Edit My Code