Uncategorized

Amazon’s Top 50 Bestselling Books Data Analysis

This post will be looking at Amazons Top 50 bestselling books from 2009-2019, along with some simple facts about them. There is a diverse range of authors, books, reviews, and prices that make this dataset intriguing. We wanted to analyze this data and display it in different visualizations below.

In the below visualization, first, we will be looking at the title of each book and the number of reviews each book received. We wanted to start off by familiarizing ourselves with how popular each book was based on the number of reviews it received. The books received a wide variety of reviews, the lowest amount being 37 reviews for the book Divine Soul Mind Body Healing and Transmission System: The Divine Way to Heal You, Humanity, Mother Earth, and All‚Ķ and the most was 87,841 from “Where the Crawdads Sing.”

For the next visualization, we chose to analyze the top 10 books Amazon’s top selling books from 2009-2019. We chose to visualize this by presenting a scatter plot. For the X- Axis we put the number of reviews. For the Y-Axis we put the User Rating. If you click on one of the squares in the visualization, you will also be able to see the books name, number of reviews, and the books rating. Also in the visualization, you are able to see the trend line of the average review received for the top 10 books.

For this visualization, we wanted to see if genre and the user rating had any similarities to each other based on the top 30 bestselling Amazon books. We saw that each book whether it was fiction or non-fiction, had a rating between 4.4-4.8, with the exception of Allegiant which had a rating of 3.9, meaning that user rating and genre do not have any direct correlation. We also wanted to see out of the top 30 bestselling Amazon books, how many were fiction and non-fiction. You can see there are 12 fiction books (represented by the color purple) with an average rating of 4.5, and 18 non-fiction books (represented by the color blue) with an average rating of 4.6.

Next, we have a more simple scatter plot visualization. The authors names are displayed on the x-axis with the user ratings being displayed on the y-axis. We liked how these dots spread out so you are able to see the data a little more clearly. One thing to note is that no book received a perfect 5 for the user ratings which can be accounted for by the number of reviews each book received. Some of the most popular authors even got a pretty average rating compared to his or her peers. One of the lower ratings that surprised us when looking over the data was J.K. Rowling, who got a 3.5 rating on one of her books.

Below is a gauge visualization that is displaying the author and the price for the top 20 bestselling Amazon books between 2009 and 2019. Authors such as Stephen King, George R.R. Martin, and Jaycee Dugard had the highest prices for their books. A trend seemed to be that the more popular the author is, then the higher the price of one of their books due to their popularity and great work. When the price of a book was higher, the higher (more to the right) the pointer would go. Some of the books in the data set even had the pointer underneath the gauge due to the higher price of the book.

Below is another gauge visualization that is displaying the author and his or hers ratings again. We found that the gauge was more simple, clean and organized way of displaying each authors rating.

Our last visualization is a violin plot displaying the top 30 bestselling Amazon books from 2009-2019. This visualization splits the top 30 bestselling books into the 2 genres, fiction and non-fiction, while showing the average ratings for each book plus the amount of reviews total that each received. If you engage with the violin plot visualization you can see that there are different color dots. The color of each dot is based on the rating for each book. For example, under the fiction genre, the book “Allegiant” has a 3.9 rating, resulting in its dot being a lighter color. Whereas a book like “A Gentleman in Moscow” has a rating of 4.7 resulting in a darker purple dot.

In conclusion, this data set was chosen because of its topic, Amazon’s Top 50 Bestselling Books from 2009-2019. We also chose this dataset due to the amount of data there was to analyze. Lastly, Leah and I were able to genuinely enjoy and relate to the data we were analyzing since we have combined to read a few of the books that were involved. Something we found interesting and have a better understanding of now is that none of the books received a perfect 5 rating. There were a handful that came close with a 4.8 or 4.9, but those were ultimately the highest ratings. We considered this to be normal, since they were in the list for top 50 bestselling books on Amazon, and also based on the number of reviews some books had.

Thank you for reading my blog post. I hope you enjoyed learning more about Amazon’s Top 50 Bestselling Books from 2009-2019, analyzed by myself and Leah Themistokleous!

Leave a Reply

Your email address will not be published. Required fields are marked *