Originally Posted by hardcastle
My remaining question for the article is that most of the data comes from the top 7,000 best sellers. How hard is it for an author to get up in that bracket (for each category)? How much noise gets introduced into the data if you start increasing that number? Is it even possible?
Actually, on any given day some subcategories can be headed with single digit sales. Others might need thousands. Amazon slices and dices their categories to help buyers find books so there are lots of categories to slot books in.
This data set is a time slice snapshot.
Further slices will provide further snapshots to enable time-based analysis.