@Hitch - you've no need to worry, they're not commercial books. They're documents from .gov, .org, ,edu, some media sites, etc - was going to say they're factual rather than fictional, but they do contain plenty of porkies.
Workload is shared between 5 of the 20 odd 'someones'. No covers, minimal front matter, mandatory DC metadata only etc etc. Think Henry Ford's Model T line.
We push the text into algorithms that do analysis for research purposes - sort of big-data, but textual rather than numerical. Some people read the 'books', particularly the outliers. An awful lot of this stuff is produced in grain and missile silos - both being excellent echo chambers.
We've accumulated ~70+K 'books' over about 8 years. Average length is 30K words.
PS : goes without saying that we don't have to deal with the walled garden behemoths, and other purveyors of shackles, chains, stocks and A-frames.
BR
Last edited by BetterRed; 12-28-2015 at 03:02 AM.
|