Quote:
Originally Posted by BetterRed
@Hitch - you've no need to worry, they're not commercial books. They're documents from .gov, .org, ,edu, some media sites, etc - was going to say they're factual rather than fictional, but they do contain plenty of porkies.
Workload is shared between 5 of the 20 odd 'someones'. No covers, minimal front matter, mandatory DC metadata only etc etc. Think Henry Ford's Model T line.
We push the text into algorithms that do analysis for research purposes - sort of big-data, but textual rather than numerical. Some people read the 'books', particularly the outliers. An awful lot of this stuff is produced in grain and missile silos - both being excellent echo chambers.
We've accumulated ~70+K 'books' over about 8 years. Average length is 30K words.
PS : goes without saying that we don't have to deal with the walled garden behemoths, and other purveyors of shackles, chains, stocks and A-frames.
BR
|
Hi, Red:
LOL, I should have been clearer. I knew you'd done some .gov stuff, but I was actually delicately (too delicately, it seems) inquiring if you were running/part of a commercial firm--given that type of capacity.
Sorry--I should have PM'ed you or made my query clearer.
It's still very interesting to me. From a lot of aspects.
Hitch