Quote:
Originally Posted by screwballl
My best explanation for the less technical people: it is the equivalent of using a scanner into photoshop that scans each page with OCR to convert it to text, then use batch scripts to convert the entire deal into a single text file, then converted into whatever format by calibre. At a commercial level, it is much more simplistic than this, the higher end commercial scanners can convert the entire book into a single project with their software as it scans, and then converted once it is done.
From start to finish, the cost per ebook if converted from paper, is around $3-5 per book.
|
I'm not a coder; I don't write programs. But I have been dealing professionally with various OCR programs and print-to-etext conversion for over a decade. You vastly overrate the abilities of OCR programs.
They're good. They're not that good. Google's epubs are top of the line for what you can get from automated OCR--and they are riddle with typos, especially on title pages and chapter headings, which often have special fonts, and no automatic OCR program can deal with the weird names & other vocab in science fiction & paranormal stories.
Automated OCR has problems with ends of physical pages; they're guessing whether that's a paragraph break or not. Often, they guess wrong. (Well, not quite true. Often, they assume that a page break is a paragraph break, because they've got no way of knowing otherwise.
Quote:
For newer books in the past 15 years, if the original book or author submitted it via digital format like DOC, TXT, RTF, PDF or something from a computer, it would cost less than 25 cents per ebook.
|
Call it the last five years; books older than that probably weren't saved. (Five years ago, a 200gb drive was *expensive;* business who weren't in the data retention business purged everything they could, every few months.) The submitted version is not the final-ready-for-print version, which could be in InDesign, Quark, PDF, Pagemaker, or some other program. There's no standardization across the industry, and often, the ready-for-print files contained atrociously useless metadata--sometimes the ISBN is in there, but more often, the document title is "Prisoneroftruth_draft3.doc Microsoft Word" and the author is "admin," or the first name of whichever person did the conversion to the print-ready program.
The text may have been fine-tuned for printing in ways that won't allow easy conversion; if styles weren't used, formatting could be lost on export, and any odd characters might've been manually placed instead of being part of the in-box text.
If they have them in nice sharp print-ready PDFs, converting out of that may wind up putting a hard return at the end of every line of text, depending on what program made the PDFs.
Converting from digital files to ebook formats could be done, but each book would need some manual checking: metadata, basic formatting, chapter headers, copyright page formatting. If it had footnotes, those are likely to be a nightmare; some ebook formats (*cough* epub) have no footnote support.
Quote:
Of course some publishers or authors are requiring some ungodly amount of royalties per copy which is why some ebooks cost $30, and others cost $3.
|
They range from $3-30 because some authors are self-published and have no overhead charges (or have decided they can deal with .20 profit per book and hope sell enough of them to make the year they spent writing it worthwhile); publisher-set prices are all over the place because some of them think that every ebook sold is a lost hardcover sale and they've priced for that. And that's for fiction & pop nonfic; textbooks cost more because they've got a smaller market and cost much more to produce.
Quote:
1) Resistance to the ebook formats by various people in the publishing industry. Using scare tactics and generalities to scare people into continuing to purchase paper based books.
|
Agreed, yes. Although I'm not sure that "scare tactics" is the right term; it seems they're more trying to say "weird new expensive tech; experiment if you like, but don't forget about what you're familiar with, what you know works."
Quote:
2) They use cherry picked data and numbers to make it appear that they are making little to no money on any ebooks.
|
Oh yes, definitely. They're very cagey with any real data. So's Amazon, releasing comparative amounts ("sold more ebooks than new hardcovers!") without saying what prices were involved, and without mentioning numbers.
I'm not particularly concerned about publishers hiding data about ebook sales, because I know the market is such a tangled mess that it really wouldn't matter if we had accurate numbers.