12-11-2013, 02:41 AM | #16 | ||
Wizard
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
Quote:
I now just get outraged when books are over $30! I can't imagine myself paying hundreds of dollars for any books any more. Especially with a lot of the technical fields I am interested in (programming, math, economics, physics), you can find perfectly good material FOR FREE. If I ever did go purchase a physical book on the topic, there is sure as hell no way I would go for the latest/"greatest" edition. Quote:
In many cases, you cannot get the good quality scan! Either they paid a crappy scanning company to scan the book (as you can see, crappy/cheap solutions bring headaches later), the book itself is so old that it is degraded (water stains), the book is rare (so this is the only copy that you have), someone wrote in the book (this one makes me want to pull my hair out! NEVER WRITE IN YOUR BOOKS OR YOU WILL SUFFER MY WRATH! ). For example, here is one of the most egregious examples (~50 out of 576 pages were marked BADLY)... a few were marked minorly (I was able to fix those before OCR): It doesn't matter what amazing PDF reader you are using on your tablet, there is no way you can get that scan as good as that EPUB. But yes, having a great scan goes a great way in speeding up the OCR process and making it more accurate. It can chop down a process that would take me a few hours, down to less than an hour (this is with me double-checking the areas marked as "unsure" by the OCR). Well, most of their stuff is in the "not great scan" category (mostly because the books are so old). They run it through OCR with no human intervention (I believe they use the Finereader engine (?)), and while it is "99.8%" accurate (or something like that), there are still a bunch of errors (which is why you pay for a human to look through it and fix it). Last edited by Tex2002ans; 12-11-2013 at 02:55 AM. |
||
12-11-2013, 03:43 AM | #17 |
Bookmaker & Cat Slave
Posts: 11,462
Karma: 158448243
Join Date: Apr 2010
Location: Phoenix, AZ
Device: K2, iPad, KFire, PPW, Voyage, NookColor. 2 Droid, Oasis, Boox Note2
|
Hi, all:
There's a lot I'd love to reply to in this thread, but we're at "that time of year," and I'm so slammed I don't have time to breathe, much less write long posts. We had another 40 books walk in the door today, that have to be done by the 20th...and on top of what's already here, all having arrived even later than usual this year, that's really pushing our buttons. @Tex: I would never accept someone else's INDD file, and I don't know anyone, you excepted, it seems, who will. If I'm getting an INDD file from another designer, it means that s/he doesn't know how to make an ePUB from it, which means that 5 hours in, we're still going to be trying to figure out what "char-override 66" means, in the CSS. Moreover, we almost never do get all the INDD files when we do get a submission; the images are missing, the fonts are missing, you name it. It's always a mess, and it's always a file that's laid out in ways that aren't supported in ebooks. I simply gave up and won't take them any longer. Believe it or not, it's FASTER for us to OCR it with Abbyy and export it using our custom clips, than it is to slum through 60, 70, 100 "character override styles" and figure out what the designer MEANT to say. Not to mention, regexing everything into submission. Pah on that. Wordperfect? Sure. Fine. But still, the point is, like the old joke, you can't get there from here. Even with MathML, you can't output the content (the equations) in any textual way that can be supported. Back to images, and thence we are no forrader, as they say. Nobody in India is getting $0.50-$5.00 page for a scanned book. Not for the scanning. They'll get closer to $0.50 for the completed book, per page. That's in ePUB and MOBI formats, both. That pricing includes the scanning (if needed), OCR, A/B compare, html output, ePUB creation, MOBI creation. It can get up to $1/page, but generally, that's where it tops out, and the Indians are now being underpriced by the Chinese, FWIW. With regard to "PDF's" and how great they are: sure, on a massive tablet like the iPad, they're great, although I find trying to page through them really annoying no matter what reader I'm using. However, they are anything but great on smaller tablets, even the larger Kindle tablets or the Mini-Pad. Then, they suck, because you are constantly pinch-zooming them and trying to read them and scrolling around, etc. So, it's different horses for different courses. Believe me, we do a LOT of technical work (we did an 1800 page Medical Textbook that I often discuss with a lot of cursewords), and I'd be the first to agree that some things should stay in a print-layout, to facilitate perfect vertical and horizontal alignment. Unfortunately, or fortunately, take your pick, many people, like you, want their books portable. The only sellers for PDF are basically small bookstores online, Smashwords (and you can't even sell your original PDF there, mind you--it's a Calibre-conversion-created PDF), and your own websites. As many people who've sold from their own website will tell you, unless you're O'Reilly, that dog doesn't hunt. And speaking thereof: yeah, he offers multi-book format packages, and I don't think I know a soul who's bought one. Not a single person. They cost the earth. I'm not opposed to using PDF's for technical books; I'm really not. And I turn business away all the time that walks in the door with a big, technical book that I do not think will convert well. Ditto some cookbooks, kids' books, etc. But that market's appetite is whetted for portable books that can be sold on larger retailers, like Amazon. As long as Amazon, B&N and iBooks won't sell PDF's, I just don't see that working, from a commercial standpoint. And, lastly, making a print-layout PDF isn't a finger-snap. Even for plain fiction, it takes time to do correctly. Doing a full-bore, print layout for a highly technical book will cost a LOT of money, and the publisher has to feel that the result of that expenditure will be worth it. The average print layout house that will take that type of work (we don't, not for print), starts pricing at ~$5/page, (250 words) and then goes UP from there, adding for each element (each formulae, each equation), every blockquote, each pullquote, etc. When you start talking about 300 page texts, it can really add up. Hell, even Createspace, which is subsidized by Amazon and which can run at a loss, charges $679 to start a book with a "custom complex interior" and then adds $25/pop for each "table or chart." Start doing that math, add the cost of creating QUALITY ebooks on top of that, versus sales price, and royalty...and there you go. You're talking thousands in print layout costs--without even starting on the ebook versions. Publishing is a business, and the numbers have to make sense to the publishers. That's all I have time for...I know I had a bunch of other things, but...like the Rabbit in AoW, I gotta go. Hitch |
Advert | |
|
12-11-2013, 05:21 AM | #18 | |
Connoisseur
Posts: 54
Karma: 210
Join Date: Sep 2007
Device: iPad
|
Quote:
And I would very much rather look at a good quality PDF than the ePub - at least you make both available on the website. Again I am saying make the PDFs available as well as the ePub/mobi. You are and that's great, but most publishers/distributors are not. |
|
12-11-2013, 05:51 AM | #19 | ||||||
Wizard
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
Quote:
Quote:
I do A/B comparison (have PDF open in left-half of screen, EPUB on right), and I strip out all the classes that I see are not relevant (I actually strip everything down to pretty much headings + blockquotes + bold/italics). Then I plop in my "in-house CSS", and from there, I just go through and introduce spacing, no indentation, a few margins here and there... doesn't take long at all. (Last InDesign EPUB took me 5 hours (most of this was me checking the book for actual typos/inconsistencies)). Although you probably get a LOT more horrible documents than I do. (I must admit, I maybe only did six or seven new books directly from InDesign output, some were super clean, others were pretty bad (but still better than PDF )). Also, your "in-house CSS" is probably a lot more complex than what I use. My mentality is bare minimum, for maximum portability, and minimal chance of breaking on the multitude of present/future devices. Side Note: Which reminds me, another thing that the cheap places do is just Input -> Output. Someone who cares about quality will spend a little time to point out ACTUAL typos/inconsistent usage. (For example, I point out hyphenation problems, forgetting to italicize a newspaper/journal, missing accents in words (Indexes usually are rife with little errors), check my site for in-depth changelog of hundreds/thousands of typos I have caught when making the EPUBs, ...) Quote:
As I mentioned a few posts back, you will have someone who does something as simple as editing metadata, and thinks that the output PDF is exactly the same (it sure "looks" the same). Or you have people who use Word/InDesign/Quark and make their document LOOK good, but have zero clue about using Styles. So the "backend" of the file is HIDEOUS (not noticeable until you try to change formats/move things around). And Hitch can probably explain the horror Word document stories (after Christmas time it seems). You always get the dreaded person who PRESSES ENTER TWO TIMES to get a "double-spaced" document. Quote:
That is how I handle cases where I pull HTML from a different source. I run the original PDF through a very rough OCR, and then code compare what I generated with the HTML site. They usually catch mistakes that I missed, and I usually catch mistakes that they missed.. so combined, I get a better EPUB in the end! I am just ecstatic every time I run into the book in anything OTHER than PDF, ANYTHING is better than working backwards from PDFs. (Although I do like to have both versions available so I can pull higher resolution images) Quote:
Quote:
Side Note: Tome about the conversion process is complete! |
||||||
12-11-2013, 06:05 AM | #20 | |
Connoisseur
Posts: 54
Karma: 210
Join Date: Sep 2007
Device: iPad
|
@Hitch, Thanks for the informative response.
Quote:
I have actually purchased several of the O'Reilly eBook packages. |
|
Advert | |
|
12-11-2013, 06:16 AM | #21 | ||
Wizard
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
Quote:
Luckily the EPUB satiates nearly all people, and those who dislike the quality of the (PERFECTLY FREE) PDF, well then, they can suffer and buy a physical version (although a used version might/might not be worse). Quote:
And the better the tools get for InDesign/Quark export, I think the slightly better quality ebooks we will see. (Although I sense you will still have a lot of this "designed for iBooks" type nonsense). This will allow a lot of those typesetters who are not very familiar with HTML/coding, to more easily auto-export cleaner code. Just like I am one of the few who works from badly designed InDesign files. There are dozens of us... DOZENS!!! Last edited by Tex2002ans; 12-11-2013 at 06:36 AM. |
||
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Commercial ePub (3) authoring software | icsorea | ePub | 9 | 06-12-2012 04:40 PM |
Troubleshooting Kindle and math formula | DrShakalu | Amazon Kindle | 12 | 12-11-2011 07:25 AM |
tables, math formulas & different fonts in a .mobi file? | Zim | Kindle Formats | 3 | 10-22-2011 07:10 PM |
'Grey texts' and 'Typos' in Kindle ebooks | fyrogenesis | Amazon Kindle | 3 | 02-01-2011 11:41 AM |
Scanned books to Epub, best software? | Student1 | Workshop | 4 | 02-27-2009 03:08 PM |