Quote:
Originally Posted by xristy
|
Fantastic, but the entire problem here is having to pay multiple times for multiple editions of a book. (one price for tablet, one price for PC, etc. etc.) The entire advantage of the EPUB is that you get ONE FILE that can scale and be run on larger or smaller devices.
Maybe if these publishers charged a ONE TIME fee, and then you can get access to ALL PDFs, that would be fantastic.
I assume that is what sort of happens when you pay for these online digital access bundles for math books (they will offer you a PDF for Desktop, a PDF for tablet, etc., HTML on their site, flash based, etc. etc.).... but usually these things are rife with DRM protection. I was also never one for paying such outrageous fees for temporary nonsense! (I hate this whole idea of the "one-time" online use code as well). You can pay an outrageous fee for the physical book (which you can use forever), or you can pay the outrageous fee minus $60 or so, for some hideously locked down digital version of the same book (which you can use temporarily, or can't read wherever you want).
I tend to try to also avoid anything that forces you towards one device (for example, I completely avoid using any sort of iBooks specific code in my EPUBs). That will only bring trouble once those devices disappear (or in the case of iBooks, they will most likely update and break any complex code you were dependent on).
Side Note: This sort of reminds me of the entire digital/film divide! I watched a documentary called "Side by Side":
http://sidebysidethemovie.com/
Film has been around for about a century, and even the oldest films can still be viewed using any film projectors (think print books).... while the digital movies have gone through so many different formats/storage mediums, and many of the devices used to read these movies have gone the way of the dodo! Thus, causing many of the purely digital movies/tv shows/culture to be lost down the drain as well! (think ebooks).
Quote:
Originally Posted by xristy
Maybe a few more years and ePub 3 with MathML will be properly supported - currently Chrome and IE do not support MathML ( Can I Use MathML?)
|
Indeed.. MathML is in its infancy, and that will be getting better in the coming years.
Quote:
Originally Posted by xristy
|
Heh, that is actually what I was going to spend my time on this month, although going in the OPPOSITE direction. I am looking into EPUB -> LaTeX -> PDF. (Another reason why I am interested in vectorizing equations, and "HTMLizing" tables, so I can create high quality PDFs!).
Quote:
Originally Posted by xristy
Maybe if ePub 3 with MathML is supported then the hoards in India will be trained to re-write the equations using MathML - introducing more errors; or maybe someone will develop specialized math OCR to generate MathML.
|
You can only hope! But I don't even think that any sort of book with very complex equations would be very economical to convert from book scans (this book would have to be an older book that would still be expected to sell well, and still be relevant today, where the publisher doesn't have access to the original source).
Usually a lot of these older technical books have obsolete information, OR, they are already done better elsewhere, in an easier form. I am all about digitizing everything though... but for now, those would probably just have to be stuck as scanned PDFs.
MathOCR... now that would be something. OCR is an EXTREMELY hard problem, and adding all of these symbols on top/all over the place, I don't know how well it would work. I did a quick search and it seems like this might be one of the better solutions for that problem (but I expect it would still require a massive amount of human intervention):
http://www.inftyproject.org/en/softw...ml#InftyReader
Quote:
Originally Posted by xristy
I also observe that even if the OCR'd text layer is poor in the PDF, at least one still has the actual image of the text and that is not at all always preserved in the ePub / mobi versions. It is worth the minor issues with PDFs to have books preserved in a portable manner.
|
Yep yep. Hopefully my work is good enough though where I have an extremely low error rate. (Another reason why you want to pay for quality conversion and avoid those cheap guys).
That is part of the reason why we release everything as PDF/EPUB (sell MOBIs on Amazon), and sell physical books on Amazon/our store/elsewhere.
For newer books:
- Physical
- PDF right out of InDesign (that is used for the Print book).
- EPUB
- Exported right out of InDesign, then I go in and do my Regexfu/magic on it.
- MOBI (for sale on Amazon)
- HTML (every so often an entire chapter is posted on the site as a part of a "daily article")
Our mentality is that digital books are COMPLEMENTARY goods to go along with the physical books (so we offer them for free). This gives MUCH larger exposure to the book than would otherwise occur, you can see EXACTLY what you will be getting if you purchase the physical book, and for us, we have found that our book sales have skyrocketed since digitizing/releasing the books.
I assume a for-profit publisher could do something similar (maybe a free digital book download along with the physical purchase, which I see some doing now! Or take something like Amazon Matchbook, where you get a huge discount (or free) on the digital version).
For older books:
- PDF of the original scans + OCR backend (just like archive.org)
- EPUB editions (when I get around to converting them).
- HTML (chapters posted right out of my EPUBs, again, most likely as a part of a "daily article")
- MOBI
- Physical (Reprints):
- I don't deal with this, so I don't know many details. To my knowledge, the scans are just cleaned up (speckles removed, etc. etc.), some new intro matter is added, and it is sent for sale.
Quote:
Originally Posted by xristy
Bottom line: Offer the PDFs for those who prefer to use them. The marginal cost of the PDFs is minimal.
|
I agree.. for anything that has been made within the past two or three decades... The stuff is definitely in digital form SOMEWHERE! And if you already designed the damn digital files for print, why not make those available too?
But as I stated, so many of these source files are lost in the abyss!!! And it is disappointing that all this culture is locked in physical form (or crappy PDFs or scans). Which is why I am making it my goal to chip away at these books in my little corner/niche (non-fiction economics books).
Side Note: You also have publishers who don't want to release a lot of their backlog, because they believe it will compete with their new book sales. There are also a ton of out-of-print works, in which the publishers have zero intention of bringing back to print. There are a ton of books which we would like to reprint, but the copyright owner either can't be found (orphan works), or tries to demand outrageous license fees for.
I once asked, "if we ever do pay for the license fees, would they help us by giving us the source files?" I was laughed at. So yeah, even if are going through all the legitimate channels, I doubt that many of these publishers would give you access to the source to make your life easier (although maybe if you worked at a big publisher, things might be different). So you would still be relegated to working backwards from a scan/PDF/OCR.
The license fees most likely make it unprofitable for us to even offer a reprint/digital edition, so we almost never do it. This is a huge problem when you look at all the hundreds of thousands/millions of out-of-print books which are in the same exact situation.