|03-28-2014, 03:33 AM||#1|
Join Date: Feb 2014
Device: IPAD, KF8 & Tablet
pdf to ePub
I need to convert a PDF to EPUB. Guide on this for a tool that converts PDF TO EPUB with styles matching the PDF?
|03-28-2014, 06:15 AM||#4|
Join Date: Nov 2006
Device: PW2, iPad Retina Mini, iPhone 4, MS Surface Pro, Onyx T68, N7,
Please ask questions in the correct format. Moved to the "Workshop" forum.
|03-30-2014, 10:35 AM||#6|
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-300, PRS-T1
|03-30-2014, 02:55 PM||#7|
Join Date: Apr 2008
Location: Central Oregon Coast
First off there are at least two different types of epubs. One consists of nothing but images. To convert it you need to optically recognize it with an appropriate program. The most effective ones are far from free.
The second type has the text already in it. You can use any one of number of programs to extract it. Mobipocket Creator can do so.
The quality of the output is dependent on how much care the creator used on it. Some are just crude OCR output with many errors, but good enough to search by, sort of, which is why many are like that.
PDFs created from text originally can have great text, but since they are not required to store it linearly like in the original, chunks can be misplaced, the images overlayed in the original but not in the deconstruction.
Hence, PDF is the worst format to start from. But it is all many of us have.
|04-01-2014, 09:32 PM||#8|
Join Date: Jan 2012
Device: iPad, iPhone, Nook Simple Touch
As mrmikel pointed out, a PDF file basically consists of... at best, a series of strings, or at worst, a series of individual glyphs, along with font information and the location where each glyph or string should be drawn on the page. You don't have paragraphs, and you may or may not even have entire lines. This is why copy and paste from a PDF is notoriously error-prone.
One of the most hilarious examples of PDF's inadequacy that I've seen involved Apple's developer documentation PDFs from a few years back. In some PDF readers (notably, Apple's Preview prior to about OS X v10.8), depending on how you selected text, you would sometimes select the words, but not the spaces between them. You can probably imagine how much fun that was.
Worse, depending on how the PDF was created, there's no guarantee that it contains the mapping information needed to convert glyph IDs back into a Unicode code points. If it doesn't, then copying text from the PDF could return nothing, random garbage, or anything in between. So in that case, the question is more like asking how to retrieve your research paper from a photo of a Microsoft Word BSOD....
|Thread Tools||Search this Thread|
|Thread||Thread Starter||Forum||Replies||Last Post|
|ePub to pdf: Doesn't respect soft hyphens in ePub||EbokJunkie||Conversion||4||11-18-2013 03:27 AM|
|PDF Margins on Epub to PDF||viker||Conversion||3||04-02-2012 12:18 AM|