pdf to ePub

qsipl · 03-28-2014, 04:33 AM

Hi

I need to convert a PDF to EPUB. Guide on this for a tool that converts PDF TO EPUB with styles matching the PDF?

bobinatcat · 03-28-2014, 04:34 AM

Calibre. Easy as.

alanHd · 03-28-2014, 04:51 AM

I have better luck using Mobipocket creator.

HarryT · 03-28-2014, 07:15 AM

Please ask questions in the correct format. Moved to the "Workshop" forum.

willus · 03-30-2014, 10:06 AM

@qsipl -- Welcome to MR. See this thread. Lots of good tips on PDF conversions.

Toxaris · 03-30-2014, 11:35 AM

Quote:

Originally Posted by qsipl

Hi

I need to convert a PDF to EPUB. Guide on this for a tool that converts PDF TO EPUB with styles matching the PDF?

That tool does not exits at the moment. Apparently it is a holy grail to many. PDF is your worst format to start from.

mrmikel · 03-30-2014, 03:55 PM

First off there are at least two different types of epubs. One consists of nothing but images. To convert it you need to optically recognize it with an appropriate program. The most effective ones are far from free.

The second type has the text already in it. You can use any one of number of programs to extract it. Mobipocket Creator can do so.

The quality of the output is dependent on how much care the creator used on it. Some are just crude OCR output with many errors, but good enough to search by, sort of, which is why many are like that.

PDFs created from text originally can have great text, but since they are not required to store it linearly like in the original, chunks can be misplaced, the images overlayed in the original but not in the deconstruction.

Hence, PDF is the worst format to start from. But it is all many of us have.

dgatwood · 04-01-2014, 10:32 PM

Quote:

Originally Posted by Toxaris

That tool does not exits at the moment. Apparently it is a holy grail to many. PDF is your worst format to start from.

You're inclined to understatement. Essentially, the question is approximately like asking, "How can I zoom in and enhance like they do on CSI" or "how can I copy the text of my research paper from a photograph of the screen," and for precisely the same reason—you can't readily extract information that isn't there.

As mrmikel pointed out, a PDF file basically consists of... at best, a series of strings, or at worst, a series of individual glyphs, along with font information and the location where each glyph or string should be drawn on the page. You don't have paragraphs, and you may or may not even have entire lines. This is why copy and paste from a PDF is notoriously error-prone.

One of the most hilarious examples of PDF's inadequacy that I've seen involved Apple's developer documentation PDFs from a few years back. In some PDF readers (notably, Apple's Preview prior to about OS X v10.8), depending on how you selected text, you would sometimes select the words, but not the spaces between them. You can probably imagine how much fun that was.

Worse, depending on how the PDF was created, there's no guarantee that it contains the mapping information needed to convert glyph IDs back into a Unicode code points. If it doesn't, then copying text from the PDF could return nothing, random garbage, or anything in between. So in that case, the question is more like asking how to retrieve your research paper from a photo of a Microsoft Word BSOD....

03-28-2014, 04:33 AM	#1
qsipl Enthusiast Posts: 25 Karma: 412584 Join Date: Feb 2014 Device: IPAD, KF8 & Tablet	pdf to ePub Hi I need to convert a PDF to EPUB. Guide on this for a tool that converts PDF TO EPUB with styles matching the PDF?

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
ePub to pdf: Doesn't respect soft hyphens in ePub	EbokJunkie	Conversion	4	11-18-2013 04:27 AM
PDF Margins on Epub to PDF	viker	Conversion	3	04-02-2012 01:18 AM

03-28-2014, 04:34 AM	#2
bobinatcat Member Posts: 10 Karma: 12050 Join Date: Feb 2014 Device: PSR-T2	Calibre. Easy as.

03-28-2014, 04:51 AM	#3
alanHd Addict Posts: 374 Karma: 1408579 Join Date: Jul 2012 Location: UK Device: Kindle Touch, Ipod Touch, Ipad Air	I have better luck using Mobipocket creator.

03-28-2014, 07:15 AM	#4
HarryT eBook Enthusiast Posts: 85,560 Karma: 93980341 Join Date: Nov 2006 Location: UK Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6	Please ask questions in the correct format. Moved to the "Workshop" forum.

03-30-2014, 10:06 AM	#5
willus Fuzzball, the purple cat Posts: 1,313 Karma: 11087488 Join Date: Jun 2011 Location: California Device: iPad	@qsipl -- Welcome to MR. See this thread. Lots of good tips on PDF conversions.

03-30-2014, 03:55 PM	#7
mrmikel Color me gone Posts: 2,089 Karma: 1445295 Join Date: Apr 2008 Location: Central Oregon Coast Device: PRS-300	First off there are at least two different types of epubs. One consists of nothing but images. To convert it you need to optically recognize it with an appropriate program. The most effective ones are far from free. The second type has the text already in it. You can use any one of number of programs to extract it. Mobipocket Creator can do so. The quality of the output is dependent on how much care the creator used on it. Some are just crude OCR output with many errors, but good enough to search by, sort of, which is why many are like that. PDFs created from text originally can have great text, but since they are not required to store it linearly like in the original, chunks can be misplaced, the images overlayed in the original but not in the deconstruction. Hence, PDF is the worst format to start from. But it is all many of us have.