![]() |
#1 |
Junior Member
![]() Posts: 6
Karma: 10
Join Date: Dec 2010
Device: iPad
|
Bad PDF to ePub Conversion
Hi,
I have been trying to convert several pdfs to the epub format (for iPad, all other settings default). As a result, the pdf page breaks are lost, page numbers, where the old pdf page breaks used to be have been inserted into the resulting ePub, however the page numbers are NOT aligned with the ePub document page breaks. So, i now have different page breaks, than the original pdf, and pager numbers in the middle of pages/paragraphs. Am I missing something? Thx. |
![]() |
![]() |
![]() |
#2 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
|
PDF conversion automatically removes all the original page breaks. To remove the page numbers/footers, etc you need to use the remove header/footer regex option under Structure detection. You need to write a regular expression to do this, because every pdf is bit different with respect to page numbers/headers/footers.
|
![]() |
![]() |
Advert | |
|
![]() |
#3 | ||||
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Yes, but it's not your fault. PDF is based on postScript - a great printer language, and a lousy ebook format.
Quote:
Quote:
Quote:
Quote:
Good luck. |
||||
![]() |
![]() |
![]() |
#4 | |
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 79,575
Karma: 145863177
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
Quote:
|
|
![]() |
![]() |
![]() |
#5 |
Junior Member
![]() Posts: 6
Karma: 10
Join Date: Dec 2010
Device: iPad
|
Thanks for all the great responses. I understand now...!
Thx. |
![]() |
![]() |
Advert | |
|
![]() |
#6 | |
Groupie
![]() ![]() ![]() ![]() ![]() Posts: 171
Karma: 400
Join Date: Jun 2009
Device: Sony PRS-700, Nook Color
|
Quote:
|
|
![]() |
![]() |
![]() |
#7 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
|
Quote:
|
|
![]() |
![]() |
![]() |
#8 |
Groupie
![]() ![]() ![]() ![]() ![]() Posts: 171
Karma: 400
Join Date: Jun 2009
Device: Sony PRS-700, Nook Color
|
Thanks. This really helps. Now, the only problem I have is that after the conversion, it is closing up words every so often, is in the example below. Is there any way to keep it from doing this as I am having to go through TONS of words to separate them.
“Right. That’s it. I officially call an end to today. I’mgoing home and going to bed until it’s over.” I rolled overonto my side and propped myself up on one hand. A pairof shoes appeared next to me, attached to a woman’s legs.I followed the legs up to the rest of the person. He turned away, his voice as smooth and polished as it had been the first time I’d met him. “You will tell me with whom you are working, or I will break your body, corruptyour soul, and banish you to an eternity of torment.” Every inch of my body broke out into a terrified coldsweat as I frantically looked around the room, desperatefor some way to escape, or something I could do todistract Ariton long enough to get away |
![]() |
![]() |
![]() |
#9 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
|
Words running into each other is probably unique to that pdf, though it's not a totally uncommon problem. As Starson17 was saying, it's all postscript, the pdf itself isn't aware of 'words' per-se, just a bunch of letters. When those strings are converted to text it's likely there is too little spacing or some other odd bit of formatting which prevents the conversion engine from recognizing the space and retain it.
|
![]() |
![]() |
![]() |
#10 |
Groupie
![]() ![]() ![]() ![]() ![]() Posts: 171
Karma: 400
Join Date: Jun 2009
Device: Sony PRS-700, Nook Color
|
OK thanks. On looking further at several of my pdf file conversions it does seem that the problem is indeed unique to the particular pdf files that I have, and is there BEFORE conversion to epub.
Sigh....pdf is rather a pain isn't it.... |
![]() |
![]() |
![]() |
#11 |
Color me gone
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,089
Karma: 1445295
Join Date: Apr 2008
Location: Central Oregon Coast
Device: PRS-300
|
PDF is a pain to be sure. These problems are why any epub conversion for instance may need to be re-edited in Sigil or other program that can do search and replace and spelling checks to make up for words run together. I am working on book from the US Army Center for Military History that has bad problems with words run together. It also suffers from captions put in graphically and textually both and the fancy captions are overlaid on the bottom of pictures with a black haze. Fine to print, heck to convert.
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
PDF to EPUB Conversion | LuchoResto | General Discussions | 1 | 11-19-2010 04:54 PM |
pdf to epub conversion | Storyowner | Calibre | 3 | 11-03-2010 08:01 AM |
PDF to EPUB conversion | jfontana | Calibre | 2 | 03-17-2010 03:09 AM |
pdf to epub conversion | mediax | Sigil | 16 | 11-19-2009 03:48 PM |
Help with conversion from PDF to EPUB | Fizz | Calibre | 5 | 10-25-2009 11:48 AM |