![]() |
#1 |
Junior Member
![]() Posts: 1
Karma: 10
Join Date: Jul 2012
Device: Kobo Touch
|
PDF to EPUB conversion
This problem has probably been addressed elsewhere, I suspect it has an easy solution, however I have very little experience with coding and at this point im stuck.
I used calibre to convert a PDF file to EPUB. The resulting file had paragraph breaks (<P>) where each line of text ended on the PDF. This means a lot of blank lines through the ebook I was trying to read. I found I was able to delete the lines manually with Sigil, however it would be a very time consuming process to go through the entire text. As the superfluous paragraph breaks are indistinguishable from the genuine ones, a simple find and replace in the code is not an option either. Is there an easy solution to this problem? |
![]() |
![]() |
![]() |
#2 | |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 30,972
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Quote:
Think, what is the common pattern that distinguishes most false line ends? lower case Letters or a comma with the next line starting in lower case (not perfect: Quotes and proper names (capitals) will be ignored) search: (?sm)([a-z,])</p>\s+<p .+>([a-z]) replace: \1 \2 |
|
![]() |
![]() |
Advert | |
|
![]() |
#3 |
frumious Bandersnatch
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 7,546
Karma: 19001583
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
|
Use a different converter, or different calibre settings, that does a better job at detecting paragraphs.
|
![]() |
![]() |
![]() |
#4 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 13,396
Karma: 78877538
Join Date: Nov 2007
Location: Toronto
Device: Libra H2O, Libra Colour
|
@XayneP_G: Look at the line un-wrap setting in the Heuristic Processing options on the PDF conversion. Changing that might help with paragraph detection.
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
conversion from pdf to epub help | slushbilly | Workshop | 1 | 01-31-2011 08:07 AM |
pdf -> epub conversion | cristobalmx | Calibre | 1 | 12-12-2010 04:06 AM |
PDF to EPUB Conversion | LuchoResto | General Discussions | 1 | 11-19-2010 04:54 PM |
pdf to epub conversion | Storyowner | Calibre | 3 | 11-03-2010 08:01 AM |
Help with conversion from PDF to EPUB | Fizz | Calibre | 5 | 10-25-2009 11:48 AM |