Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Editor

Notices

Reply
 
Thread Tools Search this Thread
Old 09-06-2025, 06:56 PM   #1
Globe Trotsky
Junior Member
Globe Trotsky began at the beginning.
 
Posts: 8
Karma: 10
Join Date: Sep 2025
Device: Kindle 10th Generation Paperwhite
Calibre tidy line breaks

I have an annoying problem that even after converting books in Calibre editor, some are riddled with random line breaks. Can someone advise me on how to fix without having to manually go through the HTML editor. Is there a code/regex/plug in. TIA. Apologies if this should be on another thread.
Attached Thumbnails
Click image for larger version

Name:	IMG_1250.jpg
Views:	128
Size:	1.02 MB
ID:	217936   Click image for larger version

Name:	IMG_1251.jpg
Views:	137
Size:	1.03 MB
ID:	217937  
Globe Trotsky is offline   Reply With Quote
Old 09-06-2025, 08:41 PM   #2
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 80,685
Karma: 150249619
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Quote:
Originally Posted by Globe Trotsky View Post
I have an annoying problem that even after converting books in Calibre editor, some are riddled with random line breaks. Can someone advise me on how to fix without having to manually go through the HTML editor. Is there a code/regex/plug in. TIA. Apologies if this should be on another thread.
It looks like it could be a rather poor conversion from a PDF source.

You could use regex to combine lines that do not end in a punctuation mark. That would help a lot.

Read this thread, It gives a lot of good information including link(s) to regex to use. https://www.mobileread.com/forums/sh...d.php?t=357635
JSWolf is offline   Reply With Quote
Old 09-06-2025, 08:53 PM   #3
Sirtel
Grand Sorcerer
Sirtel ought to be getting tired of karma fortunes by now.Sirtel ought to be getting tired of karma fortunes by now.Sirtel ought to be getting tired of karma fortunes by now.Sirtel ought to be getting tired of karma fortunes by now.Sirtel ought to be getting tired of karma fortunes by now.Sirtel ought to be getting tired of karma fortunes by now.Sirtel ought to be getting tired of karma fortunes by now.Sirtel ought to be getting tired of karma fortunes by now.Sirtel ought to be getting tired of karma fortunes by now.Sirtel ought to be getting tired of karma fortunes by now.Sirtel ought to be getting tired of karma fortunes by now.
 
Sirtel's Avatar
 
Posts: 13,970
Karma: 243829945
Join Date: Jan 2014
Location: Estonia
Device: Kobo Sage & Libra 2
A crappy conversion from PDF will always be a crappy conversion, unless you fix everything by hand, line by line. PDF is not a suitable format for conversion.

TL;DR Don't convert from PDF if you can do it in any other way.
Sirtel is online now   Reply With Quote
Old 09-07-2025, 06:50 PM   #4
Globe Trotsky
Junior Member
Globe Trotsky began at the beginning.
 
Posts: 8
Karma: 10
Join Date: Sep 2025
Device: Kindle 10th Generation Paperwhite
Thanks

Quote:
Originally Posted by JSWolf View Post
It looks like it could be a rather poor conversion from a PDF source.

You could use regex to combine lines that do not end in a punctuation mark. That would help a lot.

Read this thread, It gives a lot of good information including link(s) to regex to use. https://www.mobileread.com/forums/sh...d.php?t=357635
Thanks I will have a read. I'm fairly certain this was an EPUB to start with. I just put it through Calibre Heuristics hoping that it would tidy everything up. Thanks for the link
Globe Trotsky is offline   Reply With Quote
Old 09-07-2025, 06:52 PM   #5
Globe Trotsky
Junior Member
Globe Trotsky began at the beginning.
 
Posts: 8
Karma: 10
Join Date: Sep 2025
Device: Kindle 10th Generation Paperwhite
Thanks, as I said above, just checked and it was an EPUB, definitely not a PDF conversion.
Globe Trotsky is offline   Reply With Quote
Old 09-07-2025, 06:55 PM   #6
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 80,685
Karma: 150249619
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Quote:
Originally Posted by Globe Trotsky View Post
Thanks, as I said above, just checked and it was an EPUB, definitely not a PDF conversion.
How do you know it was not a PDF conversion before you got the ePub? Where did the ePub come from?
JSWolf is offline   Reply With Quote
Old 09-07-2025, 08:00 PM   #7
Karellen
Wizard
Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.
 
Karellen's Avatar
 
Posts: 1,684
Karma: 9500498
Join Date: Sep 2021
Location: Australia
Device: Kobo Libra 2
@Globe Trotsky

Careful how you answer the question "Where did the epub come from". It has no bearing to your original question on how to fix the problem.
There be wolves among the sheep.
Karellen is offline   Reply With Quote
Old 09-07-2025, 08:15 PM   #8
Sirtel
Grand Sorcerer
Sirtel ought to be getting tired of karma fortunes by now.Sirtel ought to be getting tired of karma fortunes by now.Sirtel ought to be getting tired of karma fortunes by now.Sirtel ought to be getting tired of karma fortunes by now.Sirtel ought to be getting tired of karma fortunes by now.Sirtel ought to be getting tired of karma fortunes by now.Sirtel ought to be getting tired of karma fortunes by now.Sirtel ought to be getting tired of karma fortunes by now.Sirtel ought to be getting tired of karma fortunes by now.Sirtel ought to be getting tired of karma fortunes by now.Sirtel ought to be getting tired of karma fortunes by now.
 
Sirtel's Avatar
 
Posts: 13,970
Karma: 243829945
Join Date: Jan 2014
Location: Estonia
Device: Kobo Sage & Libra 2
Quote:
Originally Posted by Karellen View Post
@Globe Trotsky

Careful how you answer the question "Where did the epub come from". It has no bearing to your original question on how to fix the problem.
There be wolves among the sheep.
If it's a bad PDF conversion (and it almost certainly is, by the looks of it), there's no fixing it.
Sirtel is online now   Reply With Quote
Old 09-07-2025, 09:08 PM   #9
Karellen
Wizard
Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.
 
Karellen's Avatar
 
Posts: 1,684
Karma: 9500498
Join Date: Sep 2021
Location: Australia
Device: Kobo Libra 2
Quote:
Originally Posted by Sirtel View Post
If it's a bad PDF conversion (and it almost certainly is, by the looks of it), there's no fixing it.
Trying to fix via conversions, yea totally agree. Can't be done.

If you are prepared to read and fix at the same time, it can be done. Even easier if the original book is available as a reference.
A few regex to catch the split sentences are easy enough. It's all the other annoying missing italic, replaced characters, broken words and spurious code that are more difficult/impossible to bulk fix.

(not idea why an edit to my post caused a second post to appear)
Karellen is offline   Reply With Quote
Old 09-07-2025, 09:14 PM   #10
DNSB
Bibliophagist
DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.
 
DNSB's Avatar
 
Posts: 48,012
Karma: 174315100
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos
Quote:
Originally Posted by Karellen View Post
Trying to fix via conversions, yea totally agree. Can't be done.

If you are prepared to read and fix at the same time, it can be done. Even easier if the original book is available as a reference.
A few regex to catch the split sentences are easy enough. It's all the other annoying missing italic, replaced characters, broken words and spurious code that are more difficult/impossible to bulk fix.

(not idea why an edit to my post caused a second post to appear)
Not to mention the issue with some ligatures (llama comes out as l ama for instance).
DNSB is offline   Reply With Quote
Old 09-07-2025, 09:19 PM   #11
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 22,008
Karma: 30277294
Join Date: Mar 2012
Location: Sydney Australia
Device: none
It's almost certainly a low quality OCR image scan of ink on paper (probably with an ancient version of ABBYY - probably pirated), that was saved as a PDF.

Professionally created PDFs using Acrobat, DTP (InDesign, even Quark), or WP (MS Word, LO Writer, WordPerfect) don't break paragraphs like that.

One way to deal with it, is to convert to txt, open with a decent text editor (e.g. Vim, Text Pad, Notepad++) and correct using regex. Then open the corrected text file in one of the WP apps mentioned above, style front-matter, headings, bibliography etc as appropriate and save as DOCX and get calibre to convert that to EPUB.

If you use Wordperfect you can save as EPUB directly, which does a better job of mapping its styling to the EPUB CCS.

There are also useful addins for Word - EPUB Tools (it is in the MR Workshop forum), and Transtools, which has an excellent Unbreaker tool.

BR
BetterRed is offline   Reply With Quote
Old 09-09-2025, 10:05 AM   #12
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 80,685
Karma: 150249619
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
When I did a search for this book, this is what I found.

The Color of Christ
The Son of God and the Saga of Race in America
Edward J. Blum, Paul Harvey

https://www.perlego.com/book/538115/...in-america-pdf

It says it's an ePub eBook. But it it's really from a PDF, then that would explain why it's so messed up.

Last edited by JSWolf; 09-09-2025 at 10:09 AM.
JSWolf is offline   Reply With Quote
Old 09-09-2025, 02:58 PM   #13
Globe Trotsky
Junior Member
Globe Trotsky began at the beginning.
 
Posts: 8
Karma: 10
Join Date: Sep 2025
Device: Kindle 10th Generation Paperwhite
EPUB

Quote:
Originally Posted by Karellen View Post
@Globe Trotsky

Careful how you answer the question "Where did the epub come from". It has no bearing to your original question on how to fix the problem.
There be wolves among the sheep.
It's definitely EPUB which was downloaded from Z-Library. Screenshot attached.
Attached Thumbnails
Click image for larger version

Name:	Screen Shot 2025-09-09 at 19.55.40.jpg
Views:	97
Size:	278.8 KB
ID:	218011  
Globe Trotsky is offline   Reply With Quote
Old 09-09-2025, 03:04 PM   #14
Globe Trotsky
Junior Member
Globe Trotsky began at the beginning.
 
Posts: 8
Karma: 10
Join Date: Sep 2025
Device: Kindle 10th Generation Paperwhite
Quote:
Originally Posted by JSWolf View Post
When I did a search for this book, this is what I found.

The Color of Christ
The Son of God and the Saga of Race in America
Edward J. Blum, Paul Harvey

https://www.perlego.com/book/538115/...in-america-pdf

It says it's an ePub eBook. But it it's really from a PDF, then that would explain why it's so messed up.
Definitely and EPUB download into calibre
Attached Thumbnails
Click image for larger version

Name:	Screen Shot 2025-09-09 at 19.55.40.jpg
Views:	95
Size:	290.3 KB
ID:	218012  
Globe Trotsky is offline   Reply With Quote
Old 09-09-2025, 03:10 PM   #15
DNSB
Bibliophagist
DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.
 
DNSB's Avatar
 
Posts: 48,012
Karma: 174315100
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos
For what it may be worth, I found the book on Amazon and Kobo and downloaded a sample from each. The samples did not have the page numbers nor the extraneous line breaks.

Kobo CA: https://www.kobo.com/ca/en/ebook/the-color-of-christ-1

Amazon CA: https://www.amazon.ca/Color-Christ-S.../dp/B009DH7YR8

That the screenshots posted by the OP were from a PDF to ePub conversion seems an inescapable conclusion. Since the source mentioned by the OP in a later message was Z-Library, a rather well known pirate site, I'm out of this discussion.

Am I a nasty, suspicious person? Very likely. Decades spent in IT do that to a person.
Attached Thumbnails
Click image for larger version

Name:	screen_001.png
Views:	100
Size:	198.1 KB
ID:	218014   Click image for larger version

Name:	screenshot_2025_09_09T12_11_41-0700.png
Views:	106
Size:	80.1 KB
ID:	218015   Click image for larger version

Name:	jolly_roger.png
Views:	99
Size:	15.8 KB
ID:	218016  

Last edited by DNSB; 09-09-2025 at 03:31 PM. Reason: Added links to the book on Amazon and Kobo, added jolly roger to images
DNSB is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
How to create line breaks? begtognen Editor 16 05-08-2025 09:49 PM
Line breaks on Kindle, no line breaks on 4 PC Siavahda Kindle Formats 0 10-20-2012 05:50 AM
Adding page breaks in Calibre breaks ePubcheck validation bookraft Conversion 16 03-01-2011 01:23 PM
No line breaks ecpepper Amazon Kindle 3 08-09-2009 06:42 PM
Calibre PDF to LRF losing line breaks kad032000 Calibre 11 06-23-2008 10:22 AM


All times are GMT -4. The time now is 07:19 PM.


MobileRead.com is a privately owned, operated and funded community.