|
|
#1 |
|
Zealot
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 136
Karma: 6368
Join Date: Nov 2018
Location: Italy
Device: Kindle P.white3/Kobo Clara 2E
|
How to repair broken lines?
Good morning.
I have a long text to export to epub, but..unfortunately, it often starts a new paragraph in the wrong way 🥲 So, in epub I would get this: Code:
<p>My sentence starts here, but it should not</p> <p>keep on here</p> |
|
|
|
|
|
#2 |
|
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 83,531
Karma: 153646249
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
It looks like a PDF conversion. Do you happen to have the original source?
|
|
|
|
|
|
#3 |
|
Zealot
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 136
Karma: 6368
Join Date: Nov 2018
Location: Italy
Device: Kindle P.white3/Kobo Clara 2E
|
Yes, I do..but, do you have any solution?
|
|
|
|
|
|
#4 |
|
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 83,531
Karma: 153646249
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
|
|
|
|
|
|
#5 |
|
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 83,531
Karma: 153646249
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
The problem is that while you can use regex to fix broken lines so you don't have broken sentences, you cannot fix it so you get the paragraphs correct without doing it by hand.
How would you know where one paragraph ends and the next begins? |
|
|
|
|
|
#6 |
|
Zealot
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 136
Karma: 6368
Join Date: Nov 2018
Location: Italy
Device: Kindle P.white3/Kobo Clara 2E
|
Because the broken line re-starts by minuscule initial.
|
|
|
|
|
|
#7 |
|
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,264
Karma: 8891824
Join Date: Jun 2010
Device: Kobo Clara HD,Hisence Sero 7 Pro RIP, Nook STR, jetbook lite
|
See this thread for some suggestions,
https://www.mobileread.com/forums/sh...d.php?t=237181 this is what I use in the calibre editor search Code:
([a-z])</p>\s*<p[^>]+>([a-z]) Code:
\1 \2 P.s. make a backup copy before you start |
|
|
|
|
|
#8 |
|
Zealot
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 139
Karma: 2000
Join Date: Nov 2025
Device: none
|
You guys are lucky enough to have captical to determine whether a sentence is broken. Fix this in Chinese is really pain in the a$$.
|
|
|
|
|
|
#9 | |
|
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 83,531
Karma: 153646249
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
Quote:
And then there's the issue of where does the paragraph end/start which where is no way to automate that. |
|
|
|
|
|
|
#10 |
|
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 83,531
Karma: 153646249
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
This cannot be automated. You will have to do some manual fixing. Automation cannot find where the paragraphs begin or end. Yiu may be able to fix most sentences, but you may not be able to fix every sentence. And you may not be able to fix the paragraphs.
This is a lose-lose situation unless you manually do some of the fixing. |
|
|
|
|
|
#11 |
|
null operator (he/him)
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 22,684
Karma: 33011292
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
If you have MS Word 2016 or later, install the TransTools addon (payware), open the PDF in Word, save as DOCX, use the Transtools "Unbreaker" feature then convert the DOCX to EPUB.
BR |
|
|
|
|
|
#12 | |
|
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 83,531
Karma: 153646249
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
Quote:
|
|
|
|
|
|
|
#13 |
|
null operator (he/him)
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 22,684
Karma: 33011292
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
IIRC it has a limited time free trial. So, why don't you try it and judge for yourself.
|
|
|
|
|
|
#14 |
|
Zealot
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 136
Karma: 6368
Join Date: Nov 2018
Location: Italy
Device: Kindle P.white3/Kobo Clara 2E
|
Thank you all. Finally, I decided to give in. Such a waste of time!
|
|
|
|
|
|
#15 | |
|
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,828
Karma: 9501034
Join Date: Sep 2021
Location: Australia
Device: Kobo Libra 2
|
Quote:
I've come across this problem a few times before. Easy to fix... But there is a bit of follow up edits. So do your major fixes, then read the book and fix the remaining errors when you come across them. I use a simpler regex than @gbm My first pass is the following which will catch a good 90% of the splits... Find... (\w)</p>\s+<p>(\w) Replace... \1 \2 Then this will catch most of the remaining... Find... ([;:,-–—])</p>\s+<p>(\w) Replace... \1 \2 |
|
|
|
|
![]() |
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Glo HD Broken beyond repair? | Steelpannetje | Kobo Reader | 3 | 04-02-2019 02:24 PM |
| Lines Broken | rd7l | Conversion | 5 | 02-08-2019 11:12 AM |
| Glo How to send broken Kobo glo to Japan to repair. | blc | Kobo Reader | 6 | 10-29-2013 10:20 AM |
| DR800 Broken beyond repair? | flare | iRex | 1 | 11-26-2012 06:11 AM |
| Broken Screen - Repair Costs | Rp3 | Sony Reader | 23 | 03-03-2007 12:48 PM |