![]() |
#1 |
Junior Member
![]() Posts: 1
Karma: 10
Join Date: Jun 2018
Device: Kobo Aura 2nd Edition
|
![]()
Hey guys, i need help (since i don't know anything about Pyhton or RegEx functions) with creating a function that eliminates the unnecessary paragraphs that occur when converting PDFs to EPUB.
i have tried using Find&Replace with a simple expression like: </p> <p class="calibre2">[a-z] since correct paragraphs are succeeded with a capital letter, but the problem is that i don't want it to select the matched lower case letter, i tried something like: </p> <p class="calibre2">?([a-z]) But the matched lower case letter still gets selected. Thanks in advance. |
![]() |
![]() |
![]() |
#2 |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 31,054
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
I have a series of 'cleanups' I use
Note: This is copied from Sigils saved search, ignore leading numbers an the line with Name= (describes what it does) and the escape before the \ (\\) should be a Single\ Code:
80\Name=Cleanup/Joins/Join to lower 80\Find="([[:alpha:],]\x201d*)</p>\\s*<p\\b[^>]*>([a-z\x201c])" 80\Replace=\\1 \\2 81\Name=Cleanup/Joins/Join to upper 81\Find="([[:alpha:],]\x201d*)</p>\\s*<p\\b[^>]*>([A-Z\x201c])" 81\Replace=\\1 \\2 87\Name=Cleanup/Joins/Honorifics 87\Find="(Mr|Mrs|Ms|Dr|Prof)\\.</p>\\s+<p class=\"calibre\\d+\">([A-Z])" 87\Replace=\\1. \\2 88\Name=Cleanup/Joins/de BR w/punct 88\Find="([[:punct:]])<br class=\"calibre4\" />\\s+(\"*[A-Za-z\x201c])" 88\Replace="\\1</p><p class=\"calibre4\">\\2" Note: I kept it simple and replace the capture (Wishlist PI: Import Sigil saved searches) |
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,166
Karma: 1410083
Join Date: Nov 2010
Location: Germany
Device: Sony PRS-650
|
It is maybe better to check your conversion preferences first. Your problem is a very common issue for a wrong conversion setup for PDF.
Reduce the standard line unwrapping factor of 0.45 at PDF input preferences to a value between 0.25 to 0.12 You will find out that this will reduce the most of your problem to a minimum. |
![]() |
![]() |
![]() |
#4 | |
Bibliophagist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 46,190
Karma: 168983734
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos
|
Quote:
|
|
![]() |
![]() |
![]() |
#5 |
Book E d i t o r
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 432
Karma: 288184
Join Date: May 2015
Device: Laptop
|
"Reduce the standard line unwrapping factor of 0.45 at PDF input preferences to a value between 0.25 to 0.12."
Enable Heuristics and change the line unwrap factor to 0.22. This will help to keep paragraphs together, so editing will be minimal. |
![]() |
![]() |
Advert | |
|
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Using the Editor function to activate links? | roger64 | Editor | 7 | 01-17-2016 12:09 AM |
Function mode in editor S&R -- coming soon | eschwartz | Editor | 12 | 11-21-2014 08:26 AM |
Error in function mode in editor S&R | jbacelar | Editor | 3 | 11-21-2014 05:34 AM |
Book Editor TOC Editor Isue? | weberr | Editor | 2 | 04-17-2014 11:13 AM |
Can the kindle 3 be used as a text editor with copy/paste function somehow? | kinkle | Amazon Kindle | 3 | 05-19-2011 10:50 AM |