![]() |
#1 |
A Hairy Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,301
Karma: 20171571
Join Date: Dec 2012
Location: Charleston, SC today
Device: iPhone 15/11/X/6/iPad 1,2,Air & Air Pro/Surface Pro/Kindle PW & Fire
|
Regex for Marking Text?
Does anyone know of a way to restrict a find/replace to marked text....within the search parameter? I know you can manually mark the text with Ctrl + Shift + M, but I'm hoping to do it as part of a saved search group.
eg. Step one: find: <table(.*?)>(.*?)</table> and set as marked text Step two: find (within marked text): <p>(.*?)></p> and replace: \1 Step three: Change the 'marked text' selector back to "All HTML Files" Thanks, |
![]() |
![]() |
![]() |
#2 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 5,680
Karma: 23983815
Join Date: Dec 2010
Device: Kindle PW2
|
If it's usually only one section that you want to update, simply highlight it, right-click it and select Mark Selected Text from the context menu.
All Find & Replace actions will only be applied to that section. |
![]() |
![]() |
![]() |
#3 |
A Hairy Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,301
Karma: 20171571
Join Date: Dec 2012
Location: Charleston, SC today
Device: iPhone 15/11/X/6/iPad 1,2,Air & Air Pro/Surface Pro/Kindle PW & Fire
|
Yes, I'm familiar with the manual method. In this particular instance I am trying to clean out all the <p> tags inside of a table....and there are over 200 tables
![]() My other option, of course, is to have a much more complex search, but I was worried about having too many saved groups (.*?)...isn't there a limit of like 5 or 6 within a given regex? |
![]() |
![]() |
![]() |
#4 |
Enthusiast
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 39
Karma: 59154
Join Date: May 2010
Location: Stuttgart, Germany
Device: Kobo H2O, PocketBook Touch HD, Tolino Vision 4
|
I don't have a solution for a sigil regex, but if you want to try the calibre editor, you can use the function mode.
The function mode gives you the possibility of more advanced text processing. It should be perfect for your problem: Open your book with the calibre editor. Choose one of the html-files on the left panel and click Ctrl+F. Insert "<table[^>]*>.*?</table>" in the field "Find:" Choose Mode "Regex-function" and click create/edit. In the new window you will have the basic structure of a function. In the upper field you can insert a name (maybe "clean_tables") and the the given basic function you should replace with: Code:
import regex def replace(match, number, file_name, metadata, dictionaries, data, functions, *args, **kwargs): return regex.sub(r'<p[^>]*>(.*?)</p>',r'\1',match.group()) Klecks |
![]() |
![]() |
![]() |
#5 |
A Hairy Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,301
Karma: 20171571
Join Date: Dec 2012
Location: Charleston, SC today
Device: iPhone 15/11/X/6/iPad 1,2,Air & Air Pro/Surface Pro/Kindle PW & Fire
|
Wow - that's awesome! I had played with regex functions a while back and knew they were powerful, but hadn't thought about them in a while. I guess that shows we can get stuck in our ruts and fail to think outside of the comfort box.
Thanks! |
![]() |
![]() |
![]() |
#6 |
A Hairy Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,301
Karma: 20171571
Join Date: Dec 2012
Location: Charleston, SC today
Device: iPhone 15/11/X/6/iPad 1,2,Air & Air Pro/Surface Pro/Kindle PW & Fire
|
I don't know the proprieties about sharing code between two open-source projects like Sigil & Calibre.
Is it a thing/possibility to implement the regex-functions capability within Sigil? |
![]() |
![]() |
![]() |
#7 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
Quote:
Search: <td>\s+<p>([^<]+)</p>\s+</td> Replace: <td>\1</td> [...] Search: <td colspan="([0-9]+)">\s+<p>([^<]+)</p>\s+</td> Replace: <th colspan="\1">\2</th> You could adapt something similar for your specific case. Last edited by Tex2002ans; 09-13-2018 at 05:20 PM. |
|
![]() |
![]() |
![]() |
#8 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 28,352
Karma: 203720150
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
I'm guessing a Sigil plugin could be worked up that would allow similar functionality. You're basically creating a library of small custom python routines (using predefined parameters) to alter text matched by a regexp.
Last edited by DiapDealer; 09-13-2018 at 06:03 PM. |
![]() |
![]() |
![]() |
#9 |
A Hairy Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,301
Karma: 20171571
Join Date: Dec 2012
Location: Charleston, SC today
Device: iPhone 15/11/X/6/iPad 1,2,Air & Air Pro/Surface Pro/Kindle PW & Fire
|
I've been a 'programmer' my whole life...not in any way professionally... basic/fortran/pascal/vba and wrote a little web app with Google Apps Script this week... which forced me to learn JS over the last couple of days. I haven't looked at python at all.
If it's OK with Kovid and the Calibre community I could peruse their code and give it my best shot at plug-in-ifying it??? |
![]() |
![]() |
![]() |
#10 | |
A Hairy Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,301
Karma: 20171571
Join Date: Dec 2012
Location: Charleston, SC today
Device: iPhone 15/11/X/6/iPad 1,2,Air & Air Pro/Surface Pro/Kindle PW & Fire
|
Quote:
|
|
![]() |
![]() |
![]() |
#11 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 28,352
Karma: 203720150
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
|
![]() |
![]() |
![]() |
#12 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 5,680
Karma: 23983815
Join Date: Dec 2010
Device: Kindle PW2
|
IMHO, one of best Python crash courses is Al Sweigarts's Creative Commons licensed book Automate the Boring Stuff with Python, which is available online.
You also might want to check out the Beautiful Soup Python module, which is bundled with Sigil. BTW, the last topic in the Sigil plugin development thread contains minimal code for a very simple plugin that boldens "the" in all HTML files. |
![]() |
![]() |
![]() |
#13 |
A Hairy Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,301
Karma: 20171571
Join Date: Dec 2012
Location: Charleston, SC today
Device: iPhone 15/11/X/6/iPad 1,2,Air & Air Pro/Surface Pro/Kindle PW & Fire
|
Great - thanks!
That'll give me something to do while I'm sitting around waiting for Florence to get a move on... |
![]() |
![]() |
![]() |
#14 |
A Hairy Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,301
Karma: 20171571
Join Date: Dec 2012
Location: Charleston, SC today
Device: iPhone 15/11/X/6/iPad 1,2,Air & Air Pro/Surface Pro/Kindle PW & Fire
|
"I know kung-fu."
Que training montage to get my fingers familiar with what's in my brain... ![]() |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Regex questions (body of text only?) | rosshalde | Sigil | 3 | 10-23-2014 09:02 PM |
Is there a way to remove text from Title with regex | LadyKate | Library Management | 8 | 02-14-2014 04:12 PM |
Is there RegEx to <span> ALL CAPS text? | phossler | Sigil | 4 | 03-10-2013 02:43 PM |
PRS-650 Text Marking gets awful slow with time on my PRS650 | Leserli | Sony Reader | 29 | 08-13-2011 10:33 PM |
Reading software with marking text | nettomb | Android Devices | 9 | 10-01-2010 02:55 PM |