![]() |
#1 |
Custom User Title
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 10,970
Karma: 75337983
Join Date: Oct 2018
Location: Canada
Device: Kobo Libra H2O, formerly Aura HD
|
What regular expression system does Calibre use?
Mostly for the purposes of finding an online regex tester that returns the same results as Calibre. Occasionally a regex that seems to verify okay returns unexpected results in Calibre (good thing I backup)
![]() ![]() |
![]() |
![]() |
![]() |
#2 |
Deviser
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,265
Karma: 2090983
Join Date: Aug 2013
Location: Texas
Device: none
|
Python.
Try Pythex |
![]() |
![]() |
![]() |
#3 |
Custom User Title
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 10,970
Karma: 75337983
Join Date: Oct 2018
Location: Canada
Device: Kobo Libra H2O, formerly Aura HD
|
Thank you!
|
![]() |
![]() |
![]() |
#4 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 24,905
Karma: 47303824
Join Date: Jul 2011
Location: Sydney, Australia
Device: Kobo:Touch,Glo, AuraH2O, GloHD,AuraONE, ClaraHD, Libra H2O; tolinoepos
|
I like https://regex101.com/. The explanation of the parsing helps a lot when debugging a complex expression.
|
![]() |
![]() |
![]() |
#5 | |
Still reading
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 14,012
Karma: 105092227
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper
|
Quote:
I use KATE to test any regex before destroying something with a Pearl program. A language that seems to exist purely as a wrapper for regex and I find head wrecking to do "ordinary" things I've done in VB6, C, C++, Modula-2, Java etc. There is a Pearl site that's good for regex help. When I first moved completely to Linux from Windows I ran Notepad++ in WINE, as it's a good multdocument tabbed editor with regex and syntax highlighting that can be customised. Then I discovered KATE. Pearl and a regex in general seems close to a "write-only" language. I find it hard to see what a regex might do. I found that while LO Writer has good regex Search you can't use parameters in the Replace field. Also it has some non-standard search symbols? It does seem to use a python library for it, but I could be wrong. I do use it to find misformed dialogue punctuation, spaces where there should be none and incorrectly terminated paragraphs, though that finds titles, headings and preambles as they don't punctuate like body text. So I've rarely used search & replace in Calibre. If a book from Gutenberg or wherever needs a lot of work I'd export it converted to RTF, edit and save in ODT, and a final extra Save As in DOCX for Calibre to import. Last edited by Quoth; 06-21-2020 at 04:23 AM. |
|
![]() |
![]() |
![]() |
#6 |
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 243
Karma: 291844
Join Date: Oct 2019
Device: Kobo Nia
|
|
![]() |
![]() |
![]() |
#7 | |||||
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 24,905
Karma: 47303824
Join Date: Jul 2011
Location: Sydney, Australia
Device: Kobo:Touch,Glo, AuraH2O, GloHD,AuraONE, ClaraHD, Libra H2O; tolinoepos
|
Quote:
Quote:
Quote:
Quote:
Quote:
|
|||||
![]() |
![]() |
![]() |
#8 | |
Custom User Title
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 10,970
Karma: 75337983
Join Date: Oct 2018
Location: Canada
Device: Kobo Libra H2O, formerly Aura HD
|
Quote:
![]() |
|
![]() |
![]() |
![]() |
#9 |
Hedge Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 802
Karma: 19999999
Join Date: May 2011
Location: UK/Philippines
Device: Kobo Touch, Nook Simple
|
Another Regex Option
I Like the the freeware 'Expresso" program. It will decode existing Regexes and gives you the facility to create, test and store your own Regexes.
It is worth a look at and can be used offline. |
![]() |
![]() |
![]() |
#10 | |
Still reading
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 14,012
Karma: 105092227
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper
|
Quote:
![]() Since I'm getting source in docx or odt, it wouldn't make sense to edit in Calibre. If I get a PD book that just needs some CSS/HTML changes I might edit in Calibre. If it's been very badly proofed from a scan of an ancient copy, or really stupid formatting/lack of format of headings, images, index etc I might export as RTF and Save As odt, then actually create/edit styles based on what I know works. If I was only "fixing" epubs I'd probably edit in Calibre. |
|
![]() |
![]() |
![]() |
#11 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 13,509
Karma: 78910112
Join Date: Nov 2007
Location: Toronto
Device: Libra H2O, Libra Colour
|
I have fond memories from many years ago when I was first involved with Tivoli tools for systems management. The Distributed Monitoring component relied heavily on perl scripts deployed to servers, and the Enterprise Console used Prolog. Those were the days
![]() |
![]() |
![]() |
![]() |
#12 |
Bibliophagist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 46,168
Karma: 168983734
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos
|
Very fond memories here of using Forth. It was a fun language despite making it so easy to create write-only code. I must admit that I also loved my HP RPN calculators.
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Calibre book adding: Regular expression request... | Tweetygirl10111 | Calibre | 13 | 07-13-2017 09:14 PM |
Calibre book adding: Regular expression request... | Spiffy | Calibre | 34 | 01-19-2016 01:03 PM |
Calibre Portable insists on using default regular expression for metadata info | larryvega | Library Management | 2 | 03-08-2014 07:42 AM |
Regular Expression Help | iKarampa | Calibre | 13 | 12-15-2010 07:17 AM |
Regular expression help | krendk | Calibre | 4 | 12-04-2010 04:32 PM |