View Single Post
Old 12-08-2023, 12:48 AM   #5
democrite
Evangelist
democrite will give the Devil his due.democrite will give the Devil his due.democrite will give the Devil his due.democrite will give the Devil his due.democrite will give the Devil his due.democrite will give the Devil his due.democrite will give the Devil his due.democrite will give the Devil his due.democrite will give the Devil his due.democrite will give the Devil his due.democrite will give the Devil his due.
 
Posts: 441
Karma: 77256
Join Date: Sep 2011
Device: none
Thank you Tex2002ans. I'll take a look more at what you mentioned.

As there are something like 6000+ occurrences, regular regex doesn't help too much I thin k, particularly when wanting to verify any changes are indeed wanted, e.g. maybe there should be a hyphen and merely remove the space, or mis-match of terms against a dictionary.

As for PDF, there are works for which there is no eBook or only PDF and they are important enough that it's worth the trouble, as I do not read but study them for years.

With variations in the regex, the calibre method works well:

https://manual.calibre-ebook.com/fun...phenated-words

I then diff compare the changes, as I should be doing anyway. Decent though I haven't checked but I think it uses the calibre dictionary and perhaps not a dict formed from terms in the EPUB so there's a bit more to do but it's ok for now.
democrite is offline   Reply With Quote