Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Editor

Notices

Reply
 
Thread Tools Search this Thread
Old 03-04-2016, 11:02 AM   #16
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 27,307
Karma: 44897222
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by eschwartz View Post
We have a sticky thread you can post this in.




@theducks, would you mind fixing the thread title for that sticky? I think it predated Function-Replace mode.

"Saved Search" ==> "Saved Search/Regex Functions"

Done
theducks is offline   Reply With Quote
Old 03-05-2016, 08:12 AM   #17
roger64
Wizard
roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.
 
Posts: 2,566
Karma: 2999999
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
Hi

This approach is very promising and interesting as it allows to correct some recurrent OCR mistakes (after a dictionary check).

In French, we have a list of hundreds of words that deserve to be checked out of OCR. With such an approach, of course, you can't avoid false positives, which means we need not to correct them but just to highlight them, to be able later to speed up a manual checking.

I give one example, if you find "trame" which is a correct but fairly rare word, 98% of the time, it should be written "traine". It makes sense to highlight it.

However, I do not know how to write a single entry.

Could some kind soul write a code paragraph example of this Calibre function allowing me to highlight "trame"?
roger64 is offline   Reply With Quote
Advert
Old 03-05-2016, 08:55 AM   #18
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 40,731
Karma: 18247461
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Simply modify the function to wrap the word in some markup instead of correcting it, so instead of replacing the word trame with its correction, replace it with <span class="mistake">trame</span>

Then you can use another search to jump to these words one by one and correct them or not, as you like.
kovidgoyal is offline   Reply With Quote
Old 03-05-2016, 07:14 PM   #19
roger64
Wizard
roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.
 
Posts: 2,566
Karma: 2999999
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
@kovidgoyal

Thanks for your neat idea which is so simple that I understood it.

Now to try it.

Last edited by roger64; 03-05-2016 at 07:17 PM.
roger64 is offline   Reply With Quote
Old 03-11-2016, 07:02 AM   #20
Arjayem
Casual Member
Arjayem began at the beginning.
 
Arjayem's Avatar
 
Posts: 5
Karma: 10
Join Date: Mar 2016
Location: UK
Device: Kindle paperwhite
Posting

smack: If that's aimed at me, I'd be happy to oblige, Regex Functions seems an apt tag. BUT I'm new to this, and the suggestion goes way over my head, can you step through it ?






Quote:
Originally Posted by eschwartz View Post
We have a sticky thread you can post this in.




@theducks, would you mind fixing the thread title for that sticky? I think it predated Function-Replace mode.

"Saved Search" ==> "Saved Search/Regex Functions"

Arjayem is offline   Reply With Quote
Advert
Old 03-11-2016, 12:39 PM   #21
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,421
Karma: 85000000
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
YOU just repeat your post, in that sticky thread.

@theducks has already performed the moderator administration duties I asked for.
eschwartz is offline   Reply With Quote
Old 03-11-2016, 12:50 PM   #22
BetterRed
null operator
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 18,007
Karma: 20473839
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by eschwartz View Post
YOU just repeat your post, in that sticky thread.

@theducks has already performed the moderator administration duties I asked for.
Rather than the OP doing doing that, and to avoid having 'duplicate' posts, I suggest theducks merges this thread into the sticky "Saved Search/Regex Functions" thread.

BR
BetterRed is offline   Reply With Quote
Old 03-11-2016, 01:14 PM   #23
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 27,307
Karma: 44897222
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by BetterRed View Post
Rather than the OP doing doing that, and to avoid having 'duplicate' posts, I suggest theducks merges this thread into the sticky "Saved Search/Regex Functions" thread.

BR
I am worried about breaking things
The forum (Mod actions) prompts are a bit
theducks is offline   Reply With Quote
Old 03-11-2016, 02:34 PM   #24
BetterRed
null operator
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 18,007
Karma: 20473839
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Moderator Notice
Merge done
BetterRed is offline   Reply With Quote
Old 05-08-2017, 10:54 PM   #25
nqk
Evangelist
nqk doesn't litternqk doesn't litter
 
Posts: 435
Karma: 182
Join Date: Feb 2012
Device: Onyx Boox Leaf
Dear you guys,

Please help me with a regex function to move found searches to "endnotes.html" file in the book. I picked a code from this forum, which looks like this.
Code:
def replace(match, number, file_name, metadata, dictionaries, data, functions, *args, **kwargs):
    endnotes = open('D:\endnotes.txt', 'a')
    notes = match.group()+'\n'
    endnotes.write(notes)
    return ''
replace.file_order = 'spine'
This works, yes. It moves all the texts found to the external endnotes.txt and I would manually copy them to endnotes.html that I create in the book.

What I would like to do is have them written directly to the endnotes.html (or even create the file if not there already).

Thanks.
nqk is offline   Reply With Quote
Old 02-23-2018, 08:17 AM   #26
roger64
Wizard
roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.
 
Posts: 2,566
Karma: 2999999
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
Tagging selected foreign words

If we manage to wrap this
Code:
<span xml:lang="xx" lang="xx">foreign</span>
around selected foreign words, we may help improve TTS reading. Foreign words will be pronounced with their native accent instead of an haphazard and sometimes unintelligible way.

The requisite is to write first a list.txt of these selected words -which can be easily selected and copied from the spellchecker panel of unrecognized words (to whom they all belong). For the purpose of this thread we suppose that such a list is available.

Help requested

I am looking for a function that I could launch on an ePub with the Calibre editor and which would use sequentially this list to wrap the above spans around each occurrence of these foreign names.

Last edited by roger64; 02-23-2018 at 08:26 AM.
roger64 is offline   Reply With Quote
Old 02-09-2020, 08:00 PM   #27
roger64
Wizard
roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.
 
Posts: 2,566
Karma: 2999999
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
Search and replace excluding the headers

I am trying to insert old nums in an ePub. For this I need to wrap a span around these characters like:

Code:
(\d+)
<span class="smcp" >\1</span>
However I need to exclude the headers, if not I have to correct all of them...
Could a function do it?

Last edited by roger64; 02-09-2020 at 08:17 PM.
roger64 is offline   Reply With Quote
Old 02-10-2020, 01:17 AM   #28
davidfor
Grand Sorcerer
davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.
 
Posts: 24,248
Karma: 45541438
Join Date: Jul 2011
Location: Sydney, Australia
Device: Kobo:Touch,Glo, AuraH2O, GloHD,AuraONE, ClaraHD, Libra H2O; tolinoepos
Quote:
Originally Posted by roger64 View Post
Search and replace excluding the headers

I am trying to insert old nums in an ePub. For this I need to wrap a span around these characters like:

Code:
(\d+)
<span class="smcp" >\1</span>
However I need to exclude the headers, if not I have to correct all of them...
Could a function do it?
If this is just numbers in paragraphs, you could do:

Code:
(<p>.*?)(\d+)(.*?</p>)
With the replace:

Code:
\1<span class="smcp" >\2</span>\3
That does have a problem if a paragraph has more than one number in the paragraph. The following might be better, but, it still has issues:

Code:
(<p>.*?[^\>\d])(\d+)(.*?</p>)
davidfor is offline   Reply With Quote
Old 02-10-2020, 06:05 AM   #29
roger64
Wizard
roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.
 
Posts: 2,566
Karma: 2999999
Join Date: Jan 2009
Device: Kindle PW3 (wifi)
@davidfor

Thanks very much for your help. I'll try it.
roger64 is offline   Reply With Quote
Old 07-02-2020, 07:34 PM   #30
Ted Friesen
Enthusiast
Ted Friesen began at the beginning.
 
Posts: 42
Karma: 10
Join Date: May 2016
Device: Kindle
Regex-function code

In regex-function mode "create/edit" brings up a dialogue box with two sections: 1. function name and 2. Code.

Function name has a drop down list of about a dozen functions. I thought that choosing a built-in function would populate the code panel with the appropriate Python code so I could learn from it, but it does not change from the default:

def replace(match, number, file_name, metadata, dictionaries, data, functions, *args, **kwargs):
return ''

There are a few examples in the User Manual, but not all of the built-in functions, which would be handy. Where might I find the code for those functions and others? Is there, perhaps a thread the saved searches thread?

If this is some secret that a novice shouldn't know, please let me know. I don't want to accidentally cross the beams and annihilate the universe.
Ted Friesen is offline   Reply With Quote
Reply

Tags
conversion, errors, function, ocr, spelling

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
About saved searches and regex Carpatos Editor 22 09-30-2020 10:56 PM
Regex-Functions - getting user input CalibUser Editor 8 09-09-2020 04:26 AM
Difference in Manual Search and Saved Search phossler Editor 4 10-04-2015 12:17 PM
Help - Learning to use Regex Functions weberr Editor 1 06-13-2015 01:59 AM
Limit on length of saved regex? ElMiko Sigil 0 06-30-2013 03:32 PM


All times are GMT -4. The time now is 03:13 PM.


MobileRead.com is a privately owned, operated and funded community.