Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Editor

Notices

Reply
 
Thread Tools Search this Thread
Old 10-16-2015, 08:36 AM   #1
moldy
Enthusiast
moldy began at the beginning.
 
Posts: 43
Karma: 10
Join Date: Oct 2015
Device: Kindle
Question Find and Replace from Lists

Hi Everyone

I am new to regex so don't know if what I am trying to do is possible and at the moment I have limited skills but I am learning.

I would like to convert US books to UK versions by searching and replacing US words and phrases to UK. e.g 'color' to 'colour' and 'cell phone' to' mobile phone'.

I would like to do this using two lists so I can add to the lists as I find new words or phrases to convert.

I think the function would be something like:

START
n=0
FETCH LIST#1/STRING#1(color)
FETCH LIST#2/STRING#1(colour)
FIND LIST#1/STRING#1 in the text
IF MATCH FOUND
REPLACE LIST#1/STRING#1 with LIST#2/STRING#1
n+1
GO TO SEARCHn

SEARCHn
FETCH LIST#1/STRING#2
FETCH LIST#2/STRING#2
FIND LIST#1/STRING#2 in the text
IF MATCH FOUND
REPLACE LIST#1/STRING#2 with LIST#2/STRING#2
n+1
GO TO SEARCHn

And so on until end of list.

Is this possible? Has this already been done in some way?
I would be very grateful if someone could point me in the right direction.

Thanks - moldy
moldy is offline   Reply With Quote
Old 10-16-2015, 10:42 AM   #2
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
You'd need to create a single Search that finds ALL words, then the Function-Replace would switch between each word and it's replacement.

You could also do an editor plugin which does each find-replace in turn. (It could take wordlists.)
But no one has written such a calibre editor plugin. In fact, the only publicly posted editor plugin is DiapDealer's plugin for dealing with spans and stuff. Would be nice if people would create more plugins.



Personally, my first step would be to switch the Language metadata to en_GB, activating the en_GB dictionary, and fixing the common misspellings.
That would catch "color" --> "colour" at least, although not "cell phone" --> "mobile phone".

Last edited by eschwartz; 10-16-2015 at 10:45 AM.
eschwartz is offline   Reply With Quote
Advert
Old 10-16-2015, 10:59 AM   #3
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 31,047
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Sigil's Saved Search has a 'chained' (my term) mode where you can make a group of individual saved searches, Then Load the Groupname and execute that

Took me a long time to figure how to use that

Bring up the Saved Search Editor. You will see the controls for this on the Right
theducks is offline   Reply With Quote
Old 10-16-2015, 11:01 AM   #4
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,345
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
You can create multiple saved search and replaces, select them all and execute them. Give them names that includes something common like [GB], then filter the saved search list by [GB], press Ctrl+A to select all shown searches and execute them.
kovidgoyal is offline   Reply With Quote
Old 10-16-2015, 11:06 AM   #5
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 31,047
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by kovidgoyal View Post
You can create multiple saved search and replaces, select them all and execute them. Give them names that includes something common like [GB], then filter the saved search list by [GB], press Ctrl+A to select all shown searches and execute them.
Wow
(told you I could be dense)
theducks is offline   Reply With Quote
Advert
Old 10-18-2015, 05:32 AM   #6
moldy
Enthusiast
moldy began at the beginning.
 
Posts: 43
Karma: 10
Join Date: Oct 2015
Device: Kindle
Hi All and thanks for the helpful replies.

The saved search solution sort of works but is not ideal.

Searching for say 'color' finds color, colorful, coloring etc. which is quite useful. But search and replace 'odor, gives me odour - fine but also deodourant - wrong.
Search and replace 'mom' this gives me mum fine but also mument - wrong.

Trying to use regex to manipulate the search I searched for [Oo]dor[^\w].
This worked but then what about odorless - another search to save and name.

Trying to speed things up I exported the saved searches and opened the file in an old copy of DreamWeaver I have. I could see the search sections so I set up a template to add a search:

{
"case_sensitive": true,
"dot_all": true,
"find": "([Ww])ord([^\w])",
"mode": "regex",
"name": "",
"replace": "\\1ord\\2"
},

I added some words and saved the file but when I tried to import it I got this:

calibre, version 2.41.0
ERROR: Unhandled exception: <b>ValueError</b>:No JSON object could be decoded

calibre 2.41 [64bit] isfrozen: True is64bit: True
Windows-8-6.2.9200 Windows ('64bit', 'WindowsPE')
('Windows', '8', '6.2.9200')
Python 2.7.9
Windows: ('8', '6.2.9200', '', 'Multiprocessor Free')
Successfully initialized third party plugins: Kindle Collections && Diaps Editing Toolbag && Manage Series && Count Pages
Traceback (most recent call last):
File "site-packages\calibre\gui2\tweak_book\search.py", line 1090, in import_searches
File "json\__init__.py", line 338, in loads
File "json\decoder.py", line 366, in decode
File "json\decoder.py", line 384, in raw_decode
ValueError: No JSON object could be decoded

Well thats where I'm up to in my project. As much as I would like to; I don't have the capability to write a plug in.

Any input would be welcomed and much appreciated. Thanks - moldy
moldy is offline   Reply With Quote
Old 10-18-2015, 05:37 AM   #7
moldy
Enthusiast
moldy began at the beginning.
 
Posts: 43
Karma: 10
Join Date: Oct 2015
Device: Kindle
Small error in my previous post:

"name": "Word",

I did actually name all my saved searches.

moldy
moldy is offline   Reply With Quote
Old 10-18-2015, 05:39 AM   #8
HarryT
eBook Enthusiast
HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.
 
HarryT's Avatar
 
Posts: 85,544
Karma: 93383099
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
Quote:
Originally Posted by moldy View Post
Small error in my previous post:

"name": "Word",

I did actually name all my saved searches.

moldy
If you make a mistake in a post, you can click the "Edit" button underneath the message to edit it.
HarryT is offline   Reply With Quote
Old 10-18-2015, 06:34 AM   #9
moldy
Enthusiast
moldy began at the beginning.
 
Posts: 43
Karma: 10
Join Date: Oct 2015
Device: Kindle
Thanks for the tip Harry. I'll remember that for future posts ;-)
moldy is offline   Reply With Quote
Old 10-18-2015, 06:45 AM   #10
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,345
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Use the \b operator to match word boundaries, like this

\bcolor\b

And you can create a single saved search to do all your replacing if you are willing to use function mode, something like this

\bcolor|odor|vapor|...\b

and define a function to do the actual replacement,

http://manual.calibre-ebook.com/function_mode.html

Code:
def replace(match, number, file_name, metadata, dictionaries, data, functions, *args, **kwargs):
   return {'color':'colour', 'odor':odour', 'vapor':vapour'}.get(match.group(), match.group())
kovidgoyal is offline   Reply With Quote
Old 10-20-2015, 04:20 AM   #11
moldy
Enthusiast
moldy began at the beginning.
 
Posts: 43
Karma: 10
Join Date: Oct 2015
Device: Kindle
Thanks for your help Kovid. I hadn't heard of the \b boundary before and this is very useful.

I found I can edit the saved searches file in notepad++ and then reload it without problem.

There are a at least a couple of thousand words spelled differently in US v. GB English and of course they have capitalisations and plurals.

I am going to save the words I come across in common usage as individual saved searches and add to the file as I go along.
I will run the searches at the start of an editing session and then use the spell checker to find any words not in my file.
At the moment I am ignoring words that have double letters in GB but not in US - too many of them.

(Incidentally the words check and checker are a problem because in GB we have cheque and chequer as well as check and checker - can't think of a solution!)

My general find-replace is now:

\b([Oo])dor([s]?)\b
\1dour\2

This seems to work OK.

Once I have built up a sizable list of words I will post the file somewhere for anyone to use.

Thanks again - moldy
moldy is offline   Reply With Quote
Old 10-21-2015, 09:25 AM   #12
Phssthpok
Age improves with wine.
Phssthpok knows how to set a laser printer to stun.Phssthpok knows how to set a laser printer to stun.Phssthpok knows how to set a laser printer to stun.Phssthpok knows how to set a laser printer to stun.Phssthpok knows how to set a laser printer to stun.Phssthpok knows how to set a laser printer to stun.Phssthpok knows how to set a laser printer to stun.Phssthpok knows how to set a laser printer to stun.Phssthpok knows how to set a laser printer to stun.Phssthpok knows how to set a laser printer to stun.Phssthpok knows how to set a laser printer to stun.
 
Posts: 576
Karma: 95229
Join Date: Nov 2014
Device: Kindle Oasis, Kobo Libra II
Quote:
Originally Posted by moldy View Post
I would like to convert US books to UK versions by searching and replacing US words and phrases to UK. e.g 'color' to 'colour' and 'cell phone' to' mobile phone'.
The problem is fairly intractable -- see for example http://home.comcast.net/~helenajole/Harry.html which gives a textual comparison between "Harry Potter and the Philosopher's Stone" (UK edition) and "Harry Potter and the Sorceror's Stone" (US edition -- they seemed to assume that no-one in the US would be familiar with the concept "Philosopher's Stone" and what the hell is a philosopher anyway? or something.) Changes include "bangs" (US) instead of "fringe" (UK). What are you going to do if someone lets off fireworks and there are "lots of bangs" -- will you have "lots of fringe"?

Similarly, Terry Pratchett's Bromeliad trilogy features a bulldozer called Jukub (JCB) in the UK edition, which is called Cat in the US edition, and all the jokes were changed to suit.

And don't even get me started on common cultural oopsies like "Will you knock me up in the morning?" or "Can I bum a fag?"...

Last edited by Phssthpok; 10-21-2015 at 09:31 AM.
Phssthpok is offline   Reply With Quote
Old 10-21-2015, 09:28 AM   #13
HarryT
eBook Enthusiast
HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.
 
HarryT's Avatar
 
Posts: 85,544
Karma: 93383099
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
This whole thing seems like an exercise in futility. As the previous poster says, there are likely to be numerous differences between the UK and US edition of a book besides simple spelling changes. I'd simply accept that American English and British English are different variants of the language.
HarryT is offline   Reply With Quote
Old 10-21-2015, 09:31 AM   #14
Phssthpok
Age improves with wine.
Phssthpok knows how to set a laser printer to stun.Phssthpok knows how to set a laser printer to stun.Phssthpok knows how to set a laser printer to stun.Phssthpok knows how to set a laser printer to stun.Phssthpok knows how to set a laser printer to stun.Phssthpok knows how to set a laser printer to stun.Phssthpok knows how to set a laser printer to stun.Phssthpok knows how to set a laser printer to stun.Phssthpok knows how to set a laser printer to stun.Phssthpok knows how to set a laser printer to stun.Phssthpok knows how to set a laser printer to stun.
 
Posts: 576
Karma: 95229
Join Date: Nov 2014
Device: Kindle Oasis, Kobo Libra II
Quote:
Originally Posted by Phssthpok View Post
see for example http://home.comcast.net/~helenajole/Harry.html which gives a textual comparison between "Harry Potter and the Philosopher's Stone" (UK edition) and "Harry Potter and the Sorceror's Stone" (US edition)
One other example from this, just for the sake of entertainment:

UK: Dudley had a tantrum because his knickerbocker glory wasn't big enough...

US: Dudley had a tantrum because his knickerbocker glory didn't have enough ice cream on top...

Presumably the concept of anyone serving a not-big-enough knickerbocker glory in the US just did not compute...
Phssthpok is offline   Reply With Quote
Old 10-21-2015, 09:34 AM   #15
Phssthpok
Age improves with wine.
Phssthpok knows how to set a laser printer to stun.Phssthpok knows how to set a laser printer to stun.Phssthpok knows how to set a laser printer to stun.Phssthpok knows how to set a laser printer to stun.Phssthpok knows how to set a laser printer to stun.Phssthpok knows how to set a laser printer to stun.Phssthpok knows how to set a laser printer to stun.Phssthpok knows how to set a laser printer to stun.Phssthpok knows how to set a laser printer to stun.Phssthpok knows how to set a laser printer to stun.Phssthpok knows how to set a laser printer to stun.
 
Posts: 576
Karma: 95229
Join Date: Nov 2014
Device: Kindle Oasis, Kobo Libra II
Quote:
Originally Posted by HarryT View Post
I'd simply accept that American English and British English are different variants of the language.
I think it was Oscar Wilde who said that Britain and America are two countries separated by a common language...
Phssthpok is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Find/Replace ? Otter Calibre 2 10-08-2012 08:48 PM
Find Replace unrulyguides Sigil 5 02-17-2012 08:38 PM
Find/Replace Won't Find Rand Brittain Sigil 7 09-24-2011 04:35 AM
Find / replace bremler Sigil 6 12-17-2010 03:26 PM
Using Find/Replace with BD Otter Sony Reader 7 02-28-2009 01:49 PM


All times are GMT -4. The time now is 08:21 PM.


MobileRead.com is a privately owned, operated and funded community.