Quote:
Originally Posted by phossler
Added - actually, the document language is English, so spell check flags the foreign words. From the spell check error report, I can copy the word to the saved search and do a replace all. Removes it from spell check error since I don't like to Ignore or Add To Dictionary
|
This is exactly how I would handle it.
Ever since Calibre added Multi-Language Spellcheck, you can easily mark the words with
lang +
xml:lang.
Example sentence:
Code:
I ate some espaņol sofritos today.
1. Use Calibre's
Tools > Check Spelling with
Show only misspelled words checked.
Most foreign words should pop up as misspelled. "sofritos" would stick out like a sore thumb.
Use
Change selected word to and replace it with something like "@sofritos@".
Note: The very last word in the list is the word itself, so just click on that and make your adjustments:
2. After the end of the first pass, do a mass Search/Replace:
Search: @(.+?)@
Replace: <i lang="es" xml:lang="es">\1</i>
Code:
I ate some espaņol <i lang="es" xml:lang="es">sofritos</i> today.
3. Run Spellcheck List again, and repeat Step 1.
You'll easily be able to see which Spanish words you've caught so far, and narrow the list down further:
4. Now uncheck
Show only misspelled words, and do a few more passes. That should get you most of the way there.
5. To attach most "phrases" (which are made up of just individual foreign words)... search for two Spanish italics next to each other:
Search: (<i lang="es" xml:lang="es">.+?)</i> <i lang="es" xml:lang="es">
Replace: \1
and it'll merge them:
Code:
I ate some <i lang="es" xml:lang="es">espaņol sofritos</i> today.
That should carry you most of the way there.
Quote:
Originally Posted by phossler
Not sure what the actual rules are for marking/tagging foreign text is, but that'll be a fun research project for another day
|
Just wondering, why exactly are you marking all foreign words in italics? Are you trying to enforce a Style Guide (CMOS?) or something along those lines?