Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 03-07-2025, 06:28 PM   #1
vanessamartins00
Junior Member
vanessamartins00 began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Mar 2024
Device: none
Question Script to change the language of some words

Hello!

Is there a script that can mark all texts that are not in my default content language and insert a span? I have a file in Portuguese, but I need to mark all words in English with a character style.
vanessamartins00 is offline   Reply With Quote
Old 03-07-2025, 09:41 PM   #2
KevinH
Sigil Developer
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 8,478
Karma: 5703586
Join Date: Nov 2009
Device: many
There is no plugin for this that I know about. But my guess is you can use the Spellcheck Editor to do some of the work.

It should help find all the words not in the Portuguese dictionary and you can add them to your own word list. You should end up with a file with one English word per line.

The next step depends on just how many words are in that list. If less than say twenty, I would use regular expression Find and Replace to add the spans searching for each word.

If instead of single words there are whole phases or sentences in English it gets much harder as you need to wrap the entire sentence or phrase in a single span, not one for each word.

If more than 20 or lots are in phrases and sentences, then there really is no better way than to actually proof read the entire manuscript adding in the proper spans as you go. Be sure to add lang attributes too as they should indicate the language of the English words to help with assistive technologies for those who need screen readers.

I have found that screen reading the text and just listening to know when to interrupt and edit it is easier.

So the right approach really depends on if just single words or if in phrases.

In many fiction works often only one character uses foreign phrases and so searching for references to that character can make things much easier.

It is a shame the original author did not properly mark the foreign text in some way.

Perhaps someone has written a plugin to help but I do not know of one. The closest thing I have seen is a routine that uses python function replace to work on a list of single words and wrap them in a span, followed by routine to merge spans that follow each other to deal with phrases and sentences.

Not easy to get working well. Sigil's Beta4 build has that Python Function Replace capability but someone would still have to write the code.

I wonder if this is something you could teach an AI to do?

Perhaps someone else knows a better approach.
KevinH is online now   Reply With Quote
Advert
Old 03-14-2025, 07:11 PM   #3
Haudek
Member
Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.
 
Posts: 24
Karma: 111614
Join Date: Mar 2025
Location: Poland
Device: Kindle Voyage
I have been working on such a plugin for some time. I have been following the MR forum for years, but only now decided to register.
Despite my inexperience on the MR forum, can I post a public beta version?

I drew fully from existing plugins, learning from the best. In the end, I chose my path, because, after all, I was creating a plugin primarily for myself.

Although I'm now using it myself, since I really like Sigil, I thought I'd share my work.
Unfortunately, I have not had a chance to test the plugin in an environment other than Windows.

I have tried to test the plugin in various scenarios, but I am aware that its operation can be destructive to files in certain situations if users don't fully understand how it works.

I would also find it very useful to have test files to examine the plugin's behavior in the real world.
Attached Thumbnails
Click image for larger version

Name:	sigil-foreignwords-beta-preview.png
Views:	88
Size:	27.2 KB
ID:	214330  
Haudek is offline   Reply With Quote
Old 03-15-2025, 03:37 AM   #4
Doitsu
Grand Sorcerer
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 5,680
Karma: 23983815
Join Date: Dec 2010
Device: Kindle PW2
Quote:
Originally Posted by Haudek View Post
Despite my inexperience on the MR forum, can I post a public beta version?
New plugins are always welcome. What kind of method does your plugin use to identify foreign words and what languages are supported?
Doitsu is offline   Reply With Quote
Old 03-15-2025, 04:33 AM   #5
Haudek
Member
Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.
 
Posts: 24
Karma: 111614
Join Date: Mar 2025
Location: Poland
Device: Kindle Voyage
Oh, the plugin is not so clever.
The first step, which is running the "Spellcheck" command and adding foreign words to the user's dictionary, has to be done by a human.
But after that, a bit of magic happens and if we have selected all the words correctly then they will be combined into phrases and surrounded by <span> with the selected attributes.

Today I will prepare a new thread for the plugin and invite beta testing.
I admit that only 10% of the plugin was coding and the other 90% was polishing the interface, but I did it just to make the plugin look good in front of discerning MR forum users.

I know that further testing and code improvements are needed, but without users who will devote their valuable time to beta testing there is little I can do. For my personal needs, I improved a few e-books and it just "worked".

And as for supported languages - I have not tested on languages not based on the Latin alphabet. Probably it will also work for Cyrillic (I have to check it).
I'm not even sure if the plugin will provide support for Arabic, Hebrew or Asian languages in the future.
Haudek is offline   Reply With Quote
Advert
Old 03-20-2025, 04:19 PM   #6
Haudek
Member
Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.Haudek knows the difference between a duck.
 
Posts: 24
Karma: 111614
Join Date: Mar 2025
Location: Poland
Device: Kindle Voyage
Quote:
Originally Posted by vanessamartins00 View Post
Is there a script that can mark all texts that are not in my default content language and insert a span? I have a file in Portuguese, but I need to mark all words in English with a character style.
I encourage you to check out the "Foreign Words" plugin.
Haudek is offline   Reply With Quote
Reply

Tags
language, script

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
import changes foreign language to nonsense words dandman Library Management 9 05-15-2024 07:01 PM
100 most frequently used words by language KevinH Sigil 4 04-16-2021 02:36 PM
Please help me to change the dictionary and change keyboard language! Manually usb temp0rary Onyx Boox 1 06-13-2020 04:54 PM
One-touch look-up of words in foreign language book andrewkirk Kobo Reader 8 06-09-2015 08:20 AM
Script to change pdf metadata (accented) beco Calibre 0 12-09-2012 05:51 PM


All times are GMT -4. The time now is 08:28 AM.


MobileRead.com is a privately owned, operated and funded community.