![]() |
#1 |
Junior Member
![]() Posts: 5
Karma: 10
Join Date: Mar 2024
Device: none
|
![]()
Hello!
Is there a script that can mark all texts that are not in my default content language and insert a span? I have a file in Portuguese, but I need to mark all words in English with a character style. |
![]() |
![]() |
![]() |
#2 |
Sigil Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 8,478
Karma: 5703586
Join Date: Nov 2009
Device: many
|
There is no plugin for this that I know about. But my guess is you can use the Spellcheck Editor to do some of the work.
It should help find all the words not in the Portuguese dictionary and you can add them to your own word list. You should end up with a file with one English word per line. The next step depends on just how many words are in that list. If less than say twenty, I would use regular expression Find and Replace to add the spans searching for each word. If instead of single words there are whole phases or sentences in English it gets much harder as you need to wrap the entire sentence or phrase in a single span, not one for each word. If more than 20 or lots are in phrases and sentences, then there really is no better way than to actually proof read the entire manuscript adding in the proper spans as you go. Be sure to add lang attributes too as they should indicate the language of the English words to help with assistive technologies for those who need screen readers. I have found that screen reading the text and just listening to know when to interrupt and edit it is easier. So the right approach really depends on if just single words or if in phrases. In many fiction works often only one character uses foreign phrases and so searching for references to that character can make things much easier. It is a shame the original author did not properly mark the foreign text in some way. Perhaps someone has written a plugin to help but I do not know of one. The closest thing I have seen is a routine that uses python function replace to work on a list of single words and wrap them in a span, followed by routine to merge spans that follow each other to deal with phrases and sentences. Not easy to get working well. Sigil's Beta4 build has that Python Function Replace capability but someone would still have to write the code. I wonder if this is something you could teach an AI to do? Perhaps someone else knows a better approach. |
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Member
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 24
Karma: 111614
Join Date: Mar 2025
Location: Poland
Device: Kindle Voyage
|
I have been working on such a plugin for some time. I have been following the MR forum for years, but only now decided to register.
Despite my inexperience on the MR forum, can I post a public beta version? I drew fully from existing plugins, learning from the best. In the end, I chose my path, because, after all, I was creating a plugin primarily for myself. Although I'm now using it myself, since I really like Sigil, I thought I'd share my work. Unfortunately, I have not had a chance to test the plugin in an environment other than Windows. I have tried to test the plugin in various scenarios, but I am aware that its operation can be destructive to files in certain situations if users don't fully understand how it works. I would also find it very useful to have test files to examine the plugin's behavior in the real world. |
![]() |
![]() |
![]() |
#4 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 5,680
Karma: 23983815
Join Date: Dec 2010
Device: Kindle PW2
|
|
![]() |
![]() |
![]() |
#5 |
Member
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 24
Karma: 111614
Join Date: Mar 2025
Location: Poland
Device: Kindle Voyage
|
Oh, the plugin is not so clever.
The first step, which is running the "Spellcheck" command and adding foreign words to the user's dictionary, has to be done by a human. But after that, a bit of magic happens and if we have selected all the words correctly then they will be combined into phrases and surrounded by <span> with the selected attributes. Today I will prepare a new thread for the plugin and invite beta testing. I admit that only 10% of the plugin was coding and the other 90% was polishing the interface, but I did it just to make the plugin look good in front of discerning MR forum users. I know that further testing and code improvements are needed, but without users who will devote their valuable time to beta testing there is little I can do. For my personal needs, I improved a few e-books and it just "worked". And as for supported languages - I have not tested on languages not based on the Latin alphabet. Probably it will also work for Cyrillic (I have to check it). I'm not even sure if the plugin will provide support for Arabic, Hebrew or Asian languages in the future. |
![]() |
![]() |
Advert | |
|
![]() |
#6 | |
Member
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 24
Karma: 111614
Join Date: Mar 2025
Location: Poland
Device: Kindle Voyage
|
Quote:
|
|
![]() |
![]() |
![]() |
Tags |
language, script |
Thread Tools | Search this Thread |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
import changes foreign language to nonsense words | dandman | Library Management | 9 | 05-15-2024 07:01 PM |
100 most frequently used words by language | KevinH | Sigil | 4 | 04-16-2021 02:36 PM |
Please help me to change the dictionary and change keyboard language! Manually usb | temp0rary | Onyx Boox | 1 | 06-13-2020 04:54 PM |
One-touch look-up of words in foreign language book | andrewkirk | Kobo Reader | 8 | 06-09-2015 08:20 AM |
Script to change pdf metadata (accented) | beco | Calibre | 0 | 12-09-2012 05:51 PM |