I think the solution would be to create a proper morphems.txt file for Spanish. That file contains rules for inflections in the language as a whole, and not on a word-by-word basis. You need to come up with the rule for generating the root word from each type of inflection and creating the proper line in the morphems.txt file for that. I expect that one or two dozen rules will cover more than 90% of your cases.
I google-translated the instructions for creating a dictionary from Russian to English, and here is the part for the morphemes. It should give you an idea of how to create the rules. You can look at the english morphems.txt file provided with the converter to see some simple examples of rules.
Edit: I did not look above to see that the original poster ran into problems with the size of the morphems.txt file when adding rules for the irregular verbs. Perhaps you should add rules for the regular verbs first, and then worry about tackling the irregular verbs after.
Morphems.txt file may contain the following elements:
1) The classes of characters:
% = Number charlist
For example, the character classes "Russian vowels," "Russian consonants" described lines:
%1=аеиоу
%2=бвгджзклмнпрстфхцчшщ
2) Morphemes:
ˆsuffix1/suffix2/...=variant1/variant2/...
If the end of the selected word matches one of the endings in the left side, just look at all the options on the right side of endings (the priority is the original version of the word, then the options left to right).
On the left side you can use the symbol "?" (Any character) and 0-9 (class number of characters). In the right part of the "dot" denotes the original character in the replacement end.
Examples of morphemes:
^2е/2и/2у/2ы/2ой/2ам/2ами=.а
The words "книги", "книгой" "книгам" will be replaced by "книга".
^2ой/2ей/2ому/2ему/2ых/2их/2ым/2им=.ый/.ой/.ий
For the word "быстрому" options are checked, and the replacement of the "быстрый".
Last edited by rkomar; 03-15-2020 at 10:53 AM.
Reason: Added note about irregular verbs
|