View Single Post
Old 02-03-2018, 08:48 AM   #637
Northguy
Member
Northguy began at the beginning.
 
Posts: 22
Karma: 10
Join Date: Oct 2011
Device: Kobo Glo HD
Quote:
Originally Posted by odinokij View Post
(*) Note: I don't speak german nor dutch, so if you find any fail in these languages, please report so I can fix it.

I hope it will also be useful for you,

Odinokij.
Hey great! ...

I started working on the 'matching.py' of the original plug-in, but your version looks a lot more elaborated. When comparing to the original matching.py, I have some questions:
1) In general: I do not speak spanish/portuguese, so I do not understand your comments

2) In def fuzzy_it(text, patterns=None):, why didn't you change (tweaks.get('title_sort_articles', r'^(a|the|an)\s+'), ''),

3) In def get_title_tokens I added something like:

'NL', 'ebook', 'e-Book' and 'druk' as possible alternatives in
Quote:
(r'(?i)[({\[](\d{4}|ebook|e-book|NL|omnibus|anthology|hardcover|paperback|mass \s*market|edition|ed\.)[\])}]', ''),
4) In def get_title_tokens I think we should need to add 'een', because the words 'de', 'het' and 'een' represent articles as used in the Dutch language.
Quote:
tokens_du = ('een', 'de', 'het', 'van', 'met', 'naar')
Northguy is offline   Reply With Quote