|
|
#1 |
|
Member
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 10
Karma: 98538
Join Date: Mar 2026
Location: Berlin
Device: PocketBook InkPad 4
|
How to change fuzzy search threshold (6.8+)
In 6.8 firmware dictionary lookup algorithm was completely rewritten. These days, it starts with fuzzy search, which can be frustratingly overzealous (https://www.mobileread.com/forums/sh...&postcount=482); and if it finds nothing, various morphological engines are tried in turn, including Hunspell.
Fuzzy search discards candidates based on a normalized edit distance threshold, which is hardcoded to 0.4. So if you want it to be less aggressive, this threshold needs to be lower. Turns out, changing it is quite simple as long as you have root/ssh access (follow the instructions in https://www.mobileread.com/forums/sh...d.php?t=325185). SSH into the device and do the following as root - this should change the threshold to 0.1 on 6.10; for 6.8, use 1f instead of 0f: Code:
$ mkdir -p /mnt/ext1/patches
$ xxd -p /ebrmain/cramfs/lib/libdictionary.so | tr -d '\n' | \
sed 's/d90f43e3/b90f43e3/' | \
xxd -r -p > /mnt/ext1/patches/libdictionary.so
$ mount -o remount,rw /ebrmain/
$ rm /ebrmain/lib/libdictionary.so
$ ln -s /mnt/ext1/patches/libdictionary.so /ebrmain/lib/libdictionary.so
Code:
$ objdump -d libdictionary.so | grep -C8 '0x3fd9' | c++filt
4f30c: e92d47f0 push {r4, r5, r6, r7, r8, r9, sl, lr}
4f310: e24dd0c0 sub sp, sp, #192 @ 0xc0
4f314: e1a06001 mov r6, r1
4f318: e309199a movw r1, #39322 @ 0x999a
4f31c: e1a04000 mov r4, r0
4f320: e3090999 movw r0, #39321 @ 0x9999
4f324: e28d8068 add r8, sp, #104 @ 0x68
4f328: e3491999 movt r1, #39321 @ 0x9999
! 4f32c: e3430fd9 movt r0, #16345 @ 0x3fd9
4f330: e59d50e0 ldr r5, [sp, #224] @ 0xe0
4f334: e58d1000 str r1, [sp]
4f338: e1a01006 mov r1, r6
4f33c: e98d0021 stmib sp, {r0, r5}
4f340: e1a00008 mov r0, r8
4f344: e1a09003 mov r9, r3
4f348: e1a07002 mov r7, r2
4f34c: ebff5b1c bl 25fc4 <pocketbook::dictionary::Dictionary::levenshteinLookup(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, double, std::atomic<bool> const*) const@plt>
The d9 byte includes 4 bits of the exponent and 4 bits of the mantissa, so tweaking this byte alone covers the range from 0.00003 (00) to 1.975 (FF), which is more than enough and avoids the need to understand the exact encoding of ARM instructions. In the example above, I changed it to b9, which corresponds to 0.1, but you can try out different values. The rest of the script replaces the symlink (originally it points to ../cramfs/lib/libdictionary.so). If anything goes wrong, just change the symlink back - the version in cramfs remains read-only. ___________________________________ My impression is that fuzzy search tries to compensate for morphological engine deficiencies. Hunspell dictionaries that can be downloaded from PB servers are the usual spell-checking ones that you can find in LibreOffice - they are not optimized for stemming, for which you often need to explicitly map conjugations to base forms: Hunspell index.dic plays the same role as StarDict .syn file, so to say. Therefore I invested quite a bit of time in creating custom Hunspell dictionaries for German and Spanish - hopefully someone finds them useful: https://codeberg.org/datyoma/hunspell-stemming-dicts (copy to /mnt/ext1/system/morphology) Last edited by issybird; 05-21-2026 at 01:42 PM. Reason: Fixed link |
|
|
|
![]() |
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Amazon Search Change? | dandelioncottage | Amazon Kindle | 4 | 09-04-2021 06:01 PM |
| Change Search Prefs | stevejoung | Kindle Developer's Corner | 12 | 10-10-2014 07:30 AM |
| Change in Search Result? | geormes | Calibre | 2 | 04-05-2014 07:08 PM |
| Get Books search - extremely fuzzy search results? | Man Eating Duck | Calibre | 1 | 05-06-2013 12:08 AM |
| Change in Search Features? | polly | Calibre | 5 | 04-24-2010 07:36 PM |