01-30-2020, 02:23 PM | #271 | |
Enthusiast
Posts: 34
Karma: 510
Join Date: Feb 2016
Device: Kobo
|
Quote:
But I spoke about "Le Nouveau Littré". It means the new "Littré". The "Littré" is free but "Le Nouveau Littré" is not. So, I wonder if it's possible to recover the one that is inside my Vivlio. Thanks |
|
01-30-2020, 03:27 PM | #272 |
Guru
Posts: 897
Karma: 149877
Join Date: Jul 2013
Location: Netherlands
Device: Cracked HiSenseA5ProCC, Cracked OnyxNotePro, Note5, Kobo Glo, Aura
|
If it is Stardict format, I would also expect an “.ifo” file. Why don’t you upload the files to a file server and post a link? Or install a dictionary converter like Linguae or Penelope and try to import and convert it to Stardict or dxdf-format. From either you can convert to dic-format.
|
Advert | |
|
01-30-2020, 03:58 PM | #273 |
Guru
Posts: 746
Karma: 619508
Join Date: Sep 2013
Device: EnergySistemEreaderPro, Nook STG, Pocketbook 622, Bookeen Cybooks ...
|
If the dictionary is still on your Vivlio, you can recover it via bash script or PBTerm (though it may be encrypted, some dics are and some not).
|
02-02-2020, 03:19 AM | #274 | |
Enthusiast
Posts: 34
Karma: 510
Join Date: Feb 2016
Device: Kobo
|
Quote:
I tryed to use Penelope without success (maybe I didn't installed Python correctly). I putted my files here : https://1fichier.com/?m2moq80o797u2wgvytws Someone could convert them into a Pocketbook compatible format ? Thanks in advance |
|
02-02-2020, 12:10 PM | #275 |
Guru
Posts: 897
Karma: 149877
Join Date: Jul 2013
Location: Netherlands
Device: Cracked HiSenseA5ProCC, Cracked OnyxNotePro, Note5, Kobo Glo, Aura
|
TLDR: The dictionary is not encrypted nor binary. Conversion will probably take a little programming, but not much. Maybe someone recognizes which dictionary format uses the described structure (sqlite3 database for index and zipped textfiles for meaning) and you can use that knowledge to use a converter.
I downloaded them and it appears that the dict-file is actually an archive file containing markup text in 120 files numbered c_1 to c_120. The idx-file is actually a sqlite3-database that can be browsed with any sqlite browser. (I used sqlitebrowser.) In the database there are 4 tables. One table is called T_DictIndex and holds per row a Word, an Offset, a Size and the Chunk number: Code:
F_Word F_Offset F_Size F_ChunckNum a 0 574 1 Code:
<b>A,</b> N. m. [<f>&a;</f>] ou [<f>&â;</f>] Voyelle et première lettre de l'alphabet. <i>Une panse d' <i>a </i>, </i>la première partie d'un petit <i>a </i> dans l'écriture. <f>&os;</f> <i>N'avoir pas fait une panse d' <i>a </i>, </i>c'est-à-dire n'avoir rien écrit. <f>&ns;</f> <i>Prouver par A + B, </i>avec précision et rigueur. <f>&ns;</f> <i>De A à Z, </i>du début à la fin. <f>&ns;</f> <i>A4, </i>format d'une feuille de papier de 21 X 29,7 cm. <i>A3, </i>format 29,7 X 42 cm. <f>&ns;</f> La. Code:
F_Word F_Offset F_Size F_ChunckNum à 574 2523 1 Code:
<b>À,</b> Prép. [<f>&a;</f>] (lat. <i>ad</i>, mouvement, direction, proximité, lat. <i>ab</i>, séparation, origine, et lat. <i>apud</i>, relation, accompagnement) <f>&os;</f> Marque direction, tendance : <i>aller à Rome ; aimer à lire. </i> <f>&os;</f> S'emploie devant le régime indirect des verbes actifs : <i>donner de l'argent à un pauvre. </i> <f>&os;</f> Sert à déterminer le lieu où est quelque chose, où s'exécute une action : <i>résider à Paris ; être à sa place. </i> <f>&os;</f> Sert à indiquer le temps, le moment, etc. : <i>à la fin du mois. </i> <f>&os;</f> Marque appartenance, possession : <i>rendez à César ce qui est à César ; il a un style à lui. </i> <f>&os;</f> Avec un complément indique l'espèce : <i>vache à lait ; </i>la qualité : <i>or à vingt-deux carats ; </i>la forme ou la structure : <i>clou à crochet, table à tiroir ; </i>la destination : <i>marché à la volaille ; </i>la conformité, la convenance : <i>à mon avis ; </i>l'instrument : <i>pêcher à la ligne ; </i>la mesure, le poids, la quantité : <i>vendre à la livre, à la douzaine ; </i>le prix, la valeur : <i>pain à vingt centimes la livre, dîner à trois francs ; </i>l'intention : <i>à regret ; </i>la cause : <i>se ruiner à jouer ; </i>l'effet, le résultat : <i>blesser à mort. </i> <f>&os;</f> <i>À </i> précédé et suivi du même mot marque succession, gradation, ordre : <i>deux à deux ; </i>jonction : <i>bout à bout ; </i>opposition : <i>face à face. </i> <f>&os;</f> <i>À </i> se place après certains adjectifs pour en déterminer le sens : <i>fa cile à dire ; prêt à combattre. </i> <f>&os;</f> <i>À </i> suivi d'un infinitif équivaut souvent au participe précédé de <i>en </i> : <i>à vrai dire. </i> <f>&os;</f> <i>À </i> devant un infinitif peut quelquefois s'expliquer par<i>de quoi </i> : <i>verser à boire. </i> <f>&os;</f> <i>À </i> indique ce qu'on doit faire : <i>c'est un avis à suivre ; </i>ce qui doit être la suite d'un événement : <i>c'est une affaire à vous perdre. </i> <f>&os;</f> <i>À </i> s'emploie dans certaines phrases elliptiques : <i>à moi ! au feu ! à ta santé ! </i> All in all, it's close enough to see the pattern: The index words are surrounded by <b> bold start and </b> stop tags. However, the description also holds start and stop tags. So if you (1) take the index words from the table and (2) search for the text between two following index words and (3)put the first index word and found text separated by a delimiter such as a comma on a line and (4) repeat that for all 69389 entries, than you have reconstructed your dictionary in Stardict csv-format, which can be converted to pocketbook format. You could also use the found terms and place them directly in a xdxf-format. From xdxf-format you can use pocketbooks converter.exe to convert it to pocketbook's dic-format. Last edited by Markismus; 02-02-2020 at 01:08 PM. |
Advert | |
|
02-02-2020, 01:10 PM | #276 |
Guru
Posts: 897
Karma: 149877
Join Date: Jul 2013
Location: Netherlands
Device: Cracked HiSenseA5ProCC, Cracked OnyxNotePro, Note5, Kobo Glo, Aura
|
From Penelope it seems to be a Bookeen dictionairy.
I used penelope to convert Bookeen to Stardict: Code:
python -m penelope -i NouveauLittre.dict,NouveauLittre.dict.idx -j bookeen -f fr -t fr -p stardict -o output I've uploaded the results* to a fileserver and attached them. _____________________ *These are actually the tested and corrected results from a few posts down. Last edited by Markismus; 02-06-2020 at 04:25 PM. |
02-03-2020, 12:59 PM | #277 | |
Enthusiast
Posts: 34
Karma: 510
Join Date: Feb 2016
Device: Kobo
|
Quote:
But I thank you a lot for having taking the time to convert my dictionnary. I'll try it ASAP in my Bookeen Inkpad 3. Thanks again. |
|
02-03-2020, 01:26 PM | #278 | ||
Guru
Posts: 897
Karma: 149877
Join Date: Jul 2013
Location: Netherlands
Device: Cracked HiSenseA5ProCC, Cracked OnyxNotePro, Note5, Kobo Glo, Aura
|
Quote:
Quote:
I did forget to test the dictionary, because the generated Stardict info-file looked fine. If something is off about the dictionary, would you please let me know? If others would like their dictionaries converted, too: Post in the thread Pocketbook dictionary format revisited Last edited by Markismus; 02-04-2020 at 02:56 AM. |
||
02-04-2020, 03:20 PM | #279 | |
Groupie
Posts: 169
Karma: 100516
Join Date: Jan 2018
Device: Cybook Orizon, PocketBook Touch HD
|
Quote:
|
|
02-05-2020, 03:09 AM | #280 |
Guru
Posts: 897
Karma: 149877
Join Date: Jul 2013
Location: Netherlands
Device: Cracked HiSenseA5ProCC, Cracked OnyxNotePro, Note5, Kobo Glo, Aura
|
Yes, I see it. Garbage in = garbage out, I am afraid.
This is the entry in the chunk c_1 in the archive dict-file: Code:
<b>A,</b> N. m. [<f>&a;</f>] ou [<f>&â;</f>] Voyelle et première lettre de l'alphabet. <i>Une panse d' <i>a </i>, </i>la première partie d'un petit <i>a </i> dans l'écriture. <f>&os;</f> <i>N'avoir pas fait une panse d' <i>a </i>, </i>c'est-à-dire n'avoir rien écrit. <f>&ns;</f> <i>Prouver par A + B, </i>avec précision et rigueur. <f>&ns;</f> <i>De A à Z, </i>du début à la fin. <f>&ns;</f> <i>A4, </i>format d'une feuille de papier de 21 X 29,7 cm. <i>A3, </i>format 29,7 X 42 cm. <f>&ns;</f> La. As you can see, it is not the accented characters that are the problem, rather in the idx-file there is a doctype definition given, which defines all those entities: Code:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"[ <!ENTITY ns "♦"> <!ENTITY os "•"> <!ENTITY oo "›"> <!ENTITY co "‹"> <!ENTITY a "a"> <!ENTITY â "ɑ"> <!ENTITY an "ɑ̃"> <!ENTITY b "b"> <!ENTITY d "ɗ"> <!ENTITY e "ə"> <!ENTITY é "e"> <!ENTITY è "ɛ"> <!ENTITY in "ɛ̃"> <!ENTITY f "f"> <!ENTITY g "ɡ"> <!ENTITY h "h"> <!ENTITY h2 "'"> <!ENTITY i "i"> <!ENTITY j "J"> <!ENTITY k "k"> <!ENTITY l "l"> <!ENTITY m "m"> <!ENTITY n "n"> <!ENTITY gn "ɲ"> <!ENTITY ing "ɳ"> <!ENTITY o "o"> <!ENTITY o2 "ɔ"> <!ENTITY oe "ɶ"> <!ENTITY on "ɔ̃"> <!ENTITY eu "ɸ"> <!ENTITY un "ɶ̃"> <!ENTITY p "p"> <!ENTITY r "ʀ"> <!ENTITY s "s"> <!ENTITY ch "ʃ"> <!ENTITY t "t"> <!ENTITY u "ɥ"> <!ENTITY ou "u"> <!ENTITY v "v"> <!ENTITY w "w"> <!ENTITY x "x"> <!ENTITY y "y"> <!ENTITY z "z"> <!ENTITY Z "ʒ">]> <html xml:lang="fr" xmlns="http://www.w3.org/1999/xhtml"> <head> <title></title> </head> <body> sametypesequence=m sametypesequence=h Last edited by Markismus; 02-05-2020 at 04:04 AM. |
02-05-2020, 03:23 AM | #281 | |
Enthusiast
Posts: 34
Karma: 510
Join Date: Feb 2016
Device: Kobo
|
Quote:
I've tested the dic file in my inkpad 3. It work fine if I use it as I am reading a book and I press on a word. I have the definition and the layout of the page is good. But, if I use the dictionary directly in the dictionary manager, I lose the layout and some strange characters are inserted in the text, as " <f>&os;</f> " by example. That's what mentioned EastEriq. And as said EastEriq, the phonetic paterns are not present when reading a book and separated by "&" in the dictionary manager. I think it's strange that it works perfectly while I read a book and not in the dictionary manager. But I rarely use the dictionary manager... |
|
02-05-2020, 03:44 AM | #282 |
Guru
Posts: 897
Karma: 149877
Join Date: Jul 2013
Location: Netherlands
Device: Cracked HiSenseA5ProCC, Cracked OnyxNotePro, Note5, Kobo Glo, Aura
|
@tropoy, it seems that your dictionary program used whilst reading a book properly handles html automatically and knows the apparently typical symbol references defined in the doctype. This is somewhat expected, because converter.exe seems to strip out all html-tags that are more complex than bold and italics: All the interpretation needs to be on the interpreter side.
However, your dictionary manager doesn't know html at all, it seems. What dictionary manager are you referring to? @EastEriq The tags <f> and </f> are new and were demolished by my script with the conversion of ">" to "<". This is corrected now in the script. Also now all the ampersands are left alone if they are closely followed by a ";". Last edited by Markismus; 02-05-2020 at 05:53 AM. |
02-05-2020, 07:52 AM | #283 |
Guru
Posts: 897
Karma: 149877
Join Date: Jul 2013
Location: Netherlands
Device: Cracked HiSenseA5ProCC, Cracked OnyxNotePro, Note5, Kobo Glo, Aura
|
Would you test these, please? I've created a filter to replace all instances of a named entity with its numerical counterpart based on a given DOCTYPE. The current information flow is: Bookeen dict -->Using Penelope--> Stardict-->using script --> xdxf xdxf -->using script-->Pocketbook dic xdxf -->using Linguae with the settings unsorted and sametypesetting h -->Pocketbook dic (Linguae makes a mess of this dictionary if it tries to ASCII sort and searching for 'a' would then return 'ABN'.) Last edited by Markismus; 02-06-2020 at 04:25 PM. |
02-05-2020, 08:12 AM | #284 | |
Enthusiast
Posts: 34
Karma: 510
Join Date: Feb 2016
Device: Kobo
|
Quote:
I 've tryed your new files... I't seems to make exactly the same as before. Always "<f>&os;</f> " in the internal dictionary manager of the Inkpad3. And a good layout but no phonetic when reading a book. Thanks Last edited by tropoy; 02-05-2020 at 08:23 AM. |
|
02-05-2020, 11:09 AM | #285 |
Groupie
Posts: 169
Karma: 100516
Join Date: Jan 2018
Device: Cybook Orizon, PocketBook Touch HD
|
Ah, that there is an entity translation table somewhere makes sense. However, no joy with your latest .dic for me. Three different applications, pbreader, stock PB dictionary app and CR3 differ in whether they handle standard html entities like or not, and whether they literally render the custom ones or ignore them, but it seems they are all still there. Are you sure you uploaded your last conversion?
Last edited by EastEriq; 02-05-2020 at 12:18 PM. |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Webster's 1913 Dictionary in Pocketbook Format | luqmaninbmore | PocketBook | 8 | 05-27-2020 10:41 AM |
Russian dictionary for Pocketbook 301+ | irbit | PocketBook | 9 | 03-29-2010 03:05 AM |
Pocketbook 301 und Pocketbook 360° im Test, Teil 1 | Forkosigan | PocketBook | 11 | 02-11-2010 03:54 AM |
Oxford built-in dictionary disappears after changing default dictionary | YYZscientist | Amazon Kindle | 4 | 01-24-2010 08:42 PM |
Pocketbook und Netronix Inc. fusionieren zu PocketBook Global | Forkosigan | Deutsches Forum | 0 | 01-08-2010 01:13 PM |