Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Readers > PocketBook

Notices

Reply
 
Thread Tools Search this Thread
Old 04-17-2023, 08:25 AM   #1
xcube
Junior Member
xcube began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Apr 2023
Device: Pocketbook Lux 2
Install Italian "Zingarelli" dictionary.

Hi,

Is it possible to install the italian dictionary "Zingarelli" in a pocketbook lux 2?

Regards.
xcube is offline   Reply With Quote
Old 04-18-2023, 03:06 PM   #2
EastEriq
Groupie
EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.
 
Posts: 199
Karma: 195502
Join Date: Jan 2018
Device: Cybook Orizon, PocketBook Touch HD
Where do you get it from and in which format? I have a 7.2MB VocabolarioItaliano.dic, but I forgot where I got it from.
EastEriq is offline   Reply With Quote
Advert
Old 04-19-2023, 03:05 PM   #3
xcube
Junior Member
xcube began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Apr 2023
Device: Pocketbook Lux 2
Quote:
Originally Posted by EastEriq View Post
Where do you get it from and in which format? I have a 7.2MB VocabolarioItaliano.dic, but I forgot where I got it from.
Hi,

Could you pass me the "VocabolarioItaliano.dic"?

I try to get the Italian dictionary from the kindle.
xcube is offline   Reply With Quote
Old 04-20-2023, 01:04 PM   #4
EastEriq
Groupie
EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.
 
Posts: 199
Karma: 195502
Join Date: Jan 2018
Device: Cybook Orizon, PocketBook Touch HD
What I stored is probably this, from this post of 2012. The OP there says it is converted from a freely available one, at a link which is by now dead.

Digging in the fora I think I understand that at some point in time one firmware upgrade for some pocketbook model included as a bonus the Zingarelli. I don't know if that file has been circulated since.

OTOH, it may be possible to convert the vocabulary you have in .dic format. See e.g. https://www.mobileread.com/forums/sh...d.php?t=325203 . I myself would like to get it in StarDict format for KOreader if possible (I'm not aware of a toolchain for converting .dic to StarDict, only the opposite).

Lastly, I found on my disk also a smaller Zanichelli Italian Dictionary.ld2 (2.9MB which may still be a respectable size). I completely forgot what format it was and where I got it from.

ETA: oh, the zip from https://www.mobileread.com/forums/at...5&d=1347238978 contains also a plain txt file. Converting to StarDict may be easy.

Last edited by EastEriq; 04-20-2023 at 01:48 PM.
EastEriq is offline   Reply With Quote
Old 04-20-2023, 01:50 PM   #5
EastEriq
Groupie
EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.
 
Posts: 199
Karma: 195502
Join Date: Jan 2018
Device: Cybook Orizon, PocketBook Touch HD
Mentioned as bonus coming with a firmware for Pb 912 here

Last edited by EastEriq; 04-21-2023 at 03:59 AM. Reason: link description as text
EastEriq is offline   Reply With Quote
Advert
Old 04-20-2023, 02:38 PM   #6
Markismus
Guru
Markismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicing
 
Markismus's Avatar
 
Posts: 955
Karma: 149907
Join Date: Jul 2013
Location: Rotterdam
Device: HiSenseA5ProCC, Cracked OnyxNotePro, Note5, Kobo Glo, Aura
I think my script will take it if you rename the txt-file to a csv-extension, e.g vocabolario_Italiano.txt.csv, and add the " " (3 spaces) as delimiter. Something like
Code:
perl pocketbookdic.pl vocabolario_Italiano.txt.csv it '   '.
The script and further explanation are on github.

Last edited by Markismus; 04-20-2023 at 02:40 PM.
Markismus is offline   Reply With Quote
Old 04-20-2023, 03:39 PM   #7
xcube
Junior Member
xcube began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Apr 2023
Device: Pocketbook Lux 2
Quote:
Originally Posted by EastEriq View Post
What I stored is probably this, from this post of 2012. The OP there says it is converted from a freely available one, at a link which is by now dead.

Digging in the fora I think I understand that at some point in time one firmware upgrade for some pocketbook model included as a bonus the Zingarelli. I don't know if that file has been circulated since.

OTOH, it may be possible to convert the vocabulary you have in .dic format. See e.g. https://www.mobileread.com/forums/sh...d.php?t=325203 . I myself would like to get it in StarDict format for KOreader if possible (I'm not aware of a toolchain for converting .dic to StarDict, only the opposite).

Lastly, I found on my disk also a smaller Zanichelli Italian Dictionary.ld2 (2.9MB which may still be a respectable size). I completely forgot what format it was and where I got it from.

ETA: oh, the zip from https://www.mobileread.com/forums/at...5&d=1347238978 contains also a plain txt file. Converting to StarDict may be easy.
Thank you very much for all the information and the dictionary.
I will try to find out about everything.
xcube is offline   Reply With Quote
Old 04-21-2023, 05:58 AM   #8
EastEriq
Groupie
EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.
 
Posts: 199
Karma: 195502
Join Date: Jan 2018
Device: Cybook Orizon, PocketBook Touch HD
Quote:
Originally Posted by Markismus View Post
I think my script will take it if you rename the txt-file to a csv-extension, e.g vocabolario_Italiano.txt.csv, and add the " " (3 spaces) as delimiter. Something like
Code:
perl pocketbookdic.pl vocabolario_Italiano.txt.csv it '   '.
The script and further explanation are on github.
Yes right. At first look though there are some irregularities in the use of two and three spaces as field delimiters, maybe due to the previous conversion process. I'll see if I can find a pattern and an useful regexp to massage the source.
EastEriq is offline   Reply With Quote
Old 04-22-2023, 05:06 PM   #9
EastEriq
Groupie
EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.
 
Posts: 199
Karma: 195502
Join Date: Jan 2018
Device: Cybook Orizon, PocketBook Touch HD
I've spent some time in regularizing this Vocabolario_Italiano.txt. Looking into, there were a number of blank lines, repeated keywords, repeated entries, and the use of spacing was irregular. Moreover, I saw no capital in proper names. Possibly all this comes from an earlier conversion attempt.
I corrected what I caught, and uniformized three spaces as delimiter. I attach here my result.

With that I'm able to run Markismus' tool and generate an xdxf (in fact two, _reconstructed and _unbloated, I don't know what would be better) and the .dic again for Pocketbook.

What I was not able at all, is to generate the StarDict version from the xdxf. PocketBookDict didn't, despite my setting isCreateStardictDictionary = 1 in DicControls.pm and DicGlobals.pm (I don't know perl); the new version of pyglossary requires python3.9 while I am at 3.8, and with an older pyglossary generates me an empty .dict.

If someone wants to take over from here....

PS. the attached file contains a single & at line 69831 which may have to be substituted with & depending on the workchain.
Attached Files
File Type: zip VocabolarioItaliano.zip (5.52 MB, 219 views)

Last edited by EastEriq; 04-22-2023 at 05:11 PM.
EastEriq is offline   Reply With Quote
Old 04-22-2023, 08:24 PM   #10
nezih
Enthusiast
nezih is less competitive than you.nezih is less competitive than you.nezih is less competitive than you.nezih is less competitive than you.nezih is less competitive than you.nezih is less competitive than you.nezih is less competitive than you.nezih is less competitive than you.nezih is less competitive than you.nezih is less competitive than you.nezih is less competitive than you.
 
nezih's Avatar
 
Posts: 43
Karma: 14828
Join Date: Feb 2023
Device: Boox Page, Kobo Aura SE
Quote:
Originally Posted by EastEriq View Post
What I stored is probably this, from this post of 2012. The OP there says it is converted from a freely available one, at a link which is by now dead.

Digging in the fora I think I understand that at some point in time one firmware upgrade for some pocketbook model included as a bonus the Zingarelli. I don't know if that file has been circulated since.

OTOH, it may be possible to convert the vocabulary you have in .dic format. See e.g. https://www.mobileread.com/forums/sh...d.php?t=325203 . I myself would like to get it in StarDict format for KOreader if possible (I'm not aware of a toolchain for converting .dic to StarDict, only the opposite).

Lastly, I found on my disk also a smaller Zanichelli Italian Dictionary.ld2 (2.9MB which may still be a respectable size). I completely forgot what format it was and where I got it from.

ETA: oh, the zip from https://www.mobileread.com/forums/at...5&d=1347238978 contains also a plain txt file. Converting to StarDict may be easy.
I unpacked the .dic file with this script I wrote: https://gist.github.com/anezih/b20e2...861efeff3f6072
When unpacked, I got a TSV file of 20K entries, which is way lower than the 80K of txt file. Anyway, I was able to convert the TSV file to the StarDict via PyGlossary, see the attached screenshot.

If you want to convert it yourself run: python pocketbook2tsv.py VocabolarioItaliano.dic

EDIT: There was a bug it seems, now it finds 80449 entries.
Attached Thumbnails
Click image for larger version

Name:	vocabolarioitaliano.png
Views:	164
Size:	87.3 KB
ID:	201208  

Last edited by nezih; 04-22-2023 at 09:46 PM. Reason: correction
nezih is offline   Reply With Quote
Old 04-23-2023, 04:23 PM   #11
EastEriq
Groupie
EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.
 
Posts: 199
Karma: 195502
Join Date: Jan 2018
Device: Cybook Orizon, PocketBook Touch HD
Good to know that Pocketbook .dic format has been decoded to some extent.

I also understood that your gist works correctly only in python 3.9, not in 3.8 nor in 2.7.

I quickly compared the results of your gist on either VocabolarioItaliano.dic of the original post, or on VocabolarioItaliano_reconstructed.dic which is the result of Markismus' PocketBookDict, and I see that the results differ mostly for spacing (which I corrected), and ordering of words with accented vowels (the result from VocabolarioItaliano_reconstructed.dic has them right imho, alphabetizing accented vowels as if they were not accented, as customary in italian, rather than listing them in code order at the after all unaccented letters). But I also spotted cases in which one file includes the acception (1) of some term, where the other the acception (2) (e.g: abbaglio, abbonare, acapnia, accapponare). We don't know how the original VocabolarioItaliano.dic was really generated, but I remark that these multiple acceptions are all present in the VocabolarioItaliano.txt of the original attachment. @Markismus, bug in your program? As for abbozzare, I have 3 entries in the csv I attached, wheras your result holds only (1).

Not sure what to do with _info.txt , which contain as Wordlist only a few of the terms, with some missing the initial (e.g. "ccozzaticcio", "d usum delphini" "pparenza"), but whatever...
EastEriq is offline   Reply With Quote
Old 04-23-2023, 05:01 PM   #12
EastEriq
Groupie
EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.
 
Posts: 199
Karma: 195502
Join Date: Jan 2018
Device: Cybook Orizon, PocketBook Touch HD
Btw, @Markismus, why
Quote:
abat-jour s.m. invar. paralume | lampada con paralume.
in the csv becomes
Quote:
<head><k>abat-jour</k></head><def>abat-jour s.m. invar. paralume | lampada con paralume.</def>
in the xdxf? (repteated head term in the <def>, only case I spotted)
EastEriq is offline   Reply With Quote
Old 04-23-2023, 05:23 PM   #13
nezih
Enthusiast
nezih is less competitive than you.nezih is less competitive than you.nezih is less competitive than you.nezih is less competitive than you.nezih is less competitive than you.nezih is less competitive than you.nezih is less competitive than you.nezih is less competitive than you.nezih is less competitive than you.nezih is less competitive than you.nezih is less competitive than you.
 
nezih's Avatar
 
Posts: 43
Karma: 14828
Join Date: Feb 2023
Device: Boox Page, Kobo Aura SE
Wordlist is composed of every first entry in the each zlib stream. Dictionary program may use it to jump at relevant stream based on sorting. Other than that, the txt file contains keyboards, collates and other stuff that were required by the converter program.

About python version, type hints I added raises the min. version to 3.9, I guess.

For spacing differences, at line 58, I convert every 2+ space to one. If you don't want that add # before that line to comment it out. Finally, at line 59 I try to remove the control characters which I thought were the results of decoding errors but they may denote markup for styling like bold, italic etc. I don't have access to a Pocketbook device so I don't know how those are rendered in lookup results.
nezih is offline   Reply With Quote
Old 04-24-2023, 01:57 AM   #14
EastEriq
Groupie
EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.EastEriq can program the VCR without an owner's manual.
 
Posts: 199
Karma: 195502
Join Date: Jan 2018
Device: Cybook Orizon, PocketBook Touch HD
Quote:
Originally Posted by nezih View Post
Wordlist is composed of every first entry in the each zlib stream.
Strange that a few of them are off of one character, though. Possibly a bug in your script? But alas, not really a concern. And frankly, it is quite likely that the closed converter.exe, which is what generated those .dics, is much more bug ridden than the sources we can look into. That could explain the omission of multiple acception entries, alphabetic sorting, and what not.
EastEriq is offline   Reply With Quote
Old 04-24-2023, 11:22 AM   #15
Markismus
Guru
Markismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicingMarkismus causes much rejoicing
 
Markismus's Avatar
 
Posts: 955
Karma: 149907
Join Date: Jul 2013
Location: Rotterdam
Device: HiSenseA5ProCC, Cracked OnyxNotePro, Note5, Kobo Glo, Aura
Quote:
Originally Posted by nezih View Post
Other than that, the txt file contains keyboards, collates and other stuff that were required by the converter program.

About python version, type hints I added raises the min. version to 3.9, I guess.
If you found new keyboard/collates/morphems txt-files, would you be willing to give a pull request to the github repository for the language files?
Markismus is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
dictionary "Le nouveau Littré" (modern version of a french dictionary akorx Kobo Reader 18 05-15-2021 02:35 AM
How to fix "4 dictionary downloads pending; sync to install" ?? jlemonde Kobo Developer's Corner 2 04-30-2020 03:17 PM
how can I install a dictionary english-italian? bruja_phoenix Onyx Boox 5 12-06-2015 12:33 PM
free download "Il fu Mattia Pascal", by Luigi Pirandello, in Italian paola Deals and Resources (No Self-Promotion or Affiliate Links) 0 11-21-2010 12:11 PM
Dictionary lookup in iBooks 1.1: "Dictionary not available for this language" kjk Apple Devices 71 09-18-2010 06:24 AM


All times are GMT -4. The time now is 11:13 PM.


MobileRead.com is a privately owned, operated and funded community.