Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Readers > Kobo Reader > Kobo Developer's Corner

Notices

Reply
 
Thread Tools Search this Thread
Old 12-09-2012, 11:11 AM   #151
AlPe
Digital Amanuensis
AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.
 
AlPe's Avatar
 
Posts: 727
Karma: 1446357
Join Date: Dec 2011
Location: Turin, Italy
Device: Several eReaders and tablets
Doesn't the current code already do that? Other scripts will use characters with ord() > 127.

If ord(x) > 127, then character x is considered ok.

In other words: right now, a keyword goes into 11.html if and only if it its 2-character prefix contains an ascii char x with ord(x) <= 127 which is not a letter. And I have tested that, in this case, the keyword is correctly retrieved.

Am I missing something?
AlPe is offline   Reply With Quote
Old 12-09-2012, 11:20 AM   #152
tshering
Wizard
tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.
 
Posts: 3,489
Karma: 2914715
Join Date: Jun 2012
Device: kobo touch
I did not look at your code. I was only guessing your code would implement the idea
Quote:
move all keywords containing a non-letter (defined as [A-Za-z]) in the first two characters into 11.html

Last edited by tshering; 12-09-2012 at 11:38 AM.
tshering is offline   Reply With Quote
Advert
Old 12-09-2012, 11:26 AM   #153
AlPe
Digital Amanuensis
AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.
 
AlPe's Avatar
 
Posts: 727
Karma: 1446357
Join Date: Dec 2011
Location: Turin, Italy
Device: Several eReaders and tablets
Ah, ok. No, the code is supposed to do what you asked for.

1) If the keyword is only 1 character long, the code appends an "a" to it.
2) Then it looks at the first two characters of the keyword. If both are (an ASCII letter or have ord() > 127), then the keyword goes to a suitable ??.html file. Otherwise, it goes to 11.html.

There might be still a small "slack", for example in the mentioned case of "o'clock". But I tested that putting "o'clock" into 11.html works, so I guess the current code is fine.
AlPe is offline   Reply With Quote
Old 12-09-2012, 11:35 AM   #154
tshering
Wizard
tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.
 
Posts: 3,489
Karma: 2914715
Join Date: Jun 2012
Device: kobo touch
I would like to mention that entries with identical keywords (and different definitions) have to follow directly on each other. If not, the Kobo displays only the first occurrence. For "economy" reasons, I did not check for this with my Japanese-English dictionary. Therefore, I have to do it anew.

Last edited by tshering; 12-09-2012 at 11:43 AM.
tshering is offline   Reply With Quote
Old 12-09-2012, 11:42 AM   #155
AlPe
Digital Amanuensis
AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.AlPe ought to be getting tired of karma fortunes by now.
 
AlPe's Avatar
 
Posts: 727
Karma: 1446357
Join Date: Dec 2011
Location: Turin, Italy
Device: Several eReaders and tablets
Penelope sorts the keys (i.e., keywords) before processing and outputting them, so the code should be fine w.r.t. this issue.

Also, in the current version I wrapped the content of a word+definition into a <div>, so that two identical keywords with different definitions will be cleanly displayed one below the other.
AlPe is offline   Reply With Quote
Advert
Old 12-12-2012, 10:34 AM   #156
tshering
Wizard
tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.
 
Posts: 3,489
Karma: 2914715
Join Date: Jun 2012
Device: kobo touch
Today I run into a problem. I made a French dictionary. To my disappointment, no word with an accented vowel in the first two letters did show up. The reason for this was evidently the encoding of the filenames inside the zip file. This was a surprise for me, since my Japanese dictionaries work perfectly, so why should not the French dictionary? Anyhow, I tried zipping the file under Ubuntu, rather than under Windows, and the dictionary was working fine.
The culprit is a strange default behavior of 7-zip under Windows (or maybe I should rather say the culprit was me not being aware of it). 7-zip encodes only those filenames in utf-8 that contain characters not supported by the local codepage. Therefore, one has to enforce utf-8 encoding, under Windows.
Code:
7z a -tzip -mcu=on dicthtml.zip *.html words
tshering is offline   Reply With Quote
Old 12-12-2012, 12:06 PM   #157
gouni
Connoisseur
gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.
 
gouni's Avatar
 
Posts: 86
Karma: 546021
Join Date: Nov 2012
Device: kobo
Tshering

I give you my mini table of letter for french

Open with notepad++


http://www.mediafire.com/?ptmn10cplqr7kaa

Last edited by gouni; 12-12-2012 at 12:12 PM.
gouni is offline   Reply With Quote
Old 12-12-2012, 01:34 PM   #158
gouni
Connoisseur
gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.
 
gouni's Avatar
 
Posts: 86
Karma: 546021
Join Date: Nov 2012
Device: kobo
I need your help to use 7zip

I use the windows console

i have 300 fichiers xx.html xx= ab ac ad ae

I want to compress each into gzip

The name of my *.html is in ansi code windows and i will transform to utf-8


to have

my 300 files xx.html.gz

and have accented-utf8 characters

What command should I use?

Thank

aa.html = aa.html.gz
.
aé.html = aé.html.gz
.
ûf.html = ûf.html.gz
.
.
zz.html = zz.html.gz

Last edited by gouni; 12-12-2012 at 01:41 PM.
gouni is offline   Reply With Quote
Old 12-12-2012, 02:25 PM   #159
ShellShock
Wizard
ShellShock ought to be getting tired of karma fortunes by now.ShellShock ought to be getting tired of karma fortunes by now.ShellShock ought to be getting tired of karma fortunes by now.ShellShock ought to be getting tired of karma fortunes by now.ShellShock ought to be getting tired of karma fortunes by now.ShellShock ought to be getting tired of karma fortunes by now.ShellShock ought to be getting tired of karma fortunes by now.ShellShock ought to be getting tired of karma fortunes by now.ShellShock ought to be getting tired of karma fortunes by now.ShellShock ought to be getting tired of karma fortunes by now.ShellShock ought to be getting tired of karma fortunes by now.
 
ShellShock's Avatar
 
Posts: 1,176
Karma: 2431850
Join Date: Sep 2008
Device: IPad Mini 2 Retina
Quote:
Originally Posted by tshering View Post
Today I run into a problem. I made a French dictionary. To my disappointment, no word with an accented vowel in the first two letters did show up. The reason for this was evidently the encoding of the filenames inside the zip file. This was a surprise for me, since my Japanese dictionaries work perfectly, so why should not the French dictionary? Anyhow, I tried zipping the file under Ubuntu, rather than under Windows, and the dictionary was working fine.
The culprit is a strange default behavior of 7-zip under Windows (or maybe I should rather say the culprit was me not being aware of it). 7-zip encodes only those filenames in utf-8 that contain characters not supported by the local codepage. Therefore, one has to enforce utf-8 encoding, under Windows.
Code:
7z a -tzip -mcu=on dicthtml.zip *.html words
Great find!
ShellShock is offline   Reply With Quote
Old 12-12-2012, 04:08 PM   #160
tshering
Wizard
tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.
 
Posts: 3,489
Karma: 2914715
Join Date: Jun 2012
Device: kobo touch
Gouni,

there is no need to convert the filenames already when creating the gz-files. Do

Code:
for %i in (*.html) do 7z a -tgzip "gz\%i" "%i"
then change to the gz sub-directory and do

Code:
7z a -tzip -mcu=on dicthtml.zip *.html words
If the location of 7z.exe is not known to the system give the full path, for instance "C:\Program Files\7-Zip\7z" (including the quotation marks).
tshering is offline   Reply With Quote
Old 12-12-2012, 04:14 PM   #161
tshering
Wizard
tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.
 
Posts: 3,489
Karma: 2914715
Join Date: Jun 2012
Device: kobo touch
Quote:
Originally Posted by gouni View Post
Tshering

I give you my mini table of letter for french

Open with notepad++


http://www.mediafire.com/?ptmn10cplqr7kaa
Thank you for the table. I am not sure what I am supposed to do with it. Is this a part of the description of your problem?
tshering is offline   Reply With Quote
Old 12-12-2012, 09:03 PM   #162
gouni
Connoisseur
gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.
 
gouni's Avatar
 
Posts: 86
Karma: 546021
Join Date: Nov 2012
Device: kobo


Thank you Tshering, yes is a part of description of my problem.

I do not know it but I listen for the Council.

I've done with notepad + my text is in UTF8, but I saved my files in utf8 names.

I work with windows, I don't know unix, I do not know programming.

All my file names éa éb ...... ûm html are windows character.


I read the text 7-zip.chm to try to understand, its difficult for me.

Last edited by gouni; 12-12-2012 at 09:18 PM.
gouni is offline   Reply With Quote
Old 12-13-2012, 05:22 AM   #163
tshering
Wizard
tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.
 
Posts: 3,489
Karma: 2914715
Join Date: Jun 2012
Device: kobo touch
Quote:
Originally Posted by gouni View Post
All my file names éa éb ...... ûm html are windows character.
This is fine. Make a back up copy of your files for safety reasons and execute the two commands given here. Tell us if it worked.
tshering is offline   Reply With Quote
Old 12-13-2012, 07:16 AM   #164
gouni
Connoisseur
gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.
 
gouni's Avatar
 
Posts: 86
Karma: 546021
Join Date: Nov 2012
Device: kobo
Thanks for giving your time tshering. It is very difficult, never I did that. I get to the second command (a day to understand but I get)


I know do this :

7z a -tzip -mcu=on e:\zzzarch e:\zztest\*.html

give me : all files in zztest go corectly in file zzzarch.zip



but I do not succeed the first command

My organisation folder in e:
e:\zztest = my folder with all files *.html

7z" for %i in (*.html) do 7z a -tgzip" "gz\%i" "%i"

(*.html) = e:\zztest\*.html do not correctly ?

Last edited by gouni; 12-13-2012 at 07:34 AM.
gouni is offline   Reply With Quote
Old 12-13-2012, 07:31 AM   #165
tshering
Wizard
tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.
 
Posts: 3,489
Karma: 2914715
Join Date: Jun 2012
Device: kobo touch
Do not replace "*.html" by "e:\zztest\*.htm". Instead change to this directory by executing
Code:
cd e:\zztest
and then execute
Code:
for %i in (*.html) do 7z a -tgzip "gz\%i" "%i"
Edit:
If you are not already on e: then execute first
Code:
e:
before you change to "e:\zztest".

Last edited by tshering; 12-13-2012 at 07:40 AM.
tshering is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
What's file format of dictionary mnjkl Kobo Reader 2 12-12-2011 08:48 AM
Dictionary format jgray Sony Reader 1 10-25-2010 09:52 AM
English Thesaurus in the dictionary format osnova Amazon Kindle 14 12-12-2009 06:42 PM
Dictionary: what version? can it be in firmware? jedix Sony Reader Dev Corner 7 12-05-2008 12:00 PM
Webster dictionary in DEPReader format abigail Reading and Management 0 08-10-2005 08:00 AM


All times are GMT -4. The time now is 04:38 PM.


MobileRead.com is a privately owned, operated and funded community.