Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Readers > Kobo Reader > Kobo Developer's Corner

Notices

Reply
 
Thread Tools Search this Thread
Old 11-24-2012, 02:51 PM   #106
gouni
Connoisseur
gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.
 
gouni's Avatar
 
Posts: 86
Karma: 546021
Join Date: Nov 2012
Device: kobo
I am doing a pose.

the character encoding French bug

I have trouble with the é è in my test with UTF-8
gouni is offline   Reply With Quote
Old 11-24-2012, 03:16 PM   #107
tshering
Wizard
tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.
 
Posts: 3,489
Karma: 2914715
Join Date: Jun 2012
Device: kobo touch
Quote:
Originally Posted by gouni View Post
I am doing a pose.

the character encoding French bug

I have trouble with the é è in my test with UTF-8
You are right. After hard work it is best to have a relaxing break. After you have recovered, make sure that the last line in the index file is followed by LF. If there are problems with the character encoding please give more details.
tshering is offline   Reply With Quote
Advert
Old 11-24-2012, 03:34 PM   #108
gouni
Connoisseur
gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.
 
gouni's Avatar
 
Posts: 86
Karma: 546021
Join Date: Nov 2012
Device: kobo
I dont undersand :

my index.txt

allocation = 1
allocution = 2
allodial = 0
allographe = 3
allogène = -1 with marisa lookup
gouni is offline   Reply With Quote
Old 11-24-2012, 04:06 PM   #109
tshering
Wizard
tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.
 
Posts: 3,489
Karma: 2914715
Join Date: Jun 2012
Device: kobo touch
I am not sure, I guess your index file is ok. The problem might be caused by the encoding of the windows terminal. Type
Code:
 marisa-predictive-search words
At the marisa-lookup prompt type
Code:
a
and hit Enter. All five words should then be listed.
tshering is offline   Reply With Quote
Old 11-24-2012, 04:36 PM   #110
gouni
Connoisseur
gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.
 
gouni's Avatar
 
Posts: 86
Karma: 546021
Join Date: Nov 2012
Device: kobo
yeeeeeees 1000 thank tshering.


It's the windows console that is the problem.

Well, I have my five words with just my è with one another sign.

Just a question my index is in utf-8 BOM

my dictionary will also be in utf-8 or utf - 8 BOM?
gouni is offline   Reply With Quote
Advert
Old 11-24-2012, 04:41 PM   #111
gouni
Connoisseur
gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.
 
gouni's Avatar
 
Posts: 86
Karma: 546021
Join Date: Nov 2012
Device: kobo
He began to do later, I'll rest, I am happy and thank you for your help.

Tomorrow I'll move forward and make the files aa ab ac ad ae... zz for the moment I do sleep.
gouni is offline   Reply With Quote
Old 11-24-2012, 05:29 PM   #112
tshering
Wizard
tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.
 
Posts: 3,489
Karma: 2914715
Join Date: Jun 2012
Device: kobo touch
I hope you had a refreshing sleep!

As for the format of the index, it is UTF-8 without BOM. In case of the html files, you can use both, but I would stick to UTF-8 without BOM.

Before you start making the files aa ab and so on, try to make a small dictionary with just one html in order to check whether it is working. Make sure that you have an epub with the corresponding words so that you can check the dictionary function easily.
tshering is offline   Reply With Quote
Old 11-25-2012, 07:45 AM   #113
gouni
Connoisseur
gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.
 
gouni's Avatar
 
Posts: 86
Karma: 546021
Join Date: Nov 2012
Device: kobo
My dictionary test works very well. 100 thank you all

This time I kept original html tags <g> <i> separators and sign . It is very beautiful as a presentation.

I still need you to notepad +++
.how select the line 868442 to 920568 to copy paste?

I go with the function to a line xxxxx


The completed dictionary will be to you.
gouni is offline   Reply With Quote
Old 11-25-2012, 10:56 AM   #114
tshering
Wizard
tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.
 
Posts: 3,489
Karma: 2914715
Join Date: Jun 2012
Device: kobo touch
Quote:
Originally Posted by gouni View Post
It is very beautiful as a presentation.
I am glad for you!

Quote:
Originally Posted by gouni View Post
I still need you to notepad +++
.how select the line 868442 to 920568 to copy paste?
I downloaded notepad++ yesterday for the first time. Therefore I do not no much about it.

You could try something different. Save the following code as dictlines.bat into the folder where your text file is.
Code:
@echo off
if [%1] == [] goto usage
if [%2] == [] goto usage
if [%3] == [] goto usage
setlocal EnableDelayedExpansion
set /a counter=0
for /f ^"usebackq^ eol^=^

^ delims^=^" %%a in (%3) do (
        if "!counter!" GTR "%2" goto :eof
        if "!counter!" GEQ "%1" echo %%a
        set /a counter+=1
)
goto :eof
:usage
echo Usage: dictlines.bat FROM_LINE TO_LINE INPUT_FILENAME > 

RESULT_FILENAME
At the command prompt write for instance:
Code:
dictlines 1 4 "mydic.txt" > ét.txt
and hit ENTER. This will write the lines 1 to 4 into the file ét.txt.

But producing the whole dictionary in this way is too much manual work and time consuming. I was hoping ShellShock will help with a piece of C code at this point.

Last edited by tshering; 11-25-2012 at 11:02 AM.
tshering is offline   Reply With Quote
Old 11-25-2012, 11:11 AM   #115
tshering
Wizard
tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.
 
Posts: 3,489
Karma: 2914715
Join Date: Jun 2012
Device: kobo touch
Replace
Code:
INPUT_FILENAME >
by
Code:
INPUT_FILENAME ^>
to make the Usage part working
tshering is offline   Reply With Quote
Old 11-25-2012, 12:06 PM   #116
gouni
Connoisseur
gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.
 
gouni's Avatar
 
Posts: 86
Karma: 546021
Join Date: Nov 2012
Device: kobo
Thanks I'll try to digest all this.
gouni is offline   Reply With Quote
Old 11-29-2012, 11:03 AM   #117
gouni
Connoisseur
gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.
 
gouni's Avatar
 
Posts: 86
Karma: 546021
Join Date: Nov 2012
Device: kobo
Hello

file : index

marisa-build -owords index.txt

give : words

I want to do the reverse-path

I have a word file, I want to find index file

It is possible the words file by marisa having the index?
gouni is offline   Reply With Quote
Old 11-29-2012, 02:08 PM   #118
tshering
Wizard
tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.tshering ought to be getting tired of karma fortunes by now.
 
Posts: 3,489
Karma: 2914715
Join Date: Jun 2012
Device: kobo touch
Quote:
Originally Posted by gouni View Post
Hello

file : index

marisa-build -owords index.txt

give : words

I want to do the reverse-path

I have a word file, I want to find index file

It is possible the words file by marisa having the index?
Unfortunately the marisa tools do not offer a possibility to dump the marisa-file as a whole. With marisa-reverse-lookup you can retrieve the content line by line by typing the line number at the marisa prompt. This, of course, is time consuming in case of large files.
One thing you can do is the following. Type
Code:
marisa-predictive-search -n0 words > index.txt
Then type at the marisa prompt:
Code:
a ENTER
b ENTER
c ENTER
and so on.
This will output all entries starting with a,b,c and so on with additional information to index.txt. You can easily edit the file and remove the additional information. Be aware that marisa is case sensitive.
tshering is offline   Reply With Quote
Old 11-30-2012, 01:18 PM   #119
gouni
Connoisseur
gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.gouni ought to be getting tired of karma fortunes by now.
 
gouni's Avatar
 
Posts: 86
Karma: 546021
Join Date: Nov 2012
Device: kobo
Thank's Tshering
gouni is offline   Reply With Quote
Old 12-01-2012, 02:44 AM   #120
Papi
Addict
Papi ought to be getting tired of karma fortunes by now.Papi ought to be getting tired of karma fortunes by now.Papi ought to be getting tired of karma fortunes by now.Papi ought to be getting tired of karma fortunes by now.Papi ought to be getting tired of karma fortunes by now.Papi ought to be getting tired of karma fortunes by now.Papi ought to be getting tired of karma fortunes by now.Papi ought to be getting tired of karma fortunes by now.Papi ought to be getting tired of karma fortunes by now.Papi ought to be getting tired of karma fortunes by now.Papi ought to be getting tired of karma fortunes by now.
 
Posts: 311
Karma: 547600
Join Date: Jul 2010
Location: Paris
Device: Kindle Keyboard, Kindle NT, PRS-650
Very interesting thread. I'd like to build my own dictionaries too, but with commercial dictionaries as sources. Back when I was in the kindle world, I bought some mobipocket dictionaries that I'd love to be able to use on the Kobo. Anyone tried something like that ? Also in French we have something very annoying for dictionaries, and it's indeed not handled by the French Larousse dictionary found in the Kobo : s', l', m', t' that can precedes a verb or a noun. For example, abris -》l'abris. If I put l'abris as a variant of abris, as far as I understood, it won't work (as it didn't with go/went). Maybe a file l'.html would work (as shown in the o'clock example), but it would contain a lot if words, basically all the nouns and verbs starting with a vowel.
Papi is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
What's file format of dictionary mnjkl Kobo Reader 2 12-12-2011 08:48 AM
Dictionary format jgray Sony Reader 1 10-25-2010 09:52 AM
English Thesaurus in the dictionary format osnova Amazon Kindle 14 12-12-2009 06:42 PM
Dictionary: what version? can it be in firmware? jedix Sony Reader Dev Corner 7 12-05-2008 12:00 PM
Webster dictionary in DEPReader format abigail Reading and Management 0 08-10-2005 08:00 AM


All times are GMT -4. The time now is 02:44 AM.


MobileRead.com is a privately owned, operated and funded community.