11-26-2012, 07:59 PM | #1 |
Zealot
Posts: 121
Karma: 6899
Join Date: Nov 2012
Device: Onyx Boox M92, Kindle Paperwhite 1 (Wi-Fi)
|
M92: problems with utf-8 pdf file and a dictionary
I use the "new Pdf Reader (beta)" and the "1.8.1 20121113 mixed" firmware on a M92 (from ArtaTech).
1. The first time I click the "Aa" icon (it opens the dictionary), it takes a long time to start, and there is no "busy" indicator, so I never know if my "click" was accepted and it's starting the dictionary or not (and so I usually try to "click" this icon several times). Would it be possible to add some kind of a "busy" icon (actually the same problem appears in another cases, too -> sometimes something takes a lot of time but the user is not told "to be patient", so one never knows what to do)? 2. I open a utf-8 encoded pdf file (contains German umlauts). Everything is displayed fine, but when I try to use the dictionary (the "Aa" icon), words which contain umlauts are not found at all. if I then click the "D" icon (the "full screen" dictionary window), I can see that it interprets every two-byte utf-8 character as two separate ascii characters (if I edit it, delete these two ascii characters and inset one German character, it finds the word). 3. While working in the "D" mode (the "full screen" dictionary window), when I switch the dictionary to another one, I can immediately see the "new" translation (coming from the "new" dictionary). This is not the case when I work in the "Aa" mode ("small window, down") -> after switching the dictionary, the "old" translation stays, even after one "clicks" the "?" icon. The only way to get the "new" translation is to "select" another word from the text and then "select" the original word again -> then the "new" translation (coming from the "new" dictionary) will appear. 4. The "keyboard Language" setting is not preserved -> each time I "click" the "D" icon, I need to "Switch Language" from "English" to "German" (note: the "system language" is "English" and this should stay). Would it be possible to have "book-specific" "keyboard Language" settings (or at least "preserve" the last changed setting globally for all books)? 5. When I "click" the "abc" icon ("related words" window in the dictionary window), I often do not get anything (even though the "word" itself has been found in the dictionary). 6. Some StarDict dictionaries are "colourful" (they contain a lot of "html code" which manages "fonts", "colors", ...). One example is the "WAHRIG.digital - Deutches Wörterbuch". Such dictionaries are properly displayed in both, the "small" "Aa" and the "full screen" "D", windows (and it actually looks nice, even on the black-white screen). They are, however, improperly displayed in the "abc" window (the "related words" window). One simply sees the original "html code" (i.e. instead of the "colourful" text, one gets the "dump" of the "html code" -> one line per "related word"). Best regards, Pepe. P.S. Just a small explanation about the "1.8.1 20121113 mixed" firmware on a M92 (from ArtaTech) -> it's a "mixture" of the "Booxtor 1.8 20121113" and the "ArtaTech 1.7 20120927" (the current file is called "update-02040001.zip"). Last edited by pepe_alter_ego; 11-27-2012 at 02:29 AM. |
11-27-2012, 03:18 AM | #2 | |
Groupie
Posts: 164
Karma: 10020
Join Date: Mar 2012
Device: Onyx Boox M92, Onyx Boox T68
|
Quote:
regards |
|
Advert | |
|
11-27-2012, 04:00 AM | #3 |
Zealot
Posts: 121
Karma: 6899
Join Date: Nov 2012
Device: Onyx Boox M92, Kindle Paperwhite 1 (Wi-Fi)
|
|
11-27-2012, 04:51 AM | #4 |
Connoisseur? Addict!
Posts: 136
Karma: 2720
Join Date: Aug 2010
Location: Germany
Device: Onyx M92
|
Maybe you should propose those changes to Artatech? What the hell is Booxtor supposed to do about this Artatech-branded device? Or do you expect the Onyx guys to dig through what Artatech changed?
This forum is about the Onyx M92. |
11-27-2012, 06:01 AM | #5 | |
Zealot
Posts: 121
Karma: 6899
Join Date: Nov 2012
Device: Onyx Boox M92, Kindle Paperwhite 1 (Wi-Fi)
|
Quote:
From what you write I assume that you actually run the original (unmodified) "Booxtor 1.8 20121113" firmware and you do NOT get any of the problems that I described, or did I get you wrong? Best regards, Pepe. |
|
Advert | |
|
11-27-2012, 06:16 AM | #6 |
Connoisseur
Posts: 99
Karma: 5810
Join Date: Jun 2012
Location: europe
Device: Boox i62HD
|
hallo, can you upload at least 1 page of utf-8 pdf document?
|
11-27-2012, 06:20 AM | #7 | |
Connoisseur? Addict!
Posts: 136
Karma: 2720
Join Date: Aug 2010
Location: Germany
Device: Onyx M92
|
Quote:
I am running the firmware available for non-branded Onyx M92, yes. 1. I don't have that problem. Everything loads right away. 2. Maybe if you could provide the UTF-8 encoded PDF document, somebody could check if the same thing happens on their device. PDF documents tend to be error prone. 3. Same issue. 4. Same issue. 5. Not sure about this one; guess it is similar as for you. 6. Didn't check. |
|
11-27-2012, 07:37 AM | #8 |
Mono
Posts: 699
Karma: 13333
Join Date: Jan 2012
Device: Boox M92
|
I think 6) has been fixed in 1.8. It was for sure in all 1.7 versions.
3) was in all 1.7 versions, but it seemed me that it was OK in 1.8. |
11-27-2012, 08:25 AM | #9 |
Zealot
Posts: 121
Karma: 6899
Join Date: Nov 2012
Device: Onyx Boox M92, Kindle Paperwhite 1 (Wi-Fi)
|
I am unable to "extract" one page from the original pdf file (that I was using when writing the first post here), so I tried to prepare a "small test case" and I noticed another problem with the "dictionary".
7. I open a pdf file created by LaTeX (a scientific article, for example), then I click "Aa" and then I try to "select" a word from the text ... it always "selects" many words at once (from one to several lines of text) ... try the attached "german_utf8_letter.pdf" file ... no matter where I "click", the whole text of the letter is "selected" (the same problem appears in the "old" and in the "new" Pdf Reader). I think I was wrong when saying that the "original" pdf file uses utf-8 ... I just noticed that it was the "ascii / text pdf previewer" on my Linux which was displaying utf-8 ... now I looked at this file as a "binary" one and ... no idea what the encoding is (there are some readable ascii texts inside but the majority are just "binary data" for me). Well, if you say that you expect some bugs to be repaired in 1.8 (with respect to 1.7), then I'm afraid that it really may be the case that the "mixed" firmware uses too much from the original ArtaTech 1.7 firmware (well, according to the description, the only "old" things should be the "synthesis IVONA, cloud, library, rss reader ...", the rest should come from the newest "Booxtor 1.8 20121113"). I cannot test the unmodified "Booxtor 1.8 20121113" firmware as my device will not let me perform the "update" (it simply ignores the "update" file). (Note however, "mSSM" reports that he sees the issue 3) in his device, even though he uses the newest unmodified firmware.) Just for the record, after I open a new pdf file and I "click" the "Aa" dictionary icon for the first time, it takes about 10 to 15 seconds before I see the new "small dictionary window". Successive "clicks" are performed really fast, however (just the very first one, after I open a new pdf file, is slow). Last edited by pepe_alter_ego; 11-27-2012 at 09:03 AM. |
11-27-2012, 09:32 AM | #10 |
Zealot
Posts: 121
Karma: 6899
Join Date: Nov 2012
Device: Onyx Boox M92, Kindle Paperwhite 1 (Wi-Fi)
|
Hi,
I have got a one-page extract from a German book (don't know the title, it was sent to me by a colleague whom I asked for a "small example"). This pdf file shows all "symptoms" as my "original" one, so it should allow you to "test" it on your devices. Please report the results -> if there are problems specific to the "mixed" firmware, I will need to try to contact the "author" of it. Note: the issue 7) reported in my previous post does NOT appear with this pdf file as I am able to "select" single words in this "book" -> this problem appears in pdf files that originate in LaTeX, I believe. Best regards, Pepe. P.S. I just noticed one more thing related to the issue 2) reported in my first post here. If the German umlaut character is uppercase, you will see only one character in the "full screen" dictionary window (i.e. after "clicking" the "D" icon), but in fact there is another one which is "invisible", right after the one that you can see -> you need to delete two characters (and then introduce a single one from the keyboard), if you want to get the "translation". If the German umlaut letter is lowercase, you will see both "wrong ascii characters". Last edited by pepe_alter_ego; 11-27-2012 at 10:08 AM. |
11-27-2012, 10:50 AM | #11 |
Mono
Posts: 699
Karma: 13333
Join Date: Jan 2012
Device: Boox M92
|
Ad 7) To my opinion there is some kind of bug concerning "word recognition". My opinion is that the algorithm treats special letters with accents (but not all, only some) as "not a letter" which means that it cuts the words to smaler parts (it happens a lot in Czech texts).
In your case, the reason for ill behaviour is that spaces are missing. There is a ill-behaviour of annotation export for certain pdfs. If I am not mistaken, LaTex generated. All spaces in text disapear... **** I tested dictionary a bit. So, the (abc) bug is solved for most of dictionaries in 1.8. But not all. I have one dictionary which uses a lot of font formating and it is showed without removing or correct interpreting. The bug "not recognized dictionary change" is still there. Strange is that change of dictionary is immediately reflected in (abc) view, but not others. |
11-28-2012, 11:45 AM | #12 |
Zealot
Posts: 121
Karma: 6899
Join Date: Nov 2012
Device: Onyx Boox M92, Kindle Paperwhite 1 (Wi-Fi)
|
Should I report the problem with "LaTeX-originated" pdf files somewhere?
I checked quite many "scientific articles" (coming mainly from the arXiv repository) ... the majority of them have this problem. |
11-29-2012, 05:45 AM | #13 | |
Booxtor
Posts: 1,126
Karma: 2305664
Join Date: Jun 2011
Location: Germany
Device: a lot of..
|
Quote:
|
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Blank pages in PDF file on M92 1.8 firmware | johey | Onyx Boox | 11 | 11-02-2012 06:07 AM |
how to navigate photocopied PDF file on M92? | FinancialWar | Onyx Boox | 20 | 08-10-2012 02:02 PM |
Boox M92: Problems with Wi-Fi connestion | medvedev | Onyx Boox | 14 | 03-12-2012 09:41 AM |
removing Chinese dictionary from M92 | a.a.k | Onyx Boox | 8 | 01-19-2012 12:20 PM |
Convert Chinese UTF-8 TXT file into ePub?? | C.Jones81 | Calibre | 4 | 12-05-2010 06:32 AM |