View Single Post
Old 04-23-2011, 05:44 PM   #1
wonderose began at the beginning.
Posts: 31
Karma: 10
Join Date: Apr 2011
Location: Berlin, Germany
Device: Android Tablets
OCR software/Abbyy Finereader-Highlighting –Export pdf w.notes, highlighted passages

I just thought I should open a new thread assembling all the problems I have encountered with reading texts that have been converted by OCR software on my eBook reader. In my case, the OCR software is Abbyy Finereader and the eBook reader is a Sony PRS-650

1) If the Sony reader has to display pdf files that have been converted with OCR from image to text, one can view first a jpg layer image then a text layer and finally, though not always, but very often, a blank page. This peculiar succession of pages applies to the whole books, and as a result, in a book with 200 pages, you have to turn over all in all up to 600 pages – for the reader a rather inconvenient and cumbersome experience. Is there a way to solve this problem? To my mind, the best possible solution would probably be to let the reader decide which one of the two layers he wants to see.
See also

2) When I mark a small passage of an article or a book that has been converted with OCR from image to text it seems as if I highlight the whole page instead of the particular sentence I intended to highlight. In the Reader Library, however, the highlighted passage can be viewed quite in the way I wanted it to be, yet in the Reader itself the whole page from the particular sentence upwards is colored in a dark grey. What causes this problem? And is there anything one can do to prevent it?

3) The Reader Library does allow for the viewing of the text with all the marks, bookmarks and annotations. But they won't show up in Adobe Acrobat X or in any other program, as they are saved separately as XML files. And unfortunately, there is still no support for exporting the notes via calibre.

As a consequence, there is no way of doing a backup of the texts together with all the notes, not to mention a possibility of going on to work with the books and articles one has already read using a different kind of software less cumbersome than the Reader Library. As far as I am concerned, after a series of freezes (probably due to my SD card) I already had to erase the memory of my reader, thereby losing all my notes and highlighting marks.

Although I am certainly very happy to have this feature, I have to say that I am not very happy with the way how notes and highlights are actually exported. I think the user needs to have some say in it, for instance he should be able to decide if he wants to have the data in each case to be accompanied by the exact date when the note was taken and the text highlighted or if had rather not. And furthermore, he should be the one to decide if the text and the corresponding notes begin with page 1 or rather with page 371 or page 71 (as might be the case for some articles).

Is there any possibility to export not only the highlighted parts but the whole text, i.e. a pdf file, with all the highlights and all the notes?

There has already been a discussion on this subject:

And here too:


Last edited by wonderose; 05-01-2011 at 12:00 AM.
wonderose is offline   Reply With Quote