Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Readers > PocketBook

Notices

Reply
 
Thread Tools Search this Thread
Old 08-23-2011, 07:06 PM   #1
nirious
Junior Member
nirious began at the beginning.
 
Posts: 8
Karma: 10
Join Date: Aug 2011
Device: pocketbook pro 903
Unhappy scanned pdf files

Hi,

I just noticed that though I can read pdf files, the standard pdf reader on my pocketbook pro 903 seems to fail at rendering one of my pdf files.

I have this pdf file that is about 50 MB in size, containing about 500 pages.
I can correctly open it on my windows pc.
I also have already opened pdfs with more pages than this one on the reader, though the size in MB was much smaller than this one.
Therefore I drew the conclusion that this will probably be just a bunch of scans grouped in a pdf document.

Anyway, what I get is the following:
in the status bar, it reports the correct number of pages, but all text and pictures are replaced by blocks.

In the properties it says the pdf was created using ABBYY finereader 10
It also says it is PDF version 1.6

Can it be that this version is not supported by the pdf reader on the pocketbook?
Or is this pdf somehow protected that it puts blocks on the text so you can not read it?
Or maybe the reader can not handle pdf files of this size?
Or maybe the reader can not handel scanned pdf files?

Is there a better reader out there?


I am using pocketbook pro 903
software version: D903.2.0.5 20110314_121317

kind regards,

Nirious
nirious is offline   Reply With Quote
Old 08-23-2011, 07:52 PM   #2
rkomar
Wizard
rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.
 
Posts: 3,054
Karma: 18821071
Join Date: Oct 2010
Location: Sudbury, ON, Canada
Device: PRS-505, PB 902, PRS-T1, PB 623, PB 840, PB 633
Quote:
Originally Posted by nirious View Post
Can it be that this version is not supported by the pdf reader on the pocketbook?
Or is this pdf somehow protected that it puts blocks on the text so you can not read it?
Or maybe the reader can not handle pdf files of this size?
Or maybe the reader can not handel scanned pdf files?

Is there a better reader out there?
If parts of pages are blocked out, then I doubt that the pdf is made up of scanned images. Usually the whole page shows up blank if there is a problem with displaying a scanned image. I have scanned many technical books myself into PDF format, some over 1GB in size, and I never had a problem reading them on my 902. So, I would guess that your pdf is not a scan.

The PDF files I get from O'Reilly are version 1.6, and they get displayed properly on my 902.

I have no experience with encrypted pdf files, so I can't tell if that is your problem.

There are two pdf readers available on the 903. Hold down the round button over your book in the library app, choose "Open with..." in the menu, and then select the app that doesn't have the bullet beside the name (i.e. the default item). Maybe the other will work for you.

Last edited by rkomar; 08-23-2011 at 07:56 PM. Reason: Added a note about version 1.6 pdf.
rkomar is offline   Reply With Quote
Advert
Old 08-23-2011, 08:00 PM   #3
Elfwreck
Grand Sorcerer
Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.Elfwreck ought to be getting tired of karma fortunes by now.
 
Elfwreck's Avatar
 
Posts: 5,187
Karma: 25133758
Join Date: Nov 2008
Location: SF Bay Area, California, USA
Device: Pocketbook Touch HD3 (Past: Kobo Mini, PEZ, PRS-505, Clié)
Quote:
Originally Posted by nirious View Post
In the properties it says the pdf was created using ABBYY finereader 10
It also says it is PDF version 1.6
Version 1.6 is compatible with Acrobat 7.x and later; the file may not open properly on earlier versions of Acrobat. Your ereader shouldn't have any problem with ver. 1.6 PDFs.

Finereader 10 means it's gone through OCR conversion; I'm not familiar enough with its export options to know what might cause problems on different readers.
Elfwreck is offline   Reply With Quote
Old 08-24-2011, 02:51 AM   #4
abijah
Enthusiast
abijah plays well with othersabijah plays well with othersabijah plays well with othersabijah plays well with othersabijah plays well with othersabijah plays well with othersabijah plays well with othersabijah plays well with othersabijah plays well with othersabijah plays well with othersabijah plays well with others
 
Posts: 28
Karma: 2692
Join Date: Jul 2011
Device: kobo aura h2o
Did you try the other pdfviewer on your device?
abijah is offline   Reply With Quote
Old 08-24-2011, 06:07 PM   #5
nirious
Junior Member
nirious began at the beginning.
 
Posts: 8
Karma: 10
Join Date: Aug 2011
Device: pocketbook pro 903
I tried loading it at work using the gnome fedora document viewer. and it opened it ok.
as did the adobe reader in windows.

The thing I noticed is that this ebook is actually a very poor scan.
I can tell when I zoom into the text.
Also the text (even in the desktop viewers is not black as in other pdfs but rather in tints of gray, with in some places even little white dots in the characters themselves)

the ebook readers on my pc actually do a pretty fine job of upscaling the text using some on the fly ocr or something.

I mean , when I load a page all text start blurry and than suddenly becomes sharper.

Also, when I use the text select tool, the selected text becomes much sharper and is not exactly but a very similar font.

Or also when I select certain parts of words, the selection already shows the next character (though correct) when the cursus is not yet there, if you know what I mean.
Like, I select "abc"(cursor is richt after "c") in "abcdefgh" and the highlighted part already shows "abcde" (which is correct)

also tried printing the book , but the print quality is really poor. (it is like when you print copy the same page over and over again in a copier)

Also when I try to reprint the book using a pdf printer or a pdf to djvu or xps or epub converter, it failes after a couple of pages.


@abijah:
Actually I already tryed the other viewer on my reader:

adobe viewer :
it just shows empty pages
but it responds rapidly going to the "next" page (you can see the page index increase)


pdf viewer :
this is the one showing the blocks
though the rendering of a page takes about 10 times longer ( about 10 seconds)
nirious is offline   Reply With Quote
Advert
Old 08-24-2011, 06:16 PM   #6
nirious
Junior Member
nirious began at the beginning.
 
Posts: 8
Karma: 10
Join Date: Aug 2011
Device: pocketbook pro 903
I just found out something else.
Maybe it was ocred after all. because I just scanned some random textpage of a magazine and it did not do the auto recognition thing I encounter with this pdf.

What is differnent about this pdf compared with all others is:
1) it is about 10 times larger in MB
2) if you open the pdf in windows using adobe reader, and you go to document properties, you can choose the "fonts" tab.
In every pdf it contains a list of fonts with next to them "enclosed" or "enclosed subset"
(:edit I think the right word is embedded ,not enclosed <==I translated it from my not english version )

In this particular pdf that gives the blocks iso the text or nothing at all using the other reader, it also has a whole list of fonts, but none of them has the above text next to it.
All fonts seem to be true type fonts.

This leads me to believe that since the fonts do not come with the pdg, it will look for the fonts on the local system. maybe my desktop windows and desktop linux have those fonts and the ereader doesn't?

Last edited by nirious; 08-24-2011 at 06:28 PM.
nirious is offline   Reply With Quote
Old 08-24-2011, 06:45 PM   #7
rkomar
Wizard
rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.
 
Posts: 3,054
Karma: 18821071
Join Date: Oct 2010
Location: Sudbury, ON, Canada
Device: PRS-505, PB 902, PRS-T1, PB 623, PB 840, PB 633
I believe that some PDFs show scanned images, but contain the OCR'ed text in a hidden layer underneath. That makes the document searchable, while still showing the original text layout and non-text stuff like figures, icons,... I believe that ABBYY does that for you (maybe Elfwreck can tell you more about it).

If the document is missing the fonts and your system doesn't have them, then that would explain the grayed out text blocks if the hidden text is being partially displayed as well. I don't think you can add new fonts to the system that AdobeViewer will use (at least I haven't been able to do so for EPUB documents after a fair bit of trying). However, you could try adding the missing fonts to the /mnt/ext1/system/fonts directory and pdfviewer might use them.
rkomar is offline   Reply With Quote
Old 08-24-2011, 07:31 PM   #8
nirious
Junior Member
nirious began at the beginning.
 
Posts: 8
Karma: 10
Join Date: Aug 2011
Device: pocketbook pro 903
Quote:
Originally Posted by rkomar View Post
I believe that some PDFs show scanned images, but contain the OCR'ed text in a hidden layer underneath.
I actually downloaded a trial of acrobat x to test this.
And indeed it shows me that there is the hidden ocred text and than the visible text (which is no text at all)
and further some poorly scanned pages.

And indeed that text blocks accord to alle the places that are hidden ocred text.

So some questions:
1)why are the actual scans not shown on the reader? In neither app?

2)Is there no support for this kind of pdfs?
the desktop version pdf readers seem to handle it quite well.

3)Since I now have a trial off the full adobe acrobat:
Can i move the text from the hidden layer to the visible one and add the 50ish tables and images that obviously are not included in the hidden "text".

The hidden text seems to have the right layout at first glance.

4)And since I am messing around in the pdf, I guess there must be a way to embed the fonts to the pdf anyway? I have the complete list of used fonts from the acrobat reader app.
nirious is offline   Reply With Quote
Old 08-24-2011, 09:00 PM   #9
rkomar
Wizard
rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.rkomar ought to be getting tired of karma fortunes by now.
 
Posts: 3,054
Karma: 18821071
Join Date: Oct 2010
Location: Sudbury, ON, Canada
Device: PRS-505, PB 902, PRS-T1, PB 623, PB 840, PB 633
The results of OCR can be often wrong, especially for non-text stuff like equations, tables, embedded images with text in them,... Replacing all the text with OCR results would produce a very bad copy in many cases. So, it seems best to display the original images, but underlay them with hidden text to allowing searching of the document. Bad OCR results then just mean missed search results rather than completely wrong text.

I don't know why the OCR results are partly visible in your case. I have some PDF files that have the hidden text under scanned images, and they work perfectly well on my 902. The text and equations show up clearly in the scan, and I can search for words via the hidden layer. The hidden text is not shown at all in the display. Maybe they made the hidden layer visible in your document because the original scan was really poor quality. Anyway, try finding and adding the missing fonts to the document. That is probably the easiest way to fix it.
rkomar is offline   Reply With Quote
Old 08-26-2011, 06:33 AM   #10
bense2k
Member
bense2k knows the difference between 'who' and 'whom'bense2k knows the difference between 'who' and 'whom'bense2k knows the difference between 'who' and 'whom'bense2k knows the difference between 'who' and 'whom'bense2k knows the difference between 'who' and 'whom'bense2k knows the difference between 'who' and 'whom'bense2k knows the difference between 'who' and 'whom'bense2k knows the difference between 'who' and 'whom'bense2k knows the difference between 'who' and 'whom'bense2k knows the difference between 'who' and 'whom'bense2k knows the difference between 'who' and 'whom'
 
Posts: 14
Karma: 10316
Join Date: Feb 2011
Device: Pocketbook pro 602
Maybe that will help .
http://https://www.mobileread.com/for...d.php?t=147420
bense2k is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Help with a scanned pdf file Jach234 Amazon Kindle 5 04-16-2011 02:17 AM
scanned pdf excalibra PDF 5 04-08-2011 04:41 AM
Advise for scanned pdf Mike_73 Sony Reader 7 05-28-2010 05:43 AM
Ok I have scanned pdf books....but DeathtoToasters Sony Reader 38 11-04-2008 07:51 PM
pdf with scanned images Leite iRex 5 08-18-2008 12:54 PM


All times are GMT -4. The time now is 02:17 AM.


MobileRead.com is a privately owned, operated and funded community.