|  08-23-2011, 07:06 PM | #1 | 
| Junior Member  Posts: 8 Karma: 10 Join Date: Aug 2011 Device: pocketbook pro 903 |  scanned pdf files 
			
			Hi, I just noticed that though I can read pdf files, the standard pdf reader on my pocketbook pro 903 seems to fail at rendering one of my pdf files. I have this pdf file that is about 50 MB in size, containing about 500 pages. I can correctly open it on my windows pc. I also have already opened pdfs with more pages than this one on the reader, though the size in MB was much smaller than this one. Therefore I drew the conclusion that this will probably be just a bunch of scans grouped in a pdf document. Anyway, what I get is the following: in the status bar, it reports the correct number of pages, but all text and pictures are replaced by blocks. In the properties it says the pdf was created using ABBYY finereader 10 It also says it is PDF version 1.6 Can it be that this version is not supported by the pdf reader on the pocketbook? Or is this pdf somehow protected that it puts blocks on the text so you can not read it? Or maybe the reader can not handle pdf files of this size? Or maybe the reader can not handel scanned pdf files? Is there a better reader out there? I am using pocketbook pro 903 software version: D903.2.0.5 20110314_121317 kind regards, Nirious | 
|   |   | 
|  08-23-2011, 07:52 PM | #2 | |
| Wizard            Posts: 3,059 Karma: 18821071 Join Date: Oct 2010 Location: Sudbury, ON, Canada Device: PRS-505, PB 902, PRS-T1, PB 623, PB 840, PB 633 | Quote: 
 The PDF files I get from O'Reilly are version 1.6, and they get displayed properly on my 902. I have no experience with encrypted pdf files, so I can't tell if that is your problem. There are two pdf readers available on the 903. Hold down the round button over your book in the library app, choose "Open with..." in the menu, and then select the app that doesn't have the bullet beside the name (i.e. the default item). Maybe the other will work for you. Last edited by rkomar; 08-23-2011 at 07:56 PM. Reason: Added a note about version 1.6 pdf. | |
|   |   | 
|  08-23-2011, 08:00 PM | #3 | |
| Grand Sorcerer            Posts: 5,187 Karma: 25133758 Join Date: Nov 2008 Location: SF Bay Area, California, USA Device: Pocketbook Touch HD3 (Past: Kobo Mini, PEZ, PRS-505, Clié) | Quote: 
 Finereader 10 means it's gone through OCR conversion; I'm not familiar enough with its export options to know what might cause problems on different readers. | |
|   |   | 
|  08-24-2011, 02:51 AM | #4 | 
| Enthusiast            Posts: 28 Karma: 2692 Join Date: Jul 2011 Device: kobo aura h2o | 
			
			Did you try the other pdfviewer on your device?
		 | 
|   |   | 
|  08-24-2011, 06:07 PM | #5 | 
| Junior Member  Posts: 8 Karma: 10 Join Date: Aug 2011 Device: pocketbook pro 903 | 
			
			I tried loading it at work using the gnome fedora document viewer. and it opened it ok. as did the adobe reader in windows. The thing I noticed is that this ebook is actually a very poor scan. I can tell when I zoom into the text. Also the text (even in the desktop viewers is not black as in other pdfs but rather in tints of gray, with in some places even little white dots in the characters themselves) the ebook readers on my pc actually do a pretty fine job of upscaling the text using some on the fly ocr or something. I mean , when I load a page all text start blurry and than suddenly becomes sharper. Also, when I use the text select tool, the selected text becomes much sharper and is not exactly but a very similar font. Or also when I select certain parts of words, the selection already shows the next character (though correct) when the cursus is not yet there, if you know what I mean. Like, I select "abc"(cursor is richt after "c") in "abcdefgh" and the highlighted part already shows "abcde" (which is correct) also tried printing the book , but the print quality is really poor. (it is like when you print copy the same page over and over again in a copier) Also when I try to reprint the book using a pdf printer or a pdf to djvu or xps or epub converter, it failes after a couple of pages. @abijah: Actually I already tryed the other viewer on my reader: adobe viewer : it just shows empty pages but it responds rapidly going to the "next" page (you can see the page index increase) pdf viewer : this is the one showing the blocks though the rendering of a page takes about 10 times longer ( about 10 seconds) | 
|   |   | 
|  08-24-2011, 06:16 PM | #6 | 
| Junior Member  Posts: 8 Karma: 10 Join Date: Aug 2011 Device: pocketbook pro 903 | 
			
			I just found out something else. Maybe it was ocred after all. because I just scanned some random textpage of a magazine and it did not do the auto recognition thing I encounter with this pdf. What is differnent about this pdf compared with all others is: 1) it is about 10 times larger in MB 2) if you open the pdf in windows using adobe reader, and you go to document properties, you can choose the "fonts" tab. In every pdf it contains a list of fonts with next to them "enclosed" or "enclosed subset" (:edit I think the right word is embedded ,not enclosed <==I translated it from my not english version  ) In this particular pdf that gives the blocks iso the text or nothing at all using the other reader, it also has a whole list of fonts, but none of them has the above text next to it. All fonts seem to be true type fonts. This leads me to believe that since the fonts do not come with the pdg, it will look for the fonts on the local system. maybe my desktop windows and desktop linux have those fonts and the ereader doesn't? Last edited by nirious; 08-24-2011 at 06:28 PM. | 
|   |   | 
|  08-24-2011, 06:45 PM | #7 | 
| Wizard            Posts: 3,059 Karma: 18821071 Join Date: Oct 2010 Location: Sudbury, ON, Canada Device: PRS-505, PB 902, PRS-T1, PB 623, PB 840, PB 633 | 
			
			I believe that some PDFs show scanned images, but contain the OCR'ed text in a hidden layer underneath.  That makes the document searchable, while still showing the original text layout and non-text stuff like figures, icons,...  I believe that ABBYY does that for you (maybe Elfwreck can tell you more about it). If the document is missing the fonts and your system doesn't have them, then that would explain the grayed out text blocks if the hidden text is being partially displayed as well. I don't think you can add new fonts to the system that AdobeViewer will use (at least I haven't been able to do so for EPUB documents after a fair bit of trying). However, you could try adding the missing fonts to the /mnt/ext1/system/fonts directory and pdfviewer might use them. | 
|   |   | 
|  08-24-2011, 07:31 PM | #8 | |
| Junior Member  Posts: 8 Karma: 10 Join Date: Aug 2011 Device: pocketbook pro 903 | Quote: 
 And indeed it shows me that there is the hidden ocred text and than the visible text (which is no text at all) and further some poorly scanned pages. And indeed that text blocks accord to alle the places that are hidden ocred text. So some questions: 1)why are the actual scans not shown on the reader? In neither app? 2)Is there no support for this kind of pdfs? the desktop version pdf readers seem to handle it quite well. 3)Since I now have a trial off the full adobe acrobat: Can i move the text from the hidden layer to the visible one and add the 50ish tables and images that obviously are not included in the hidden "text". The hidden text seems to have the right layout at first glance. 4)And since I am messing around in the pdf, I guess there must be a way to embed the fonts to the pdf anyway? I have the complete list of used fonts from the acrobat reader app. | |
|   |   | 
|  08-24-2011, 09:00 PM | #9 | 
| Wizard            Posts: 3,059 Karma: 18821071 Join Date: Oct 2010 Location: Sudbury, ON, Canada Device: PRS-505, PB 902, PRS-T1, PB 623, PB 840, PB 633 | 
			
			The results of OCR can be often wrong, especially for non-text stuff like equations, tables, embedded images with text in them,...  Replacing all the text with OCR results would produce a very bad copy in many cases.  So, it seems best to display the original images, but underlay them with hidden text to allowing searching of the document.  Bad OCR results then just mean missed search results rather than completely wrong text. I don't know why the OCR results are partly visible in your case. I have some PDF files that have the hidden text under scanned images, and they work perfectly well on my 902. The text and equations show up clearly in the scan, and I can search for words via the hidden layer. The hidden text is not shown at all in the display. Maybe they made the hidden layer visible in your document because the original scan was really poor quality. Anyway, try finding and adding the missing fonts to the document. That is probably the easiest way to fix it. | 
|   |   | 
|  08-26-2011, 06:33 AM | #10 | 
| Member            Posts: 14 Karma: 10316 Join Date: Feb 2011 Device: Pocketbook pro 602 | 
			
			Maybe that will help . http://https://www.mobileread.com/for...d.php?t=147420 | 
|   |   | 
|  | 
| Thread Tools | Search this Thread | 
| 
 | 
|  Similar Threads | ||||
| Thread | Thread Starter | Forum | Replies | Last Post | 
| Help with a scanned pdf file | Jach234 | Amazon Kindle | 5 | 04-16-2011 02:17 AM | 
| scanned pdf | excalibra | 5 | 04-08-2011 04:41 AM | |
| Advise for scanned pdf | Mike_73 | Sony Reader | 7 | 05-28-2010 05:43 AM | 
| Ok I have scanned pdf books....but | DeathtoToasters | Sony Reader | 38 | 11-04-2008 07:51 PM | 
| pdf with scanned images | Leite | iRex | 5 | 08-18-2008 12:54 PM |