Join Date: Nov 2009
Device: sony prs-650!
conversation with sony support re: image>text pdfs
I had an online chat with sony support about pdf support.
My question was: does sony reader support "image over text" pdfs?
"image over text" (aka text under image, same thing) pdfs have a great advantage where you technically get the best of both worlds. You have the original image of the page, to reference, if you think the OCR failed in the text at any point. At the same time, you can use all the highlighting, annotating, dictionary, zooming, and reflow advantages of normal text.
Such pdfs are saved in 'two layers' in the pdf, a jpeg layer (image of original page) and a text layer (ocr'ed text under it).
On the sony, I found that such pdfs do display properly - to a point. The jpeg layer shows normally, and if you crop the page correctly when you make the pdf, and it fills up the screen on the sony, its quite legible either in portrait or landscape mode, just like a normal image-pdf might be. Unlike an image pdf, you can double tap a word in the image, and you'll get the dictionary; you can highlight passages etc and annotate. Its very, very neat.
The limitation here is the small screen size on the sony (and the 7" 900 series doesnt help much cuz as I understand it the width of the screen is the same, and pretty narrow). On a reader with a 9" screen, reading a book this way, with image over text pdfs, should *rock*. (however kindle dx is slow and clunky and i dont want one; i also dont know how it might handle image-over-text pdfs, and i dont want to spend 400 bucks to find out. If anyone knows, please tell me).
And when youre viewing such a pdf on the sony, if you make the font larger, basically the sony switches to the 'text layer' of the pdf, and you're then viewing the OCR'ed text, which reflows normally like any other text document, and you can make the font larger and read it easier.
But this is where we come across the limitation on the sony with these types of pdfs. When viewing it with larger font and viewing the text layer, if you hit 'next page', you DONT get the next page. Instead, you get the jpg layer image. And then a blank page. THEN you get the next page. IN other words, you have to hit 'next page' three times to get to the actual next page of ocr text.
That can get cumbersome. So I got on sony support page online and had a chat with a tech. Transcript is below. For the record, I was connected to a technician *immediately* and in this session i have nothing but praise for sony tech support responsiveness; keep in mind this was a saturday too.
eSupport Chat Transcript
Jose1@Reader > Hi Jay. Welcome to Sony Online Support. I'm Jose. Please allow me a moment to review your concern.
jay > sure
Jose1@Reader > Basically you want to know if the reader supports character under images, Am I correct?
jay > yes, you know how there is a type of pdf called "text under image" (or sometimes its called "image over text", same thing)
jay > does the prs 650 ereader support that?
jay > i loaded one of those pdfs onto the reader, and I found some strange behaviour
jay > basically, it does display on the reader, and you can zoom into the text and everything -- but -- when you zoom in, something strange happens.
Jose1@Reader > Okay, go on please.
jay > when you zoom in, as you scroll down pages, at the end of a page of text, the jpg picture of the page appears (ie, a pdf of this kind is saved in layers, so basically the text layer is displayed, then when you hit 'next page', the jpg layer is displayed)
jay > so the strange behaviour is that, after you read the text, and you hit 'next page', then instead of the next page of text, what you see on the screen is the jpg of the page
jay > you have to hit 'next page' 3 times to get to the next page of text.
jay > (if you dont zoom in, however, then it works normally because the jpg's of the top layer of the pdf are displayed in sequence normally). However most of the time you will want to zoom in because the top layer jpg is too small to read.
jay > So basically I was thinking maybe the ereader does not support the "text under image" style of pdf. wanted to see if you guys had any information about that support.
Jose1@Reader > Okay I will gladly provide you with that information, just let me look for it.
Jose1@Reader > I would like to ask you:
Jose1@Reader > Have you created/converted this files using a converter software?
Jose1@Reader > Or you just downloaded them.
jay > yes, I created the pdf file using abbyy finereader
jay > abbyy finereader gives the option to save the OCR'ed text as a pdf and you can select what type of pdf, and in those options I chose "text under image"
Jose1@Reader > Oh I see, allow me a few seconds while I search for the information.
Jose1@Reader > Thank you for waiting. I can see here that the reader does support this type of files but, due to the fact that the file is saved using multilayer system, one for the text and other for the images, it is normal that the device thinks that one page is the text and the next one is the image and viceversa. And this is because, eventhough you can see the text and the image in the same page, they are saved separately.
jay > ok, but thats not normal display behaviour right? For instance, when i view the file on my pc, the reader on the pc knows that these are two different layers to be displayed as one, and it treats them correctly when I 'turn page'
jay > So is there any setting that I can set when I create the pdf file, so it works correctly on the sony reader?
jay > maybe is there a way to collapse the two layers into one so the sony displays them correctly? (but still allows the text to be selected and annotated and reflowed and zoomed like text)
Jose1@Reader > Well, it might be that this is not a 'normal' behaviour since the reader does not allow to make 1 layer of the 2 and make them work as 1 single page, but it cannot be changed because the reader does not include something like and application or an 'add-on' to modify files.
jay > I understand. In that case I'd like to suggest this as a future feature for the sony reader... basically the "text under image" pdf has a great advantage which is that one can always view a picture of the original page as it really was, while still getting the benefits of reflowable text (with highlighting and dictionary and everything else).
jay > So I think it would be *great* if the sony reader could handle such files gracefully. Maybe the solution is for sony reader to have a feature where the two layers of the pdf can be individually selected when viewing such a pdf.
jay > that way, if one is zoomed into the text, one can just read normally and hit 'next page' normally, and if one needs to reference the jpg layer, then one can maybe hit the 'options' button and switch layers, see what they need to see in the jpg, and then switch back to the text.
jay > that would be *awesome*, and really very very useful, for people who are using the sony reader for their pdf documents.
Jose1@Reader > Yes I understand that this would be very useful and I really thank you for the feedback and believe me that we will really take it into consideration for future updates/releases.
jay > I hope so! Thanks for your help today.
Last edited by ebooker; 11-06-2010 at 02:49 PM.