![]() |
#16 | |
Liseur de Bonne Aventure
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 374
Karma: 2176666
Join Date: Sep 2008
Location: Paris, France
Device: PRS T1
|
Quote:
![]() I went on the ADE forum to ask the question there, and here is the answer I got: It's not an issue with the 505, this is expected behavior. When you zoom in on PDF, to contents are "reflowed" - essentially stripping a lot of the formating so we can enlarge the font sizes. Because of technical limitations with PDF files (and the current implementation of the reflow algorithms), we do not reflow across pages, so you will get gaps between pages. Also because of limitations with with the PDF file structure itself (it is not an easily reflowable content), line breaks will appear in odd places. The bits I find interesting: - They talk of the "current implementation" as being a cause, so there might be hope for a brighter future. - They say "we", but what does it mean? I assume it's the PRS software that reflows text, not ADE? I completely understand PDF weren't originally designed for reflow, but I still find it amazing how difficult it is, sometimes impossible, to get a proper html doc from a PDF. Even Acrobat pro can't get it right 90% of the time... ![]() |
|
![]() |
![]() |
![]() |
#17 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 19,832
Karma: 11844413
Join Date: Jan 2007
Location: Tampa, FL USA
Device: Kindle Touch
|
This is the typical "working as designed" answer. I've used it myself as a developer. However, this doesn't mean their design is correct. Why did they decide on this behavior. What makes it better? I would ask them that.
BOb |
![]() |
![]() |
Advert | |
|
![]() |
#18 | |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 11,470
Karma: 13095790
Join Date: Aug 2007
Location: Grass Valley, CA
Device: EB 1150, EZ Reader, Literati, iPad 2 & Air 2, iPhone 7
|
Quote:
Dale |
|
![]() |
![]() |
![]() |
#19 | |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 11,470
Karma: 13095790
Join Date: Aug 2007
Location: Grass Valley, CA
Device: EB 1150, EZ Reader, Literati, iPad 2 & Air 2, iPhone 7
|
Quote:
PDF is based loosely on Post Script and its ilk PageMaker. A document is really a set of glyphs and a set of locations on the page. It looks to you like a book but it is really an illusion. In many cases a word may not even show up with the letters side by side in the database. If the document has been edited then the data can be in a totally different portion of the file in some cases. It can get very messy. If they PDF was created in one clean shot then it will translate pretty easy but not all PDF files were done that way. The ADE group is new and young and they are working hard I believe but they have a steep learning curve. Dale |
|
![]() |
![]() |
![]() |
#20 | |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 19,832
Karma: 11844413
Join Date: Jan 2007
Location: Tampa, FL USA
Device: Kindle Touch
|
Quote:
BOb |
|
![]() |
![]() |
Advert | |
|
![]() |
#21 | |
Liseur de Bonne Aventure
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 374
Karma: 2176666
Join Date: Sep 2008
Location: Paris, France
Device: PRS T1
|
Quote:
By the way, I found one way that seems to be giving good results when converting a PDF into a html: when running ReadIris OCR on a PDF text based document, the result looks quite good. It's a bit ridiculous, considering there's nothing to OCR, but that might prove an effective tool for converting PDFs. It'll do nothing for protected PDFs, obviously. |
|
![]() |
![]() |
![]() |
#22 | |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 11,470
Karma: 13095790
Join Date: Aug 2007
Location: Grass Valley, CA
Device: EB 1150, EZ Reader, Literati, iPad 2 & Air 2, iPhone 7
|
Quote:
Dale |
|
![]() |
![]() |
![]() |
#23 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 860
Karma: 4380
Join Date: Feb 2008
Location: Almada, Portugal
Device: Cybook Gen3, Sony PRS 505, Kindle DXG and Samsung Galaxy Note
|
Yes.
The best solution so far to revert from PDF looks to see it as it was basic designed to be, an exact reproduction of the paper document it mimics. So, one just must treat it as a paper document and, if it’s not protected, apply OCR over it. Omnipage pro 16 and Finereader pro 9 do an outstanding job in this camp. |
![]() |
![]() |
![]() |
#24 |
Liseur de Bonne Aventure
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 374
Karma: 2176666
Join Date: Sep 2008
Location: Paris, France
Device: PRS T1
|
MMMhhh... I guess I can make it work, although it requires quite a bit of work. The recognition works fine, and for text with pictures inlined, without wrap around, it should work fine, certainly better than any other conversion process I've tried so far. Alas, both ReadIris and Omnipage html output include arbitrary page breaks to correspond to the PDF version. Anyone here is knowledgeable enough with either of these software to know if there's an option to remove this behaviour?
|
![]() |
![]() |
![]() |
#25 | |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 11,470
Karma: 13095790
Join Date: Aug 2007
Location: Grass Valley, CA
Device: EB 1150, EZ Reader, Literati, iPad 2 & Air 2, iPhone 7
|
Quote:
Dale |
|
![]() |
![]() |
![]() |
#26 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 860
Karma: 4380
Join Date: Feb 2008
Location: Almada, Portugal
Device: Cybook Gen3, Sony PRS 505, Kindle DXG and Samsung Galaxy Note
|
In Omnipage Pro, and choosing the output to Html 3.2, in the screen to save and choosing formatted text, choose option (right side) and check if insert page breaks is activated, if so deactivate it and save again.
Also I advise you to test with the output options to see the ones that give you better results. |
![]() |
![]() |
![]() |
#27 |
Evangelist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 416
Karma: 14682
Join Date: May 2008
Location: SF Bay Area
Device: Nook HD, Nook for Windows 8
|
We do not reflow across pages to reduce the demands for both memory and processor on mobile devices. For instance, in order to get a proper page count for a reflowed PDF (if we were to reflow across pages), the entire PDF would need to be loaded and rendered.
|
![]() |
![]() |
![]() |
#28 | |
Wizzard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,402
Karma: 2000000
Join Date: Nov 2007
Location: UK
Device: iPad 2, iPhone 6s, Kindle Voyage & Kindle PaperWhite
|
Quote:
If someone really wants the accurate page count, then they would still be able to use the non-reflowed view to locate it. Last edited by gwynevans; 10-17-2008 at 12:56 PM. |
|
![]() |
![]() |
![]() |
#29 | |
Liseur de Bonne Aventure
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 374
Karma: 2176666
Join Date: Sep 2008
Location: Paris, France
Device: PRS T1
|
Quote:
ì am not sure how important ebooks are for Adobe and the PDF format, but if the format is to be successful on the current generation of readers (and probably the following ones, a screen the size of a paperback is what people really want in the end), SOME way has to be found to make the docs appear correctly without complex user's intervention like the one I mention above in the thread... |
|
![]() |
![]() |
![]() |
#30 |
Liseur de Bonne Aventure
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 374
Karma: 2176666
Join Date: Sep 2008
Location: Paris, France
Device: PRS T1
|
By the way, Jim Lester is the person from Adobe who nicely answered when I went to their forum to enquire about reflow. Thanks for taking the time to post here!
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
eBook PDF - free tool for creating PDF eBooks from text files | KACartlidge | 6 | 01-04-2012 09:41 AM | |
【Best PDF Size】I find The reason of slowing When Read PDF file | linlance | Sony Reader | 0 | 03-11-2010 08:13 AM |
Unutterably Silly Help for a color challenged designer wannabe! | Verencat | Lounge | 47 | 07-31-2009 10:43 AM |
Help for Spatially Challenged Writers! | sherryk_us | Writers' Corner | 1 | 06-16-2009 10:47 AM |