![]() |
#1 |
Custom User Title
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 10,696
Karma: 74203799
Join Date: Oct 2018
Location: Canada
Device: Kobo Libra H2O, formerly Aura HD
|
Archive.org PDF scans on Kobo
I have a couple of PDF books from Archive.org/Open Library. (The ePubs provided are often not so great.)
They use some sort of layering/transparency for compression reasons, which I presume is the cause of various issues reading it on my e-Ink Kobo. Mostly it's just slowness in turning pages, which I can deal with. But then sometimes pages render badly (just the background layer, no visible text) or not at all. (Could it be running out of video memory? Does an e-Ink device even have video memory?) I've been mucking around with trying to 'fix' the PDF file itself, but is there anything that can be done in Kobo to help? ![]() Last edited by ownedbycats; 01-04-2021 at 04:42 PM. |
![]() |
![]() |
![]() |
#2 |
Running with scissors
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,583
Karma: 14328510
Join Date: Nov 2019
Device: none
|
The layering that I'm aware of is to provide the scanned OCR'd text. If you open the pdf in SumatraPDF and go to its File menu (under the 3 bar thing at the top left) and select Save As, you can change the Save as Type from PDF documents to Text documents. Then you'll save the OCR'd text; most likely the same text that Archive.org provides for downloading when you download the text format.
Not all PDFs have this second invisible layer; one way to tell is when you move your mouse over the text where on my system the pointer changes from an arrow to the text selection I bar, where you can select and copy the text. But you're selecting and copying from that invisible OCR'd text layer. If there's no invisible text layer the mouse pointer doesn't change. Years ago when Adobe first introduced this text layer they gave as an example how you could scan an entirely hand written page/document and manually create the invisible text layer and add it to the PDF, no doubt placing the words so that they lined up with the underlying scanned document. At the time, and I still am, I was very impressed with this ability. Everyone else seems to take it as commonplace. I guess I'm easily impressed. |
![]() |
![]() |
![]() |
#3 |
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 79,206
Karma: 145458580
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
The PDF are going to be a very poor reading experience on your Aura HD even if they worked properly. Don't bother. Not worth the hassle.
|
![]() |
![]() |
![]() |
#4 |
Custom User Title
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 10,696
Karma: 74203799
Join Date: Oct 2018
Location: Canada
Device: Kobo Libra H2O, formerly Aura HD
|
Yes, I've seen the invisible text layer on some PDFs from other sources. From what I recall, the Internet Archive uses LuraTech's brand of mixed-raster content compression. Basically there's several different images and the text all layered on each page. It's pretty efficient in filesize, but slow to render and can look pretty terrible if done poorly.
Last edited by ownedbycats; 01-04-2021 at 06:01 PM. |
![]() |
![]() |
![]() |
Tags |
internet archive |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
archive.org downloads | abrogard | Calibre | 2 | 08-11-2018 06:08 PM |
Archive.org | crutledge | General Discussions | 129 | 08-28-2015 06:22 AM |
A useful trick with archive.org to get .pdf files | rolgiati | Deals and Resources (No Self-Promotion or Affiliate Links) | 7 | 01-11-2013 09:56 AM |
Archive.org book on Kobo question | cpl625 | Kobo Reader | 4 | 10-28-2011 06:41 AM |
Archive.org can't read any d/led PDF | rakista | enTourage Archive | 5 | 05-04-2010 07:01 PM |