11-02-2009, 03:25 PM | #1 |
Maratus speciosus butt
Posts: 3,292
Karma: 1162698
Join Date: Sep 2009
Device: PRS-350
|
PDF conversion help
I'm sure this has been covered before, but I'm not sure what search terms I'd use to dig it up.
Scans of books made by Google, Microsoft, and others seem to be in multiple layers per page, with at least 2 layers that I can see-- one of mostly just text (and other areas of black) and one of mostly just every other color. Done, I assume, to make OCRing easier. When you open a PDF file, you can often see the background layer appear before the text/black is overlaid on it. Which is fine, when you are just viewing it as a PDF. But how do you convert that to other formats and have it look correct? I have a series of public-domain booklets that I gathered together several months ago from various archives (some on Google, some the Internet archives, etc.) They work as PDFs on my Sony Reader, but page changes are slooooooow. So I wanted to convert them all to EPUBs (for myself and to share.) Calibre failed utterly in converting them correctly. I have a full retail copy of an ancient version of Acrobat (version 5) which allows full editing of PDFs, but-- on this old version, at least, when I export the pages they are exporting as their individual layers, with the "black" layer garbled and useless. Anyone have a solution? These are the files: http://www.sendspace.com/file/okec99 |
11-02-2009, 04:14 PM | #2 |
GuteBook/Mobi2IMP Creator
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
|
Though a bit cumbersome, you can use PDFRead 1.8 to convert those scanned .pdf ebooks into either the Sony .lrf format or Mobipocket .prc format (and then use calibre to convert the .prc to .epub).
There is no .epub support within PDFRead, but it can also be made to retain the .html and images used to create the resulting ebooks (using an empty file named "debug" in the PDFRead install directory) and thus that could easily be used with calibre or Sigil to yield a .epub ebook! See the attached samples from "Japanese Fairy Tale Series 01 #01- Momotaro.pdf" noting that the .lrf and .prc were created by PDFRead using two successive "runs" and the .epub was created using calibre from that .prc. It seems PDFRead is a good fit. Have fun converting all of them! |
Advert | |
|
11-03-2009, 01:55 AM | #3 |
Maratus speciosus butt
Posts: 3,292
Karma: 1162698
Join Date: Sep 2009
Device: PRS-350
|
Spent a long time playing with PDFRead, but couldn't find a way to make the images use the full vertical space (and be centered on the horizontal) so I ended up having to export them as individual images, archive them as CBRs, and use comiclrf to make them into LRFs. The LRFs mostly look okay, considering the source material. EPUBS, on the other hand, ended up looking horrible for all of them. But here are the LRFs I made.
http://www.sendspace.com/file/zy6qaj |
01-12-2010, 06:59 AM | #4 |
Junior Member
Posts: 3
Karma: 10
Join Date: Dec 2009
Device: prs-505
|
actually what u could do is use rasterfarian. have a look for manga2ebook and it will put u on its trail. what it does is, manga2ebook can be used to convert cbr/cbz to pdf, though it will mess up the sequence, so calibre does a better job. that pdf is a scanned pdf. rasterfairan is a separate tool, which will rasterize and CONDENSE a pdf. i.e. on a 1.5 GB color pdf from a comicbook, u may happen to be able to obtain a lrf which is abt 200 MB, which is dramatic. speed should be ok then. but the results are not always the same. if u have a BW pdf which is almost without shades already, reduction maybe will only achieve 50% reduction.
|
01-13-2010, 01:22 PM | #5 |
Wizard
Posts: 1,213
Karma: 12890
Join Date: Feb 2009
Location: Amherst, Massachusetts, USA
Device: Sony PRS-505
|
I find that PDFLRF and soPDF work better than either PDFread (sorry Nick) or Rasterfarian for this job -- I'd recommend PDFLRF in particular for this job.
https://www.mobileread.com/forums/showthread.php?t=13135 I should say, Google already offers most of their public domain titles in both ePub and PDF format, though the ePubs are not that great. Not sure why this thread isn't in the PDF forum. |
Advert | |
|
01-13-2010, 02:47 PM | #6 | |
GuteBook/Mobi2IMP Creator
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
|
Quote:
Thanks for recommending it based on your practical ("real world") experience. |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
pdf conversion | terraskye | Calibre | 0 | 10-07-2010 09:46 PM |
Conversion de pdf ? | Cressence | Assistance | 7 | 02-11-2010 07:34 AM |
Conversion PDF | EricGagne | Software | 3 | 10-29-2009 03:19 PM |
PDF Conversion | wamblej | Calibre | 7 | 10-16-2009 08:13 AM |
PDF Conversion Help | Exinferis | Reading and Management | 2 | 06-15-2009 09:11 AM |