11-23-2010, 08:47 AM | #1 |
Zealot
Posts: 123
Karma: 998177
Join Date: Aug 2010
Device: Kindle 3
|
PDF conversion ignores images and cropping
Before i put (text based) e-book PDF on my Kindle i crop all the margins/whitespace and headers (repeated chapter names and/or author) and footers (page numbers).
Basically I leave just a few pixels around the text. That way i get the best PDF reading experience. Today, for the first time, i tried converting a PDF to MOBI & EPUB with Calibre. While I was AMAZED by the quality of the conversion there are two main issues i noticed rightaway: 1. No Images. I wonder why? 2. All the header and footer texts are back. I hope that this can be fixed. Kovid, if you need the PDF file i am using, let me know. Cheers |
11-23-2010, 09:07 AM | #2 |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
That's because the "cropped" pdf is really just a pdf with the cutoff material hidden. You can tell your pdf editor to remove the cropped material and then it will be gone, or you can remove headers and footers during conversion.
|
Advert | |
|
11-23-2010, 09:24 AM | #3 |
US Navy, Retired
Posts: 9,864
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Nexus 7
|
That is because the cropped area was hidden not removed. I often crop PDFs using Adobe then export the cropped PDF as HTML. Once exported I then use calibre to convert the html to epub. All Images and most links work fine and the cropped area is gone completely.
|
11-23-2010, 12:25 PM | #4 | |
Zealot
Posts: 123
Karma: 998177
Join Date: Aug 2010
Device: Kindle 3
|
I am fully aware that cropping does not actually remove anything from pdf, just resizes the view area.
Quote:
All PDF viewers i am aware off display cropped documents correctly. Some of those viewers support (cropping-aware) reflow. That means that apparently there is some API that provides PDF content depending on cropping. Which in turn, i hope, means that rather than being something impossible to implement, this is just a feature that is not implemented in Calibre yet. So, once again "I wonder why?" & "I hope that this can be fixed." |
|
11-23-2010, 12:35 PM | #5 |
creator of calibre
Posts: 43,850
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Implementing cropping in a viewer is easy, you just draw the whole page and crop out the part of the page specified by the cropbox. Implementing cropping in a text extraction tool is not nearly that easy.
Patches welcome. |
Advert | |
|
11-23-2010, 12:36 PM | #6 |
Wizard
Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
|
|
11-23-2010, 05:20 PM | #7 |
Zealot
Posts: 123
Karma: 998177
Join Date: Aug 2010
Device: Kindle 3
|
Because Adobe and other programs can do it, i was hoping it would be a well known Open Source library. Do you think you'd ever be interested in implementing this and image extraction functionality?
|
11-23-2010, 05:25 PM | #8 |
creator of calibre
Posts: 43,850
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Who knows, I don't make open ended commitments like that. If at some point I get interested in PDF, maybe. But most likely not, since I have no personal motivation to work on PDF.
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
PDF cropping software: BRISS | laborg | 331 | 08-18-2023 08:30 AM | |
.rtf conversion bug - cropping characters. | cybmole | Calibre | 5 | 11-18-2010 02:12 AM |
PDF to EPUP conversion after page cropping | Naismith | Calibre | 6 | 03-09-2010 08:37 AM |
cropping pdf with preview | wang960 | Sony Reader | 2 | 05-05-2009 09:28 AM |
Yet another PDF cropping tool | sjvr767 | iRex | 7 | 02-14-2009 07:04 AM |