Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > PDF

Notices

Reply
 
Thread Tools Search this Thread
Old 03-06-2015, 07:02 AM   #1
norweger
Addict
norweger is faster than slow light.norweger is faster than slow light.norweger is faster than slow light.norweger is faster than slow light.norweger is faster than slow light.norweger is faster than slow light.norweger is faster than slow light.norweger is faster than slow light.norweger is faster than slow light.norweger is faster than slow light.norweger is faster than slow light.
 
Posts: 218
Karma: 29322
Join Date: Mar 2015
Location: Norway
Device: Android-phone, HTC Desire Z
Stripping a file from header text?

Hello,

I prefer reading epub over free websites, so for almost nothing I got this literary guide as an ebook file, but it turns out it was only delivered in pdf, so I'll have to convert it.

But now there's many unwanted elements in the pdf. In the bottom there's the page number, on the right there's a logo, on the left there's a copyright text. I'd like to be left with only relevant text, so the conversion goes smoother (if smooth is a relevant word when dealing with pdf-files).

Here's the file
https://dl.dropboxusercontent.com/u/...animalfarm.pdf

Do you know of an easy way to strip such pdf-files from unwanted elements?
norweger is offline   Reply With Quote
Old 03-06-2015, 11:02 AM   #2
markom
Banned
markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.
 
Posts: 488
Karma: 1080260
Join Date: Sep 2012
Device: sony prs t1 kindle dx ipad
Freeware apps: pdf scissors, briss, k2pdfopt.

http://sourceforge.net/projects/briss/
https://sites.google.com/site/pdfscissors/
http://www.willus.com/k2pdfopt/

k2pdfopt is more powerfull app but in your case it's better to use briss or scissors because they use intuitive visual rectangle boxes for cropping instead of commands.

It takes about 10 seconds to load that 38 page file and ten more seconds for cropping if we know what to do.

Since original cropped file is 15 cm wide we can also try to read it in landscape mode of 6" reader (12 cm width).

To adjust that cropped pdf file for landscape viewing (if our reader is poor or slow at it) we can use k2pdfopt beforehand; just by loading the cropped pdf file into k2pdfopt and choosing fitwidth mode, and after couple of minutes we'll get the third file from attachment.

If letters are very small (as in this case) we can use k2pdfopt's reflow mode instead (default mode, reflow box checked) getting the fourth file from attachment after a couple of minutes.
Attached Files
File Type: pdf animalfarm_cropped.pdf (293.4 KB, 578 views)
File Type: pdf animalfarm_scissored.pdf (274.9 KB, 1734 views)
File Type: pdf animalfarm_cropped_k2opt.pdf (327.4 KB, 998 views)
File Type: pdf animalfarm_cropped_k2optreflow.pdf (7.43 MB, 474 views)

Last edited by markom; 03-06-2015 at 12:38 PM.
markom is offline   Reply With Quote
Advert
Old 03-06-2015, 01:21 PM   #3
norweger
Addict
norweger is faster than slow light.norweger is faster than slow light.norweger is faster than slow light.norweger is faster than slow light.norweger is faster than slow light.norweger is faster than slow light.norweger is faster than slow light.norweger is faster than slow light.norweger is faster than slow light.norweger is faster than slow light.norweger is faster than slow light.
 
Posts: 218
Karma: 29322
Join Date: Mar 2015
Location: Norway
Device: Android-phone, HTC Desire Z
That's great. Thanks. Can you get the last step to run smoothly as well, from pdf to epub?
norweger is offline   Reply With Quote
Old 03-06-2015, 08:03 PM   #4
willus
Fuzzball, the purple cat
willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.willus ought to be getting tired of karma fortunes by now.
 
willus's Avatar
 
Posts: 1,274
Karma: 11087488
Join Date: Jun 2011
Location: California
Device: iPad
Quote:
Originally Posted by markom View Post
... k2pdfopt is more powerfull app but in your case it's better to use briss or scissors because they use intuitive visual rectangle boxes for cropping instead of commands...
The latest release of k2pdfopt (v2.32, MS Windows version) now also can use visual rectangles to apply a cropping box.
willus is offline   Reply With Quote
Old 03-06-2015, 08:17 PM   #5
markom
Banned
markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.
 
Posts: 488
Karma: 1080260
Join Date: Sep 2012
Device: sony prs t1 kindle dx ipad
Quote:
Originally Posted by norweger View Post
That's great. Thanks. Can you get the last step to run smoothly as well, from pdf to epub?
In the case of non-scanned material i.e. original textual(vector) pdf file like this one, conversion to epub should not pose a problem, depending on the tool we use.

Here I used Abbyy 11 to quickly convert cropped animalfarm pdf to epub.

Personally, I always use pdf optimization only, without conversion to epub, because I want the 100% exactness(at recognition and formating) for every sign, what is impossible to achieve automatically and quickly with epub (for the scanned material) because it reflows imperfect OCR text layer only (not pdf image itself), so I'd use reflowed pdf (with k2pdfopt) if letters were to small for comfortable reading on 6" screen.

A5 sized pdf and 2-column A4 pdfs pose no problem even for 6" screen after cropping, so it's usually only one column A4 pdf that should be reflowed for reading on 6" reader in landscape (2 or 3 screens per pdf page).

k2pdfopt app reflows pdf image itself, not just the imperfect OCR layer behind the image, it also retains pictures and tables unlike pdf reflow in e-readers and is usually a lot faster to flip through because e-reader's processor doesn't have to compute reflowing itself but just to show already reflowed page.
Attached Files
File Type: epub animalfarm_cropped_Abbyy.epub (112.9 KB, 430 views)

Last edited by markom; 03-07-2015 at 02:07 AM.
markom is offline   Reply With Quote
Advert
Old 03-07-2015, 02:21 PM   #6
norweger
Addict
norweger is faster than slow light.norweger is faster than slow light.norweger is faster than slow light.norweger is faster than slow light.norweger is faster than slow light.norweger is faster than slow light.norweger is faster than slow light.norweger is faster than slow light.norweger is faster than slow light.norweger is faster than slow light.norweger is faster than slow light.
 
Posts: 218
Karma: 29322
Join Date: Mar 2015
Location: Norway
Device: Android-phone, HTC Desire Z
Beautiful. Thanks!
norweger is offline   Reply With Quote
Old 03-07-2015, 05:56 PM   #7
norweger
Addict
norweger is faster than slow light.norweger is faster than slow light.norweger is faster than slow light.norweger is faster than slow light.norweger is faster than slow light.norweger is faster than slow light.norweger is faster than slow light.norweger is faster than slow light.norweger is faster than slow light.norweger is faster than slow light.norweger is faster than slow light.
 
Posts: 218
Karma: 29322
Join Date: Mar 2015
Location: Norway
Device: Android-phone, HTC Desire Z
I just bought Hitchens Why Orwell Matters, and ran it through Briss, but this strange thing happened. This image–which was not visible when I viewed the original pdf file–suddenly popped up.

Here's the pdf file. Here's the epub.

What I did was to open Briss, not have a rectangle at all for the first 3 pages, and then I ran it through Calibre.

Does the same thing happen when you run it through your scissors?
norweger is offline   Reply With Quote
Old 03-08-2015, 12:07 PM   #8
markom
Banned
markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.markom ought to be getting tired of karma fortunes by now.
 
Posts: 488
Karma: 1080260
Join Date: Sep 2012
Device: sony prs t1 kindle dx ipad
Quote:
Originally Posted by norweger View Post
I just bought Hitchens Why Orwell Matters, and ran it through Briss, but this strange thing happened. This image–which was not visible when I viewed the original pdf file–suddenly popped up.

...

What I did was to open Briss, not have a rectangle at all for the first 3 pages, and then I ran it through Calibre.

Does the same thing happen when you run it through your scissors?
It is the same with pdf-scissors but if you want to get rid of some pages you can easily do that either by using some freeware pdf tool or pdf editor to delete them from pdf or by using Sigil (freeware epub editor) to delete pages from epub afterwards.

When we crop or delete pdf page we don't actually genuinely crop or delete the original pdf page, but just mask it, telling the reader to show just part of the page or not to show it at all.

Deleting a cropped part of the pdf page genuinely isn't easy i.e. straightforward thing even using Adobe Acrobat.

Also as already mentioned, this pdf book as most of the belletristic there is A5 pdf format or smaller, so there is really no need for epub/mobi conversion because it is easily and quickly readable, searchable, annotateable, scribbleable etc. as cropped or zoomed pdf in landscape mode on 6" eink screen, two or three screens per page depending what size of letters we want, and we can get a lot bigger letter size than in the original paper book because 6" screen is 12 cm wide and the text width in this book is 10 cm without margins.

Last edited by markom; 03-10-2015 at 05:40 AM.
markom is offline   Reply With Quote
Old 03-11-2015, 03:49 PM   #9
norweger
Addict
norweger is faster than slow light.norweger is faster than slow light.norweger is faster than slow light.norweger is faster than slow light.norweger is faster than slow light.norweger is faster than slow light.norweger is faster than slow light.norweger is faster than slow light.norweger is faster than slow light.norweger is faster than slow light.norweger is faster than slow light.
 
Posts: 218
Karma: 29322
Join Date: Mar 2015
Location: Norway
Device: Android-phone, HTC Desire Z
I read pretty much all my books on an old phone, with FBReader (not the latest edition, because the phone is too old to handle it), so that's why I prefer I epub.
norweger is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
ePub->pdf: How to narrow space between header and book text? EbokJunkie Conversion 17 01-07-2015 02:17 AM
Getting text length from mobi header. mattst Kindle Formats 7 03-29-2012 06:31 AM
HTML input plugin stripping text within toc tags in child html file nimblebooks Conversion 3 02-21-2012 03:24 PM
HTML to ePub stripping out Content text nimblebooks Conversion 6 02-01-2012 01:50 AM
PDF Conversion - Removing Header / Footer Text heb Sony Reader 9 07-11-2010 11:02 PM


All times are GMT -4. The time now is 03:54 AM.


MobileRead.com is a privately owned, operated and funded community.