MobileRead Forums
Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > PDF

Welcome to the MobileRead Forums.

You are currently viewing our boards as a guest which gives you limited access to view most discussions and access our other features. By joining our free community today, you will have fewer ads, access to post topics, communicate privately with other members, respond to polls, upload content and access many other special features.

If you have any problems with the registration process or your account login, please contact us.

Hint: Don't have time to visit us daily? Subscribe to our main RSS feed to receive our frontpage posts at your convenience.

Notices

PDF Adobe PDF is commonly used to distribute print content; usually not the ideal format for smaller displays

Reply
 
Thread Tools Search this Thread Display Modes
Old 01-01-2009, 10:10 AM   #1
TadW
Uebermensch
TadW can illuminate an eclipseTadW can illuminate an eclipseTadW can illuminate an eclipseTadW can illuminate an eclipseTadW can illuminate an eclipseTadW can illuminate an eclipseTadW can illuminate an eclipseTadW can illuminate an eclipseTadW can illuminate an eclipseTadW can illuminate an eclipseTadW can illuminate an eclipse
 
TadW's Avatar
 
Posts: 2,467
Karma: 8144
Join Date: Jul 2003
Location: Italy
Device: Kindle
How to Do Everything with PDF Files

The following article gives a good overview over what you can do with PDF files (without using the expensive Adobe Acrobat):

http://www.labnol.org/software/adobe...tutorial/6296/
__________________
"It doesn't matter how good or bad the product is, the fact is that people don't read anymore." - Apple's Steve Jobs, Jan 15, 2008
TadW is offline   Reply With Quote
Old 01-01-2009, 10:24 AM   #2
Nate the great
Sir Penguin of Edinburgh
Nate the great can load mercury with a pitchforkNate the great can load mercury with a pitchforkNate the great can load mercury with a pitchforkNate the great can load mercury with a pitchforkNate the great can load mercury with a pitchforkNate the great can load mercury with a pitchforkNate the great can load mercury with a pitchforkNate the great can load mercury with a pitchforkNate the great can load mercury with a pitchforkNate the great can load mercury with a pitchforkNate the great can load mercury with a pitchfork
 
Nate the great's Avatar
 
Posts: 7,287
Karma: 48386
Join Date: Apr 2007
Location: Northern Virginia
Device: Airpanel 100, Jornada 720, Kindle, Smart Q7, Zelda has Sony 600 & 700
good find
Nate the great is offline   Reply With Quote
Old 01-01-2009, 10:30 AM   #3
xianfox
Ebook Addict
xianfox is no ebook tyro.xianfox is no ebook tyro.xianfox is no ebook tyro.xianfox is no ebook tyro.xianfox is no ebook tyro.xianfox is no ebook tyro.xianfox is no ebook tyro.xianfox is no ebook tyro.xianfox is no ebook tyro.
 
xianfox's Avatar
 
Posts: 200
Karma: 1266
Join Date: Jul 2003
Location: Appleton, Wisconsin, USA
Device: AT&T Tilt, Sony PRS-700
Thanks, some of that will come in handy at work.
__________________
Ebook Device History: Pilot 1000 -> Palm Pilot Professional -> Palm IIIxe -> Handspring Visor Prism -> Toshiba e310 -> Toshiba e740 -> PPC6700 -> Treo 700wx -> iRex iLiad V2 -> Sony PRS-700
xianfox is offline   Reply With Quote
Old 01-01-2009, 10:45 AM   #4
JSWolf
Mobile Reader Geek
JSWolf has a certain pleonastic somethingJSWolf has a certain pleonastic somethingJSWolf has a certain pleonastic somethingJSWolf has a certain pleonastic somethingJSWolf has a certain pleonastic somethingJSWolf has a certain pleonastic somethingJSWolf has a certain pleonastic somethingJSWolf has a certain pleonastic somethingJSWolf has a certain pleonastic somethingJSWolf has a certain pleonastic somethingJSWolf has a certain pleonastic something
 
JSWolf's Avatar
 
Posts: 16,132
Karma: 18503
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Sony Reader PRS-505
But I notice no good way to convert from PDF.
__________________
Jon



If you want to listen to really good music while you surf Mobileread, here.
JSWolf is offline   Reply With Quote
Old 01-02-2009, 06:10 PM   #5
smithno
Zealot
smithno doesn't littersmithno doesn't litter
 
Posts: 117
Karma: 166
Join Date: Jul 2008
Location: Tenn., US
Device: Sony PRS-505, Eee PC, Touch, EZ Reader, jetBook
Quote:
Originally Posted by JSWolf View Post
But I notice no good way to convert from PDF.
PDF was designed as an output format. It will probably never be easy to manipulate.
smithno is offline   Reply With Quote
Old 01-02-2009, 07:57 PM   #6
RWood
Technogeezer
RWood knows the way to San Jose.RWood knows the way to San Jose.RWood knows the way to San Jose.RWood knows the way to San Jose.RWood knows the way to San Jose.RWood knows the way to San Jose.RWood knows the way to San Jose.RWood knows the way to San Jose.RWood knows the way to San Jose.RWood knows the way to San Jose.RWood knows the way to San Jose.
 
RWood's Avatar
 
Posts: 7,194
Karma: 54344
Join Date: Nov 2006
Location: Virginia, USA
Device: Sony PRS-500
Quote:
Originally Posted by JSWolf View Post
But I notice no good way to convert from PDF.
"Good" is a matter of conjecture Jon, the article suggests that "You can upload the PDF document to Zamzar and convert it any formats like doc, html, png, txt or rtf (rich text format). Alternatively, you can convert PDF to HTML using Gmail."

I have used ABBYY PDF Transformer 2.0, ABC Amber PDF Converter, Paperport, and several other packages over the years. There is not one solution for all cases and the correct choice depends on the specific PDF in question, the tools on your computer, what tools are currently available for free, what you tools you can get in a functioning trial copy, and how much money you are willing to spend on new tools.

While I am not the biggest fan of PDF for ebooks, PDFs have their place and I have created PDF files for the Sony Reader where I felt they were the best option.
__________________
We've all gotten crazier to keep from going sane.
RWood is offline   Reply With Quote
Old 01-03-2009, 03:57 AM   #7
alexxx
Junior Member
alexxx began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Aug 2006
Device: nokia770
too many of the options proposed in the article involve the uploading of your document to some server.
Call me paranoid, but I don't like at all this kind of "services" - I want my documents to stay on <my> server.
Apart from that, under linux (which is not mentioned at all in the article) software exists to do practically any kind of conversion you need.



alessandro
alexxx is offline   Reply With Quote
Old 01-03-2009, 05:52 AM   #8
Flinx
Member
Flinx has learned how to buy an e-book online
 
Posts: 16
Karma: 80
Join Date: Sep 2006
Device: Cybook Gen3
Quote:
Originally Posted by alexxx View Post
Apart from that, under linux (which is not mentioned at all in the article) software exists to do practically any kind of conversion you need.
alessandro
Really? I did search for one and have found no Linux program at all that tries to convert from PDF to floating text with attributes and with paragraph recognition. The only program that generates useful output I could find is PdfGrabber, but I am still interested in a better solution.
Flinx is offline   Reply With Quote
Old 01-03-2009, 05:24 PM   #9
bookbinder
Connoisseur
bookbinder has learned how to read e-booksbookbinder has learned how to read e-booksbookbinder has learned how to read e-booksbookbinder has learned how to read e-booksbookbinder has learned how to read e-booksbookbinder has learned how to read e-booksbookbinder has learned how to read e-books
 
bookbinder's Avatar
 
Posts: 61
Karma: 813
Join Date: Jun 2007
Location: Ukraine
Device: Sony Reader 505
google books

I have a few scanned google books in pdf that I'm having a hard time converting to text, even following advice from the article. Has anyone done this successfully? I've tried:
-Zamzar (returns an unopenable doc file)
-Google mail (doesn't display pdf as html)
-Pdf2Word program
bookbinder is offline   Reply With Quote
Old 01-04-2009, 02:46 AM   #10
labnol
PDF Geek
labnol began at the beginning.
 
labnol's Avatar
 
Posts: 1
Karma: 10
Join Date: Jan 2009
Device: none
Use Google

Quote:
Originally Posted by bookbinder View Post
I have a few scanned google books in pdf that I'm having a hard time converting to text, even following advice from the article. Has anyone done this successfully?
You can upload the scanned PDF files to a public web server, link those files from web page and then wait for google bots to index those PDF. See complete instructions.
labnol is offline   Reply With Quote
Old 01-04-2009, 07:52 AM   #11
Flinx
Member
Flinx has learned how to buy an e-book online
 
Posts: 16
Karma: 80
Join Date: Sep 2006
Device: Cybook Gen3
Quote:
Originally Posted by labnol View Post
...wait for google bots to index those PDF.
The linked example shows why this way is essentially useless. The resulting text has line breaks on each line. A good converter for books has to try to set a line break only at the end of a paragraph.
Flinx is offline   Reply With Quote
Old 01-04-2009, 09:59 AM   #12
tompe
Wizard
tompe got an A in P-Chem.tompe got an A in P-Chem.tompe got an A in P-Chem.tompe got an A in P-Chem.tompe got an A in P-Chem.tompe got an A in P-Chem.tompe got an A in P-Chem.tompe got an A in P-Chem.tompe got an A in P-Chem.tompe got an A in P-Chem.tompe got an A in P-Chem.
 
Posts: 4,219
Karma: 6356
Join Date: Oct 2007
Location: Linköpng, Sweden
Device: Cybook Gen3
Quote:
Originally Posted by Flinx View Post
The linked example shows why this way is essentially useless. The resulting text has line breaks on each line. A good converter for books has to try to set a line break only at the end of a paragraph.
Really not true at all. You can also use the convention that two line breaks in a row indicates a new paragraph like TeX and LaTeX do. It is trivial to convert between the two conventions using some simple program or a one line script.
__________________
DRM is EVIL!
/Tommy Persson
tompe is offline   Reply With Quote
Old 01-04-2009, 02:24 PM   #13
Flinx
Member
Flinx has learned how to buy an e-book online
 
Posts: 16
Karma: 80
Join Date: Sep 2006
Device: Cybook Gen3
Quote:
Originally Posted by tompe View Post
Really not true at all. You can also use the convention that two line breaks in a row indicates a new paragraph
No, that is not really useful for the most standard PDFs. The text object in a PDF file does not contain a real line break. It contains the position where on the page it has to drawn and a number of characters. The result is a line of text.
The progam that makes the conversion has to estimate from the positions of the text objects in which order the lines come. Simple converters like the most available (including Acrobat) use one text object, convert it to text and set a line break at the end, resulting in one line of the output text. The better converters can try to join the separate text objects, if their horizontal start position is identical and the line is long enough. But this is a difficult job, and I have not yet found a program that works good enough for me.
Flinx is offline   Reply With Quote
Old 01-04-2009, 02:51 PM   #14
tompe
Wizard
tompe got an A in P-Chem.tompe got an A in P-Chem.tompe got an A in P-Chem.tompe got an A in P-Chem.tompe got an A in P-Chem.tompe got an A in P-Chem.tompe got an A in P-Chem.tompe got an A in P-Chem.tompe got an A in P-Chem.tompe got an A in P-Chem.tompe got an A in P-Chem.
 
Posts: 4,219
Karma: 6356
Join Date: Oct 2007
Location: Linköpng, Sweden
Device: Cybook Gen3
Quote:
Originally Posted by Flinx View Post
No, that is not really useful for the most standard PDFs. The text object in a PDF file does not contain a real line break. It contains the position where on the page it has to drawn and a number of characters. The result is a line of text.
The progam that makes the conversion has to estimate from the positions of the text objects in which order the lines come. Simple converters like the most available (including Acrobat) use one text object, convert it to text and set a line break at the end, resulting in one line of the output text. The better converters can try to join the separate text objects, if their horizontal start position is identical and the line is long enough. But this is a difficult job, and I have not yet found a program that works good enough for me.
That might be the case but there is no functional different between encoding paragraphs with two line breaks or one. What you are talking about is how go a converter is detecting a paragraph break but that has no necessary connection to how the encoding is done. You can argue that you loose information if you do not keep the line breaks in a paragraph since they are impossible to recreate but it is trivial to take a paragraph specified by using double line breaks and convert it to one line.
__________________
DRM is EVIL!
/Tommy Persson
tompe is offline   Reply With Quote
Old 01-05-2009, 05:28 AM   #15
stonehat
Re-Iliadist
stonehat began at the beginning.
 
Posts: 6
Karma: 10
Join Date: Oct 2008
Device: none
From TFA:
"Most mobile phones can read PDF files."

I stopped reading after that.
stonehat is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Cover for pdf files? junior Bookeen Cybook 4 09-08-2008 08:51 PM
PDF Files. AndyCapon iRex iLiad 16 06-20-2008 08:09 PM
Bookeen and .pdf files pathfinderca Bookeen Cybook 10 05-15-2008 05:27 PM
...just for pdf files? sharp21 Which one should I buy? 32 10-17-2007 12:26 PM
Large PDF Files Almagne iRex iLiad 3 01-17-2007 12:27 PM


All times are GMT -4. The time now is 08:54 AM.


MobileRead.com is a privately owned, operated and funded community.