Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > PDF

Notices

Reply
 
Thread Tools Search this Thread
Old 07-12-2009, 09:10 PM   #1
Ham88
Zealot
Ham88 has learned how to read e-booksHam88 has learned how to read e-booksHam88 has learned how to read e-booksHam88 has learned how to read e-booksHam88 has learned how to read e-booksHam88 has learned how to read e-booksHam88 has learned how to read e-booksHam88 has learned how to read e-books
 
Ham88's Avatar
 
Posts: 134
Karma: 994
Join Date: Apr 2009
Location: Maine, United States
Device: Ectaco Jetbook
Question Preserving Formatting through conversion?

There may be a thread on this already but I couldn't find one, so here it is. I have downloaded a book that is only available in PDF, so far as I could find anyway. So I went and converted it to epub through calibre. The formatting is entirely messed up, for example all of the words with two l's in it have lost one of them (call is now cal), and all of the dialog between characters has just become one paragraph. So basically my question is there a way to preserve the formatting through conversion without having to go through the file manually and fix it?
Ham88 is offline   Reply With Quote
Old 07-12-2009, 09:38 PM   #2
Kostas
Still wondering why
Kostas has learned how to read e-booksKostas has learned how to read e-booksKostas has learned how to read e-booksKostas has learned how to read e-booksKostas has learned how to read e-booksKostas has learned how to read e-booksKostas has learned how to read e-books
 
Kostas's Avatar
 
Posts: 253
Karma: 800
Join Date: Jun 2009
Location: Athens, Greece
Device: PRS 505, (BlackBerry Bold ?)
Hi Ham88,

Indeed, it's a strange behavior. Calibre usually gives very good results (in lrf conversion which is the format I use). Maybe your pdf source is not a "normal" text based file.
My only suggestion (I'm pretty sure other expereienced members will give you more) would be to give another try with the following online converter:
http://www.lib2go.com/

I have tested for lrfs and it gives fair results.
Good luck!
Kostas is offline   Reply With Quote
Old 07-12-2009, 09:55 PM   #3
Ham88
Zealot
Ham88 has learned how to read e-booksHam88 has learned how to read e-booksHam88 has learned how to read e-booksHam88 has learned how to read e-booksHam88 has learned how to read e-booksHam88 has learned how to read e-booksHam88 has learned how to read e-booksHam88 has learned how to read e-books
 
Ham88's Avatar
 
Posts: 134
Karma: 994
Join Date: Apr 2009
Location: Maine, United States
Device: Ectaco Jetbook
Lib2go hates my files, as it claims they are too big and I noticed something else its based on calibre so I would get the same results anyway, so any other ideas?
Ham88 is offline   Reply With Quote
Old 07-12-2009, 09:58 PM   #4
Timoleon
Time Enough at Last
Timoleon ought to be getting tired of karma fortunes by now.Timoleon ought to be getting tired of karma fortunes by now.Timoleon ought to be getting tired of karma fortunes by now.Timoleon ought to be getting tired of karma fortunes by now.Timoleon ought to be getting tired of karma fortunes by now.Timoleon ought to be getting tired of karma fortunes by now.Timoleon ought to be getting tired of karma fortunes by now.Timoleon ought to be getting tired of karma fortunes by now.Timoleon ought to be getting tired of karma fortunes by now.Timoleon ought to be getting tired of karma fortunes by now.Timoleon ought to be getting tired of karma fortunes by now.
 
Timoleon's Avatar
 
Posts: 385
Karma: 1151316
Join Date: Feb 2008
Location: New England
Device: iPad 3, iPhone 5, Kindle 3, Fire, Sony PRS-350
How about running it through soPDF first, and then letting Calibre tackle it?
Timoleon is offline   Reply With Quote
Old 07-12-2009, 10:07 PM   #5
Ham88
Zealot
Ham88 has learned how to read e-booksHam88 has learned how to read e-booksHam88 has learned how to read e-booksHam88 has learned how to read e-booksHam88 has learned how to read e-booksHam88 has learned how to read e-booksHam88 has learned how to read e-booksHam88 has learned how to read e-books
 
Ham88's Avatar
 
Posts: 134
Karma: 994
Join Date: Apr 2009
Location: Maine, United States
Device: Ectaco Jetbook
I just looked up soPDF, I may be wrong but it appears to be a command line driven which is something I avoid because I'm incompetent with the lovely command line. But I'm currently using PDFread and am hoping this will yield the results that I want. The only problem is that this requires me to be patient, something that I lack. If this doesn't work I'll try soPDF and hope for the, best thanks for the help so far.
Ham88 is offline   Reply With Quote
Old 07-13-2009, 12:08 AM   #6
frabjous
Wizard
frabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameter
 
frabjous's Avatar
 
Posts: 1,213
Karma: 12890
Join Date: Feb 2009
Location: Amherst, Massachusetts, USA
Device: Sony PRS-505
I wrote a barebones GUI for SoPDF. If you go further down in the SoPDF thread, you'll find discussion of it, and a download link.

I'm not sure SoPDF tool is the right tool here, however. PDFread is a good one to try. Even though you're not using a Sony, running PDFLRF on it, and then running that through calibre may give you a good result too.
frabjous is offline   Reply With Quote
Old 07-13-2009, 02:19 AM   #7
doreenjoy
01000100 01001010
doreenjoy ought to be getting tired of karma fortunes by now.doreenjoy ought to be getting tired of karma fortunes by now.doreenjoy ought to be getting tired of karma fortunes by now.doreenjoy ought to be getting tired of karma fortunes by now.doreenjoy ought to be getting tired of karma fortunes by now.doreenjoy ought to be getting tired of karma fortunes by now.doreenjoy ought to be getting tired of karma fortunes by now.doreenjoy ought to be getting tired of karma fortunes by now.doreenjoy ought to be getting tired of karma fortunes by now.doreenjoy ought to be getting tired of karma fortunes by now.doreenjoy ought to be getting tired of karma fortunes by now.
 
doreenjoy's Avatar
 
Posts: 1,858
Karma: 293916
Join Date: Mar 2009
Device: Polyamorous
I have the same problem when using Calibre to convert PDFs: all the paragraph breaks are lost, and I end up with one long long text file. I have yet to find a good PDF to LRF or PDF to ePUB converter.
doreenjoy is offline   Reply With Quote
Old 07-13-2009, 09:17 PM   #8
Ham88
Zealot
Ham88 has learned how to read e-booksHam88 has learned how to read e-booksHam88 has learned how to read e-booksHam88 has learned how to read e-booksHam88 has learned how to read e-booksHam88 has learned how to read e-booksHam88 has learned how to read e-booksHam88 has learned how to read e-books
 
Ham88's Avatar
 
Posts: 134
Karma: 994
Join Date: Apr 2009
Location: Maine, United States
Device: Ectaco Jetbook
All of the things I have tried has failed, soPDF worsened the problem as the eventual epub file was only one or two words per line on average. I think I'm going to have to read this book as a PDF on my PC unfortunately. Thanks for the help.
Ham88 is offline   Reply With Quote
Old 07-18-2009, 06:12 AM   #9
HarryT
eBook Enthusiast
HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.
 
HarryT's Avatar
 
Posts: 65,018
Karma: 43118253
Join Date: Nov 2006
Location: UK
Device: Kindle Voyage, iPad Mini, iPhone 4, MS Surface Pro, N7
Calibre is an excellent program, but PDF conversion is not one of its strengths. Try "Book Designer" - it converts PDFs as well as anything I've come across.

Unfortunately, however, PDF files sometimes simply cannot be converted well. A PDF files does not contain "text" - it has no paragraphs, lines, words, etc; just drawing instructions for individual letters. As such, it is extraordinarily difficult to convert.
HarryT is online now   Reply With Quote
Old 07-19-2009, 07:01 AM   #10
DDHarriman
Guru
DDHarriman can extract oil from cheeseDDHarriman can extract oil from cheeseDDHarriman can extract oil from cheeseDDHarriman can extract oil from cheeseDDHarriman can extract oil from cheeseDDHarriman can extract oil from cheeseDDHarriman can extract oil from cheeseDDHarriman can extract oil from cheeseDDHarriman can extract oil from cheese
 
Posts: 854
Karma: 1200
Join Date: Feb 2008
Location: Almada, Portugal
Device: Cybook Gen3, Sony PRS 505, Kindle DXG and Samsung Galaxy Note
Hi

My advice is to OCR the PDF file, save the result in a format you will be able to edit (per example Microsoft word), proof read it and correct the errors found, create final eBook in ePub or other form your reader can handle.

Two of the best software applications for OCR are, Finereader and Omnipage.

Best regards,
DDHarriman is offline   Reply With Quote
Old 07-19-2009, 12:06 PM   #11
frabjous
Wizard
frabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameter
 
frabjous's Avatar
 
Posts: 1,213
Karma: 12890
Join Date: Feb 2009
Location: Amherst, Massachusetts, USA
Device: Sony PRS-505
Quote:
Originally Posted by DDHarriman View Post
Hi

My advice is to OCR the PDF file, save the result in a format you will be able to edit (per example Microsoft word), proof read it and correct the errors found, create final eBook in ePub or other form your reader can handle.
I don't think that falls under the category of "Preserving format through conversion" "without having to go through the file manually and fix it", which is what was asked for.

I continue to maintain that PDFread and/or PDRLRF>Calibre are the best ways to go to preserve the look of the original PDF, while formatting better for an e-ink device.
frabjous is offline   Reply With Quote
Old 07-19-2009, 02:14 PM   #12
DDHarriman
Guru
DDHarriman can extract oil from cheeseDDHarriman can extract oil from cheeseDDHarriman can extract oil from cheeseDDHarriman can extract oil from cheeseDDHarriman can extract oil from cheeseDDHarriman can extract oil from cheeseDDHarriman can extract oil from cheeseDDHarriman can extract oil from cheeseDDHarriman can extract oil from cheese
 
Posts: 854
Karma: 1200
Join Date: Feb 2008
Location: Almada, Portugal
Device: Cybook Gen3, Sony PRS 505, Kindle DXG and Samsung Galaxy Note
Hi

Frabjous

You are perfectly correct, I missed that the final intent from Ham88 was to:
“(…)to preserve the formatting through conversion without having to go through the file manually and fix it?”

Ham88

Rephrasing, the answer to the question (cited above) you have posed is: no!

Best regards,
DDHarriman is offline   Reply With Quote
Old 07-19-2009, 02:18 PM   #13
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 37,884
Karma: 18755150
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Sony Reader PRS-650, iPad, nook STR
There is no such thing as a novel length PDF that will convert from PDF to any other format without errors.
JSWolf is offline   Reply With Quote
Old 07-19-2009, 03:28 PM   #14
frabjous
Wizard
frabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameter
 
frabjous's Avatar
 
Posts: 1,213
Karma: 12890
Join Date: Feb 2009
Location: Amherst, Massachusetts, USA
Device: Sony PRS-505
The tools I suggested convert the PDF pages to images, but then remove the margins and cut up the images into manageable chunks, and also change the file format--not that matters much if it's just a sequence of images. I'm not sure what you have in mind by "errors", but they should preserve the look more or less exactly, and there would be nothing to manually fix. The only question is whether the results would look nice enough for the purposes of reading.
frabjous is offline   Reply With Quote
Old 07-19-2009, 03:37 PM   #15
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 37,884
Karma: 18755150
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Sony Reader PRS-650, iPad, nook STR
Quote:
Originally Posted by frabjous View Post
The tools I suggested convert the PDF pages to images, but then remove the margins and cut up the images into manageable chunks, and also change the file format--not that matters much if it's just a sequence of images. I'm not sure what you have in mind by "errors", but they should preserve the look more or less exactly, and there would be nothing to manually fix. The only question is whether the results would look nice enough for the purposes of reading.
I was meaning converting the PDF to some reflowable format and not actually images. It cannot be done without error.
JSWolf is offline   Reply With Quote
Reply

Tags
calibre, conversion, pdf

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Preserving <br /> on epub -> txt conversion billingd Calibre 1 08-11-2010 07:24 AM
[KOBO] Strip existing formatting to apply my own default formatting to all books digital_steve Calibre 2 08-10-2010 07:34 PM
Need help formatting HTML for good conversion ficbot Calibre 2 04-15-2010 10:36 PM
TXT conversion to ePub or LRF - paragraph formatting Zapped Calibre 6 10-23-2009 06:06 PM
Preserving TOC upon conversion from Lit to Mobi mobelby Calibre 0 07-31-2009 08:59 AM


All times are GMT -4. The time now is 05:10 AM.


MobileRead.com is a privately owned, operated and funded community.