Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Conversion

Notices

Reply
 
Thread Tools Search this Thread
Old 03-15-2012, 03:48 PM   #1
Lbooker
Addict
Lbooker ought to be getting tired of karma fortunes by now.Lbooker ought to be getting tired of karma fortunes by now.Lbooker ought to be getting tired of karma fortunes by now.Lbooker ought to be getting tired of karma fortunes by now.Lbooker ought to be getting tired of karma fortunes by now.Lbooker ought to be getting tired of karma fortunes by now.Lbooker ought to be getting tired of karma fortunes by now.Lbooker ought to be getting tired of karma fortunes by now.Lbooker ought to be getting tired of karma fortunes by now.Lbooker ought to be getting tired of karma fortunes by now.Lbooker ought to be getting tired of karma fortunes by now.
 
Posts: 316
Karma: 1021312
Join Date: Jun 2009
Device: Sony PRS-T1
Are things going ahead on the PDF-to-html/xml/epub front?

Are things going ahead on the PDF-to-html front ?
We all have scores of PDF files on our computers that never get read because we have no convenient way to read them besides printing them (not ecological, and no room to store hundreds of them), and converting them to a clean epub/mobi file is not an available option. I tried MobiCreator, and I only get a 20% success rate ; it is way outdated.
You guys finishing programming a very good PDF-to-html conversion engine (and then to epub/mobi/whatever) would be a giant leap for humanity.
If you are not motivated and need an incentive, you can start a new thread asking for donations/Amazon gifts/whatever. Or I can do that for you if you are too shy !

Last edited by Lbooker; 03-19-2012 at 08:14 AM.
Lbooker is offline   Reply With Quote
Old 03-15-2012, 10:22 PM   #2
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 79,758
Karma: 145864619
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
There is no way to convert a complex or text based PDF of some decent length without errors.
JSWolf is offline   Reply With Quote
Old 03-16-2012, 04:32 AM   #3
roffLOL
Member
roffLOL once ate a cherry pie in a record 7 seconds.roffLOL once ate a cherry pie in a record 7 seconds.roffLOL once ate a cherry pie in a record 7 seconds.roffLOL once ate a cherry pie in a record 7 seconds.roffLOL once ate a cherry pie in a record 7 seconds.roffLOL once ate a cherry pie in a record 7 seconds.roffLOL once ate a cherry pie in a record 7 seconds.roffLOL once ate a cherry pie in a record 7 seconds.roffLOL once ate a cherry pie in a record 7 seconds.roffLOL once ate a cherry pie in a record 7 seconds.roffLOL once ate a cherry pie in a record 7 seconds.
 
roffLOL's Avatar
 
Posts: 10
Karma: 1538
Join Date: Sep 2011
Location: Sweden
Device: Sony PRS-350
I've been busy with studies and work for several of months. Lastly I worked on a conversion for a horrible PDF file.

Moderator Notice
removed link to copyrighted material. don't do this again


I'm pretty sure that when I get a nice result from it the code should be able to handle just about any case with grace. But I'm not there yet. Will try to pull myself together soon.

Last edited by theducks; 03-16-2012 at 10:24 AM. Reason: Warning
roffLOL is offline   Reply With Quote
Old 03-16-2012, 07:52 AM   #4
Lbooker
Addict
Lbooker ought to be getting tired of karma fortunes by now.Lbooker ought to be getting tired of karma fortunes by now.Lbooker ought to be getting tired of karma fortunes by now.Lbooker ought to be getting tired of karma fortunes by now.Lbooker ought to be getting tired of karma fortunes by now.Lbooker ought to be getting tired of karma fortunes by now.Lbooker ought to be getting tired of karma fortunes by now.Lbooker ought to be getting tired of karma fortunes by now.Lbooker ought to be getting tired of karma fortunes by now.Lbooker ought to be getting tired of karma fortunes by now.Lbooker ought to be getting tired of karma fortunes by now.
 
Posts: 316
Karma: 1021312
Join Date: Jun 2009
Device: Sony PRS-T1
Quote:
Originally Posted by JSWolf View Post
There is no way to convert a complex or text based PDF of some decent length without errors.
Reducing the error percentage of these conversions is a rational challenge for the human mind.


Quote:
Originally Posted by roffLOL View Post
I've been busy with studies and work for several of months. Lastly I worked on a conversion for a horrible PDF file.


removed link to copyrighted material

I'm pretty sure that when I get a nice result from it the code should be able to handle just about any case with grace. But I'm not there yet. Will try to pull myself together soon.
Great news ! Do you know if the other members of the calibre team are still actively pursuing the same effort ? And do you plan to use some of the GPL code out there on the web, like this one :
http://pdftohtml.sourceforge.net/
Check his demo ! His code manages to convert a complex document nicely.
I just found out pdftohtml is now part of poppler-utils. I played with it, turned a pdf into hundreds of html files, but calibre will not accept them as one book. I also turned this pdf into an xml file, but calibre does not accept xml as input.
Well, with the -s and -i option, I managed to create a single html file that calibre converted into epub, but the outcome is no better that what calibre would have directly done with the pdf file.
So the problem lies in the conversion from html to epub.

Last edited by Lbooker; 03-16-2012 at 01:28 PM.
Lbooker is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
PDF to HTML Mamaijee Conversion 3 05-21-2011 08:25 AM
PDF->EPUB only displays 'Front section' bsmart Conversion 2 05-14-2011 05:44 AM
PDF to HTML Mamaijee Calibre 3 10-01-2010 12:45 AM
Strange things happening w/PDF ebook Writernan Astak EZReader 5 04-05-2010 06:36 PM
converting epub to pdf: can't get the front page image in Nicoo Calibre 3 12-05-2009 08:18 AM


All times are GMT -4. The time now is 12:27 PM.


MobileRead.com is a privately owned, operated and funded community.