![]() |
#1 |
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 316
Karma: 1021312
Join Date: Jun 2009
Device: Sony PRS-T1
|
Are things going ahead on the PDF-to-html/xml/epub front?
Are things going ahead on the PDF-to-html front ?
We all have scores of PDF files on our computers that never get read because we have no convenient way to read them besides printing them (not ecological, and no room to store hundreds of them), and converting them to a clean epub/mobi file is not an available option. I tried MobiCreator, and I only get a 20% success rate ; it is way outdated. You guys finishing programming a very good PDF-to-html conversion engine (and then to epub/mobi/whatever) would be a giant leap for humanity. ![]() If you are not motivated and need an incentive, you can start a new thread asking for donations/Amazon gifts/whatever. Or I can do that for you if you are too shy ! ![]() Last edited by Lbooker; 03-19-2012 at 08:14 AM. |
![]() |
![]() |
![]() |
#2 |
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 79,758
Karma: 145864619
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
There is no way to convert a complex or text based PDF of some decent length without errors.
|
![]() |
![]() |
![]() |
#3 |
Member
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 10
Karma: 1538
Join Date: Sep 2011
Location: Sweden
Device: Sony PRS-350
|
I've been busy with studies and work for several of months. Lastly I worked on a conversion for a horrible PDF file.
Moderator Notice removed link to copyrighted material. don't do this again I'm pretty sure that when I get a nice result from it the code should be able to handle just about any case with grace. But I'm not there yet. Will try to pull myself together soon. Last edited by theducks; 03-16-2012 at 10:24 AM. Reason: Warning |
![]() |
![]() |
![]() |
#4 | ||
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 316
Karma: 1021312
Join Date: Jun 2009
Device: Sony PRS-T1
|
Quote:
Quote:
http://pdftohtml.sourceforge.net/ Check his demo ! His code manages to convert a complex document nicely. I just found out pdftohtml is now part of poppler-utils. I played with it, turned a pdf into hundreds of html files, but calibre will not accept them as one book. I also turned this pdf into an xml file, but calibre does not accept xml as input. Well, with the -s and -i option, I managed to create a single html file that calibre converted into epub, but the outcome is no better that what calibre would have directly done with the pdf file. So the problem lies in the conversion from html to epub. Last edited by Lbooker; 03-16-2012 at 01:28 PM. |
||
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
PDF to HTML | Mamaijee | Conversion | 3 | 05-21-2011 08:25 AM |
PDF->EPUB only displays 'Front section' | bsmart | Conversion | 2 | 05-14-2011 05:44 AM |
PDF to HTML | Mamaijee | Calibre | 3 | 10-01-2010 12:45 AM |
Strange things happening w/PDF ebook | Writernan | Astak EZReader | 5 | 04-05-2010 06:36 PM |
converting epub to pdf: can't get the front page image in | Nicoo | Calibre | 3 | 12-05-2009 08:18 AM |