MobileRead Forums - View Single Post - PDF -> HTML conversion

roffLOL · 09-30-2011, 05:57 AM

You wouldn't be interested in a PDF -> HTML converter? I'm currently developing one. For single page (one page per page, not those documents with double columns), justified PDF documents it will be able to:

retain:
Fonts
Paragraphs and indentations
alignment
PDF's general logical structure with TOC
[graphics]

remove:
Page numbering
[possibly header and footer]

However, I have developed this library out of need, and as such, will not develop it further as soon as I get it working for the case described (single page, justified PDF document).

Current status is 90% finished. Only 90% development time left, in other words

Say, a month.

The library is in pure python (2.6?).

//Humble greetings,
roffLOL

09-30-2011, 05:57 AM	#1
roffLOL Member Posts: 10 Karma: 1538 Join Date: Sep 2011 Location: Sweden Device: Sony PRS-350	PDF -> HTML conversion You wouldn't be interested in a PDF -> HTML converter? I'm currently developing one. For single page (one page per page, not those documents with double columns), justified PDF documents it will be able to: retain: Fonts Paragraphs and indentations alignment PDF's general logical structure with TOC [graphics] remove: Page numbering [possibly header and footer] However, I have developed this library out of need, and as such, will not develop it further as soon as I get it working for the case described (single page, justified PDF document). Current status is 90% finished. Only 90% development time left, in other words Say, a month. The library is in pure python (2.6?). //Humble greetings, roffLOL Last edited by roffLOL; 09-30-2011 at 09:17 AM. Reason: Clarification. Written before coffee o'clock.