View Single Post
Old 05-04-2009, 04:23 PM   #46
dauwhe
<geek type="xml"/>
dauwhe has a complete set of Star Wars action figures.dauwhe has a complete set of Star Wars action figures.dauwhe has a complete set of Star Wars action figures.
 
Posts: 22
Karma: 276
Join Date: Dec 2008
Location: Greenfield, Massachusetts, USA
Device: Kindle, Kindle 2, Sony Reader, iPod Touch
Quote:
Originally Posted by tirsales View Post
Yes - but it should be possible to create XHTML and ePub not from the PDF - but from the original source, shouldnt it?
Or at least possible to extract the text (or have the complete text in advance) and re-format this one...
Things are slightly better with "application" files (InDesign, etc.). If done by a decent typesetter, the split paragraph problem shouldn't happen, for example. But I do remember a book where most of the text appeared twice when first extracted from Quark. The (bad) typesetter had left almost another complete copy of the book "hidden" in a text box. The extraction program dutifully found all the text, whether hidden or not.

Dave
dauwhe is offline   Reply With Quote