View Single Post
Old 01-25-2016, 11:23 AM   #1
1v4n0
Groupie
1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.1v4n0 writes the songs that make the whole world sing.
 
Posts: 173
Karma: 40000
Join Date: Oct 2013
Device: kindle
Post pdf to doc: best way?

Hello guys.

What is the best and (possibly) easiest way to convert from pdf (text) to doc/odt/rft? And I mean with the line breaks in the right places too, not just saving to doc from acrobat. The final goal is an epub.

I did it this way (acrobat pro->save as doc) once, and then I bulk-corrected all the line breaks with perfect epub. Not sure if I missed anything.

EDIT apparently I did. It doesn't undo line breaks where the line ends with punctuation and starts with a guillemet «, and it erases the dash where the new line starts with one (this is probably due to prefectepub's hypenation regex). Also, it doesn't undo line breaks if a page ends with a period and the next one starts with a capital letter.

EDIT2 acrobat pro->save as HTML does a pretty good job.

One other time I passed the pdf through finereader, but it was more complicated.

Thanks.

Last edited by 1v4n0; 02-14-2017 at 08:38 AM.
1v4n0 is offline   Reply With Quote