Originally Posted by AnemicOak
With Acrobat Pro I usually save as HTML or RTF, which usually allows formatting like italics to be kept.
I hadn't used Acrobat's save to HTML in a while. Previous versions weren't very good - thus part of the "nightmare of converting". But I have a new(er) version (10) and tried it out.
It is fairly clean...better than before...and you are right it saves bold and italics...but it still has some issues. On this very simple test page there are several formatting discrepancies that would need to be fixed...not impossible with search and replace, but very time consuming. I would be hesitant to try anything more complex or longer than a simple page or two.
Sample OCR text.html