Quote:
Originally Posted by NovelFan
The books already have a text layer, I don't need OCR.
PS:
"I convert ebooks professionally."
How does one make money with that?
|
Umm... in a lot of cases, the text layer is done by OCR to allow searching. If you extract the text layer, in >90% of the PDFs I've looked at, it is total crap and will require way too much work for me to do unpaid. Even with PDFs that are text based, the conversion tends to leave a lot of artifacts which need to be manually cleaned up. Items such as kerned letter pairs and ligatures tend to have a habit of disappearing with some conversions (suddenly pallet becomes pa et for instance).