View Single Post
Old 11-19-2018, 09:40 AM   #3
gg4u
Junior Member
gg4u understands when you whisper 'The dog barks at midnight.'gg4u understands when you whisper 'The dog barks at midnight.'gg4u understands when you whisper 'The dog barks at midnight.'gg4u understands when you whisper 'The dog barks at midnight.'gg4u understands when you whisper 'The dog barks at midnight.'gg4u understands when you whisper 'The dog barks at midnight.'gg4u understands when you whisper 'The dog barks at midnight.'gg4u understands when you whisper 'The dog barks at midnight.'gg4u understands when you whisper 'The dog barks at midnight.'gg4u understands when you whisper 'The dog barks at midnight.'gg4u understands when you whisper 'The dog barks at midnight.'
 
Posts: 7
Karma: 42206
Join Date: Nov 2018
Device: Kindle 8
Hi Jps,

Quote:
Have you tried using asciidoc (or the compatible asciidoctor) to convert the text into epub?
whose option is asciidoc ? tessearct ? pandoc?

Quote:
You have to apply some very light markup to the text to designate chapters. In return you automatically get hyperlinked table of contents.
So I should do it manually on the .txt file after tesseract , right? Or can tessearct or pandoc guess right markup from white spaces between paragraphs?

Could you tell which is right markup?

I would like to KEEP IMAGES from ghostscript, like k2dpfopt attempts to do.

Does tessearact allow to keep images (or have some options to detect images and skip them from OCR processing) ?
gg4u is offline   Reply With Quote