View Single Post
Old 03-15-2020, 01:40 AM   #1760
vasilas7
Junior Member
vasilas7 understands when you whisper 'The dog barks at midnight.'vasilas7 understands when you whisper 'The dog barks at midnight.'vasilas7 understands when you whisper 'The dog barks at midnight.'vasilas7 understands when you whisper 'The dog barks at midnight.'vasilas7 understands when you whisper 'The dog barks at midnight.'vasilas7 understands when you whisper 'The dog barks at midnight.'vasilas7 understands when you whisper 'The dog barks at midnight.'vasilas7 understands when you whisper 'The dog barks at midnight.'vasilas7 understands when you whisper 'The dog barks at midnight.'vasilas7 understands when you whisper 'The dog barks at midnight.'vasilas7 understands when you whisper 'The dog barks at midnight.'
 
Posts: 3
Karma: 42956
Join Date: Mar 2019
Device: Kindle 3 Keyboard
Quote:
Originally Posted by willus View Post
Actually, that PDF is very good for processing. It's perfectly straight, consistent from page to page, and very clean. First, I recommend downloading the Greek Tesseract OCR data set and installing it per these instructions. Then you can run one of the following commands depending how large you want the text.

1. Separate each page into two but don't do any text re-flow.
k2pdfopt -grid 2x1 -n- -ocr t -lang grc source.pdf

2. Same but with text re-flow
k2pdfopt -grid 2x1 -fc- -n- -f2p 0 -wrap -ocr t -lang grc source.pdf

3. Even larger text (50% larger with -mag 1.5)
k2pdfopt -grid 2x1 -fc- -n- -f2p 0 -wrap -ocr t -lang grc -mag 1.5 source.pdf

I've attached the results of these three methods for just page 5 of your PDF. You'll notice the text is selectable and searchable, unlike your original.
Thank you so much, Willus. If I select to crop the borders, what I have to do? Because in this pdf file the borders are not exactly in the same position.
vasilas7 is offline   Reply With Quote