View Single Post
Old 04-06-2015, 02:38 AM   #1031
RTL
Member
RTL exercises by bench pressing the entire Harry Potter series in hardcoverRTL exercises by bench pressing the entire Harry Potter series in hardcoverRTL exercises by bench pressing the entire Harry Potter series in hardcoverRTL exercises by bench pressing the entire Harry Potter series in hardcoverRTL exercises by bench pressing the entire Harry Potter series in hardcoverRTL exercises by bench pressing the entire Harry Potter series in hardcoverRTL exercises by bench pressing the entire Harry Potter series in hardcoverRTL exercises by bench pressing the entire Harry Potter series in hardcoverRTL exercises by bench pressing the entire Harry Potter series in hardcoverRTL exercises by bench pressing the entire Harry Potter series in hardcoverRTL exercises by bench pressing the entire Harry Potter series in hardcover
 
Posts: 17
Karma: 16138
Join Date: Mar 2015
Device: none
Quote:
Originally Posted by willus View Post
Yes--you can PM questions to me if you like, but it may be that there are options that can help you. Can you post or PM me a link to your source PDF?
Dear Willus,

Thanks for your reply.

Let me describe the book I need to read with the help of k2pdfopt:

(1) It is a big book, around 600 pages, and it is a scanned pdf book.
(2) I need most of the pages, but not all.
(3) It is a secure PDF file, which I can not extract any pages.
(4) It is a bilingual book (source and target translation)
(5) The source text is framed in a solid lined box and it is placed on a quarter of a given page, either on upper left or upper right, depending on the odd and even page number. The target translation text flows around the framed source text.

Now my goal is:

(1) I do not need all of the pages, but most of them.
(2) I do not need the source text, I want to discard it and I only need the target text.

What I have got so far:

As the pages are divided neither left-right nor upper-buttom sections, the k2pdfopt program can not simply extract the target text. For the upper part of the page, there is no reflow, it outputs as it is. For the buttom part (which there is no distraction of framed source text), it can reflow quite well, but not very satisfactorily.

And the marked-up functionality is very good and useful. But I need some interactive functionality, such as human confirmation of using a page or not, and subdividing the page space to pick-up the useful area and discard the useless area, etc. Though it is laborous and time consuming, it is worth the time and effort for some really important and useful books.
Attached Thumbnails
Click image for larger version

Name:	page-format.png
Views:	276
Size:	2.4 KB
ID:	136883  
RTL is offline   Reply With Quote