View Single Post
Old 09-18-2019, 10:08 AM   #2
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 31,094
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
That looks like it was a PDF conversion that has more than the usual from PDF problems (see sticky in the Conversion forum).
IMHO If some place sold it to you like that, return it and ask for your money back.


If you converted from a PDF you bought (the bit about reading the sticky still applies), you really will need to learn basic REGEX (<<that is a clue to search MR for tutorials. )
From your example, this is a very dirty case. Even if it was a fairly clean case. You are looking at least a dozen passes (varied search terms) for the removal. And a bunch more to do the 'joins' after the removal.

IMHO 90% of those are Search->Eyeball the find -decide Replace Next or Search (skip the replace). Repeat for the term till the end. New term. Do it all again.
This could take less than 1hr if you are experienced with REGEX term crafting.

Then you do your 'Join' set of S&R's to clean broken paragraphs.

OTOH If this was a OCR scan PDF, there are going to be tons of random errors that turn that hour into much longer because you need to do the find BY all the variations
theducks is offline   Reply With Quote