MobileRead Forums - View Single Post

Tex2002ans · 03-01-2021, 07:44 AM

Quote:

Originally Posted by FDPuthuff

I am preparing an MS Word document to be transformed into an E-pub doc.

I have these page numbers and header text I need to erase throughout the document. (As seen in the attached image)

Is there a regex line I could use to find all of these?

See my posts from 2016 in:

"Delete paragraphs in scanned books (S & R with regexes)"

I used regex to remove 5 different variations of "page numbers", leftover headers/footers, and other cruft.

I also broke all the regex down step-by-step + color-coded.

Once you learn the basic concepts, the regex from that thread can be adjusted to fit your specific case.

Quote:

Originally Posted by JSWolf

So yes, helping to fix the PDF conversion is helping the OP to fix a pirated eBook. I suggest not helping.

JSWolf, you really should stop this constant anti-"piracy" inquisition.

Quote:

Originally Posted by phillipgessert

Sorry folks, piracy didn’t even occur to me. What’s the etiquette here, should I remove my answer? Not that it was necessarily a particularly strong one.

Meh. Just answer the question.

If the OP actually links directly to piracy sites, then the mods would deal with it or lock the thread.

If you believe unfounded claims of "piracy", then just ignore the thread.

If you want to be helpful—and who knows who would stumble upon this thread and ALSO have X problem—then I always answer.

Quote:

Originally Posted by phillipgessert

My question is why is a supposed editor converting a PDF conversion into ePub when the eBook already exists on Amazon?

This happens all the time.

Sometimes the only copies left are scans/PDFs, and the author lost (or doesn't have access to) the source files.

For example, the author may have:

1. the original DOCX (maybe, if you're lucky)
3. the final PDF

But they don't have:

2. The source files (InDesign, etc.)

The final PDF is the only proofed copy.

The original + final documents are way too far apart (hundreds/thousands of changes could've occurred between 2->3).

So many times, it's sometimes easiest to work backwards from the PDF.

Quote:

Originally Posted by Turtle91

Yup - I concur with the origin of the image. The book linked on amazon is even available "for free" with kindle unlimited. I'm not sure why someone would be paying an editor for a book that's already published??

I get paid to re-clean older/bad conversions all the time.

If the initial conversion was a disaster, lots of errors were left in (Amazon KQNs, etc. etc.).

(See that absolutely fantastic talk I linked to last year, "Building Ebooks that Last" + discussing cleaning up the backlist.)