02-26-2021, 12:22 PM | #1 |
Avid Learner
Posts: 39
Karma: 10
Join Date: Sep 2020
Location: Charleston, SC
Device: Kindle Fire, iPad
|
Specific RegEx Help
Hey all my smart peeps,
I have a specific regex question and was told this would be a better part of these boards to ask it. If not, just let me know. I am preparing an MS Word document to be transformed into an E-pub doc. I have these page numbers and header text I need to erase throughout the document. (As seen in the attached image) Is there a regex line I could use to find all of these? |
02-26-2021, 01:01 PM | #2 |
Addict
Posts: 311
Karma: 3196258
Join Date: Oct 2015
Location: Madison, WI
Device: Kindle 5th Gen
|
I would try something like:
Code:
\n\n\d{1,}\n\n.*[a-z]\n\n Last edited by phillipgessert; 02-26-2021 at 05:32 PM. |
Advert | |
|
02-26-2021, 10:37 PM | #3 |
Avid Learner
Posts: 39
Karma: 10
Join Date: Sep 2020
Location: Charleston, SC
Device: Kindle Fire, iPad
|
Thank you sir
|
02-27-2021, 02:33 PM | #4 |
the rook, bossing Never.
Posts: 11,173
Karma: 85874891
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper11
|
Why aren't headers and page numbers not part of the page style and thus easily removed globally with a few seconds editing?
ANY GUI based wordprocessor I've used in nearly 30 years simply had check boxes to untick to delete that stuff. Including Word 2.0a, Office 4.3, Office 95, Office 2000, Office XP, Word 2003 and Word 2007. I still have all those. |
02-27-2021, 04:22 PM | #5 |
Grand Sorcerer
Posts: 12,177
Karma: 73448616
Join Date: Nov 2007
Location: Toronto
Device: Nexus 7, Clara, Touch, Tolino EPOS
|
I have a hunch that this might be a downloaded DOC of a copyrighted book that someone wants to convert to an ebook.
|
Advert | |
|
02-27-2021, 04:40 PM | #6 |
Grand Sorcerer
Posts: 5,285
Karma: 98804578
Join Date: Apr 2011
Device: pb360
|
|
02-27-2021, 05:02 PM | #7 |
Bibliophagist
Posts: 35,513
Karma: 145557716
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Forma, Clara HD, Lenovo M8 FHD, Paperwhite 4, Tolino epos
|
Looking at the text snippet in the OP's message, it appears to match with text in The Flight of the Griffin by C. M. Gray which was copyrighted in 2012 and close to text in An Uncommon Evening by Gecko posted in 2004.
Last edited by DNSB; 02-27-2021 at 05:04 PM. |
02-27-2021, 06:01 PM | #8 | |
Resident Curmudgeon
Posts: 74,044
Karma: 129333562
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
Quote:
Here is the first chapter (legally). http://pirategrl1014.blogspot.com/20...f-griffin.html Here is the link to the eBook on Amazon US. https://www.amazon.com/dp/B007TKUD7Y My question is why is a supposed editor converting a PDF conversion into ePub when the eBook already exists on Amazon? Last edited by JSWolf; 02-27-2021 at 06:15 PM. |
|
02-27-2021, 07:06 PM | #9 |
Addict
Posts: 311
Karma: 3196258
Join Date: Oct 2015
Location: Madison, WI
Device: Kindle 5th Gen
|
Sorry folks, piracy didn’t even occur to me. What’s the etiquette here, should I remove my answer? Not that it was necessarily a particularly strong one.
|
02-28-2021, 06:44 AM | #10 |
the rook, bossing Never.
Posts: 11,173
Karma: 85874891
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper11
|
Well, that explains the stupidity of not editing the source. The OP just needs to buy the real ebook.
|
02-28-2021, 07:07 AM | #11 |
Grand Sorcerer
Posts: 12,177
Karma: 73448616
Join Date: Nov 2007
Location: Toronto
Device: Nexus 7, Clara, Touch, Tolino EPOS
|
The OPs profile links to a web site. He is a copy editor / proof reader / ebook formatter
|
02-28-2021, 06:19 PM | #12 |
A Hairy Wizard
Posts: 3,101
Karma: 18727053
Join Date: Dec 2012
Location: Charleston, SC today
Device: iPhone 11/X/6/iPad 1,2,Air & Air Pro/Surface Pro/Kindle PW & Fire
|
Yup - I concur with the origin of the image. The book linked on amazon is even available "for free" with kindle unlimited. I'm not sure why someone would be paying an editor for a book that's already published??
|
02-28-2021, 07:03 PM | #13 |
Grand Sorcerer
Posts: 5,285
Karma: 98804578
Join Date: Apr 2011
Device: pb360
|
In light of DNSB's finding, things certainly look suspicious.
|
03-01-2021, 06:44 AM | #14 | |||||
Wizard
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
Quote:
"Delete paragraphs in scanned books (S & R with regexes)" I used regex to remove 5 different variations of "page numbers", leftover headers/footers, and other cruft. I also broke all the regex down step-by-step + color-coded. Once you learn the basic concepts, the regex from that thread can be adjusted to fit your specific case. Quote:
Quote:
If the OP actually links directly to piracy sites, then the mods would deal with it or lock the thread. If you believe unfounded claims of "piracy", then just ignore the thread. If you want to be helpful—and who knows who would stumble upon this thread and ALSO have X problem—then I always answer. Quote:
Sometimes the only copies left are scans/PDFs, and the author lost (or doesn't have access to) the source files. For example, the author may have: 1. the original DOCX (maybe, if you're lucky) 3. the final PDF But they don't have: 2. The source files (InDesign, etc.) The final PDF is the only proofed copy. The original + final documents are way too far apart (hundreds/thousands of changes could've occurred between 2->3). So many times, it's sometimes easiest to work backwards from the PDF. Quote:
If the initial conversion was a disaster, lots of errors were left in (Amazon KQNs, etc. etc.). (See that absolutely fantastic talk I linked to last year, "Building Ebooks that Last" + discussing cleaning up the backlist.) Last edited by Tex2002ans; 03-01-2021 at 07:48 AM. |
|||||
03-01-2021, 07:04 AM | #15 |
Resident Curmudgeon
Posts: 74,044
Karma: 129333562
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
It's not an anti-piracy inquisition. I found the eBook is already published at Amazon and available to either buy or borrow from Kindle Unlimited. Why would an editor be asked for help to edit an eBook that's already edited/published and available? If this isn't piracy, what is?
I did not ask if this was a pirated eBook. I did not make any claims without evidence. I found a sample chapter of this eBook and then traced the full eBook to Amazon. Last edited by JSWolf; 03-01-2021 at 07:06 AM. |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Predefined regex for Regex-function | sherman | Editor | 3 | 01-19-2020 05:32 AM |
Looking for a specific use | mr9v9 | Which one should I buy? | 4 | 12-20-2016 10:53 AM |
Specific use case... | caleb72 | Calibre | 5 | 07-03-2011 08:35 AM |
Help with regex to remove specific strings of numbers | adrian1944 | Conversion | 9 | 02-14-2011 01:11 PM |
regex request for specific header removal | cellocgw | Calibre | 2 | 04-15-2010 02:42 PM |