Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > Workshop

Notices

Reply
 
Thread Tools Search this Thread
Old 02-26-2021, 12:22 PM   #1
FDPuthuff
Avid Learner
FDPuthuff began at the beginning.
 
Posts: 39
Karma: 10
Join Date: Sep 2020
Location: Charleston, SC
Device: Kindle Fire, iPad
Cool Specific RegEx Help

Hey all my smart peeps,

I have a specific regex question and was told this would be a better part of these boards to ask it. If not, just let me know.

I am preparing an MS Word document to be transformed into an E-pub doc.

I have these page numbers and header text I need to erase throughout the document. (As seen in the attached image)

Is there a regex line I could use to find all of these?
Attached Thumbnails
Click image for larger version

Name:	WordImage.jpg
Views:	281
Size:	159.2 KB
ID:	185645  
FDPuthuff is offline   Reply With Quote
Old 02-26-2021, 01:01 PM   #2
phillipgessert
Addict
phillipgessert ought to be getting tired of karma fortunes by now.phillipgessert ought to be getting tired of karma fortunes by now.phillipgessert ought to be getting tired of karma fortunes by now.phillipgessert ought to be getting tired of karma fortunes by now.phillipgessert ought to be getting tired of karma fortunes by now.phillipgessert ought to be getting tired of karma fortunes by now.phillipgessert ought to be getting tired of karma fortunes by now.phillipgessert ought to be getting tired of karma fortunes by now.phillipgessert ought to be getting tired of karma fortunes by now.phillipgessert ought to be getting tired of karma fortunes by now.phillipgessert ought to be getting tired of karma fortunes by now.
 
phillipgessert's Avatar
 
Posts: 311
Karma: 3196258
Join Date: Oct 2015
Location: Madison, WI
Device: Kindle 5th Gen
I would try something like:

Code:
\n\n\d{1,}\n\n.*[a-z]\n\n
Which is two newlines, + one or more digits, + 2 more newlines, + a string ending on a lowercase letter, + two newlines. It'll take some wrangling to rework that into Word's "regex," but if you rip it into something else that ought to work directly. Take care though if any of those headers end on anything other than a lowercase letter (trailing whitespace, punctuation mark, etc.).

Last edited by phillipgessert; 02-26-2021 at 05:32 PM.
phillipgessert is offline   Reply With Quote
Advert
Old 02-26-2021, 10:37 PM   #3
FDPuthuff
Avid Learner
FDPuthuff began at the beginning.
 
Posts: 39
Karma: 10
Join Date: Sep 2020
Location: Charleston, SC
Device: Kindle Fire, iPad
Thank you sir
FDPuthuff is offline   Reply With Quote
Old 02-27-2021, 02:33 PM   #4
Quoth
the rook, bossing Never.
Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.
 
Quoth's Avatar
 
Posts: 11,173
Karma: 85874891
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper11
Why aren't headers and page numbers not part of the page style and thus easily removed globally with a few seconds editing?

ANY GUI based wordprocessor I've used in nearly 30 years simply had check boxes to untick to delete that stuff. Including Word 2.0a, Office 4.3, Office 95, Office 2000, Office XP, Word 2003 and Word 2007. I still have all those.
Quoth is offline   Reply With Quote
Old 02-27-2021, 04:22 PM   #5
PeterT
Grand Sorcerer
PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.
 
PeterT's Avatar
 
Posts: 12,177
Karma: 73448616
Join Date: Nov 2007
Location: Toronto
Device: Nexus 7, Clara, Touch, Tolino EPOS
I have a hunch that this might be a downloaded DOC of a copyrighted book that someone wants to convert to an ebook.
PeterT is offline   Reply With Quote
Advert
Old 02-27-2021, 04:40 PM   #6
j.p.s
Grand Sorcerer
j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.
 
Posts: 5,285
Karma: 98804578
Join Date: Apr 2011
Device: pb360
Quote:
Originally Posted by PeterT View Post
I have a hunch that this might be a downloaded DOC of a copyrighted book that someone wants to convert to an ebook.
That was my first reaction, but looking at the OP more closely, isn't it more likely a manuscript sent to and editor by a clueless author?
j.p.s is offline   Reply With Quote
Old 02-27-2021, 05:02 PM   #7
DNSB
Bibliophagist
DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.
 
DNSB's Avatar
 
Posts: 35,513
Karma: 145557716
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Forma, Clara HD, Lenovo M8 FHD, Paperwhite 4, Tolino epos
Looking at the text snippet in the OP's message, it appears to match with text in The Flight of the Griffin by C. M. Gray which was copyrighted in 2012 and close to text in An Uncommon Evening by Gecko posted in 2004.

Last edited by DNSB; 02-27-2021 at 05:04 PM.
DNSB is offline   Reply With Quote
Old 02-27-2021, 06:01 PM   #8
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 74,044
Karma: 129333562
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Quote:
Originally Posted by PeterT View Post
I have a hunch that this might be a downloaded DOC of a copyrighted book that someone wants to convert to an ebook.
It looks like a PDF conversion from a book titled The Flight of the Griffin by C.M. Gray. It's available on Kindle Unlimited. So yes, helping to fix the PDF conversion is helping the OP to fix a pirated eBook. I suggest not helping.

Here is the first chapter (legally).
http://pirategrl1014.blogspot.com/20...f-griffin.html

Here is the link to the eBook on Amazon US.
https://www.amazon.com/dp/B007TKUD7Y

My question is why is a supposed editor converting a PDF conversion into ePub when the eBook already exists on Amazon?

Last edited by JSWolf; 02-27-2021 at 06:15 PM.
JSWolf is offline   Reply With Quote
Old 02-27-2021, 07:06 PM   #9
phillipgessert
Addict
phillipgessert ought to be getting tired of karma fortunes by now.phillipgessert ought to be getting tired of karma fortunes by now.phillipgessert ought to be getting tired of karma fortunes by now.phillipgessert ought to be getting tired of karma fortunes by now.phillipgessert ought to be getting tired of karma fortunes by now.phillipgessert ought to be getting tired of karma fortunes by now.phillipgessert ought to be getting tired of karma fortunes by now.phillipgessert ought to be getting tired of karma fortunes by now.phillipgessert ought to be getting tired of karma fortunes by now.phillipgessert ought to be getting tired of karma fortunes by now.phillipgessert ought to be getting tired of karma fortunes by now.
 
phillipgessert's Avatar
 
Posts: 311
Karma: 3196258
Join Date: Oct 2015
Location: Madison, WI
Device: Kindle 5th Gen
Sorry folks, piracy didn’t even occur to me. What’s the etiquette here, should I remove my answer? Not that it was necessarily a particularly strong one.
phillipgessert is offline   Reply With Quote
Old 02-28-2021, 06:44 AM   #10
Quoth
the rook, bossing Never.
Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.
 
Quoth's Avatar
 
Posts: 11,173
Karma: 85874891
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper11
Well, that explains the stupidity of not editing the source. The OP just needs to buy the real ebook.
Quoth is offline   Reply With Quote
Old 02-28-2021, 07:07 AM   #11
PeterT
Grand Sorcerer
PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.
 
PeterT's Avatar
 
Posts: 12,177
Karma: 73448616
Join Date: Nov 2007
Location: Toronto
Device: Nexus 7, Clara, Touch, Tolino EPOS
The OPs profile links to a web site. He is a copy editor / proof reader / ebook formatter
PeterT is offline   Reply With Quote
Old 02-28-2021, 06:19 PM   #12
Turtle91
A Hairy Wizard
Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.
 
Turtle91's Avatar
 
Posts: 3,101
Karma: 18727053
Join Date: Dec 2012
Location: Charleston, SC today
Device: iPhone 11/X/6/iPad 1,2,Air & Air Pro/Surface Pro/Kindle PW & Fire
Quote:
Originally Posted by DNSB View Post
Looking at the text snippet in the OP's message, it appears to match with text in The Flight of the Griffin by C. M. Gray which was copyrighted in 2012 and close to text in An Uncommon Evening by Gecko posted in 2004.
Yup - I concur with the origin of the image. The book linked on amazon is even available "for free" with kindle unlimited. I'm not sure why someone would be paying an editor for a book that's already published??
Turtle91 is offline   Reply With Quote
Old 02-28-2021, 07:03 PM   #13
j.p.s
Grand Sorcerer
j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.
 
Posts: 5,285
Karma: 98804578
Join Date: Apr 2011
Device: pb360
In light of DNSB's finding, things certainly look suspicious.
j.p.s is offline   Reply With Quote
Old 03-01-2021, 06:44 AM   #14
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by FDPuthuff View Post
I am preparing an MS Word document to be transformed into an E-pub doc.

I have these page numbers and header text I need to erase throughout the document. (As seen in the attached image)

Is there a regex line I could use to find all of these?
See my posts from 2016 in:

"Delete paragraphs in scanned books (S & R with regexes)"

I used regex to remove 5 different variations of "page numbers", leftover headers/footers, and other cruft.

I also broke all the regex down step-by-step + color-coded.

Once you learn the basic concepts, the regex from that thread can be adjusted to fit your specific case.

Quote:
Originally Posted by JSWolf View Post
So yes, helping to fix the PDF conversion is helping the OP to fix a pirated eBook. I suggest not helping.
JSWolf, you really should stop this constant anti-"piracy" inquisition.

Quote:
Originally Posted by phillipgessert View Post
Sorry folks, piracy didn’t even occur to me. What’s the etiquette here, should I remove my answer? Not that it was necessarily a particularly strong one.
Meh. Just answer the question.

If the OP actually links directly to piracy sites, then the mods would deal with it or lock the thread.

If you believe unfounded claims of "piracy", then just ignore the thread.

If you want to be helpful—and who knows who would stumble upon this thread and ALSO have X problem—then I always answer.

Quote:
Originally Posted by phillipgessert View Post
My question is why is a supposed editor converting a PDF conversion into ePub when the eBook already exists on Amazon?
This happens all the time.

Sometimes the only copies left are scans/PDFs, and the author lost (or doesn't have access to) the source files.

For example, the author may have:

1. the original DOCX (maybe, if you're lucky)
3. the final PDF

But they don't have:

2. The source files (InDesign, etc.)

The final PDF is the only proofed copy.

The original + final documents are way too far apart (hundreds/thousands of changes could've occurred between 2->3).

So many times, it's sometimes easiest to work backwards from the PDF.

Quote:
Originally Posted by Turtle91 View Post
Yup - I concur with the origin of the image. The book linked on amazon is even available "for free" with kindle unlimited. I'm not sure why someone would be paying an editor for a book that's already published??
I get paid to re-clean older/bad conversions all the time.

If the initial conversion was a disaster, lots of errors were left in (Amazon KQNs, etc. etc.).

(See that absolutely fantastic talk I linked to last year, "Building Ebooks that Last" + discussing cleaning up the backlist.)

Last edited by Tex2002ans; 03-01-2021 at 07:48 AM.
Tex2002ans is offline   Reply With Quote
Old 03-01-2021, 07:04 AM   #15
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 74,044
Karma: 129333562
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
It's not an anti-piracy inquisition. I found the eBook is already published at Amazon and available to either buy or borrow from Kindle Unlimited. Why would an editor be asked for help to edit an eBook that's already edited/published and available? If this isn't piracy, what is?

I did not ask if this was a pirated eBook. I did not make any claims without evidence. I found a sample chapter of this eBook and then traced the full eBook to Amazon.

Last edited by JSWolf; 03-01-2021 at 07:06 AM.
JSWolf is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Predefined regex for Regex-function sherman Editor 3 01-19-2020 05:32 AM
Looking for a specific use mr9v9 Which one should I buy? 4 12-20-2016 10:53 AM
Specific use case... caleb72 Calibre 5 07-03-2011 08:35 AM
Help with regex to remove specific strings of numbers adrian1944 Conversion 9 02-14-2011 01:11 PM
regex request for specific header removal cellocgw Calibre 2 04-15-2010 02:42 PM


All times are GMT -4. The time now is 09:39 AM.


MobileRead.com is a privately owned, operated and funded community.