Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > Workshop

Notices

Reply
 
Thread Tools Search this Thread
Old 03-01-2021, 07:08 AM   #16
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 74,037
Karma: 129333114
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Quote:
Originally Posted by PeterT View Post
The OPs profile links to a web site. He is a copy editor / proof reader / ebook formatter
So why would the OP be asked for help on editing a PDF conversion of an eBook that's already published? TBH, this stinks of piracy.

The OP really needs to come back and explain himself.
JSWolf is offline   Reply With Quote
Old 03-01-2021, 07:28 AM   #17
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by JSWolf View Post
So why would the OP be asked for help on editing a PDF conversion of an eBook that's already published? TBH, this stinks of piracy.
Read the later part of my post more closely. I explained multiple reasons why.

And *gasp*, I do such PDF conversions too!!!

Heck, last year I also did a journal. Many of the original articles "already existed for free as HTML", but whoever did it busted the conversion so bad that lots of the formatting/italics were missing.

Many times, it's easier to go back to original scans/PDF and redo them correctly.

Quote:
Originally Posted by JSWolf View Post
The OP really needs to come back and explain himself.
No. They don't have to explain anything.

Again, stop pushing the inquisition.

If you think it's piracy, silently report to the mods, then FDPuthuff can further explain TO THEM if needed.

Last edited by Tex2002ans; 03-01-2021 at 07:50 AM.
Tex2002ans is offline   Reply With Quote
Advert
Old 03-01-2021, 08:47 AM   #18
FDPuthuff
Avid Learner
FDPuthuff began at the beginning.
 
Posts: 39
Karma: 10
Join Date: Sep 2020
Location: Charleston, SC
Device: Kindle Fire, iPad
WOW! That escalated quickly.



Tex2002ans, Thank you sir for trying to let cooler heads prevail.


Something my website does not mention, yet, is that I also do proof-listening for audio-books. I was sent audio files for a couple chapter of the book, along with PDF files to check the audio against.
.
I have been trying to wrap my head around regex and was just curious how it might be used to save time removing all the page numbers and header stuff which does not get converted into proper header info in the Word file.
.
I appreciate the concern, but it is truly just a question about something I figure I will run into as I am streamlining my process of converting PDFs.
FDPuthuff is offline   Reply With Quote
Old 03-01-2021, 09:41 AM   #19
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 74,037
Karma: 129333114
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
But why the hassle of converting from PDF when there is already a version of this eBook in reflowable format that you can get from Amazon?
JSWolf is offline   Reply With Quote
Old 03-01-2021, 09:48 AM   #20
jhowell
Grand Sorcerer
jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.jhowell ought to be getting tired of karma fortunes by now.
 
jhowell's Avatar
 
Posts: 6,498
Karma: 84420419
Join Date: Nov 2011
Location: Tampa Bay, Florida
Device: Kindles
Quote:
Originally Posted by JSWolf View Post
But why the hassle of converting from PDF when there is already a version of this eBook in reflowable format that you can get from Amazon?
FDPuthuff explained that his task is to compare against the content of the PDF.
jhowell is offline   Reply With Quote
Advert
Old 03-01-2021, 12:08 PM   #21
DNSB
Bibliophagist
DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.
 
DNSB's Avatar
 
Posts: 35,513
Karma: 145557716
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Forma, Clara HD, Lenovo M8 FHD, Paperwhite 4, Tolino epos
Quote:
Originally Posted by JSWolf View Post
But why the hassle of converting from PDF when there is already a version of this eBook in reflowable format that you can get from Amazon?
Perhaps the PDF is an edited version of the book?

Anyhow, since the situation has been explained, time to drop the topic.
DNSB is offline   Reply With Quote
Old 03-01-2021, 02:43 PM   #22
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by FDPuthuff View Post
Tex2002ans, Thank you sir for trying to let cooler heads prevail.


Quote:
Originally Posted by FDPuthuff View Post
I have been trying to wrap my head around regex and was just curious how it might be used to save time removing all the page numbers and header stuff which does not get converted into proper header info in the Word file.
Once you understand the WHY/HOW of regex + thinking in "search by pattern"... you'll be able to work much more efficiently.

For example, instead of doing dozens of individual searches for:
  • Find "Page 123"
  • Find "Page 124"
  • [...]
  • Find "Page 256"
  • Find "Tom" and change to "Smith"
    • Now you'll get "Tomorrow" -> "Smithorrow"!!!
    • And "Tomas's" -> "Smithas's"

Instead, regex lets you search for patterns/categories:
  • Find the word "Page" + followed by any numbers.
    • Regex Search: Page \d+
    • Page = find "Page"
    • \d = any number
    • + = "one or more"
  • Find "Tom" + with or without an apostrophe s.
    • Regex Search: \bTom'*s*\b
    • \b = make sure this is the edge of a word
    • Tom = find "Tom"
    • ' = find the apostrophe
    • * = "zero or more"
    • s = find the "s"
    • * = "zero or more"
    • \b = make sure this is the edge of a word
      • So now it'll only hit "Tom" + "Tom's"
      • and NOT "Tomorrow" + "Tomas"

Word's regex uses slightly different symbols from Calibre/Sigil, but all the same concepts apply.

Quote:
Originally Posted by FDPuthuff View Post
Something my website does not mention, yet, is that I also do proof-listening for audio-books. I was sent audio files for a couple chapter of the book, along with PDF files to check the audio against.
Fantastic, fantastic.

You may also want to check out these audiobook talks given at ebookcraft 2019 (a yearly ebook conference):

(I'll get around to summarizing all the info from these talks one of these days... lol.)

Quote:
Originally Posted by FDPuthuff View Post
I appreciate the concern, but it is truly just a question about something I figure I will run into as I am streamlining my process of converting PDFs.
If you only care about the text... k2pdfopt can crop headers/footers right out of the PDF.

Willus is the master there...

See his program/thread: "k2pdfopt: optimizes PDFs for viewing on e-readers".

He's extremely helpful/responsive, and has helped hundreds (thousands?) of people crop their PDFs.

You could also see some slightly related discussion/tangents in this thread:

But I'd learn the regex method. It'll be infinitely more efficient in the long-run (and applicable to actual editing/copyediting too!).

Quote:
Originally Posted by JSWolf View Post
But why the hassle of converting from PDF when there is already a version of this eBook in reflowable format that you can get from Amazon?
PDF is usually the proof copy.

As another example:

I JUST completed Book #5 for an author... It was supposed to include 1 chapter from each of his previous 4 books (+ new Foreword/Intro).

Somewhere along the line, the ebook vs. print for #1-4 became wildly out of sync.

I call this the great "bifurcation". See my posts in:

The author only proofed the PDFs, and gave lots of wording/format changes here and there.

When I opened the EPUB/MOBIs, they were missing italics, em dashes, commas, bold label in the captions, etc.

So what should've been a simple copy/paste each HTML chapter from #1-4... became a mess.

And the only way to untangle it was to redo everything from the other formats.

Last edited by Tex2002ans; 03-01-2021 at 05:51 PM.
Tex2002ans is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Predefined regex for Regex-function sherman Editor 3 01-19-2020 05:32 AM
Looking for a specific use mr9v9 Which one should I buy? 4 12-20-2016 10:53 AM
Specific use case... caleb72 Calibre 5 07-03-2011 08:35 AM
Help with regex to remove specific strings of numbers adrian1944 Conversion 9 02-14-2011 01:11 PM
regex request for specific header removal cellocgw Calibre 2 04-15-2010 02:42 PM


All times are GMT -4. The time now is 04:08 AM.


MobileRead.com is a privately owned, operated and funded community.