Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 02-22-2023, 12:58 AM   #1
F4in7_
Junior Member
F4in7_ began at the beginning.
 
Posts: 7
Karma: 10
Join Date: Feb 2023
Device: none
Question Need help in adding a few steps in my Calibre clean-up routine.

I use Calibre to remove publisher ads, ensuring ellipses are consistent and formatted properly, and making tags consistent for some of my saved searches to work properly. However, I do have some things that I'm trying to figure out but can't make it work.

- Changing the first few words at the beginning of a chapter (not the chapter name itself) from uppercase to capitalize. Sometimes page breaks tend to have this too, so it would be great to be able to detect them using regular expressions or other methods.
For example, "I'LL FIGURE SOMETHING out." to "I'll figure something out."
- Ensuring quotation marks aren't missing and actually paired to each other.
For example, What now?” to “What now?”.

Finally, this is a bit of a stretch, but I mentioned that I try to make tags consistent to make my regex work properly, I use a plug-in in Calibre called 'Reformat plug-in' and how I make it work is after I'm satisfied removing what needs to be removed in my book, I delete all stylesheets and run the plug-in.

Now, I know what you're thinking, and yes it will nuke the intended structure of the book. But honestly, I've almost never had a problem with how it looks, the only thing that's holding me back from using it all the time, is sometimes paragraphs that are meant to be centered aren't and numbered bullet points are spaced out too much, so if anyone's familiar with the plug-in or a method similar to what I'm trying to achieve that would resolve some of those issues, I would love to hear it.
F4in7_ is offline   Reply With Quote
Old 02-22-2023, 01:27 AM   #2
Karellen
Wizard
Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.
 
Karellen's Avatar
 
Posts: 1,611
Karma: 9500498
Join Date: Sep 2021
Location: Australia
Device: Kobo Libra 2
Quote:
Originally Posted by F4in7_ View Post
Now, I know what you're thinking, and yes it will nuke the intended structure of the book. But honestly, I've almost never had a problem with how it looks, the only thing that's holding me back from using it all the time, is sometimes paragraphs that are meant to be centered aren't and numbered bullet points are spaced out too much, so if anyone's familiar with the plug-in or a method similar to what I'm trying to achieve that would resolve some of those issues, I would love to hear it.
Yep, that's exactly what I was thinking. How does your epub work if there is no stylesheet? I am surprised the reader will open the book with a major component missing, but I guess the reader is just falling back to default values, but as you noticed it does kill off any styling that is different from the default.

Rather than your hit and miss approach, maybe teach yourself css and html (its not hard) and fix the book in a correct manner. Sometimes you just need to jump in as you can't rely on plugins and other automated approaches to fix everything.

As for uppercase words, you might find some useful regex in here... https://www.mobileread.com/forums/sh...d.php?t=352157

Quotation marks? Not sure how you could automate finding missing quoatation marks. There would be too many false positives from words like it's, students’ etc
Karellen is offline   Reply With Quote
Advert
Old 02-22-2023, 01:59 AM   #3
F4in7_
Junior Member
F4in7_ began at the beginning.
 
Posts: 7
Karma: 10
Join Date: Feb 2023
Device: none
Quote:
Originally Posted by Karellen View Post
Yep, that's exactly what I was thinking. How does your epub work if there is no stylesheet? I am surprised the reader will open the book with a major component missing, but I guess the reader is just falling back to default values, but as you noticed it does kill off any styling that is different from the default.

Rather than your hit and miss approach, maybe teach yourself css and html (its not hard) and fix the book in a correct manner. Sometimes you just need to jump in as you can't rely on plugins and other automated approaches to fix everything.

As for uppercase words, you might find some useful regex in here... https://www.mobileread.com/forums/sh...d.php?t=352157

Quotation marks? Not sure how you could automate finding missing quoatation marks. There would be too many false positives from words like it's, students’ etc
Understandable, so the plug-in does generate a stylesheet that from my experience so far, tries to replicate the intended structure of the original format whilst simplifying the tags.

The uppercase words helped make it easier, so thanks for that.

And with quotation marks, I'm only looking for double quotation marks to fix, would that still be hard?
F4in7_ is offline   Reply With Quote
Old 02-22-2023, 02:21 AM   #4
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 21,721
Karma: 29711016
Join Date: Mar 2012
Location: Sydney Australia
Device: none
I use Toxaris' ePUB Tools addon for MS Word to detect and correct errors in dialogue quote marks, the Transtools Word addon also has an dialogue checking feature, but I've not used it.

I am not aware of any similar tools that work on other formats.

BR
BetterRed is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
how to automatize routine modifications to epubs? sbin Calibre 66 04-01-2017 12:34 AM
What’s your “image rehab” routine? GrannyGrump Workshop 58 11-23-2013 03:01 AM
Short Fiction Martinez, Brian: A Good Clean, A Harsh Clean. v1. PDF, 13th Dec 2010 BrianMartinez Other Books 0 12-13-2010 09:27 PM
Short Fiction Martinez, Brian: A Good Clean, A Harsh Clean. v1. 13th Dec 2010 BrianMartinez Kindle Books 0 12-13-2010 09:25 PM
Short Fiction Martinez, Brian: A Good Clean, A Harsh Clean. v1. 13th Dec 2010 BrianMartinez ePub Books 0 12-13-2010 09:23 PM


All times are GMT -4. The time now is 04:56 AM.


MobileRead.com is a privately owned, operated and funded community.