batch processing regex search/replace? - Page 2

G2B · 11-11-2020, 10:37 AM

Quote:

Originally Posted by retiredbiker

There is actually a sticky for that: https://www.mobileread.com/forums/sh...d.php?t=237181

Thank you for that.

I do realize that calibre does not always use the same tags for the same situations. It depends on how many books I merge together, which means some searches have to be edited.
But I do run a few search/replace commands that are the same for every book. This is my list, copied from a saved search file.

Spoiler:

Then there is also the ([a-zI])linebreak-tags([a-zA-Z]) to \1 \2 and similar with comma a.o. to remove mid-sentence line breaks.

I search for chapter headers, change them to <h2> for 'parts' and <h3> for all the regular, I use <h4> for subchapters that I don't want in the TOC. Since those will be different with each book, I cannot make a general rule for them.

After that it becomes stylesheet line searches to remove the remaining nonsense rules, until I am left with only my standard ones which are the same for every book I edit. Some ebooks have multiple tags for every line that cancel each other out = waste of space that only increases the size of the book needlessly.

Lately, I have started merging all the novels or series from one author together and processing that in bulk. I find it goes faster than doing each one separately. I rename files per book with copyright date. After cleaning up css, I export everything, bring everything up to top-level and because of the renaming, I get the novels sorted chronologically and the series by name.

I am doing this to get rid of the interminable listings in my ebook-reader when all the books are separate. This way I get one listing for novels (I used to do by 5 or 6, but even that can be quite long)

phossler · 11-11-2020, 10:58 AM

Quote:

I use <i>, <b> and <u> <br/> and <hr/> without classes.
The above alone can get rid of dozens of classes and sometimes reduce the size of an epub by half.

1. I recently reformatted a book for my Kindle that had 4 "font-size: ..." classes just for the basic text. Some were so small that I couldn't read them on the eReader.

Made them all 1em

2. I would prefer to just have <p> for 99% of the text, and only add a class= when needed

3. Same for <Hx> - let the style sheet do all the heavy lifting for the formatting, and only tweak where absolutely needed

hobnail · 11-11-2020, 12:26 PM

Quote:

Originally Posted by phossler

2. I would prefer to just have <p> for 99% of the text, and only add a class= when needed

3. Same for <Hx> - let the style sheet do all the heavy lifting for the formatting, and only tweak where absolutely needed

Clever use of css combinators should let you get rid of 99.9% of all the class= warts. I do have a few in my standard css file but I rarely use them.

G2B · 11-12-2020, 12:12 AM

Quote:

Originally Posted by phossler

I would prefer to just have <p> for 99% of the text, and only add a class= when needed

I like an indent of 2 em for new paragraphs. I find it easier for me to read than when there is no-indent thoughout.

I have plenty of books that have only the 'calibre' for page layout and calibre1 for the line indent.

G2B · 11-12-2020, 12:23 AM

I noticed yesterday that one can also enter regex search/replace during the convert process. I thought why not do it all in advance. I entered the same listing, (preferences +> Common options +> Search/replace) That didn't seem to work. None of the commands were executed. I am probably not doing it right. But the saved searches in the editor work fine. Batching those saves me a lot of repetion and time.

roger64 · 11-23-2020, 07:56 PM

Quote:

Originally Posted by phossler

The Saved Searches list box is really one dimensional. I wish that there was a way to make it into an expandable 'tree' to improve the organization structure.

There is a very handy way to find what you need in the saved searches window.

You can call saved searches according to their names. Say, you just type "w" in the top "filter" field, and you just see all your saved searches which have a "w" in their name. So, it's up to you to just name your saved searches in a custom way.

phossler · 11-24-2020, 09:52 PM

Quote:

Originally Posted by roger64

There is a very handy way to find what you need in the saved searches window.

I haven't been taking full advantage of the filter

I have been using my own convention to give the first word a 'key' and grouping the saved searches that I typically run together next to each other, like DELETE, TAG, and Hx

One thing -- I'd prefer that filter started matching just from the left, and not anywhere in the string (personal opinion)

Similar Threads
Thread	Thread Starter	Forum	Replies	Last Post
Multiple Search/Replace as a batch process	idf560	Library Management	7	05-22-2020 12:30 AM
Bug? No --search-replace processing in Windows	TechnoCat	Conversion	3	06-12-2017 11:28 AM
Regex in search problems (NOT Search&Replace; the search bar)	lairdb	Calibre	3	03-15-2017 07:10 PM
Regex: Search and Replace	Thomas_AR	Calibre	2	03-31-2016 06:23 PM
need regex help search and replace	schuster	Calibre	4	01-10-2011 09:00 AM

11-12-2020, 12:23 AM	#20
G2B Enthusiast Posts: 28 Karma: 10 Join Date: Feb 2018 Device: PC / iPad	I noticed yesterday that one can also enter regex search/replace during the convert process. I thought why not do it all in advance. I entered the same listing, (preferences +> Common options +> Search/replace) That didn't seem to work. None of the commands were executed. I am probably not doing it right. But the saved searches in the editor work fine. Batching those saves me a lot of repetion and time.