Quote:
Originally Posted by phossler
Diap's Toolbox plugin will take care of those very easily
|
Yes. I love Diap's "Editing Toolbag" (Calibre) / "TagMechanic" (Sigil). It makes cleaning this stuff up so much easier.
I've written a handful of mini-tutorials on how to use it:
That will help you change garbage like:
Code:
<p class="junk123"><span class="italics123">This</span><span class="junk456"> </span>is<span class="junk456"> </span>an<span class="junk456"> </span>example.</p>
into:
Code:
<p><em>This</em> is an example.</p>
And in 2021, I wrote even more tricks:
Quote:
Originally Posted by jhowell
Try converting to MOBI format and then from MOBI back to EPUB. That will remove a lot of excess formatting.
|
No. It's much better to
do a Calibre EPUB->EPUB. This will consolidate a lot of the horrible code.
There are a few bells and whistles you can select in the EPUB conversion to try to remove junk like extra:
- Colors
- Fonts
- Font-size
- Margins
- [...]
This will make the CSS cleaner + be much less full of "useless" stuff, making your manual cleanup steps much easier.
I explained these methods in much more detail in:
when RbnJrg asked how to consolidate 26 books/EPUBs, with very similar formatting, into more manageable HTML+CSS code.
About a year later, I wrote:
which summarized more + explained some of the best,
bleeding-edge methods.
- - -
Side Note: Some of these most-advanced cleanup tools are still in the works though...
But the pieces/concepts are all there.
Sigil 1.9.10+ added the "Advanced Find/Replace (List-Based)" method I was describing.
For more info on that, see KevinH's:
And the CSSToolbox is still in the works. (I think? I haven't talked with KevinH in a while.)