![]() |
#1 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 772
Karma: 1537886
Join Date: Sep 2013
Device: Kobo Forma
|
Remove Italic Tags from Selection?
Is there some way to remove italic tags (<i> and </i>, but this could apply to other tags as well) from a selected group of paragraphs in the Editor?
I ask because almost all the time, the author/publisher italicizes things with those tags in the HTML instead of with a css style. So far, the only things I've come up with are manually selecting each tag pair and deleting them, or marking the selection and running a Regex search/replace on the marked text: Code:
find: (<p[^>]*>)<i>(.+?)</i>([\.\?]+)?</p> replace: \1\2\3</p> Last edited by enuddleyarbl; 09-11-2022 at 06:51 PM. Reason: better code for finding all-italic paragraphs |
![]() |
![]() |
![]() |
#2 |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 30,981
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Diaps Toolbag (plugin)
Outright delete or Modify (assign different) |
![]() |
![]() |
![]() |
#3 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,611
Karma: 9500498
Join Date: Sep 2021
Location: Australia
Device: Kobo Libra 2
|
|
![]() |
![]() |
![]() |
#4 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 772
Karma: 1537886
Join Date: Sep 2013
Device: Kobo Forma
|
@theducks: Unless I'm missing something with Diap's Toolbag, it can only change all the stuff on a given page (EDIT: sorry, the current HTML file) or all the stuff in the whole book (all the files). I just need to remove those tags from specific clumps of text on a page.
@Karellen: Usually, I've no problem with things like <i>, <em>, <b> and <strong> in the HTML. But, where there are clumps of text that should be treated as a syntactic whole (i.e., a quote at the top of a chapter, a poem, a letter, etc.), I'd like to apply a CSS class to handle the formatting. It's possible the styling embedded in the content will conflict with that present in the class. If the <i> tags were only around small bits of text like ship names or foreign words, that's probably not a problem. I think I could use another class to override their italics to something like bold: Code:
i i { font-style: normal; font-weight: bold; } Last edited by enuddleyarbl; 09-09-2022 at 07:25 PM. Reason: missed a word |
![]() |
![]() |
![]() |
#5 |
Bibliophagist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,640
Karma: 168959522
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos
|
That really sounds like something where it would be easier just to make the changes manually. You could search for the <i>blah blah blah</i> text and either replace or find next. Unless there is something very specific that you can use to select the chunks of text you want to remove the italics from, there are no good ways to automatically make the changes.
|
![]() |
![]() |
![]() |
#6 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 772
Karma: 1537886
Join Date: Sep 2013
Device: Kobo Forma
|
Yep. That's what I was afraid of. As it is, unless the number of paragraphs in a the block of text is big enough (say, for instance, 4 lines of a poem or more), I just manually delete the tags. If there are more, I'll go through the select, mark, type regex, replace, unmark dance.
|
![]() |
![]() |
![]() |
#7 | |
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 79,544
Karma: 145863177
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
Quote:
|
|
![]() |
![]() |
![]() |
#8 |
Bibliophagist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,640
Karma: 168959522
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos
|
Jon, in your opinion, there is no reason to remove those italics. Sadly, it would appear that enuddleyarbl has their own opinion which disagrees with yours. His ebook, his formatting.
|
![]() |
![]() |
![]() |
#9 | |
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 79,544
Karma: 145863177
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
Quote:
I never said there's no reason. I said there's no need. There isn't a need. There's a want. A need would be if it wasn't working or causing some sort of display problem. Some of the things I do when editing an eBook are not needed to be done but are done because I want to do them. |
|
![]() |
![]() |
![]() |
#10 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 772
Karma: 1537886
Join Date: Sep 2013
Device: Kobo Forma
|
I guess it's not a needful thing, but in case some other person wants to do something similar, I'm starting to look at this from a different direction. Instead of working with the <i>/</i> pair and the text in between, I've decided to look for the whitespace between adjacent italicized paragraphs:
Code:
find: </i>([\.\?]?</p>\s+?<p[^>]*>)<i> replace: \1 Last edited by enuddleyarbl; 09-11-2022 at 06:35 PM. Reason: initial code didn't handle classless paragraphs & punctuation |
![]() |
![]() |
![]() |
#11 | |||||
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
Quote:
I add a temporary background-color: Code:
i i { background-color: red; <---- Add this line. } As I skim through the preview of the ebook, the different colors stand out like a sore thumb. This lets me take a much closer look at the (problematic) areas. When I'm done fixing up all the code, I remove the background-color. ![]() - - - Side Note: I sometimes use multiple colors for different things:
Just choose any of the HTML Color Names if you want it easy. ![]() You could then use something like: Code:
background-color: green; ![]() - - - Quote:
In that case, I mark with a <div> or <blockquote> plus a class="". For example: Code:
<blockquote class="openingquote"> <p>Some amazing quote.</p> <p class="right">—Tex</p> </blockquote> <div class="poem"> <div class="stanza"> <p class="line">The cow jumped over the moon.</p> <p class="line">And Tex jumped over the <em>house</em>!</p> </div> </div> Spoiler:
What I tend to do is something along these lines: Search: <i>(.+?)</i> Replace: <span class="temppoem">\1</span> and just go through the book on a case-by-case basis. Combined with that background-color trick above, it makes it much easier to find these spots + fix them up. ![]() - - - After I'm done, I can then do a: Search: <span class="temppoem"> and go mass replacing those new <span>s with whatever code I need. (Usually use Diap's amazing Toolbag to shift to something clean!) - - - Quote:
Italics/Emphasis within italics = Roman (Straight up-and-down, normal text). So let's say you were reading a Fiction book where all the character's inner thoughts are in italics: Code:
<p><i class="thoughts">The ghost is going to <em>kill</em> my cat!</i> Tex doused the door handle in holy water. <i class="thoughts">Remind me to never read the <i class="booktitle">Necronomicon</i> or go aboard the <i class="shipname">HMS Haunty</i> ever again!</i></p>
Similar to quotes within quotes, where you alternate:
You do the same thing with italics! Think of it like an ON/OFF switch:
Then if you made all your thoughts be bold instead, you'd toggle the italics back on for those "inner" emphasis + book titles + ship names. ![]() Quote:
Search: <i>([^<]{100,})</i> Replace: <span class="replace">\1</span> Here's a breakdown of that Regex:
This will find really long italics, while ignoring any italics <100 characters long. ![]() (Sometimes I start with high numbers, like 200, then progressively work my way down.) Quote:
We can give general principles/guidelines, but there definitely isn't a "one-button press" solution for complicated situations like this. Like you might have another book with hundreds of different "calibre123" classes that all apply italics, and you'd have to figure out what each one does. 99.99% chance the ebooks aren't super clean with their <i> + <em> markup... especially when dealing with nested cases like this! - - - Complete Side Note: If you want the ultimate info on Italics <i> and Emphasis <em>, see my posts in:
If you want even more examples of weird edge-cases like "quotes-within-quotes" or "should the punctuation be in italics?" see:
Last edited by Tex2002ans; 09-11-2022 at 05:48 AM. |
|||||
![]() |
![]() |
![]() |
#12 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 772
Karma: 1537886
Join Date: Sep 2013
Device: Kobo Forma
|
Tex2002ans: Thanks. I especially like the idea of temporarily changing colors of the types of things you're looking for. That should make seeing them much easier.
|
![]() |
![]() |
![]() |
#13 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 772
Karma: 1537886
Join Date: Sep 2013
Device: Kobo Forma
|
And, of course, it helps a lot now that I've found the Calibre Editor's Saved Search function: Search > Saved Searches (is my face red for not having found that up until now).
|
![]() |
![]() |
![]() |
#14 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
Quote:
Here's an example where I explained them in detail: (I use them in Sigil, not so much in Calibre.) For example, I have a 12-step Saved Search I use to instantly clean up Finereader HTML/EPUB cruft. ![]() And here's an example EPUB that I used that method on: Within seconds, I was able to generate a (relatively clean) EPUB from the PDF. Compared to the Archive.org "EPUB" version... mine is wayyyyyy better. ![]() |
|
![]() |
![]() |
![]() |
#15 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,194
Karma: 1995558
Join Date: Aug 2015
Device: Kindle
|
You can use the Editor Chains plugin to do this. But it is going to be two part process
Also you can merge adjacent italic tags into one. You can find an example for that in the plugin's thread. It will need modifying to change the name form <span> to <i>. Notes:
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
How to remove all tags at once? | al35 | Calibre | 16 | 05-30-2024 07:59 AM |
Remove all Tags | silverraven | Library Management | 7 | 02-09-2016 09:00 AM |
RFE: Remove remove tags in bulk edit | magphil | Calibre | 0 | 08-11-2009 10:37 AM |
Remove <p> tags? | sideburnt | Calibre | 3 | 06-11-2009 12:22 PM |
PRS-500 Tags for Bold, Italic, Center, Etc. in LRF? | EatingPie | Sony Reader Dev Corner | 9 | 04-07-2007 01:06 AM |