Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Editor

Notices

Reply
 
Thread Tools Search this Thread
Old 04-06-2023, 12:03 PM   #16
DNSB
Bibliophagist
DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.
 
DNSB's Avatar
 
Posts: 35,513
Karma: 145557716
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Forma, Clara HD, Lenovo M8 FHD, Paperwhite 4, Tolino epos
Quote:
Originally Posted by Karellen View Post
In your example a simple regex would have fixed that
Find... <span class="text_14">(.*?)</span>
Replace... \1

Then look at what is leftover and figure out what it does and either leave it, replace it or remove it.
Unfortunately, your regex does horrible things if you have nested spans:

Code:
<span class="text_14"> blah de blah de blah<span class="text_17">more blah de blah</span>yet more blah de blah</span>
which your regex would convert to:

Code:
 blah de blah de blah<span class="text_17">more blah de blah yet more blah de blah</span>
It would be much safer to simply use:

search: <span class="text_14>
replace: <span>

and then remove the naked <span> tags using Diap's "Editing Toolbag" (Calibre) / "TagMechanic" (Sigil).

Last edited by DNSB; 04-06-2023 at 03:21 PM.
DNSB is offline   Reply With Quote
Old 04-06-2023, 02:49 PM   #17
Karellen
Wizard
Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.
 
Karellen's Avatar
 
Posts: 1,107
Karma: 4911876
Join Date: Sep 2021
Location: Australia
Device: Kobo Libra 2
Quote:
Originally Posted by DNSB View Post
Unfortunately, your regex does horrible things if you have nested spans:
Well, I wasn't claiming that my regex was a one-stop fix for the entire book. I could only go by the sample he provided and I did not spot a "<span class=text_17">" class in the example.

It's more of a multi-step process, which is why I further stated "Then look at what is leftover and figure out what it does and either leave it, replace it or remove it." But at each step, you get to see what is being removed and if it is appropriate to remove it.
Karellen is online now   Reply With Quote
Old 04-06-2023, 03:25 PM   #18
DNSB
Bibliophagist
DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.
 
DNSB's Avatar
 
Posts: 35,513
Karma: 145557716
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Forma, Clara HD, Lenovo M8 FHD, Paperwhite 4, Tolino epos
Quote:
Originally Posted by Karellen View Post
Well, I wasn't claiming that my regex was a one-stop fix for the entire book. I could only go by the sample he provided and I did not spot a "<span class=text_17">" class in the example.

It's more of a multi-step process, which is why I further stated "Then look at what is leftover and figure out what it does and either leave it, replace it or remove it." But at each step, you get to see what is being removed and if it is appropriate to remove it.
My post was more of a warning since nested spans are relatively common.
DNSB is offline   Reply With Quote
Old 04-06-2023, 03:33 PM   #19
Karellen
Wizard
Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.
 
Karellen's Avatar
 
Posts: 1,107
Karma: 4911876
Join Date: Sep 2021
Location: Australia
Device: Kobo Libra 2
Quote:
Originally Posted by DNSB View Post
My post was more of a warning since nested spans are relatively common.
Gotcha
Karellen is online now   Reply With Quote
Old 04-09-2023, 05:20 PM   #20
akita328
Member
akita328 began at the beginning.
 
Posts: 13
Karma: 10
Join Date: Aug 2019
Device: kindle, iPad Marvin
Thanks @Tex2002ans!

and everybody! I think I'm following most of what is being suggested, but it will take time for me to digest it all...

I knew there was something like rt-click-->replace trick. I will rename the classes, etc so it will be easier to figure out where they're being used for.

I think I have at least all the book text fixed.. so it's readable in Kindle

now I'm doing minor tweaks, but otherwise, I think all all is well.

Thank you everybody for all your help! I love formatting things (learned large doc formatting using LaTeX, so... this is so much easier) and intend to learn more eBook editing by trial and error.

Last edited by akita328; 04-09-2023 at 06:20 PM.
akita328 is offline   Reply With Quote
Old 04-09-2023, 07:04 PM   #21
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 74,037
Karma: 129333114
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Quote:
Originally Posted by akita328 View Post
Hi everybody,

thank you so much for your inputs. I will first try azw3-->mobi-->epub and see if that helps first... if not I will try learning how to use the Diap Toolbox..
KF8 (AZW3) > Mobi > ePub is a really really bad idea. That can cause problems you don't want. Best to go KF8 > ePub.
JSWolf is offline   Reply With Quote
Old 04-09-2023, 07:05 PM   #22
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 74,037
Karma: 129333114
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Quote:
Originally Posted by Karellen View Post
Yes, I agree. You need to spend 10min or so figuring out what the classes are doing, especially when they are nested. You might also be removing blockquotes, centering, right aligned, and lots of other styling.

I have never used a plugin to fix these problems. A few well placed regexes can either remove the code or find&replace the convoluted code with your own simpler classes.

In your example a simple regex would have fixed that
Find... <span class="text_14">(.*?)</span>
Replace... \1

Then look at what is leftover and figure out what it does and either leave it, replace it or remove it.
But with Diaps Editing Toolbag you just delete all spans with the class text_14 if you don't want text_14. It's a lot easier and safer. If you have nested spans, the regex given won't work.

Last edited by JSWolf; 04-09-2023 at 07:09 PM.
JSWolf is offline   Reply With Quote
Old 04-09-2023, 07:07 PM   #23
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 74,037
Karma: 129333114
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Quote:
Originally Posted by DNSB View Post
Unfortunately, your regex does horrible things if you have nested spans:

Code:
<span class="text_14"> blah de blah de blah<span class="text_17">more blah de blah</span>yet more blah de blah</span>
which your regex would convert to:

Code:
 blah de blah de blah<span class="text_17">more blah de blah yet more blah de blah</span>
It would be much safer to simply use:

search: <span class="text_14>
replace: <span>

and then remove the naked <span> tags using Diap's "Editing Toolbag" (Calibre) / "TagMechanic" (Sigil).
That's twice the work when Diaps Editing Toolbag can do the <span class="text_14"> delete in one go.
JSWolf is offline   Reply With Quote
Old 04-09-2023, 08:27 PM   #24
Karellen
Wizard
Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.Karellen ought to be getting tired of karma fortunes by now.
 
Karellen's Avatar
 
Posts: 1,107
Karma: 4911876
Join Date: Sep 2021
Location: Australia
Device: Kobo Libra 2
Quote:
Originally Posted by JSWolf View Post
But with Diaps Editing Toolbag you just delete all spans with the class text_14 if you don't want text_14. It's a lot easier and safer. If you have nested spans, the regex given won't work.
Oh geez.
Yes, it won't work if there are nested <span>'s, I am aware of that. I gave a regex that matched the example the OP provided in his post. I didn't read his mind to figure out he had class="calibre_17" nested spans. That will require a different approach.

Yes, I am aware of some of these plugins. But I have no use for them. Why would I need a plugin to perform something I can already, very easily, do without the plugin.

It also seems to me, quitely reading threads from newbies, that newbies become so reliant on plugins to do the easiest of fixes, that when things go wrong, or can't achieve what they want, the are at a complete loss what to do.

Maybe push the plugins less and describe how to fix using alternate methods, like regex.

My personal opinion
Karellen is online now   Reply With Quote
Old 04-10-2023, 03:27 AM   #25
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by akita328 View Post
Thanks @Tex2002ans!

and everybody!
You're welcome.

Quote:
Originally Posted by akita328 View Post
I think I'm following most of what is being suggested, but it will take time for me to digest it all...
Yep, it's a lot, but you'll get it one piece at a time.

And the posts aren't going anywhere! (That's the awesome things about forums, you can visit the posts a year later and refresh your memory and/or still learn more.)

Quote:
Originally Posted by akita328 View Post
I knew there was something like rt-click-->replace trick. I will rename the classes, etc so it will be easier to figure out where they're being used for.
Yep, that's one of the best tricks you can do.

And like I said, then you can use Diap's Toolbag in successive rounds to clean up the junk.

Quote:
Originally Posted by akita328 View Post
I love formatting things (learned large doc formatting using LaTeX, so... this is so much easier) and intend to learn more eBook editing by trial and error.
Pfff, if you learned LaTeX, then almost anything is easier!

Quote:
Originally Posted by Karellen View Post
Yes, I am aware of some of these plugins. But I have no use for them. Why would I need a plugin to perform something I can already, very easily, do without the plugin.

It also seems to me, quitely reading threads from newbies, that newbies become so reliant on plugins to do the easiest of fixes, that when things go wrong, or can't achieve what they want, the are at a complete loss what to do.
Yes, but in this specific case... Diap's plugins really do wonders.

(And will be much less likely to cause errors and accidentally delete text!)

- - -

BUT, ever since that Right-Click > Rename was added into Calibre/Sigil, WOW, did that cut down on one of the major pain points.

Then, like DNSB said, just keep on renaming crappy/useless code to simple:

Code:
<span>
<span>
<span>
and then when I see screens full of:

Quote:
<p class="block_21">“How<span> </span>can<span> </span>I<span> </span>persuade<span class="differentjunk"> </span>you<span> </span>that<span> </span>I<span> </span>mean<span> </span>you<span> </span>no<span> </span>harm?”<span> </span>he<span> </span>asked.<span> </span>“I<span> </span>swear to you that I will do nothing to you.”</p>
in one fell swoop, boom, delete them all with the plugin:

Quote:
<p class="block_21">“How can I persuade<span class="differentjunk"> </span>you that I mean you no harm?” he asked. “I swear to you that I will do nothing to you.”</p>
Sigil/Calibre already know which opening <span> matches which closing </span>, so why try to recreate that using some (complicated) regex which may cause you to lose data? :P

Save the pure regex solution for stuff where it excels, like:

Notice a pattern and wanna find/fix them all? NOW we can use Regex!

Wanna safely remove "excessive <class> and other formatting horrors"? Use a mix of simple search/replace (even regex) + Diap's tools!
Tex2002ans is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Removing class and id references Artha Sigil 10 07-24-2011 11:17 AM
Changing or removing <div class="calibrenavbar"> ptsefton Recipes 3 05-28-2011 08:30 AM
Problem with removing formatting jekoby Calibre 4 03-29-2011 04:57 AM
Trouble removing span class mufc Recipes 3 03-18-2011 03:29 PM
Ebook formatting - help with removing margins? geekgeek Amazon Kindle 8 12-22-2010 10:27 PM


All times are GMT -4. The time now is 02:56 AM.


MobileRead.com is a privately owned, operated and funded community.