![]() |
#16 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,720
Karma: 1759970
Join Date: Sep 2010
Device: none
|
Quote:
<h1 class="calibre10" id="rw-h1_319849-00001"><a class="calibre7" href="../Text/9780857900135_toc.html">4</a></h1> now I run your (sigil flavored) regex- you are right - it works ! so can you walk me though HOW it works, please -using the above example I am impressed that it zaps both eth opening and the closing tag, in a single pass, and without needing a \1 replace anywhere PS re the concern that I may over-zealously zap too much stuff: My usual precaution in sigil is to run count all, to begin with; if that returns a count that matches the number of chapters, then clearly I have no instances outside of chapter headers to worry about & I can run replace all Last edited by cybmole; 08-09-2014 at 01:54 AM. |
|
![]() |
![]() |
![]() |
#17 | ||
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 28,400
Karma: 203720150
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
Quote:
Quote:
Certainly. It's all about the optional elements (indicated by the '?'s). Code:
</?a ?([^>]+)?> Code:
</?a The space that follows is for demarcation so it doesn't match any other tags that might start with the letter 'a' (addr abbr, area, etc...). It's made optional with the following '?' because the space won't exist in the closing tag. (NOTE: I can't guarantee it won't match tags like addr, abbr, or area because I frankly haven't tried it--I suspect it might. But those tags are pretty rare. Still ... that's why I prefer the \M approach instead of the " ?". "a\M" matches the letter a at the "end of a word." But \M won't work in all flavors of regex.) That takes us through Code:
</?a ? Code:
[^>]+ Code:
([^>]+)? So put it all together and it will match </a> as well as: Code:
<a id="blah" class="blahdeblah" href="blahdedblahdeblah.html#doohickey"> Last edited by DiapDealer; 08-09-2014 at 03:03 AM. |
||
![]() |
![]() |
Advert | |
|
![]() |
#18 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,720
Karma: 1759970
Join Date: Sep 2010
Device: none
|
so simple when you know how
![]() many thanks for that excellent walkthrough |
![]() |
![]() |
![]() |
#19 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,223
Karma: 27110894
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
The only place you have to use ade based readers is on eink devices, and they dont support colors anyway.
|
![]() |
![]() |
![]() |
#20 | |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 28,400
Karma: 203720150
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
Quote:
![]() Last edited by DiapDealer; 08-09-2014 at 03:31 AM. |
|
![]() |
![]() |
Advert | |
|
![]() |
#21 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,223
Karma: 27110894
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
And I just tried overriding the styles like this
a:link { color: magenta; text-decoration: none } and it worked fine in my copy of ADE 1.7 |
![]() |
![]() |
![]() |
#22 |
Resident Curmudgeon
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 79,259
Karma: 145488788
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
If the code is <a href="page_10"/> then that's easy to remove.
Search for <a href="page_[0-9]*"/> and replace with nothing. |
![]() |
![]() |
![]() |
#23 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,720
Karma: 1759970
Join Date: Sep 2010
Device: none
|
Quote:
for sure, on the sony readers, I still saw blue+ underlined. but I did not do it your way, I just styled the h2 tag or whatever that tag containing the <a bit was, maybe that's why, I still prefer to remove them though so that i do not create dead links by removing an unwanted html TOC page, which I consider to be a redundant item. On any reader i'd use the show me the chapters feature & that would refer to the toc.ncx file. I don't see any added value in keeping an active-links HTML contents page either at the start or at the end of an epub. |
|
![]() |
![]() |
![]() |
#24 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,223
Karma: 27110894
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
ade 1.7 is what is used on sony readers. And styling the element surrounding a link will not work, because the link's css will override it.
|
![]() |
![]() |
![]() |
#25 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,720
Karma: 1759970
Join Date: Sep 2010
Device: none
|
Quote:
given the easy to use tag removal regex solutions posted here by others, I guess there's less of a case for wanting an editor or a convert feature to do the job for me. |
|
![]() |
![]() |
![]() |
#26 | |
stumblebum
![]() Posts: 29
Karma: 10
Join Date: Nov 2013
Location: Roseburg, OR
Device: kindle2
|
Quote:
![]() Back to lurking. ![]() larry Last edited by timberbeast; 08-09-2014 at 08:17 AM. Reason: giving credit |
|
![]() |
![]() |
![]() |
#27 | |
Ex-Helpdesk Junkie
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
Quote:
![]() I was kinda viewing your plugin as a way to remove extraneous elements, not nested per se. So it might be nice to have a plugin that does all the heavy liftingthinking for you. EDIT: And I see you added <a> ![]() Last edited by eschwartz; 08-10-2014 at 12:19 AM. |
|
![]() |
![]() |
![]() |
#28 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,720
Karma: 1759970
Join Date: Sep 2010
Device: none
|
Another particular horror - sometimes seen in old mobo books - is nested blockquotes.
I've seen them about 6 layers deep in some free amazon books! & by the time you are at the innermost nest the text has almost been pushed off the screen! those are a nightmare to remove & what makes it even trickier is that i usually want to keep the outer layer, and then just have a sensible blockquote margin set in CSS. so if you guys are looking at tools for nested tags, the general challenge is for some code that locates nested tags & then removes all but the outer layer- is that possible ? the same code would sometimes be helpful for simplifying spans To be fair though, it's been a while since I saw one of those blockquote horrors. I think they were a way of overcoming mobi format limitations, and are unlikey to be nested so badly if foils work in epub or awz |
![]() |
![]() |
![]() |
#29 |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 30,917
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
@cybmole
I use: Code:
(?sm)(<blockquote class="\w">\s+){2,}(.+?)(</blockquote>\s+){2,} Code:
\2 It is not a perfect solution, you may have to fix (debug) some now-broken code (IIRC Mobi has no 'margin-left, margin-right' support, thus the use of BQ) |
![]() |
![]() |
![]() |
#30 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,720
Karma: 1759970
Join Date: Sep 2010
Device: none
|
thanks - now I'll have to remember where I might have saved a test case
![]() does that code strip the nested tags from inner to outer, as usually the outermost one is the best candidate for keeping ? |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
What does the filepos parameter do in an href? | lunixer | ePub | 6 | 03-16-2017 10:56 AM |
Regex Solution to hidden href search? | MizSuz | Sigil | 16 | 09-29-2012 07:40 PM |
Why is a href needed in the manifest to validate? | wannabee | ePub | 3 | 01-24-2012 11:40 PM |
a href links working/not working | mimosawind | ePub | 5 | 12-09-2011 12:42 PM |
RFE: Remove remove tags in bulk edit | magphil | Calibre | 0 | 08-11-2009 10:37 AM |