|  08-09-2014, 01:51 AM | #16 | |
| Wizard            Posts: 3,720 Karma: 1759970 Join Date: Sep 2010 Device: none | Quote: 
 <h1 class="calibre10" id="rw-h1_319849-00001"><a class="calibre7" href="../Text/9780857900135_toc.html">4</a></h1> now I run your (sigil flavored) regex- you are right - it works ! so can you walk me though HOW it works, please -using the above example I am impressed that it zaps both eth opening and the closing tag, in a single pass, and without needing a \1 replace anywhere PS re the concern that I may over-zealously zap too much stuff: My usual precaution in sigil is to run count all, to begin with; if that returns a count that matches the number of chapters, then clearly I have no instances outside of chapter headers to worry about & I can run replace all Last edited by cybmole; 08-09-2014 at 01:54 AM. | |
|   |   | 
|  08-09-2014, 02:54 AM | #17 | ||
| Grand Sorcerer            Posts: 28,864 Karma: 207000000 Join Date: Jan 2010 Device: Nexus 7, Kindle Fire HD | Quote: 
 Quote: 
 Certainly. It's all about the optional elements (indicated by the '?'s). Code: </?a ?([^>]+)?> Code: </?a The space that follows is for demarcation so it doesn't match any other tags that might start with the letter 'a' (addr abbr, area, etc...). It's made optional with the following '?' because the space won't exist in the closing tag. (NOTE: I can't guarantee it won't match tags like addr, abbr, or area because I frankly haven't tried it--I suspect it might. But those tags are pretty rare. Still ... that's why I prefer the \M approach instead of the " ?". "a\M" matches the letter a at the "end of a word." But \M won't work in all flavors of regex.) That takes us through Code: </?a ? Code: [^>]+ Code: ([^>]+)? So put it all together and it will match </a> as well as: Code: <a id="blah" class="blahdeblah" href="blahdedblahdeblah.html#doohickey"> Last edited by DiapDealer; 08-09-2014 at 03:03 AM. | ||
|   |   | 
|  08-09-2014, 03:14 AM | #18 | 
| Wizard            Posts: 3,720 Karma: 1759970 Join Date: Sep 2010 Device: none | 
			
			so simple when you know how   many thanks for that excellent walkthrough | 
|   |   | 
|  08-09-2014, 03:27 AM | #19 | 
| creator of calibre            Posts: 45,598 Karma: 28548962 Join Date: Oct 2006 Location: Mumbai, India Device: Various | 
			
			The only place you have to use ade based readers is on eink devices, and they dont support colors anyway.
		 | 
|   |   | 
|  08-09-2014, 03:28 AM | #20 | |
| Grand Sorcerer            Posts: 28,864 Karma: 207000000 Join Date: Jan 2010 Device: Nexus 7, Kindle Fire HD | Quote: 
   Last edited by DiapDealer; 08-09-2014 at 03:31 AM. | |
|   |   | 
|  08-09-2014, 03:36 AM | #21 | 
| creator of calibre            Posts: 45,598 Karma: 28548962 Join Date: Oct 2006 Location: Mumbai, India Device: Various | 
			
			And I just tried overriding the styles like this a:link { color: magenta; text-decoration: none } and it worked fine in my copy of ADE 1.7 | 
|   |   | 
|  08-09-2014, 04:02 AM | #22 | 
| Resident Curmudgeon            Posts: 80,677 Karma: 150249619 Join Date: Nov 2006 Location: Roslindale, Massachusetts Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3 | 
			
			If the code is <a href="page_10"/> then that's easy to remove. Search for <a href="page_[0-9]*"/> and replace with nothing. | 
|   |   | 
|  08-09-2014, 04:35 AM | #23 | |
| Wizard            Posts: 3,720 Karma: 1759970 Join Date: Sep 2010 Device: none | Quote: 
 for sure, on the sony readers, I still saw blue+ underlined. but I did not do it your way, I just styled the h2 tag or whatever that tag containing the <a bit was, maybe that's why, I still prefer to remove them though so that i do not create dead links by removing an unwanted html TOC page, which I consider to be a redundant item. On any reader i'd use the show me the chapters feature & that would refer to the toc.ncx file. I don't see any added value in keeping an active-links HTML contents page either at the start or at the end of an epub. | |
|   |   | 
|  08-09-2014, 05:36 AM | #24 | 
| creator of calibre            Posts: 45,598 Karma: 28548962 Join Date: Oct 2006 Location: Mumbai, India Device: Various | 
			
			ade 1.7 is what is used on sony readers. And styling the element surrounding a link will not work, because the link's css will override it.
		 | 
|   |   | 
|  08-09-2014, 06:15 AM | #25 | |
| Wizard            Posts: 3,720 Karma: 1759970 Join Date: Sep 2010 Device: none | Quote: 
 given the easy to use tag removal regex solutions posted here by others, I guess there's less of a case for wanting an editor or a convert feature to do the job for me. | |
|   |   | 
|  08-09-2014, 08:05 AM | #26 | |
| stumblebum  Posts: 29 Karma: 10 Join Date: Nov 2013 Location: Roseburg, OR Device: kindle2 | Quote: 
  Back to lurking.  larry Last edited by timberbeast; 08-09-2014 at 08:17 AM. Reason: giving credit | |
|   |   | 
|  08-10-2014, 12:14 AM | #27 | |
| Ex-Helpdesk Junkie            Posts: 19,421 Karma: 85400180 Join Date: Nov 2012 Location: The Beaten Path, USA, Roundworld, This Side of Infinity Device: Kindle Touch fw5.3.7 (Wifi only) | Quote: 
  Your solution looks quite nice too. I was kinda viewing your plugin as a way to remove extraneous elements, not nested per se. So it might be nice to have a plugin that does all the heavy liftingthinking for you. EDIT: And I see you added <a>  . Last edited by eschwartz; 08-10-2014 at 12:19 AM. | |
|   |   | 
|  08-10-2014, 01:04 AM | #28 | 
| Wizard            Posts: 3,720 Karma: 1759970 Join Date: Sep 2010 Device: none | 
			
			Another particular horror - sometimes seen in old mobo books - is nested blockquotes. I've seen them about 6 layers deep in some free amazon books! & by the time you are at the innermost nest the text has almost been pushed off the screen! those are a nightmare to remove & what makes it even trickier is that i usually want to keep the outer layer, and then just have a sensible blockquote margin set in CSS. so if you guys are looking at tools for nested tags, the general challenge is for some code that locates nested tags & then removes all but the outer layer- is that possible ? the same code would sometimes be helpful for simplifying spans To be fair though, it's been a while since I saw one of those blockquote horrors. I think they were a way of overcoming mobi format limitations, and are unlikey to be nested so badly if foils work in epub or awz | 
|   |   | 
|  08-10-2014, 03:54 AM | #29 | 
| Well trained by Cats            Posts: 31,241 Karma: 61360164 Join Date: Aug 2009 Location: The Central Coast of California Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A | 
			
			@cybmole I use: Code: (?sm)(<blockquote class="\w">\s+){2,}(.+?)(</blockquote>\s+){2,}Code: \2 It is not a perfect solution, you may have to fix (debug) some now-broken code (IIRC Mobi has no 'margin-left, margin-right' support, thus the use of BQ) | 
|   |   | 
|  08-10-2014, 05:22 AM | #30 | 
| Wizard            Posts: 3,720 Karma: 1759970 Join Date: Sep 2010 Device: none | 
			
			thanks - now I'll have to remember where I might have saved a test case    does that code strip the nested tags from inner to outer, as usually the outermost one is the best candidate for keeping ? | 
|   |   | 
|  | 
| Thread Tools | Search this Thread | 
| 
 | 
|  Similar Threads | ||||
| Thread | Thread Starter | Forum | Replies | Last Post | 
| What does the filepos parameter do in an href? | lunixer | ePub | 6 | 03-16-2017 10:56 AM | 
| Regex Solution to hidden href search? | MizSuz | Sigil | 16 | 09-29-2012 07:40 PM | 
| Why is a href needed in the manifest to validate? | wannabee | ePub | 3 | 01-24-2012 11:40 PM | 
| a href links working/not working | mimosawind | ePub | 5 | 12-09-2011 12:42 PM | 
| RFE: Remove remove tags in bulk edit | magphil | Calibre | 0 | 08-11-2009 10:37 AM |