Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Editor

Notices

Reply
 
Thread Tools Search this Thread
Old 08-10-2014, 08:34 AM   #31
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 28,394
Karma: 203720150
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Earlier I mentioned that I was worried that:
Code:
</?a ?([^>]+)?>
might find other tags that started with 'a'--and indeed it does. The addr, abbr, and area tags are probably able to be dismissed, but the <aside> tag is one that I'm sure we're only going to see more and more of. And my regex will include it.

So for the paranoid/pedantic type (like myself), it's probably best to use:
Code:
</?a\b([^>]+)?>
instead (should work in pretty-much all regex flavors).

The \b just matches a "word" boundary so that no other tags that start with 'a' will be caught up in the match.
DiapDealer is offline   Reply With Quote
Old 08-10-2014, 09:13 AM   #32
cybmole
Wizard
cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.
 
Posts: 3,720
Karma: 1759970
Join Date: Sep 2010
Device: none
Id never heard of <aside> i had to google it. it doesnt seem like anything I'd ever expect to encounter in an epub though - would ADE even have a clue what to do with it?

So I think I'll just go on happily nuking everything beginning with <a !

I have an HTML5 cuick guide on my tablet so lets see what else there is in its index for a:
acronym - though its says that is now unsupported
address
article - new for html 5
cybmole is offline   Reply With Quote
Advert
Old 08-10-2014, 09:40 AM   #33
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 28,394
Karma: 203720150
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Fair enough. As I said; my change is more of a "stickler for details" thing.

But if you're cleaning up the cruft from retail epubs, you're going to start running into the <aside> tag sooner rather than later. Both Kobo and B&N (and even Kindle) books are exhibiting more and more html5/epub3 features.

Deleting the <aside></aside> tags and leaving their contents intact could make for some very confusing reading (footnotes and other ancillary details stuffed into the middle of sentences and the like).

Last edited by DiapDealer; 08-10-2014 at 10:22 AM.
DiapDealer is offline   Reply With Quote
Old 08-10-2014, 10:45 AM   #34
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 30,914
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by cybmole View Post
thanks - now I'll have to remember where I might have saved a test case

does that code strip the nested tags from inner to outer, as usually the outermost one is the best candidate for keeping ?
No, it is pretty stupid, when I can, I usually wrap that in a
(<body.*>)\s*
and
\s+</body>

including putting those back in the replace


I do frequent saves, NOT auto fix, and I stop and repair before getting a deep pile. Use the Preview Message to help locate the exception (usually there was a mid document close BQ and another Open BQ.
A tool that could count the additional inner Opening BQ tags and only remove the close after counting back down)
theducks is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
What does the filepos parameter do in an href? lunixer ePub 6 03-16-2017 10:56 AM
Regex Solution to hidden href search? MizSuz Sigil 16 09-29-2012 07:40 PM
Why is a href needed in the manifest to validate? wannabee ePub 3 01-24-2012 11:40 PM
a href links working/not working mimosawind ePub 5 12-09-2011 12:42 PM
RFE: Remove remove tags in bulk edit magphil Calibre 0 08-11-2009 10:37 AM


All times are GMT -4. The time now is 10:28 PM.


MobileRead.com is a privately owned, operated and funded community.