04-17-2012, 06:50 AM | #1 |
Junior Member
Posts: 3
Karma: 10
Join Date: Apr 2012
Device: Sony PRS-T1
|
Help to compose a regex to find strings, enclosed in comments tags
Hi I have a problem: I want to use calibre to convert eBook I downloaded as a bunch of HTML pages to ePub format. And in the pages there are some comments, which I would like to remove.
The example text I want to remove is as follows: I want to remove text enclosed in comments, including the comments themselfs. Here is what I have tried but without success: I have tried a few more, but also without success. I wonder why this is so because I can easily select a tag, but not a comment. Like this: So could somebody help with this? I have also attached Html file. Thanks in anvance if somebody could help |
04-17-2012, 07:48 AM | #2 |
Guru
Posts: 655
Karma: 64171
Join Date: Sep 2010
Location: Kent, England, Sol 3, ZZ9 plural Z Alpha
Device: Sony PRS-300, Kobo Aura HD, iPad (Marvin)
|
Multiline is a problem when doing this. That's why the individual tags worked, as they are on same line.
Try using this Code:
(?mis)<!-- Copyright.+?<!-- /Copyright.+? --> |
Advert | |
|
04-17-2012, 10:25 AM | #3 |
Junior Member
Posts: 3
Karma: 10
Join Date: Apr 2012
Device: Sony PRS-T1
|
Wow thanks! That's is exactly what I need! Thanks!
By the way, from where you knwo the flag (?mis)? I have searched here http://manual.calibre-ebook.com/regexp.html and here http://docs.python.org/search.html?q=%28%3Fmis%29 , and even here https://www.google.com.ua/search?sou...w=1280&bih=656 but haven't found anything. Some kind of hidden flag |
04-17-2012, 11:55 AM | #4 |
Guru
Posts: 655
Karma: 64171
Join Date: Sep 2010
Location: Kent, England, Sol 3, ZZ9 plural Z Alpha
Device: Sony PRS-300, Kobo Aura HD, iPad (Marvin)
|
Its a combination of 3 flags.
I remembered there was a topic a while ago, and done a search for 'multiline', looked through several of the results and found this topic |
04-17-2012, 11:57 AM | #5 |
creator of calibre
Posts: 43,745
Karma: 22446736
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Nothing hidden about it, they are described here: http://docs.python.org/library/re.html and that in turn is linked to from here: http://manual.calibre-ebook.com/regexp.html#credits
|
Advert | |
|
04-17-2012, 12:49 PM | #6 |
Junior Member
Posts: 3
Karma: 10
Join Date: Apr 2012
Device: Sony PRS-T1
|
@Perkin, @kovidgoyal
Thanks for the links I thought "mis" was one solid flag, so that confused me to "filtrate" everything else . Thanks again for links and for help, now I have a bunch of readable ePub books on my device . |
Thread Tools | Search this Thread |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Regex help needed, selecting single tags out of namy | Sidetrack | Library Management | 5 | 02-26-2012 10:54 PM |
Help with regex to remove specific strings of numbers | adrian1944 | Conversion | 9 | 02-14-2011 01:11 PM |
RegEx find and replace | iblesq | Sigil | 1 | 01-10-2011 09:26 PM |
REGEX find and replace help please | potestus | Sigil | 13 | 09-18-2010 04:14 PM |