09-26-2010, 11:25 AM | #61 |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Regex case sensitivity is on in recipes (but controllable by flags), off in the main searchbar, and controlled by option box in S&R (I'm not completely sure about the conversion situations, as I don't do a lot of those). Case sensitivity is important to understand for regex, so don't remove it, but I'd point out where Calibre turns it off so the new user doesn't get confused.
|
09-26-2010, 11:34 AM | #62 | |
Wizard
Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
|
Quote:
Edit: Now it should be correct again. Put the ignore case- flag back in, re-wrote beginning paragraph. Last edited by Manichean; 09-26-2010 at 11:50 AM. |
|
Advert | |
|
09-26-2010, 03:19 PM | #63 |
Junior Member
Posts: 1
Karma: 10
Join Date: Sep 2010
Location: Pécs, Hungary
Device: Kindle 3
|
A simple way to clean up HTML header/footer
Thanks for the nice tutorial Manichean!
I've found the following regex snippet very handy to clean the navigation header and footer from the HTML version of ProGit (http://progit.org/, CC by-nc-sa, saved with wget). You only have to find the start of the header/footer div, than tell the regex how many lines to delete: Code:
<div id="footer">(\n.+){27}\n.+script> HTH |
09-27-2010, 08:09 AM | #64 |
Wizard
Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
|
Since I'm happy with the post as it is now, and there seem to be no more suggestions, I've removed the notice that the post is still developing and will notice Kovid that it can be included in the documentation. Of course, that doesn't mean that I won't listen to suggestions anymore, just that edits will happen less frequently.
I'm also going to go ahead and create a wiki article out of this, which I'll link from the main Calibre article. |
09-27-2010, 09:28 AM | #65 | ||
Wizard
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
|
Overall looking very good Manichean, sorry about the late review. Not sure that backreferences are really required, but you poked some fun at that, so fine by me.
Shouldn't this be two separate paragraphs?: Quote:
This is actually incorrect, (1|2)+ will match all those strings. A group doesn't get 'locked' based on the first character it matches. Quote:
|
||
Advert | |
|
09-27-2010, 09:42 AM | #66 | ||
Wizard
Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
|
Quote:
Oops... Quote:
|
||
09-27-2010, 09:59 AM | #67 | |
Wizard
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
|
Quote:
Right - they're basically identical. |
|
09-27-2010, 10:00 AM | #68 | |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Quote:
BTW, I refer to "|" as the "vertical bar," not "pipe" and in the regex context I've seen it formally referred to as the "alternation operator" or, more commonly, just "OR" or the "OR operator" The term "pipe" has a pretty specific meaning in *nixLand (and in WindowsLand) and it has nothing to do with alternation or OR. There could be confusion there. |
|
09-27-2010, 10:21 AM | #69 | |||
Wizard
Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
|
Quote:
Oh, and by all means play with that feature- chaley did a great job on that, I think. Quote:
Quote:
|
|||
09-27-2010, 10:34 AM | #70 |
Well trained by Cats
Posts: 30,006
Karma: 57259778
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Minor Nit.
Whitespace is any non-visible printing character, It occupies space In ASCII text, A non-printing character were things like Bel, SOT,EOT,DC1... They did things, but occupied no space (and caused no carriage or paper motion). |
09-27-2010, 10:44 AM | #71 |
Wizard
Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
|
You're correct, and I'm aware of that. I used "won't get printed" as a colloquial way of saying "not using ink, but doing stuff". Keeping in mind that I want to keep this understandable to the more non-technical crowd, do you think that part should be reworded?
|
09-27-2010, 10:47 AM | #72 |
Grand Sorcerer
Posts: 11,806
Karma: 7029971
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
|
@Manichean: good work. Writing to explain ideas is very difficult for lots of reasons. I think you found a good balance.
One nit: in a couple of places you have lines that start with punctuation. These always happen after a code box. One is ', for lower- and uppercase characters you'd' and another is '. The other shorthands can be complemented by'. |
09-27-2010, 11:04 AM | #73 | |
Wizard
Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
|
Quote:
I should probably change that habit and proofread the finished, rendered text instead... |
|
09-27-2010, 11:44 AM | #74 | |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Quote:
|
|
10-02-2010, 10:24 PM | #75 |
Evangelist
Posts: 405
Karma: 692
Join Date: Sep 2006
Device: Samsung Galaxy Note 3 | Kindle Paperwhite | iPad Mini
|
If I want to remove multiple lines of text, do I enclose my reg expressions in parentheses and then separate the sets by a ?
|
Tags |
regexp calibre tutorial |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Problem with regular expressions | Manichean | Conversion | 10 | 02-03-2011 02:27 PM |
Custom Regular Expressions for adding book information | bigbot3 | Calibre | 1 | 12-25-2010 06:28 PM |
Help with Regular Expressions | ghostyjack | Workshop | 2 | 01-08-2010 11:04 AM |
Regular Expressions help needed | Phil_C | Workshop | 20 | 10-03-2009 12:14 AM |
BookDesigner v5 and regular expressions | ShineOn | Sony Reader | 11 | 08-25-2008 04:06 PM |