Quote:
Originally Posted by Dullahir
What I have learned:
Most of the time, if there are no tags in author names or titles for example nothing like <i> or <b>, it's safe to just neglect them in the RegEx.
To remove numbers from books, the expression '\d' would supplement [0-9]. This would delete every integer from the book. I'm looking for another way, however. In the tutorial, it mentioned 'Page of Number'. Rare are the occassions when the books in my library have 'Page of Number' instead of just a regular number, so I have had trouble making the expression Page [0-9] of 65 work.
Also, I don't think I am going to use the \d expression. While handy, what about when you see things like "11:30. He was late. Again." without the expression, but with it, you'd see ":. He was late. Again."
Any ideas? Because
Page [0-9][0-9] of 65 won't work,
Page [0-9][0-9]+ of 65 won't work. (Double-expressions because of the double integers, I'm assuming.)
I haven't tried [0-9]+ of 65, but I'm not really too hopeful on that, but it won't hurt to try, I guess!
|
In your case:
11:30 wont be captured by \d+ You would need \d+\:\d+
12.5 will only capture the 12 \d+\.\d+ is needed
Page [0-9][0-9]
of 65
for D+ to capture. ALL of the green has to be present including spaces
Which may be your problem. You have a normal space (%20) in your pattern.
What if it is a NBSP (What I would use so thing don't spread out)?