View Single Post
Old 11-02-2010, 10:49 AM   #8
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
Manichean's correct - DOTALL needs a dot. That said, \s+ will traverse newlines, so as long as that's in the right place the proposed regexes should be ok.

All that said, I can't tell what you're really trying to do - where are you placing this regex that it's doing something useful for you? PDF has some hard-coded regexes to do basically what you're asking for. One built-in pdf regex - one that has more false positives - only becomes enabled when preprocessing is enabled. I could have sworn that a single number followed by an optional dot (and optionally followed by a title on a second line) was already covered by the default regex....

You could open a bug with your file if you like and we can take a look at it from there.
ldolse is offline   Reply With Quote