View Single Post
Old 04-30-2010, 05:18 PM   #19
adolson
Member
adolson began at the beginning.
 
Posts: 15
Karma: 10
Join Date: Apr 2010
Device: PRS-300
I see that Kovid accepted the fix and will put it in the next release, which I guess will be pretty soon, looking at the release history (I am 1 day old to Calibre and eBooks in general - this is an impressive project, and the frequent releases and fast development are amazing to me).

I may just wait until then, but in the meantime, I just installed the latest version and am running Ubuntu 10.04. Here's a clip of my PDF from the regex test page:

Quote:
anyone else to know I was using it. Ball Tongue and I would go score some on the sly, after band practice or some other time when the rest of the guys weren’t around. It went on that way for a few 50</p><p>
the final piece</p><p>

months, until I found out something interesting: Munky was doing speed on the days <i>he </i> was off, too. </p><p>
And guess what? So was Jonathan. </p><p>
The red is the part I want to get rid of. That's the page number (footer of the book) and the chapter or title of (header of each page). If I separate them into two, I can't come up with a header regex that works for removing the header line, because it matches the content (some characters and then a </p><p>).

Here is what I tried, and have been converting it to TXT format for quick viewing, though EPUB results the same:

(?ism)\d+</p><p>.*?</p><p>$

(?m)(\d+</p><p>.*?</p><p>)

(?mi)(\d+</p><p>.*?</p><p>$)

(?mi)(\d+</p><p>$^.*?</p><p>)

...and many other variants...

I based these ideas on the regex given on page 1 that was said to work for multi-line, but I can't figure it out. I'm sure it's something obvious that I'm doing wrong, too. Can anyone help?
adolson is offline   Reply With Quote