MobileRead Forums - View Single Post

smartmart · 10-17-2010, 05:07 AM

Quote:

Originally Posted by BookGnome

I'm not sure how you need to specify it with Calibre's custom syntax, but your regex itself is flawed. Here's a working regex in Python:

Code:

>>> import re
>>> myString = '<p> foo foo foo</p>CHAPTER 1<p> foo foo foo </p>'
>>> re.findall('Chapter \d+', myString, re.I)
['CHAPTER 1']

A lot depends on how consistent the input file is, but this should catch any instance of the word 'chapter' followed by one or more numbers, without regard to case. How to wrap that in Calibre's regex DSL is a question for the Calibre gurus.

I know, i've used a wide regex only for testing purpose

Thx Idosle, but i'm searching for a solution in calibre (if it's possible) so i can use the setting every time.

PS: i don't use the pdf to mobi from calibre because it fails with the wrap.
It seems that every page of the pdf is a paragraph.