View Single Post
Old 06-25-2010, 02:55 PM   #1
deadSkip
Junior Member
deadSkip began at the beginning.
 
Posts: 1
Karma: 10
Join Date: Jun 2010
Device: iPhone
PDF to ePub conversion issue - headers getting left in

I'm hoping someone can give me some pointers on where I'm going wrong here. I'm trying to convert a PDF into ePub, but it seems that no matter what I do the header text is left in. According to both the wizard and the regexbuddy software, both headers are matched, but when I do the conversion they're still there.

Here's an example of the debug code.

Input\Index.html Text:
Code:
will soon manifest themselves.”&nbsp;<br>
“I sense nothing.”&nbsp;<br>
<hr>
<A name=12></a>2&nbsp;<br>
Richard A. Knaak&nbsp;<br>
“Your skills are not honed as mine are, my lord, but that&nbsp;<br>
Regex:
Code:
(?i)(?<=)<hr>\s*<A name=/d+></a>(/d+&nbsp;<br>\sRichard A\. Knaak|Moon of the Spider&nbsp;<br>\s/d+)&nbsp;<br>
Doing this leaves the text when. I do the similar thing with the parsed file and it's still left in.

Parsed\Index.html:
Code:
themselves.” </p><p>
“I sense nothing.” </p><p>
2 </p><p>
Richard A. Knaak </p><p>
“Your skills are not honed as mine are, my lord, but that shall be remedied soon enough, yes?” </p><p>
Regex:
Code:
(?i)(?<=)(Moon of the Spider\s*</p><p>\s\d+\s*</p><p>|\s\d+\s</p><p>\sRichard A\. Knaak\s*</p><p>)
And yes, I've remembered to check the Remove Header boxes
deadSkip is offline   Reply With Quote