MobileRead Forums - View Single Post

pckopp · 12-11-2010, 01:31 PM

I am converting some pdf files (I know) to epub and mobi. There are several issues, but one I'm stuck on that seems solvable is removing the header from every page. From the Structure Detection page, I click the Remove Header box and then click the 'Magic Wand' to try and create a reg exp that will do what I want. It shows me the file in html and I see what I want to remove. It is:

<hr>
<A name=12></a>Book Author 
Book Title 

In the A name= tag the number increases so I understand the <a name=\d+> expression. But the rest eludes me. If I understood what the existing expression did, I could probably make some headway. I want to replace the above with nothing. Note that there is a space char after both the author and title strings inside the bold tags.

Also, when I click the Test button nothing seems to happen.

Any help appreciated. Thanks!

12-11-2010, 01:31 PM	#1
pckopp Enthusiast Posts: 32 Karma: 44 Join Date: Jul 2010 Location: Seneca, SC Device: Kindle, eReader	Removing a header I am converting some pdf files (I know) to epub and mobi. There are several issues, but one I'm stuck on that seems solvable is removing the header from every page. From the Structure Detection page, I click the Remove Header box and then click the 'Magic Wand' to try and create a reg exp that will do what I want. It shows me the file in html and I see what I want to remove. It is: <hr> <A name=12></a><b>Book Author </b><br> <b>Book Title </b><br> In the A name= tag the number increases so I understand the <a name=\d+> expression. But the rest eludes me. If I understood what the existing expression did, I could probably make some headway. I want to replace the above with nothing. Note that there is a space char after both the author and title strings inside the bold tags. Also, when I click the Test button nothing seems to happen. Any help appreciated. Thanks!