![]() |
Regex help to remove HTML footer
This is the HTML code:
Code:
<br clear="all"/><hr/><div class="center"><small><a href="slide19.html">previous</a> | |
Ok I managed to find the answer.
Used .+\ When I test the expression, it highlights the sections correctly. However after the conversion, they are still there! Even if I ticked remove footer. |
Not sure what you're trying to remove from that html code. Are you saying every page has 'previous', 'table of contents', and 'next' links?
.+\ should only be .+, but try [^>]* because it isn't greedy. You also need to account for variable spacing across line breaks and between tags, \s* helps for that. If some of the parts don't occur every time then surround it with parentheses - e.g. "(<br[^>]*>)" and add a question mark to make it optional - "(<br[^>]*>)?" Try something like this: Code:
<br[^>]*>\s*<hr/>\s*<div[^>]*>\s*<small>\s*<a\shref[^>]*>\s*previous\s*</a>\s*\|\s*<a\shref[^>]*>\s*Table\sof\sContents\s*</a>\s*\|\s*<a\shref[^>]*>\s*next\s*</a>\s*</small>\s*</div> |
As your source is HTML, if all else fails, you could always try editing the HTML in a text editor before importing to Calibre.
For example, Notepad++ is a very good free text editor, it supports Regex and allows you to find/replace across multiple open files in one hit. |
Quote:
|
| All times are GMT -4. The time now is 10:50 PM. |
Powered by: vBulletin
Copyright ©2000 - 3.8.5, Jelsoft Enterprises Ltd.
MobileRead.com is a privately owned, operated and funded community.