View Single Post
Old 08-22-2009, 08:36 PM   #11
DerSchwarzePrinz
Enthusiast
DerSchwarzePrinz began at the beginning.
 
Posts: 25
Karma: 16
Join Date: Aug 2009
Device: Pocketbook 360, Sony PRS-T1
Quote:
Originally Posted by user_none View Post
In most cases this won't work because the regex matches against the HTML produced at a middle stage in the conversion pipeline. In most cases you're going to need something like:

Code:
(<A name=\d+></a><i>\d+</i><br>\s*<i>Book Title</i><br>)|(<A name=\d+></a><i>Book Title</i><br>\s*<i>\d+</i><br>)
Right now the only way to get a look at that intermediary HTML is to use the command line ebook-convert tool with the --debug-input flag.
I found the following in the HTML output:

»Tatsächlich?«,&nbsp;&nbsp;erwiderte&nbsp;&nbsp;de r&nbsp;&nbsp;junge&nbsp;&nbsp;Mann&nbsp;&nbsp;tr o-<br>
5<br>
<hr>
<A name=6></a>
cken.<br>

How can I remove the passage marked in bold (numer 5 is the page number)?
DerSchwarzePrinz is offline   Reply With Quote