MobileRead Forums - View Single Post - Custom recipes (archive, read-only)

jbambridge · 08-06-2009, 10:11 AM

One extra thought:

Checking:

Quote:

File "calibre\ebooks\oeb\base.pyo", line 917, in _parse_xhtml

in the source code shows that this is a part of the code that removes empty <a></a> tags. This is indeed the case on the example I gave where the publisher has left a strange link in the text.

Adding a

PHP Code:


			
remove_tags = [dict(name='a')]

is a work around, although this also destroys valid <a> tags.

My PHP is not up to fixing the _parse_xhtml code myself though.

Can anyone suggest a better work around (that doesn't delete any valid content) or a fix to the PHP code?

John

P.S. I've attached the offending article as an example of the empty <a> tags. index.txt is after porcessing by the recipe and problem.txt is the original html file.