View Single Post
Old 03-11-2011, 02:03 PM   #2
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by TonyDeWonderful View Post
The recipe itself works fine except that the article titles sometimes contain <em> and </em> tags (for example, the article title on the Kindle and Calibre v. 0.7.48 will show "<em>Godzilla</em> vs. Real Life"). This was also occurring in the main title once you opened the article but I was able to remove that via "preprocess_html".

Since the "preprocess_html" did not affect the article title, can someone provide me some direction as to how to remove the <em> and </em> tags from the article title?
Use populate_article_metadata like this:

Code:
    def populate_article_metadata(self, article, soup, first):
       print "Pop article title is: ", article.title
       article.title = article.title
       return
Except, do a replace or whatever you want to the article.title
Quote:
I've included the recipe that I'm using below.
Use code tags in the future, it makes it easier to use your code (highlight and hit the pound/hash symbol to mark your code.

Last edited by Starson17; 03-11-2011 at 02:14 PM.
Starson17 is offline   Reply With Quote