View Single Post
Old 06-11-2010, 02:58 PM   #2076
Daanish87
Member
Daanish87 began at the beginning.
 
Posts: 18
Karma: 10
Join Date: Jun 2010
Device: none
Quote:
Originally Posted by Starson17 View Post
You can start here:
http://calibre-ebook.com/user_manual...asicnewsrecipe

It looks like you already have the meat of your recipe, but want to get rid of "extra pictures and icons and crazy stuff." There are 2 ways to do it. First, you can try the print_version method. See the link, but basically it means substituting a link to a clean version for your messy version. You have to find that link yourself on the site. The other is to clean up or avoid the "crazy stuff."

I suggest you use FireFox and Firebug to identify what you want to remove or keep, then use keep_only_tags or remove_tags to either get only what you want, or remove the junk. Have fun and feel free to ask here for help.
Hi Starson, thanks for the reply. I actually wanted to try the print edition way, but here is where I got stuck. So for example, with investment executive:

The normal URL is:

http: // www .investmentexecutive.com/client/en/News/DetailNews.asp?id=53928&IdSection=146&cat=146&BIma geCI=1

The print is:

http: // www. investmentexecutive.com/client/en/News/ImprimerDetail.asp?id=53928&IdSection=146&cat=146& BImageCI=1

The URLs are nearly identical. I tried messing around with it but it keeps giving me the same output. Would you be able to guide me as to what I should put for the replacement when writing the code? Thanks
Daanish87 is offline