Quote:
Originally Posted by Starson17
You can start here:
http://calibre-ebook.com/user_manual...asicnewsrecipe
It looks like you already have the meat of your recipe, but want to get rid of "extra pictures and icons and crazy stuff." There are 2 ways to do it. First, you can try the print_version method. See the link, but basically it means substituting a link to a clean version for your messy version. You have to find that link yourself on the site. The other is to clean up or avoid the "crazy stuff."
I suggest you use FireFox and Firebug to identify what you want to remove or keep, then use keep_only_tags or remove_tags to either get only what you want, or remove the junk. Have fun and feel free to ask here for help.
|
Hi Starson, thanks for the reply. I actually wanted to try the print edition way, but here is where I got stuck. So for example, with investment executive:
The normal URL is:
http: // www .investmentexecutive.com/client/en/News/DetailNews.asp?id=53928&IdSection=146&cat=146&BIma geCI=1
The print is:
http: // www. investmentexecutive.com/client/en/News/ImprimerDetail.asp?id=53928&IdSection=146&cat=146& BImageCI=1
The URLs are nearly identical. I tried messing around with it but it keeps giving me the same output. Would you be able to guide me as to what I should put for the replacement when writing the code? Thanks