Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 04-04-2011, 12:07 PM   #1
audreypots
Member
audreypots began at the beginning.
 
audreypots's Avatar
 
Posts: 10
Karma: 10
Join Date: Apr 2011
Device: Kindle
My first recipe YeY! I have a question though...

Hi Guys,

I finally made my first recipe and so far so good (thanks calibre!). However, I was wondering if you guys can point me in the right direction (I really feel that my question is very stupid, but i read in this forum that the stupid question is the one that's not asked). First let me say that I have no idea what python is and a very basic knowledge of HTML, but I find this very interesting and the potential to be great! So I am trying to learn...

Anyways, my question is this, can you guys point me in the right direction on how I can include the images on the articles?

This is the link of the one I am working on:

http://blog.mysanantonio.com/spursnation/feed/

So far everything is okay, but the images are not showing and I have no idea why... I tried playing around with remove_tags_before/after and keep_only_tags/remove_only_tags but no success yet...

Sorry for the rant and I really do appreciate the help!
audreypots is offline   Reply With Quote
Old 04-04-2011, 01:27 PM   #2
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by audreypots View Post
Anyways, my question is this, can you guys point me in the right direction on how I can include the images on the articles?
Have you tried the Basic recipe? Does it keep images? All the keep or remove tags options you referred to act to eliminate tags. That's all they do, so you should start without any of those options and see if you get images. If not, the problem is elsewhere (It could be cookies, scripting problems, headers, login authentication, etc.). If you get images with the Basic recipe (which does not use any of the tag removal options), then you can look at how to get rid of other junk, without also inadvertently deleting images.

IOW, your first job is to figure out whether the images are missing because of something you are doing, or something the website is doing.
Starson17 is offline   Reply With Quote
Old 04-04-2011, 02:24 PM   #3
louhike
Junior Member
louhike began at the beginning.
 
Posts: 8
Karma: 10
Join Date: Apr 2011
Device: Kindle 3
Give a try to keep_only_tags = [dict(name='div',attrs={'class':'post-contents clearfix'})])].
You should have everything you want with this parameter.
louhike is offline   Reply With Quote
Old 04-04-2011, 09:38 PM   #4
audreypots
Member
audreypots began at the beginning.
 
audreypots's Avatar
 
Posts: 10
Karma: 10
Join Date: Apr 2011
Device: Kindle
Quote:
Originally Posted by Starson17 View Post
Have you tried the Basic recipe? Does it keep images? All the keep or remove tags options you referred to act to eliminate tags. That's all they do, so you should start without any of those options and see if you get images. If not, the problem is elsewhere (It could be cookies, scripting problems, headers, login authentication, etc.). If you get images with the Basic recipe (which does not use any of the tag removal options), then you can look at how to get rid of other junk, without also inadvertently deleting images.

IOW, your first job is to figure out whether the images are missing because of something you are doing, or something the website is doing.

Hi and thank you very much for the help, I should have been more clear earlier, I actually started with just the basic recipe, only giving a title and the url of the feed (just like in the tutorial) and everything is fine I can read the whole article and as far as I can tell no extra junk from the site. Then I tried the Economist and I noticed that it include images so I tried playing around with it to try and add the image from the articles... that's why I used the tag removal options.

Last edited by audreypots; 04-04-2011 at 09:44 PM. Reason: wrong spelling :D
audreypots is offline   Reply With Quote
Old 04-04-2011, 09:43 PM   #5
audreypots
Member
audreypots began at the beginning.
 
audreypots's Avatar
 
Posts: 10
Karma: 10
Join Date: Apr 2011
Device: Kindle
Quote:
Originally Posted by louhike View Post
Give a try to keep_only_tags = [dict(name='div',attrs={'class':'post-contents clearfix'})])].
You should have everything you want with this parameter.
Hi thanks for the help! I think I tried this one already to make sure I tried it again and after fetching the whole article is gone, the article would be something like:

Quote:
| Next | Section Menu | Main Menu |
This article was downloaded by calibre from http://blog.mysanantonio.com/spursna...-nbas-top-500/


| Section Menu | Main Menu |
| Next | Section Menu | Main Menu | Previous |
This article was downloaded by calibre from http://blog.mysanantonio.com/spursna...other-players/


| Section Menu | Main Menu |
| Next | Section Menu | Main Menu | Previous |
audreypots is offline   Reply With Quote
Old 04-05-2011, 08:00 AM   #6
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by audreypots View Post
Hi thanks for the help! I think I tried this one already to make sure I tried it again and after fetching the whole article is gone
Do you get images without any keep_only_tags or remove_tags, etc. in your recipe? If not, changing those functions will never produce images. You should always post your recipe if you want others to look at it.

Last edited by Starson17; 04-05-2011 at 08:46 AM.
Starson17 is offline   Reply With Quote
Old 04-05-2011, 08:50 AM   #7
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by audreypots View Post
I actually started with just the basic recipe, only giving a title and the url of the feed (just like in the tutorial) and everything is fine
When you wrote "everything is fine" did you mean you had images?

If you didn't have images, then everything wasn't fine, but if you had images, then why would you have " tried playing around with it to try and add the image from the articles"?

If you did not have images with the basic recipe, the tag removal options won't improve anything.
Starson17 is offline   Reply With Quote
Old 04-05-2011, 10:43 AM   #8
audreypots
Member
audreypots began at the beginning.
 
audreypots's Avatar
 
Posts: 10
Karma: 10
Join Date: Apr 2011
Device: Kindle
Quote:
Originally Posted by Starson17 View Post
When you wrote "everything is fine" did you mean you had images?

If you didn't have images, then everything wasn't fine, but if you had images, then why would you have " tried playing around with it to try and add the image from the articles"?

If you did not have images with the basic recipe, the tag removal options won't improve anything.
Actually no, there was no image with the basic recipe, but I was able to read the whole article so I thought it was fine So if the tag removal options won't improve anything, what can I use/do to add the images? Again thanks for the help and enjoy the rest of your day!
audreypots is offline   Reply With Quote
Old 04-05-2011, 10:54 AM   #9
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by audreypots View Post
Actually no, there was no image with the basic recipe, but I was able to read the whole article so I thought it was fine So if the tag removal options won't improve anything, what can I use/do to add the images? Again thanks for the help and enjoy the rest of your day!
Post your recipe. Use CODE and SPOILER tags. I'll take a look. As to what you can do to fix it, you have to find out the problem. Now we know it's the site, and not your recipe removing a tag that contains the image. As to what the problem is: "(It could be cookies, scripting problems, headers, login authentication, etc.)."
Starson17 is offline   Reply With Quote
Old 04-06-2011, 08:57 AM   #10
audreypots
Member
audreypots began at the beginning.
 
audreypots's Avatar
 
Posts: 10
Karma: 10
Join Date: Apr 2011
Device: Kindle
Quote:
Originally Posted by Starson17 View Post
Post your recipe. Use CODE and SPOILER tags. I'll take a look. As to what you can do to fix it, you have to find out the problem. Now we know it's the site, and not your recipe removing a tag that contains the image. As to what the problem is: "(It could be cookies, scripting problems, headers, login authentication, etc.)."

Hi,

This is my code, I am afraid it's very basic:

Spoiler:
Code:
class AdvancedUserRecipe1301845915(BasicNewsRecipe):
    title          = u'My San Antonio Spurs'
    oldest_article = 1
    max_articles_per_feed = 100
    cover_url           = 'http://blog.mysanantonio.com/spursnation/wp-content/themes/niche-site-spurs/images/logo.png'

    feeds          = [(u'My San Antonio Spurs', u'http://blog.mysanantonio.com/spursnation/feed/')]

    #keep_only_tags = [dict(name='div',attrs={'class':'post-contents clearfix'})]
audreypots is offline   Reply With Quote
Old 04-06-2011, 11:08 AM   #11
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by audreypots View Post
Hi,

This is my code, I am afraid it's very basic:
That's OK, 1) I prefer to see what you're actually using, 2) it saves me time in organizing the basic structure, and 3) I'm sure you really are interested enough to post.

I ran your recipe and tracked it back at least far enough to see that the site is sensitive to all sorts of issues. If you turn off cookies in Firefox, or block the cookie in TamperData, you get no images. If you send no UserAgent, you get no images, etc. i suspect it may also be sensitive to other headers, like the accept header, etc.

Normally, the recipe system will provide basic cookie handling and it sends a default UserAgent. Something else is likely to be the problem. I had a site that needed an Accept header that Calibre was not sending to get past the Bad Behavior module.

I regret that I don't have time to solve the problem for you. Search for some of my posts on Accept headers, Bad Behavior, cookies, etc. to see how to track the HTTP handshaking, cookies and headers. You would need to see what Calibre sends, match that to the minimum that the site finds acceptable.
Starson17 is offline   Reply With Quote
Old 04-07-2011, 06:04 AM   #12
audreypots
Member
audreypots began at the beginning.
 
audreypots's Avatar
 
Posts: 10
Karma: 10
Join Date: Apr 2011
Device: Kindle
Quote:
Originally Posted by Starson17 View Post
That's OK, 1) I prefer to see what you're actually using, 2) it saves me time in organizing the basic structure, and 3) I'm sure you really are interested enough to post.

I ran your recipe and tracked it back at least far enough to see that the site is sensitive to all sorts of issues. If you turn off cookies in Firefox, or block the cookie in TamperData, you get no images. If you send no UserAgent, you get no images, etc. i suspect it may also be sensitive to other headers, like the accept header, etc.

Normally, the recipe system will provide basic cookie handling and it sends a default UserAgent. Something else is likely to be the problem. I had a site that needed an Accept header that Calibre was not sending to get past the Bad Behavior module.

I regret that I don't have time to solve the problem for you. Search for some of my posts on Accept headers, Bad Behavior, cookies, etc. to see how to track the HTTP handshaking, cookies and headers. You would need to see what Calibre sends, match that to the minimum that the site finds acceptable.

Thank you so much! Please no regrets (i do not know the exact phrase to reply, but I hope you get my point), as I wasted some of your time already. All I really need is a little nudge on the right the direction, a few days ago I don't even know the problem now I can concentrate!

Again many many many thanks!
audreypots is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Custom recipe question jdomingos76 Recipes 1 02-10-2011 07:46 AM
Question about Seattle Times Recipe (adding a section list) kingsinger Recipes 2 01-17-2011 10:47 PM
New to Calibre - Recipe/HTML question ClairePMR Calibre 3 07-23-2010 11:53 AM
Question on TheAtlantic News Recipe gilamon Calibre 6 11-05-2008 03:07 PM
Calibre recipe Question astrodad Calibre 3 05-23-2008 01:05 PM


All times are GMT -4. The time now is 08:17 PM.


MobileRead.com is a privately owned, operated and funded community.