I am working on a recipe for the Boston Globe (with subscription). Certain images are often missing (for example, the editorial cartoon), and I don't understand why. Here's an example from today's attempt:
Code:
Fetching http://www.bostonglobe.com/opinion/2012/03/27/editorial-cartoon-romney-house-plans/GIkU9kAMRSFiNAxvKhb5ZO/story.html
Traceback (most recent call last):
File "site-packages\calibre\web\fetch\simple.py", line 346, in process_images
File "site-packages\PIL\Image.py", line 1982, in open
IOError: cannot identify image file
Recursion limit reached. Skipping links in http://www.bostonglobe.com/opinion/2012/03/27/editorial-cartoon-romney-house-plans/GIkU9kAMRSFiNAxvKhb5ZO/story.html
The HTML for the image in question is this:
Code:
<img src="/rf/image_960w/Boston/2011-2020/2012/03/27/BostonGlobe.com/EditorialOpinion/Images/03.28ROMNEYHOUSE.tif" data-fullsrc="/rf/image_960w/Boston/2011-2020/2012/03/27/BostonGlobe.com/EditorialOpinion/Images/03.28ROMNEYHOUSE.tif" alt="
">
The image's URL is correct - going to
Code:
http://www.bostonglobe.com/rf/image_960w/Boston/2011-2020/2012/03/27/BostonGlobe.com/EditorialOpinion/Images/03.28ROMNEYHOUSE.tif
does display the image (provided you've got a subscription, of course, which I do).
One thought I had was that perhaps calibre doesn't support TIF files, but I couldn't find a list of supported image types anywhere.
If that's not the problem, does anyone have any ideas of what might be going on?