MobileRead Forums - View Single Post

RobFreundlich · 09-27-2012, 08:58 AM

Thanks to your tips, I've solved this problem. Here's my solution:

Code:

    valid_filename_chars = "-_.%s%s" % (string.ascii_letters, string.digits)

    def image_url_processor(self, baseurl, url):
        self.log("===================\nbaseurl: ", baseurl, "\nurl: ", url)
        # This is a hack because some of the URLs just have a leading
        # // instead of http://
        if url.startswith("//"):
            url = "http:" + url

        url = self.get_image(url)

        self.log("url out: ", url, "\n===================")

        return url

    def get_image(self, url):
        # Another hack - sometimes the URLs just have a leading /,
        # in which case I stick on "http://" and the correct domain
        if url.startswith("/"):
          url = self.make_url(url)

        # Get the image bytes
        br = BasicNewsRecipe.get_browser()
        response = br.open(url)
        data = response.get_data()

        # write it to a local file whose name is based on the URL
        filename = ''.join(c for c in url if c in self.valid_filename_chars)
        self.log("filename=%s" % filename)

        f = open(filename, "wb")
        f.write(data)
        f.close()

        # Try to read it with PIL, which is what the containing app will do
        try:
            im = PIL.Image.open(filename)
        except:
            # If it failed, read it with ImageMagick and write it to a new file,
            # changing the URL to point to the new file
            self.log("Could not open ", filename, " from ", url)
            self.log("Trying to open and re-save with ImageMagick")
            image = calibre.utils.magick.Image()
            image.read(filename)
            image.save("new_" + filename)
            url = os.getcwd() + "/new_" + filename
            url = "file:///" + url.replace("\\", "/")
            self.log("Succeeded.  Using local file")

        return url

Luckily, ImageMagick manages to load the file successfully AND heal it when saving it back out, so I didn't have to look into what exactly was wrong with these files. For curiosity's sake, I did do a comparison of the old and new, and there are differences, but since I don't know squat about the PNG format (and don't have the time or energy to learn), I don't know exactly what they mean.

If I had the time, I'd grab the calibre source, find where it's doing the image load, put something like this in, and submit it as a bug fix, but unfortunately, I don't. If anyone else out there wants to do it, go ahead - I happily release this code (particularly the try...except bit that solves the problem) into the public domain.