View Single Post
Old 09-27-2012, 08:58 AM   #5
RobFreundlich
Connoisseur
RobFreundlich ought to be getting tired of karma fortunes by now.RobFreundlich ought to be getting tired of karma fortunes by now.RobFreundlich ought to be getting tired of karma fortunes by now.RobFreundlich ought to be getting tired of karma fortunes by now.RobFreundlich ought to be getting tired of karma fortunes by now.RobFreundlich ought to be getting tired of karma fortunes by now.RobFreundlich ought to be getting tired of karma fortunes by now.RobFreundlich ought to be getting tired of karma fortunes by now.RobFreundlich ought to be getting tired of karma fortunes by now.RobFreundlich ought to be getting tired of karma fortunes by now.RobFreundlich ought to be getting tired of karma fortunes by now.
 
Posts: 74
Karma: 10000010
Join Date: Jan 2012
Device: Android Tablet with Calibre Companion and Moon+ Reader Pro
Solved it

Thanks to your tips, I've solved this problem. Here's my solution:

Code:
    valid_filename_chars = "-_.%s%s" % (string.ascii_letters, string.digits)

    def image_url_processor(self, baseurl, url):
        self.log("===================\nbaseurl: ", baseurl, "\nurl: ", url)
        # This is a hack because some of the URLs just have a leading
        # // instead of http://
        if url.startswith("//"):
            url = "http:" + url

        url = self.get_image(url)

        self.log("url out: ", url, "\n===================")

        return url

    def get_image(self, url):
        # Another hack - sometimes the URLs just have a leading /,
        # in which case I stick on "http://" and the correct domain
        if url.startswith("/"):
          url = self.make_url(url)

        # Get the image bytes
        br = BasicNewsRecipe.get_browser()
        response = br.open(url)
        data = response.get_data()

        # write it to a local file whose name is based on the URL
        filename = ''.join(c for c in url if c in self.valid_filename_chars)
        self.log("filename=%s" % filename)

        f = open(filename, "wb")
        f.write(data)
        f.close()

        # Try to read it with PIL, which is what the containing app will do
        try:
            im = PIL.Image.open(filename)
        except:
            # If it failed, read it with ImageMagick and write it to a new file,
            # changing the URL to point to the new file
            self.log("Could not open ", filename, " from ", url)
            self.log("Trying to open and re-save with ImageMagick")
            image = calibre.utils.magick.Image()
            image.read(filename)
            image.save("new_" + filename)
            url = os.getcwd() + "/new_" + filename
            url = "file:///" + url.replace("\\", "/")
            self.log("Succeeded.  Using local file")

        return url
Luckily, ImageMagick manages to load the file successfully AND heal it when saving it back out, so I didn't have to look into what exactly was wrong with these files. For curiosity's sake, I did do a comparison of the old and new, and there are differences, but since I don't know squat about the PNG format (and don't have the time or energy to learn), I don't know exactly what they mean.

If I had the time, I'd grab the calibre source, find where it's doing the image load, put something like this in, and submit it as a bug fix, but unfortunately, I don't. If anyone else out there wants to do it, go ahead - I happily release this code (particularly the try...except bit that solves the problem) into the public domain.
RobFreundlich is offline   Reply With Quote