12-30-2014, 08:11 PM | #1 |
Junior Member
Posts: 3
Karma: 10
Join Date: Dec 2014
Device: Kindle Paperwhite
|
Calibre StorePlugin: Download Chunked Response
I'm developing a StorePlugin for Calibre and I've managed to get the bulk of the parsing completed, but when it gets down to the actual downloading of the ebook, I'm running into some issues.
The page I'm scraping from to retrieve download links supplies them in the format "http://server.com/get.php?fileID=XXXX". Checking the headers, it's giving a chunked response. Here's the info: Connection → keep-alive Content-Encoding → gzip Transfer-Encoding → chunked Throwing any download link like this in Calibre throws a ValueError (I'm assuming because it's just saving empty page as the file, not the chunked data referred to by the header.) Any ideas on how to tackle this so I can either provide a proper link or somehow patch in chunked response support? |
12-30-2014, 08:55 PM | #2 |
creator of calibre
Posts: 43,860
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
As far as I know, httplib/urllib/mechanize all support chunked transfer encoding. While I am not the maintainer for Get Books, IIRC Get Books use mechanize, so there should be no problem with chunked transfer encoding.
|
Advert | |
|
12-31-2014, 12:54 AM | #3 |
Junior Member
Posts: 3
Karma: 10
Join Date: Dec 2014
Device: Kindle Paperwhite
|
Interesting, so something like
s = SearchResult() s.downloads = { 'FORMAT_HERE': 'http://server.com/get.php?fileID=FILE_ID' } should be running along its merry way? I'll poke around a bit more and see if I can't get it working, and I'll get in contact with the Get Books maintainer if all else fails. Thanks. Last edited by itsWeller; 12-31-2014 at 12:57 AM. |
12-31-2014, 01:11 AM | #4 |
creator of calibre
Posts: 43,860
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
As far as I know, it should. Look at ebook_download.py for details.
And note that if the server requires authentication, then you will need to provide a cookie file as well. |
01-02-2015, 07:34 PM | #5 |
Junior Member
Posts: 3
Karma: 10
Join Date: Dec 2014
Device: Kindle Paperwhite
|
Sure enough, after a bit of probing, you were right - the issue isn't with chunked transfer. It actually appears to be checking the referer header before authorizing the download, and dropping
br.addheaders = [("Referer", "http://servername.com")] in ebook_download.py is enough to get the plugin off the ground and enable downloads. Now the next challenge is, how can I specify headers from within my plugin so I can do this the *right* way? I see I can supply cookies for authentication, but I don't see any way to change the headers of the download request. |
Advert | |
|
01-02-2015, 09:50 PM | #6 |
creator of calibre
Posts: 43,860
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Like I said I'm not the maintainer of get books, so I cant say. To me, the best way to proceed is to allow the storeplugins to specify a function that returns the browser object to use for downloads. The default (base class) implementation of this function should just do what is done currently.
You can try contacting john and asking him for his opinion, his email is at the top of ebook_download.py Or open a bug report in launchpad which will notify him, when I assign to him. |
01-02-2015, 10:44 PM | #7 |
creator of calibre
Posts: 43,860
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
|
Tags |
chunks, download, plugin, storeplugin |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Troubleshooting My Kindle 3 has no response at all! | boohockey | Amazon Kindle | 4 | 09-28-2014 07:24 PM |
Can Calibre Companion Download books from Nook to Calibre? | Rika24 | Library Management | 5 | 10-03-2013 12:55 AM |
StorePlugin questions | fenuks | Development | 2 | 11-02-2011 01:27 PM |