MobileRead Forums - View Single Post

bobodude · 08-24-2015, 06:36 PM

Quote:

Originally Posted by eschwartz

Good to hear it helped you.

Note: It should work everywhere, so long as the website offers download links.
It might not work in places where content-disposition headers rename the download or redirects are in place -- both lead to downloaded flenames that look like e.g. attachment.php?attachmentid=141344&d=1440341764 and are filtered out because we only accepted PDFs -- or the website uses a robots.txt to forbid bot downloads.
The solution to all these is in advanced wget usage, for instance in my wgetrc (permanent configuration file) I have trust_server_names=on and content_disposition=on and robots=off. You can also pass those options with

Code:

--execute trust_server_names=on --execute content_disposition=on --execute robots=off

Hi, I tried it, and had exactly this problem !

[could you give me an example of downloading a site, using this command (I'm a noob ...)]

hmm, I figured out what you meant, I think, I have to make the above changes in the wgetrc file, which I can't seem able to find ...

Thanks in advance !

PS: I am using windows, and downloaded wget from this site:https://builtvisible.com/download-yo...ite-with-wget/ (first link, under download wget) ...

PPS: I am aslo trying this: http://www.jensroesner.de/wgetgui/, which is a wgetGUI, probably the noob version of wget, and I'm getting html files, we is already a start, will continue fiddling around ...

PPPS: what I would like to do is download some pdf articles from a journal (newleftreview.org), in a faster way ...