View Single Post
Old 08-24-2015, 06:36 PM   #14
bobodude
Connoisseur
bobodude ought to be getting tired of karma fortunes by now.bobodude ought to be getting tired of karma fortunes by now.bobodude ought to be getting tired of karma fortunes by now.bobodude ought to be getting tired of karma fortunes by now.bobodude ought to be getting tired of karma fortunes by now.bobodude ought to be getting tired of karma fortunes by now.bobodude ought to be getting tired of karma fortunes by now.bobodude ought to be getting tired of karma fortunes by now.bobodude ought to be getting tired of karma fortunes by now.bobodude ought to be getting tired of karma fortunes by now.bobodude ought to be getting tired of karma fortunes by now.
 
Posts: 70
Karma: 1800048
Join Date: Oct 2014
Device: BooX M96
Quote:
Originally Posted by eschwartz View Post
Good to hear it helped you.

Note: It should work everywhere, so long as the website offers download links.
It might not work in places where content-disposition headers rename the download or redirects are in place -- both lead to downloaded flenames that look like e.g. attachment.php?attachmentid=141344&d=1440341764 and are filtered out because we only accepted PDFs -- or the website uses a robots.txt to forbid bot downloads.
The solution to all these is in advanced wget usage, for instance in my wgetrc (permanent configuration file) I have trust_server_names=on and content_disposition=on and robots=off. You can also pass those options with
Code:
--execute trust_server_names=on --execute content_disposition=on --execute robots=off
Hi, I tried it, and had exactly this problem !

[could you give me an example of downloading a site, using this command (I'm a noob ...)]

hmm, I figured out what you meant, I think, I have to make the above changes in the wgetrc file, which I can't seem able to find ...



Thanks in advance !

PS: I am using windows, and downloaded wget from this site:https://builtvisible.com/download-yo...ite-with-wget/ (first link, under download wget) ...


PPS: I am aslo trying this: http://www.jensroesner.de/wgetgui/, which is a wgetGUI, probably the noob version of wget, and I'm getting html files, we is already a start, will continue fiddling around ...


PPPS: what I would like to do is download some pdf articles from a journal (newleftreview.org), in a faster way ...

Last edited by bobodude; 08-25-2015 at 03:47 PM.
bobodude is offline   Reply With Quote