![]() |
#1 |
Connoisseur
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 70
Karma: 1800048
Join Date: Oct 2014
Device: BooX M96
|
Batch download pdf's ?
I've been looking around the web for a simple and fast way to download many pdf's from a website, but haven't found a solution I am able to figure out,
does anyone know of a simple way to do this ? Thnaks ! |
![]() |
![]() |
![]() |
#2 |
Just a Yellow Smiley.
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 19,161
Karma: 83862859
Join Date: Jul 2015
Location: Texas
Device: K4, K5, fire, kobo, galaxy
|
Is this a particular website?
|
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Ex-Helpdesk Junkie
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
Are all the PDFs linked from one (or a couple) pages?
If so, you can do it easily on the command-line with wget. Code:
wget --recursive --level=1 --accept pdf,PDF http://website.com/pdf-index-page.html ![]() Last edited by eschwartz; 08-24-2015 at 04:11 PM. Reason: for stupid PDFs with uppercase extensions ;) |
![]() |
![]() |
![]() |
#4 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 608
Karma: 5007204
Join Date: Sep 2014
Location: Calif
Device: Fire hdx 8.9, Tab S2, Tab S5e, Aura ONE
|
Then are the files all in 1 directory & do you want them all? Or are the files "not in sequence" as you just want to select the files to be downloaded. I've gotten all the files in a directory by just selecting the directory to be downloaded via ftp software to update linux KDE.
Don't recall any ftp software that will allow selection of various files to be "batched" download which really is having the software download each file in turn. There may be ftp software that is able to provide "batch" download of selected files now, but could be available as a browser addon; i.e. try FireFtp for firefox. NCFTP client has/had a batch download but think that it is just getting files that are in sequential order. |
![]() |
![]() |
![]() |
#5 |
Ex-Helpdesk Junkie
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
@crane3 -- who says they are available on an FTP server?
![]() |
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Connoisseur
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 70
Karma: 1800048
Join Date: Oct 2014
Device: BooX M96
|
thanks for all the answers, I will try the firefox add-on,
I tried wget, but need to find a good tutorial online to get it to work ... (I don't have one particular site in mind, but this is something I try to do every once in a while for different sites ...) |
![]() |
![]() |
![]() |
#7 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 5,786
Karma: 103362673
Join Date: Apr 2011
Device: pb360
|
|
![]() |
![]() |
![]() |
#8 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 608
Karma: 5007204
Join Date: Sep 2014
Location: Calif
Device: Fire hdx 8.9, Tab S2, Tab S5e, Aura ONE
|
If it is not an FTP server, then it should be another type of "server" with the idea the "server" just provides availability of some files. Even using a browser's download option is getting files from a server. Copying files from directory A to directory B may have the directory A be considered as a "server".
|
![]() |
![]() |
![]() |
#9 | |
Ex-Helpdesk Junkie
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
Interesting idea.
![]() But as far as I can tell, that only helps you "download" faster. While interesting, it isn't directly applicable to the idea of "downloading many files in a batch job". Quote:
Which falls entirely flat as a methodology, if you only have an HTTP server available and you cannot trawl the filesystem hierarchy of the server. As I suggested in the first place, the most likely solution is going to be something like wget, which can recursively download an index page containing links to the desired PDFs. |
|
![]() |
![]() |
![]() |
#10 | |
Ex-Helpdesk Junkie
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
Quote:
![]() How much experience do you have with the command-line? Using the command string I offered above, you simply replace "http://website.com/pdf-index-page.html" with the website URL of some internet page that contains links to all the PDFs you want. Tutorial: http://www.thegeekstuff.com/2009/09/...some-examples/ The official wget documentation: https://www.gnu.org/software/wget/ma...ode/index.html |
|
![]() |
![]() |
![]() |
#11 |
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 264
Karma: 2121470
Join Date: Oct 2011
Location: Arlington, TX
Device: Kindle PW4, Moon+ Reader on a cheap Android tablet
|
In the past, I have used a firefox plugin called DownThemAll to mass download pdf manuals from IBM web pages. This might work for you.
|
![]() |
![]() |
![]() |
#12 |
Connoisseur
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 70
Karma: 1800048
Join Date: Oct 2014
Device: BooX M96
|
Hi,
I don't have much experience with the command line, but all the answers here got me more intrigued in wget, so I've got it installed now, and I'm playing around with it a bit, thanks for all the answers !!! It's working! I will be testing it on different websites, I guess it won't work everywhere ... Last edited by bobodude; 08-24-2015 at 05:59 PM. |
![]() |
![]() |
![]() |
#13 |
Ex-Helpdesk Junkie
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
Good to hear it helped you.
![]() Note: It should work everywhere, so long as the website offers download links. It might not work in places where content-disposition headers rename the download or redirects are in place -- both lead to downloaded flenames that look like e.g. attachment.php?attachmentid=141344&d=1440341764 and are filtered out because we only accepted PDFs -- or the website uses a robots.txt to forbid bot downloads. The solution to all these is in advanced wget usage, for instance in my wgetrc (permanent configuration file) I have trust_server_names=on and content_disposition=on and robots=off. You can also pass those options with Code:
--execute trust_server_names=on --execute content_disposition=on --execute robots=off Last edited by eschwartz; 08-24-2015 at 06:26 PM. |
![]() |
![]() |
![]() |
#14 | |
Connoisseur
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 70
Karma: 1800048
Join Date: Oct 2014
Device: BooX M96
|
Quote:
[could you give me an example of downloading a site, using this command (I'm a noob ...)] hmm, I figured out what you meant, I think, I have to make the above changes in the wgetrc file, which I can't seem able to find ... Thanks in advance ! PS: I am using windows, and downloaded wget from this site:https://builtvisible.com/download-yo...ite-with-wget/ (first link, under download wget) ... PPS: I am aslo trying this: http://www.jensroesner.de/wgetgui/, which is a wgetGUI, probably the noob version of wget, and I'm getting html files, we is already a start, will continue fiddling around ... PPPS: what I would like to do is download some pdf articles from a journal (newleftreview.org), in a faster way ... Last edited by bobodude; 08-25-2015 at 03:47 PM. |
|
![]() |
![]() |
![]() |
#15 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,728
Karma: 75825105
Join Date: Dec 2010
Location: PDXish
Device: Kindle Voyage, various Android devices
|
Quote:
And if all the files I need don't have a link on the same page but I know they are sequential, I have used a bookmarklet to make a numbered list. See this page for an example: https://www.squarefree.com/bookmarkl..._numbered_list Last edited by Dazrin; 08-24-2015 at 07:45 PM. |
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Request Batch export annotated pdf | gotmilt | enTourage eDGe | 2 | 11-18-2011 04:57 PM |
PDF to prc/azw Batch Conversion | xsolitudex | 2 | 09-04-2010 10:19 AM | |
Classic Batch download of B&N eBooks? | mgmueller | Barnes & Noble NOOK | 5 | 02-08-2010 12:01 PM |
HTML to PDF batch converter | sputnik | 3 | 07-07-2009 04:25 AM |