Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book General > General Discussions

Notices

Reply
 
Thread Tools Search this Thread
Old 08-23-2015, 07:59 PM   #1
bobodude
Connoisseur
bobodude ought to be getting tired of karma fortunes by now.bobodude ought to be getting tired of karma fortunes by now.bobodude ought to be getting tired of karma fortunes by now.bobodude ought to be getting tired of karma fortunes by now.bobodude ought to be getting tired of karma fortunes by now.bobodude ought to be getting tired of karma fortunes by now.bobodude ought to be getting tired of karma fortunes by now.bobodude ought to be getting tired of karma fortunes by now.bobodude ought to be getting tired of karma fortunes by now.bobodude ought to be getting tired of karma fortunes by now.bobodude ought to be getting tired of karma fortunes by now.
 
Posts: 70
Karma: 1800048
Join Date: Oct 2014
Device: BooX M96
Batch download pdf's ?

I've been looking around the web for a simple and fast way to download many pdf's from a website, but haven't found a solution I am able to figure out,

does anyone know of a simple way to do this ?

Thnaks !
bobodude is offline   Reply With Quote
Old 08-23-2015, 08:33 PM   #2
Cinisajoy
Just a Yellow Smiley.
Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.Cinisajoy ought to be getting tired of karma fortunes by now.
 
Cinisajoy's Avatar
 
Posts: 19,161
Karma: 83862859
Join Date: Jul 2015
Location: Texas
Device: K4, K5, fire, kobo, galaxy
Is this a particular website?
Cinisajoy is offline   Reply With Quote
Advert
Old 08-23-2015, 08:51 PM   #3
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
Are all the PDFs linked from one (or a couple) pages?

If so, you can do it easily on the command-line with wget.

Code:
wget --recursive --level=1 --accept pdf,PDF http://website.com/pdf-index-page.html
If you are on Windows you will almost definitely need Gow.

Last edited by eschwartz; 08-24-2015 at 04:11 PM. Reason: for stupid PDFs with uppercase extensions ;)
eschwartz is offline   Reply With Quote
Old 08-23-2015, 09:55 PM   #4
crane3
Guru
crane3 ought to be getting tired of karma fortunes by now.crane3 ought to be getting tired of karma fortunes by now.crane3 ought to be getting tired of karma fortunes by now.crane3 ought to be getting tired of karma fortunes by now.crane3 ought to be getting tired of karma fortunes by now.crane3 ought to be getting tired of karma fortunes by now.crane3 ought to be getting tired of karma fortunes by now.crane3 ought to be getting tired of karma fortunes by now.crane3 ought to be getting tired of karma fortunes by now.crane3 ought to be getting tired of karma fortunes by now.crane3 ought to be getting tired of karma fortunes by now.
 
Posts: 608
Karma: 5007204
Join Date: Sep 2014
Location: Calif
Device: Fire hdx 8.9, Tab S2, Tab S5e, Aura ONE
Then are the files all in 1 directory & do you want them all? Or are the files "not in sequence" as you just want to select the files to be downloaded. I've gotten all the files in a directory by just selecting the directory to be downloaded via ftp software to update linux KDE.

Don't recall any ftp software that will allow selection of various files to be "batched" download which really is having the software download each file in turn.

There may be ftp software that is able to provide "batch" download of selected files now, but could be available as a browser addon; i.e. try FireFtp for firefox. NCFTP client has/had a batch download but think that it is just getting files that are in sequential order.
crane3 is offline   Reply With Quote
Old 08-23-2015, 10:02 PM   #5
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
@crane3 -- who says they are available on an FTP server?
eschwartz is offline   Reply With Quote
Advert
Old 08-24-2015, 04:07 AM   #6
bobodude
Connoisseur
bobodude ought to be getting tired of karma fortunes by now.bobodude ought to be getting tired of karma fortunes by now.bobodude ought to be getting tired of karma fortunes by now.bobodude ought to be getting tired of karma fortunes by now.bobodude ought to be getting tired of karma fortunes by now.bobodude ought to be getting tired of karma fortunes by now.bobodude ought to be getting tired of karma fortunes by now.bobodude ought to be getting tired of karma fortunes by now.bobodude ought to be getting tired of karma fortunes by now.bobodude ought to be getting tired of karma fortunes by now.bobodude ought to be getting tired of karma fortunes by now.
 
Posts: 70
Karma: 1800048
Join Date: Oct 2014
Device: BooX M96
thanks for all the answers, I will try the firefox add-on,

I tried wget, but need to find a good tutorial online to get it to work ...

(I don't have one particular site in mind, but this is something I try to do every once in a while for different sites ...)
bobodude is offline   Reply With Quote
Old 08-24-2015, 01:20 PM   #7
j.p.s
Grand Sorcerer
j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.j.p.s ought to be getting tired of karma fortunes by now.
 
Posts: 5,786
Karma: 103362673
Join Date: Apr 2011
Device: pb360
Quote:
Originally Posted by eschwartz View Post
Are all the PDFs linked from one (or a couple) pages?

If so, you can do it easily on the command-line with wget.

Code:
wget --recursive --level=1 --accept "*.pdf" http://website.com/pdf-index-page.html
Systems that support FUSE can use httpfs.
j.p.s is online now   Reply With Quote
Old 08-24-2015, 01:35 PM   #8
crane3
Guru
crane3 ought to be getting tired of karma fortunes by now.crane3 ought to be getting tired of karma fortunes by now.crane3 ought to be getting tired of karma fortunes by now.crane3 ought to be getting tired of karma fortunes by now.crane3 ought to be getting tired of karma fortunes by now.crane3 ought to be getting tired of karma fortunes by now.crane3 ought to be getting tired of karma fortunes by now.crane3 ought to be getting tired of karma fortunes by now.crane3 ought to be getting tired of karma fortunes by now.crane3 ought to be getting tired of karma fortunes by now.crane3 ought to be getting tired of karma fortunes by now.
 
Posts: 608
Karma: 5007204
Join Date: Sep 2014
Location: Calif
Device: Fire hdx 8.9, Tab S2, Tab S5e, Aura ONE
Quote:
Originally Posted by eschwartz View Post
@crane3 -- who says they are available on an FTP server?
If it is not an FTP server, then it should be another type of "server" with the idea the "server" just provides availability of some files. Even using a browser's download option is getting files from a server. Copying files from directory A to directory B may have the directory A be considered as a "server".
crane3 is offline   Reply With Quote
Old 08-24-2015, 03:42 PM   #9
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
Quote:
Originally Posted by j.p.s View Post
Systems that support FUSE can use httpfs.
Interesting idea.
But as far as I can tell, that only helps you "download" faster. While interesting, it isn't directly applicable to the idea of "downloading many files in a batch job".

Quote:
Originally Posted by crane3 View Post
If it is not an FTP server, then it should be another type of "server" with the idea the "server" just provides availability of some files. Even using a browser's download option is getting files from a server. Copying files from directory A to directory B may have the directory A be considered as a "server".
Well, I thought you were trying to say "use the capability of FTP to download a directory structure as communicated by the FTP protocol".
Which falls entirely flat as a methodology, if you only have an HTTP server available and you cannot trawl the filesystem hierarchy of the server.



As I suggested in the first place, the most likely solution is going to be something like wget, which can recursively download an index page containing links to the desired PDFs.
eschwartz is offline   Reply With Quote
Old 08-24-2015, 04:08 PM   #10
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
Quote:
Originally Posted by bobodude View Post
thanks for all the answers, I will try the firefox add-on,

I tried wget, but need to find a good tutorial online to get it to work ...

(I don't have one particular site in mind, but this is something I try to do every once in a while for different sites ...)
Well, wget is pretty simple.
How much experience do you have with the command-line?


Using the command string I offered above, you simply replace "http://website.com/pdf-index-page.html" with the website URL of some internet page that contains links to all the PDFs you want.

Tutorial: http://www.thegeekstuff.com/2009/09/...some-examples/
The official wget documentation: https://www.gnu.org/software/wget/ma...ode/index.html
eschwartz is offline   Reply With Quote
Old 08-24-2015, 05:15 PM   #11
Section8
Addict
Section8 ought to be getting tired of karma fortunes by now.Section8 ought to be getting tired of karma fortunes by now.Section8 ought to be getting tired of karma fortunes by now.Section8 ought to be getting tired of karma fortunes by now.Section8 ought to be getting tired of karma fortunes by now.Section8 ought to be getting tired of karma fortunes by now.Section8 ought to be getting tired of karma fortunes by now.Section8 ought to be getting tired of karma fortunes by now.Section8 ought to be getting tired of karma fortunes by now.Section8 ought to be getting tired of karma fortunes by now.Section8 ought to be getting tired of karma fortunes by now.
 
Section8's Avatar
 
Posts: 264
Karma: 2121470
Join Date: Oct 2011
Location: Arlington, TX
Device: Kindle PW4, Moon+ Reader on a cheap Android tablet
In the past, I have used a firefox plugin called DownThemAll to mass download pdf manuals from IBM web pages. This might work for you.
Section8 is offline   Reply With Quote
Old 08-24-2015, 05:19 PM   #12
bobodude
Connoisseur
bobodude ought to be getting tired of karma fortunes by now.bobodude ought to be getting tired of karma fortunes by now.bobodude ought to be getting tired of karma fortunes by now.bobodude ought to be getting tired of karma fortunes by now.bobodude ought to be getting tired of karma fortunes by now.bobodude ought to be getting tired of karma fortunes by now.bobodude ought to be getting tired of karma fortunes by now.bobodude ought to be getting tired of karma fortunes by now.bobodude ought to be getting tired of karma fortunes by now.bobodude ought to be getting tired of karma fortunes by now.bobodude ought to be getting tired of karma fortunes by now.
 
Posts: 70
Karma: 1800048
Join Date: Oct 2014
Device: BooX M96
Hi,

I don't have much experience with the command line,

but all the answers here got me more intrigued in wget,

so I've got it installed now, and I'm playing around with it a bit,

thanks for all the answers !!!

It's working!

I will be testing it on different websites, I guess it won't work everywhere ...

Last edited by bobodude; 08-24-2015 at 05:59 PM.
bobodude is offline   Reply With Quote
Old 08-24-2015, 06:24 PM   #13
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
Good to hear it helped you.

Note: It should work everywhere, so long as the website offers download links.
It might not work in places where content-disposition headers rename the download or redirects are in place -- both lead to downloaded flenames that look like e.g. attachment.php?attachmentid=141344&d=1440341764 and are filtered out because we only accepted PDFs -- or the website uses a robots.txt to forbid bot downloads.
The solution to all these is in advanced wget usage, for instance in my wgetrc (permanent configuration file) I have trust_server_names=on and content_disposition=on and robots=off. You can also pass those options with
Code:
--execute trust_server_names=on --execute content_disposition=on --execute robots=off

Last edited by eschwartz; 08-24-2015 at 06:26 PM.
eschwartz is offline   Reply With Quote
Old 08-24-2015, 06:36 PM   #14
bobodude
Connoisseur
bobodude ought to be getting tired of karma fortunes by now.bobodude ought to be getting tired of karma fortunes by now.bobodude ought to be getting tired of karma fortunes by now.bobodude ought to be getting tired of karma fortunes by now.bobodude ought to be getting tired of karma fortunes by now.bobodude ought to be getting tired of karma fortunes by now.bobodude ought to be getting tired of karma fortunes by now.bobodude ought to be getting tired of karma fortunes by now.bobodude ought to be getting tired of karma fortunes by now.bobodude ought to be getting tired of karma fortunes by now.bobodude ought to be getting tired of karma fortunes by now.
 
Posts: 70
Karma: 1800048
Join Date: Oct 2014
Device: BooX M96
Quote:
Originally Posted by eschwartz View Post
Good to hear it helped you.

Note: It should work everywhere, so long as the website offers download links.
It might not work in places where content-disposition headers rename the download or redirects are in place -- both lead to downloaded flenames that look like e.g. attachment.php?attachmentid=141344&d=1440341764 and are filtered out because we only accepted PDFs -- or the website uses a robots.txt to forbid bot downloads.
The solution to all these is in advanced wget usage, for instance in my wgetrc (permanent configuration file) I have trust_server_names=on and content_disposition=on and robots=off. You can also pass those options with
Code:
--execute trust_server_names=on --execute content_disposition=on --execute robots=off
Hi, I tried it, and had exactly this problem !

[could you give me an example of downloading a site, using this command (I'm a noob ...)]

hmm, I figured out what you meant, I think, I have to make the above changes in the wgetrc file, which I can't seem able to find ...



Thanks in advance !

PS: I am using windows, and downloaded wget from this site:https://builtvisible.com/download-yo...ite-with-wget/ (first link, under download wget) ...


PPS: I am aslo trying this: http://www.jensroesner.de/wgetgui/, which is a wgetGUI, probably the noob version of wget, and I'm getting html files, we is already a start, will continue fiddling around ...


PPPS: what I would like to do is download some pdf articles from a journal (newleftreview.org), in a faster way ...

Last edited by bobodude; 08-25-2015 at 03:47 PM.
bobodude is offline   Reply With Quote
Old 08-24-2015, 07:38 PM   #15
Dazrin
Wizard
Dazrin ought to be getting tired of karma fortunes by now.Dazrin ought to be getting tired of karma fortunes by now.Dazrin ought to be getting tired of karma fortunes by now.Dazrin ought to be getting tired of karma fortunes by now.Dazrin ought to be getting tired of karma fortunes by now.Dazrin ought to be getting tired of karma fortunes by now.Dazrin ought to be getting tired of karma fortunes by now.Dazrin ought to be getting tired of karma fortunes by now.Dazrin ought to be getting tired of karma fortunes by now.Dazrin ought to be getting tired of karma fortunes by now.Dazrin ought to be getting tired of karma fortunes by now.
 
Dazrin's Avatar
 
Posts: 2,728
Karma: 75825105
Join Date: Dec 2010
Location: PDXish
Device: Kindle Voyage, various Android devices
Quote:
Originally Posted by Section8 View Post
In the past, I have used a firefox plugin called DownThemAll to mass download pdf manuals from IBM web pages. This might work for you.
Same here although not IBM.

And if all the files I need don't have a link on the same page but I know they are sequential, I have used a bookmarklet to make a numbered list. See this page for an example: https://www.squarefree.com/bookmarkl..._numbered_list

Last edited by Dazrin; 08-24-2015 at 07:45 PM.
Dazrin is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Request Batch export annotated pdf gotmilt enTourage eDGe 2 11-18-2011 04:57 PM
PDF to prc/azw Batch Conversion xsolitudex PDF 2 09-04-2010 10:19 AM
Classic Batch download of B&N eBooks? mgmueller Barnes & Noble NOOK 5 02-08-2010 12:01 PM
HTML to PDF batch converter sputnik PDF 3 07-07-2009 04:25 AM


All times are GMT -4. The time now is 01:41 PM.


MobileRead.com is a privately owned, operated and funded community.