View Single Post
Old 05-09-2015, 03:10 AM   #233
AlterusPrime
Enthusiast
AlterusPrime has a complete set of Star Wars action figures.AlterusPrime has a complete set of Star Wars action figures.AlterusPrime has a complete set of Star Wars action figures.AlterusPrime has a complete set of Star Wars action figures.
 
Posts: 38
Karma: 358
Join Date: May 2015
Device: Kobo Glo
tshering
Thanks for your assistance, it's greatly appreciated.
After all the things i've tried, asked and did looks like finally i've made a script that does what i wanted it to do. It has some flaws (like if server has a dir present - it'll probably won't work but after all it's just a 'i'm lazy, i don't want to connect my USB' kind of script i think it's ok)
So, if somebody ever needs exact same thing i did - here's the listing
Spoiler:

#!/bin/sh

index=/tmp/index_$(date +%Y%m%d_%H%M%S)
pindex=/tmp/pindex_$(date +%Y%m%d_%H%M%S)
cindex=/tmp/cindex_$(date +%Y%m%d_%H%M%S)
theurl=http://192.168.1.10:1111
thedir=/mnt/onboard/.books/wgot_em
thetemp=/mnt/onboard/.books/wgot_em/tmp

if [ ! -d "$thedir" ]; then
mkdir $thedir
fi

mkdir $thetemp
wget -O $index $theurl

while read line ;do
echo $(sed -n "/href/ s/.*href=['\"]\([^'\"]*\)['\"].*/\1/gp" | grep -v '../') > $pindex
done < $index

sed -r -e 's/\s+/\n/g' $pindex > $cindex

while read line ;do
wget -O $thetemp/$line $theurl/$line
mv -f $thetemp/$line $thedir/$line
done < $cindex

rm $index
rm $pindex
rm $cindex
rm -rf $thetemp

Spoiler:

Here's an explanation of what this script does.
First, there's a bunch of variables.
$index is a raw index.html from web server.
$pindex - Prepared index - has only links instead of full html page.
$cindex is a Clear index.html, has \n's instead of spaces.
$theurl - web server's address and port
$thedir - is where books will be stored after all the things done
$thetemp - temp folder under $thedir. removed after script's job done
if block makes sure there's a $thedir.
Then it creates $thetemp and gets unprepared index from server.
After, while loop does extracting job, getting all the links from html, except "../".
sed replaces all the spaces with \n's
Another while loop gets all the files from server and moves them to $thedir rewriting any existing file in here.
And cleans the stuff, removing all the index files and the temp directory


Note: script is adapted to nginx 1.9.0 with autoindex enabled, all the books should be placed in /html (aka root) nginx directory. Provided as-is, not guaranteed to work ('cause i'm bad at scripting), but at least for me it does.

Thanks for all the help i got here

Last edited by AlterusPrime; 05-09-2015 at 03:13 AM. Reason: Weird blank space under the post :O
AlterusPrime is offline   Reply With Quote