Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > Miscellaneous > Archive > Sitescooper

Notices

 
 
Thread Tools Search this Thread
Old 03-24-2005, 01:20 PM   #1
PostGrant
Enthusiast
PostGrant began at the beginning.
 
Posts: 29
Karma: 27
Join Date: Mar 2005
Device: eBookWise, Dell Axim X5
eBookWise 1150/REB offline reading

Hey guys.

I use SiteScooper to gather all the sites I read for the day (BBC, Guardian, The Times, a couple of my favorite blogs). SiteScooper automatically creates an index site for all these, so if I Run SiteScooper ON this Index it creates a big fat, single HTML file, which is fully indexed and consists of all my daily reading.

Really handy so I don't need to transfer 8-9 files - the ugly part is SiteScooper has no interface so I had to write a batch file for all this. The good part is it's a 1-click operation - I run it before my shower, and by the time I'm out it's waiting for me to plug in my eBook.

Anyway, just thought I'd drop my experience at how SiteScooper saved my EB. I have a feeling this software isn't really supported anymore... the last update I saw was in 2001. Ugh.

Either SiteScooper needs to be resurrected, or some of hte folks at FictionWise/eBook Technologies need to realize the importance of not reading just DRM stuff.
PostGrant is offline  
Old 05-13-2005, 02:05 PM   #2
CINCNORAD
Enthusiast
CINCNORAD began at the beginning.
 
CINCNORAD's Avatar
 
Posts: 28
Karma: 44
Join Date: May 2005
Location: Studio City, CA
Device: Ebookwise//Sony Reader//Nokia N800
Any chance you could share a tutorial on how to accomplish this? And if you would like to be my hero -- possibly share any batch files that would help? Been checking out sitescooper, and man that program looks complicated...
CINCNORAD is offline  
Advert
Old 08-16-2006, 11:51 PM   #3
technobritt
Junior Member
technobritt began at the beginning.
 
Posts: 1
Karma: 10
Join Date: Aug 2006
Device: GEB 1150
PostGrant--I second CINCNORAD's request for a tutorial or walkthrough if possible. Reading websites offline would be EXACTLY what I'd use this device for most often.
technobritt is offline  
Old 09-01-2006, 12:24 AM   #4
PostGrant
Enthusiast
PostGrant began at the beginning.
 
Posts: 29
Karma: 27
Join Date: Mar 2005
Device: eBookWise, Dell Axim X5
I hear ya. A few months ago, I talked with the developer of the librarian software, and he was developing a spider for the EB1150. It seemed to work pretty darn well. Maybe he needs more beta testers?

In the mean time, I'll come up with a HOWTO on sitescooper. Give me a few days, I gotta go out of town.
PostGrant is offline  
Old 09-01-2006, 02:55 AM   #5
stobs
Connoisseur
stobs is on a distinguished road
 
Posts: 62
Karma: 72
Join Date: Oct 2002
Location: Germany
Device: nook
please post it additionaly to the wiki

-S.
stobs is offline  
Advert
Old 02-13-2007, 04:31 AM   #6
Gatton
Groupie
Gatton has a complete set of Star Wars action figures.Gatton has a complete set of Star Wars action figures.Gatton has a complete set of Star Wars action figures.Gatton has a complete set of Star Wars action figures.
 
Gatton's Avatar
 
Posts: 150
Karma: 368
Join Date: Aug 2004
Location: Charlotte, NC
Device: Kindle Paperwhite 2021, assorted Fire tablets.
I thought I'd bump this as I am looking for ways to read news offline on my EB1150. Are there any EB1150 users who can share some tips? The sitescooper page appears to be down and I don't know if it would even work on OSX which is my only option for now (Windows box blew up.) Alternatively is it possible to use something like wget? Bottom line is it would be nice to download a set of headlines/stories in one big html file for easy reading on this device. Any advice is appreciated. Thanks. Oh and I guess I should say I'm mainly interested in news sites like BBC, Washington Post etc. Thanks again.
Gatton is offline  
Old 02-13-2007, 12:45 PM   #7
sea2stars
Zealot
sea2stars is on a distinguished road
 
Posts: 104
Karma: 64
Join Date: Dec 2006
Device: eb1150, Sony Reader
I'm looking into Sitescooper and wget too since I'm looking to purchase a eb1150 shortly.

Sitescooper should work on a Mac. There's plenty of info on the net about the subject, although there still isn't a GUI; at least I can't find one.

I believe that there are front-ends for wget for the Mac & PC; again, Google is your friend.
sea2stars is offline  
Old 02-14-2007, 05:03 AM   #8
TadW
Uebermensch
TadW ought to be getting tired of karma fortunes by now.TadW ought to be getting tired of karma fortunes by now.TadW ought to be getting tired of karma fortunes by now.TadW ought to be getting tired of karma fortunes by now.TadW ought to be getting tired of karma fortunes by now.TadW ought to be getting tired of karma fortunes by now.TadW ought to be getting tired of karma fortunes by now.TadW ought to be getting tired of karma fortunes by now.TadW ought to be getting tired of karma fortunes by now.TadW ought to be getting tired of karma fortunes by now.TadW ought to be getting tired of karma fortunes by now.
 
TadW's Avatar
 
Posts: 2,583
Karma: 1094606
Join Date: Jul 2003
Location: Italy
Device: Kindle
@sea2stars: Sitescooper is console-based only, and development has been stagnant for a long time. It should definitely work on Mac if you have Perl installed.
TadW is offline  
Old 04-03-2007, 01:03 AM   #9
ashkulz
Addict
ashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enoughashkulz will become famous soon enough
 
ashkulz's Avatar
 
Posts: 350
Karma: 705
Join Date: Dec 2006
Location: Mumbai, India
Device: Kindle 1/REB 1200
I customized bloglines2html so that it would work for my REB1100. It does a lot of other things, namely downloads all the referenced images, blacklist some image domains, remove some unneeded links, and customizes the default templates to read and navigate properly on the ebook.

You will need to download these three files: bloglines2html and two required libraries: feedparser and BeautifulSoup. Put all of them in a single directory, and install Python if you don't have it installed.

Just run the command
Code:
python bloglines2html.py -u userid -p password -o <some-dir>
Point your creation utility at index.html in the directory. I typically use
Code:
rbmake -bef 1 -o feeds.rb index.html

Last edited by ashkulz; 04-03-2007 at 01:05 AM. Reason: fixed links
ashkulz is offline  
Old 05-26-2009, 10:32 PM   #10
nrapallo
GuteBook/Mobi2IMP Creator
nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.
 
nrapallo's Avatar
 
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
Quote:
Originally Posted by ashkulz View Post
I customized bloglines2html so that it would work for my REB1100. It does a lot of other things, namely downloads all the referenced images, blacklist some image domains, remove some unneeded links, and customizes the default templates to read and navigate properly on the ebook.

You will need to download these three files: bloglines2html and two required libraries: feedparser and BeautifulSoup. Put all of them in a single directory, and install Python if you don't have it installed.

Just run the command
Code:
python bloglines2html.py -u userid -p password -o <some-dir>
Point your creation utility at index.html in the directory. I typically use
Code:
rbmake -bef 1 -o feeds.rb index.html
While the links above are no longer active, I was able to get a copy of the above modified python code and shell script directly from ashkulz a while ago. I attach them here in case you are looking for/need same.

EDIT: provided a revised bloglines2html.py for Windows Users (changed three occurrence of 'w' to 'wb' in file operations that work with binary data i.e. images). See the bloglines2html.py.zip attachment.

EDIT2: provided some sample .imp conversions, but needed to tweak the resulting .html to split <a name= href= > into <a name= ><a href= > as well as re-save a few images that were in an incompatible format for the python image handler. Oh yeah, created the .opf also. I'll try and automate these (necessary revisions) a bit more, later on.

p.s. Added a REB1100 .rb (in bloglines2html - May 26, 2009.rb.zip) created by eBook Publisher. A rbmake version (as ashkulz prepared) may be better compatible with the REB1100.
Attached Files
File Type: zip bloglines2html.zip (57.7 KB, 1365 views)
File Type: zip bloglines2html.py.zip (6.9 KB, 1284 views)
File Type: imp bloglines2html - May 26, 2009.imp (1.45 MB, 1399 views)
File Type: imp bloglines2html - May 26, 2009_1200.imp (1.44 MB, 1397 views)
File Type: zip bloglines2html - May 26, 2009.rb.zip (192.5 KB, 1377 views)

Last edited by nrapallo; 05-27-2009 at 12:40 AM. Reason: added sample .imp ebooks for EBW1150 and REB1200 (and .rb)
nrapallo is offline  
 

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
How do you get new content onto your eBookwise 1150 / REB 1200? (pick all that apply) nrapallo Fictionwise eBookwise 23 02-03-2009 04:34 PM
Ebookwise and REB-1100 Nate the great Fictionwise eBookwise 4 08-08-2008 07:39 PM
Comparing the eBookwise-1150 and the GEB/REB 1100 nrapallo Fictionwise eBookwise 9 04-15-2008 02:05 PM
eBookwise-1150 or older Palm for ebook reading Katelyn Fictionwise eBookwise 5 11-22-2006 08:32 PM


All times are GMT -4. The time now is 04:14 PM.


MobileRead.com is a privately owned, operated and funded community.