Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > Miscellaneous > Archive > Sitescooper

Notices

 
 
Thread Tools Search this Thread
Old 03-31-2004, 12:42 PM   #1
Alexander Turcic
Fully Converged
Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.
 
Alexander Turcic's Avatar
 
Posts: 18,175
Karma: 14021202
Join Date: Oct 2002
Location: Switzerland
Device: Too many to count here.
Lightbulb Pre-Made Scoops for our Regular Members

We now have a beta section where our regular members (everyone with 20 or more posts) can download pre-made scoops in iSilo format (Plucker will follow).

These scoops are regularly updated (several times a day). The list of scoops is expanding, and currently includes: The Economist, Reuters, NZZ, The Inquirer.

MobileRead.com Team
Alexander Turcic is offline  
Old 03-31-2004, 12:57 PM   #2
ignatz
mechanoholic
ignatz ought to be getting tired of karma fortunes by now.ignatz ought to be getting tired of karma fortunes by now.ignatz ought to be getting tired of karma fortunes by now.ignatz ought to be getting tired of karma fortunes by now.ignatz ought to be getting tired of karma fortunes by now.ignatz ought to be getting tired of karma fortunes by now.ignatz ought to be getting tired of karma fortunes by now.ignatz ought to be getting tired of karma fortunes by now.ignatz ought to be getting tired of karma fortunes by now.ignatz ought to be getting tired of karma fortunes by now.ignatz ought to be getting tired of karma fortunes by now.
 
ignatz's Avatar
 
Posts: 582
Karma: 1000217
Join Date: Mar 2004
Location: Sarasota, FL
Device: Nook STR/iPhone 4S/EVO 4G
Alex, this looks really cool! How did you get the Economist scoop to work?! It looks great! Thanks for this service.

I would love to have a few NYTimes sections up here too, if possible. Maybe National, International,Technology, and Week in Review?

Fantastic work!
ignatz is offline  
Old 03-31-2004, 01:02 PM   #3
Alexander Turcic
Fully Converged
Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.
 
Alexander Turcic's Avatar
 
Posts: 18,175
Karma: 14021202
Join Date: Oct 2002
Location: Switzerland
Device: Too many to count here.
Thanks for the compliments I will add those NYT sections; are your .site files ready or do we have to optimize them further?
Alexander Turcic is offline  
Old 03-31-2004, 01:14 PM   #4
ignatz
mechanoholic
ignatz ought to be getting tired of karma fortunes by now.ignatz ought to be getting tired of karma fortunes by now.ignatz ought to be getting tired of karma fortunes by now.ignatz ought to be getting tired of karma fortunes by now.ignatz ought to be getting tired of karma fortunes by now.ignatz ought to be getting tired of karma fortunes by now.ignatz ought to be getting tired of karma fortunes by now.ignatz ought to be getting tired of karma fortunes by now.ignatz ought to be getting tired of karma fortunes by now.ignatz ought to be getting tired of karma fortunes by now.ignatz ought to be getting tired of karma fortunes by now.
 
ignatz's Avatar
 
Posts: 582
Karma: 1000217
Join Date: Mar 2004
Location: Sarasota, FL
Device: Nook STR/iPhone 4S/EVO 4G
I never finished the final mods to the .site files because I couldn't figure out the proper implementation of the code. But for those sections they should work fine. The problem only arises in sections like Movies or Health where there are commonly articles older than 10 days. In the more "newsy" sections, there are never any posts that old, so there is no problem!

The only note that I would add is that my .site file is designed for one large scoop, containing all the sections a user would want. It is based on a local HTML file. So, if you plan to do it section-wise, you will simply have to have multiple html files, that each point to only one section, and a scoop for each of these sections. However, the rest of the .site file can probably stay the same, with the exception of the pointer to the HTML file. I could do these mods for you, if you like, but not until tomorrow...
ignatz is offline  
Old 03-31-2004, 01:17 PM   #5
ignatz
mechanoholic
ignatz ought to be getting tired of karma fortunes by now.ignatz ought to be getting tired of karma fortunes by now.ignatz ought to be getting tired of karma fortunes by now.ignatz ought to be getting tired of karma fortunes by now.ignatz ought to be getting tired of karma fortunes by now.ignatz ought to be getting tired of karma fortunes by now.ignatz ought to be getting tired of karma fortunes by now.ignatz ought to be getting tired of karma fortunes by now.ignatz ought to be getting tired of karma fortunes by now.ignatz ought to be getting tired of karma fortunes by now.ignatz ought to be getting tired of karma fortunes by now.
 
ignatz's Avatar
 
Posts: 582
Karma: 1000217
Join Date: Mar 2004
Location: Sarasota, FL
Device: Nook STR/iPhone 4S/EVO 4G
and how did you make the Economist scoop work?
ignatz is offline  
Old 03-31-2004, 01:26 PM   #6
jasondv
Enthusiast
jasondv began at the beginning.
 
Posts: 29
Karma: 10
Join Date: Sep 2003
Location: Philippines
Device: Palm Tungsten|T
Cool, Alexander Thanks!
jasondv is offline  
Old 03-31-2004, 01:32 PM   #7
Alexander Turcic
Fully Converged
Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.
 
Alexander Turcic's Avatar
 
Posts: 18,175
Karma: 14021202
Join Date: Oct 2002
Location: Switzerland
Device: Too many to count here.
Well, I used the same .site file Morpheus posted earlier. I only added images to it.

Then, I did some changes to the sitescooper itself. As recommended by stobs, I changed the default user agent to something like "Internet Explorer" to hide sitescooper. To do so, you must edit /lib/Sitescooper/Main.pm:

Find (around line 977)
PHP Code:
  $self->{useragent}->agent ("sitescooper/$VERSION ($self->{home_url}) ".
          
$self->{useragent}->agent); 
and change it to
PHP Code:
  $self->{useragent}->agent ("Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)"); 
That did the trick here at least.

Greets
Alexander Turcic is offline  
Old 03-31-2004, 01:34 PM   #8
Alexander Turcic
Fully Converged
Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.
 
Alexander Turcic's Avatar
 
Posts: 18,175
Karma: 14021202
Join Date: Oct 2002
Location: Switzerland
Device: Too many to count here.
Quote:
Originally Posted by ignatz
I could do these mods for you, if you like, but not until tomorrow...
Lemme think about it and the easiest way how we can do that.
Alexander Turcic is offline  
Old 03-31-2004, 01:50 PM   #9
Tanker Bob
Junior Member
Tanker Bob began at the beginning.
 
Posts: 8
Karma: 26
Join Date: Feb 2003
Location: United States
Device: Sony PEG-T665C
FoxNews scoups would be great. These are a nice touch.
Tanker Bob is offline  
Old 03-31-2004, 02:11 PM   #10
gyffes
Enthusiast
gyffes is on a distinguished road
 
gyffes's Avatar
 
Posts: 25
Karma: 56
Join Date: Mar 2003
Device: eMate!!! oh, and an NX-60
Fox News? Don't you mean, White House State News Service?

Alex, any chance you can set up something like this for Mobipocket? I prefer reading news in landscape mode.

As always, thank you for your work on our behalf.
gyffes is offline  
Old 03-31-2004, 02:13 PM   #11
Alexander Turcic
Fully Converged
Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.
 
Alexander Turcic's Avatar
 
Posts: 18,175
Karma: 14021202
Join Date: Oct 2002
Location: Switzerland
Device: Too many to count here.
Gyffes, Mobipocket I am afraid won't be possible because it lacks a linux-compatible console tool to generate ebooks, right? I will try and see what I can do about it.
Alexander Turcic is offline  
Old 03-31-2004, 02:34 PM   #12
ignatz
mechanoholic
ignatz ought to be getting tired of karma fortunes by now.ignatz ought to be getting tired of karma fortunes by now.ignatz ought to be getting tired of karma fortunes by now.ignatz ought to be getting tired of karma fortunes by now.ignatz ought to be getting tired of karma fortunes by now.ignatz ought to be getting tired of karma fortunes by now.ignatz ought to be getting tired of karma fortunes by now.ignatz ought to be getting tired of karma fortunes by now.ignatz ought to be getting tired of karma fortunes by now.ignatz ought to be getting tired of karma fortunes by now.ignatz ought to be getting tired of karma fortunes by now.
 
ignatz's Avatar
 
Posts: 582
Karma: 1000217
Join Date: Mar 2004
Location: Sarasota, FL
Device: Nook STR/iPhone 4S/EVO 4G
Quote:
Originally Posted by Alexander
As recommended by stobs, I changed the default user agent to something like "Internet Explorer" to hide sitescooper.
That's a great find. I had understood (though don't know where I read it now...) that you could not spoof with sitescooper. There's a couple of other sites that I've been working on that this may help with. Great work Alex.
ignatz is offline  
Old 03-31-2004, 02:37 PM   #13
Alexander Turcic
Fully Converged
Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.
 
Alexander Turcic's Avatar
 
Posts: 18,175
Karma: 14021202
Join Date: Oct 2002
Location: Switzerland
Device: Too many to count here.
Quote:
Originally Posted by ignatz
That's a great find. I had understood (though don't know where I read it now...) that you could not spoof with sitescooper. There's a couple of other sites that I've been working on that this may help with. Great work Alex.
Np I think together we can make Sitescooper even better. For what it's worth, I could also turn 'user-agent' into a .site option.
Alexander Turcic is offline  
Old 03-31-2004, 02:37 PM   #14
kezza
Lowlife of the Party
kezza has a spectacular aura aboutkezza has a spectacular aura aboutkezza has a spectacular aura aboutkezza has a spectacular aura aboutkezza has a spectacular aura aboutkezza has a spectacular aura aboutkezza has a spectacular aura aboutkezza has a spectacular aura aboutkezza has a spectacular aura aboutkezza has a spectacular aura aboutkezza has a spectacular aura about
 
kezza's Avatar
 
Posts: 266
Karma: 4038
Join Date: Oct 2002
Location: seattle
Device: nook, iphone
Any chance of adding PIC mobile (http://www.palminfocenter.com/mobile), CS Monitor (text edition at http://www.csmonitor.com/cgi-bin/red...pl?textEdition), and/or Wired (http://www.wired.com/wireless)? I'd love to use this to gather up my morning reading, rather than running iSiloX, because I'm lazy like that.
Plus, I don't think I'm the only one that likes to read those sites regularly.
kezza is offline  
Old 03-31-2004, 03:09 PM   #15
gvtexas
Addict
gvtexas knows the square root of minus one.gvtexas knows the square root of minus one.gvtexas knows the square root of minus one.gvtexas knows the square root of minus one.gvtexas knows the square root of minus one.gvtexas knows the square root of minus one.gvtexas knows the square root of minus one.gvtexas knows the square root of minus one.gvtexas knows the square root of minus one.gvtexas knows the square root of minus one.gvtexas knows the square root of minus one.
 
gvtexas's Avatar
 
Posts: 346
Karma: 7797
Join Date: Nov 2002
Location: Texas
Device: Sony Clie TH55
These look great, Alex, thanks!. Only tried a few, but on The Economist scoop when you click on the "More From (blank)" links you get screens and screens of urls. Was expecting more stories, but see only the list of urls (which aren't links).
gvtexas is offline  
 


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Regular Expression Help Azhad Calibre 86 09-27-2011 02:37 PM
Regular Expression Help smartmart Calibre 5 10-17-2010 05:19 AM
Help with the regular expression Dysonco Calibre 9 03-22-2010 10:45 PM
Help with Regular Expressions ghostyjack Workshop 2 01-08-2010 11:04 AM
The "Kindle" iPod of reading - Newsweek scoops Nate the great Amazon Kindle 129 11-30-2007 08:06 AM


All times are GMT -4. The time now is 01:40 PM.


MobileRead.com is a privately owned, operated and funded community.