Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > Miscellaneous > Archive > Mobile Sites

Notices

 
 
Thread Tools Search this Thread
Old 03-03-2003, 04:03 PM   #1
bookrats
Junior Member
bookrats began at the beginning.
 
Posts: 8
Karma: 18
Join Date: Feb 2003
Red face

Just curious -- anyone have any luck grabbing the Zap2it.com TV news page (or any of their other individual pages.) It's at http://tv.zap2it.com/news/tvnewsdaily_headlines.html. I'm trying to only get the links listed under "TV NEWS DAILY", but no luck -- it grabs the links from all over the page, making for a very large file (and, more importantly, taking a long time to download and convert.)

I'm only a moderately knoweledgable HTML guy; I've tried to deconstruct the Zap2it.com news page into something that can be grabbed by iSiloX. However, no luck -- I get a huge file, even with the link level set to 1 in iSiloX, and only following links below the root..

Thought someone else here might have a clue as to doing this. I've searched the archives, but couldn't find any mention of Zap2it.

(Also, is there an FAQ and/or "tricks of the trade" page for figuring out how to get just the info you want out of a web page for iSiloX?)

TIA...

Jeff
bookrats is offline  
Old 03-03-2003, 04:23 PM   #2
Alexander Turcic
Fully Converged
Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.Alexander Turcic ought to be getting tired of karma fortunes by now.
 
Alexander Turcic's Avatar
 
Posts: 18,175
Karma: 14021202
Join Date: Oct 2002
Location: Switzerland
Device: Too many to count here.
Jeff,
Check the new link sections I included it there. Unfortunately, there is no official FAQ yet on this topic. This is the idea:

In iSiloX under Channel Properties->Links, make sure that is Follow Offsite Links is ON. There you also find the URL Filters menu. What I almost always do is EXCLUDE *every* link first. Yes, everything Add exclusion filter:

* (type wildcard).

Now we want to make EXCEPTIONS to above exclusion.
For that, I look closely at the links that we want to browse with iSiloX. In Zap2it, I noticed that all the links you want to have contain /tvnewsdaily.html?, example http://tv.zap2it.com/news/tvnewsdaily.html?30388. So let's add that inclusion filter:

\/tvnewsdaily\.html\? (type regular expression, more to that in a moment)

In addition, I looked at the TV News pages and noticed that there are occasionally little images included with the text. We want those too. So another inclusion filter:

\.jpg (type regular expression)

Regular expressions is a way to define patterns. It is easy to crasp but can become quite complex. Use the search engine on this forum where I posted some interesting links to RegEx tutorials.

Greets
Alexander Turcic is offline  
Advert
 


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
I seriously can't believe my luck tech_au Sony Reader 15 08-11-2010 07:51 PM
A stroke of accessory luck... khourianya Sony Reader 7 09-26-2008 12:04 AM
Just ordered... wish me luck! Abraxus Bookeen 4 06-04-2008 10:37 PM
Non-US Residents - Any luck with Sony Connect??? Amadeus Sony Reader 4 04-17-2007 08:08 AM


All times are GMT -4. The time now is 01:48 PM.


MobileRead.com is a privately owned, operated and funded community.