View Single Post
Old 01-02-2013, 12:33 PM   #1
swerling
Junior Member
swerling began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Jan 2013
Device: kindle
Today's Zaman (english) recipe update

Fix all RSS feed urls, added removed some invalid ones, and added a few undocumented ones ('Diplomacy', 'Food', etc).

Valid feeds were gathered using the following ruby script:

Code:
require 'mechanize'
require 'nokogiri'

(0..1000).each do |i|
  url = "http://www.todayszaman.com/#{i}.rss"
  page = Mechanize.new.get(url)
  if page.body.size > 20
    section = Nokogiri::HTML(page.body).xpath("//title").first.inner_html
    puts "#{url} => #{section}"
  end
end
Attached Files
File Type: gz todays_zaman.py.gz (1.4 KB, 36 views)
swerling is offline   Reply With Quote