i have been working on this for a few hours. i have my table, and i am very happy with it. it fits in to the page in some magical way.
i gave it a fake feed to parse, and just had it return the address that i wanted. i am not sure why it works, but it does.
now i want to remove the "feeds" menu that calibre creates (page 2 in any other recipe) and the section menu (page 3 in any other recipe). is there a way to do that?
Spoiler:
Code:
from calibre.ptempfile import PersistentTemporaryFile
import mechanize
class AdvancedUserRecipe1282101454(BasicNewsRecipe):
title = u'TA stock table'
oldest_article = 1
baseURL='http://www.tase.co.il/TASE/MarketData/Indices/MarketCap/IndexMainDataMarket.htm?Action=5&IndexID=168'
__author__ = 'marbs'
max_articles_per_feed = 1
#no_stylesheets = True
#extra_css = ' body{font-family: Arial,Helvetica,sans-serif } '
cover_url = 'http://money-talks.co.il/wp-content/uploads/2008/02/glasses_on_newspaper.jpg'
feeds = [(u'maya', u'http://maya.tase.co.il/bursa/rss/maya.xml')]
temp_files = []
articles_are_obfuscated = True
keep_only_tags = [ dict(name='table',attrs={'id':'NiaROGrid1_DataGrid1'})]
#style':['float: right;', 'float: left;'
def get_obfuscated_article(self, url):
br = self.get_browser()
br.open('http://www.tase.co.il/TASE/MarketData/Indices/MarketCap/IndexMainDataMarket.htm?Action=5&IndexID=168')
response = br.open('http://www.tase.co.il/TASE/Management/GeneralPages/PopUpGrid.htm?tbl=0&Columns=he-IL_AddColColumns&Titles=he-IL_AddColTitles&ds=he-IL_ds&enumTblType=SharesByIndex&sess=he-IL_&gridName=%D7%A0%D7%AA%D7%95%D7%A0%D7%99+%D7%9E%D7%A1%D7%97%D7%A8+-+%D7%9E%D7%A0%D7%99%D7%95%D7%AA+%D7%AA%22%D7%90+%D7%9B%D7%9C%D7%9C%D7%99')
html = response.read()
self.temp_files.append(PersistentTemporaryFile('_fa.html'))
self.temp_files[-1].write(html)
self.temp_files[-1].close()
return self.temp_files[-1].name
# def get_obfuscated_article(self, url):
# br = BasicNewsRecipe.get_browser()
# br.open('http://www.tase.co.il/TASE/MarketData/Indices/MarketCap/IndexMainDataMarket.htm?Action=5&IndexID=168')
# br.open('http://www.tase.co.il/TASE/Management/GeneralPages/PopUpGrid.htm?tbl=0&Columns=he-IL_AddColColumns&Titles=he-IL_AddColTitles&ds=he-IL_ds&enumTblType=SharesByIndex&sess=he-IL_&gridName=%D7%A0%D7%AA%D7%95%D7%A0%D7%99+%D7%9E%D7%A1%D7%97%D7%A8+-+%D7%9E%D7%A0%D7%99%D7%95%D7%AA+%D7%AA%22%D7%90+%D7%9B%D7%9C%D7%9C%D7%99')
# print_url = 'http://tase.co.il/TASEEng/Management/GeneralPages/PopUpGrid.htm?tbl=0&Columns=en-US_AddColColumns&Titles=en-US_AddColTitles&ds=en-US_ds&enumTblType=SharesByIndex&sess=en-US_&gridName=Market+Data+-+Shares+General'
# response = br.follow_link(mechanize.Link(base_url = '', url = print_url, text = '', tag = '', attrs = []))
#
# html = response.read()
#
# self.temp_files.append(PersistentTemporaryFile('_fa.html'))
# self.temp_files[-1].write(html)
# self.temp_files[-1].close()
#
return br#self.temp_files[-1].name
def get_article_url(self, article):
return 'http://www.tase.co.il/TASE/Management/GeneralPages/PopUpGrid.htm?tbl=0&Columns=he-IL_AddColColumns&Titles=he-IL_AddColTitles&ds=he-IL_ds&enumTblType=SharesByIndex&sess=he-IL_&gridName=%D7%A0%D7%AA%D7%95%D7%A0%D7%99+%D7%9E%D7%A1%D7%97%D7%A8+-+%D7%9E%D7%A0%D7%99%D7%95%D7%AA+%D7%AA%22%D7%90+%D7%9B%D7%9C%D7%9C%D7%99'
so i got a little greedy. is there an easy way to brake the table in half?
i can think of 3 things that might work (i just dont know how to do them)
the 1st is to remove some less relevant columns.
the 2nd is to cut every row in half. and have :
1st row right half
1st row left half
2nd row right half
and so on.
the 3rd is to cut the hole table in half and add hte right most colont to the 2nd half too
1st row right half
2nd row right hald
.
.
.
top right cell + 1st row left half
2nd from the top right cell + 2nd row left half
.
.
.
possible?