Thread: Question
View Single Post
Old 05-12-2011, 09:47 AM   #1
gambarini
Connoisseur
gambarini began at the beginning.
 
Posts: 98
Karma: 22
Join Date: Mar 2010
Device: IRiver Story, Ipod Touch, Android SmartPhone
Question

I want to create a new recipe with a parse index mehtod on
this page:
http://rassegnastampa.mef.gov.it/mef...e/Default.aspx

When i do this

print self.index_to_soup(url)

i don't obtain entire page but only a little part...
something like this:

Quote:
[html]
<table class="ResultsTable" summary="La tabella contiene gli articoli pubblicati nella rassegna stampa di giovedì 12 maggio 2011.">
<caption>
Articoli della rassegna
</caption><thead>
<tr>
<th class="DateCellShort" scope="col" id="data">Data</th>
<th class="TopicCellShort" scope="col" id="sezione">Sezione</th><
th class="PublicationCellShort" scope="col" id="testata">Testata</th>
<th class="TitleCellShort" scope="col" id="titolo">Titolo</th>
<th class="AuthorCellShort" scope="col" id="autore">Autore</th>
<th class="OcrLinkCellShort" scope="col" id="ocr">OCR</th>
</tr>
</thead>
<tr>
<td class="DateCellShort" headers="data">12/05/2011</td>
<td class="TopicCellShort" headers="sezione">MINISTRO</td>
<td class="PublicationCellShort" headers="testata">Corriere della Sera</td>
<td class="TitleCellShort" headers="titolo"></td>
</tr>
</table>
[/html]

why????
gambarini is offline   Reply With Quote