View Single Post
Old 10-16-2011, 02:30 PM   #1
julio:map
Member
julio:map began at the beginning.
 
Posts: 23
Karma: 12
Join Date: Jul 2011
Device: Cool-er
"El Pais" recipe correction

The International section of "El Pais" is not working well. (it shows no contents in the news, but the hyperlinks do work and direct you - if online - to the correct pieces of news)

I think it is because the articles are in a div with class "cuerpo_noticia' which is not listed in the keep_only_tags.

I have tried to include it, and at least today it has worked.

Just replacing:

keep_only_tags = [ dict(name='div', attrs={'class':['cabecera_noticia_reportaje estirar','cabecera_noticia_opinion estirar','cabecera_noticia estirar','contenido_noticia','caja_despiece']})]

with:
keep_only_tags = [ dict(name='div', attrs={'class':['cabecera_noticia_reportaje estirar','cabecera_noticia_opinion estirar','cabecera_noticia estirar','contenido_noticia','cuerpo_noticia','caja_despiece']})]

Hope this helps anybody.
julio:map is offline   Reply With Quote