"El Pais" recipe correction
The International section of "El Pais" is not working well. (it shows no contents in the news, but the hyperlinks do work and direct you - if online - to the correct pieces of news)
I think it is because the articles are in a div with class "cuerpo_noticia' which is not listed in the keep_only_tags.
I have tried to include it, and at least today it has worked.
Just replacing:
keep_only_tags = [ dict(name='div', attrs={'class':['cabecera_noticia_reportaje estirar','cabecera_noticia_opinion estirar','cabecera_noticia estirar','contenido_noticia','caja_despiece']})]
with:
keep_only_tags = [ dict(name='div', attrs={'class':['cabecera_noticia_reportaje estirar','cabecera_noticia_opinion estirar','cabecera_noticia estirar','contenido_noticia','cuerpo_noticia','caja_despiece']})]
Hope this helps anybody.
|