View Single Post
Old 01-26-2008, 04:44 PM   #28
Lemoine
Junior Member
Lemoine began at the beginning.
 
Lemoine's Avatar
 
Posts: 7
Karma: 10
Join Date: Jan 2008
Device: Prs 505
Iso-8859-1 feed howto

Hello,
Using this wonderful program (thank's a lot Govid!), i have tried to add the support for "Le Monde" a french newspaper. It was working pretty well, but yesterday they changed both their structure and encoding, switching from utf8 to iso-8859-1.
Now, my new profile captures the articles but with weird encoding.

If i add in the regex,for instance,

<head><meta HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1"></head>

my characters are correct, but all the crap is not stripped from the articles.

Here is my profile

I would be very grateful for your help...
Attached Files
File Type: zip Monde.zip (1.6 KB, 501 views)
Lemoine is offline   Reply With Quote