View Single Post
Old 10-06-2006, 01:12 PM   #7
DTM
Intentionally Left Blank
DTM ought to be getting tired of karma fortunes by now.DTM ought to be getting tired of karma fortunes by now.DTM ought to be getting tired of karma fortunes by now.DTM ought to be getting tired of karma fortunes by now.DTM ought to be getting tired of karma fortunes by now.DTM ought to be getting tired of karma fortunes by now.DTM ought to be getting tired of karma fortunes by now.DTM ought to be getting tired of karma fortunes by now.DTM ought to be getting tired of karma fortunes by now.DTM ought to be getting tired of karma fortunes by now.DTM ought to be getting tired of karma fortunes by now.
 
DTM's Avatar
 
Posts: 172
Karma: 300106
Join Date: Feb 2006
Location: Royal Oak, MI, USA
Device: Nook STR
Quote:
Originally Posted by PippoPippini
I tried rewriting link of Bloomberg`s feed.

The link filter is http://www\.bloomberg\.com(.*), while the rewrite rule I wrote is http://www.bloomberg.com$1#

But it doesn`t work. What`s wrong ?
Got it!

Your rule doesn't work because the printable page is not just the original URL with a # appended. That's what you see when you hover on the link, but when you actually click the link, it runs a Java Script routine that constructs the real URL.

This is a good example to use to show how to crack the more advanced problems. To find the real URL for the printable version, click the link for the printer version. When the printer-friendly box opens, right-click on it and select Properties from Internet Explorer or View Page Info from Firefox. That will give you the actual URL for the page.

The example I used was the Economy RSS feed. One of the article pages was:

http://www.bloomberg.com/apps/news=?pid=20601068&sid=avAOrwcRZaAU&refer=economy

The printable version was:

http://www.bloomberg.com/apps/news=?pid=20670001&refer=economy&sid=avAOrwcRZaAU

Comparing the two, we see that there are two changes: the pid number is different and the "sid" section has swapped places with the "refer" section.

We are fortunate because it turns out that all of the printable pages for all of the feeds have a pid of 20670001. Our rewrite rule, then, must change the pid number and move the sid segment to the end.

The filter looks like this:

http://www\.bloomberg\.com/apps/news\?pid=.*&sid=(.*)&refer=economy

Notice that the first .* is not in parentheses, because we don't need to save the value; the number changes in the rewrite. The second is needed, however. Notice also that you must backslash periods and question marks.

The rewritten expression looks like this:

http://www.bloomberg.com/apps/news?pid=20670001&refer=economy&sid=$1

It is identical up to the pid, where we substitute the new number 20670001, then we follow that with the "refer" segment and end with the "sid" segment. The $1 inserts the sid code that we captured from the original link.

I checked only a couple of the feeds, but this appears to work for all of them. The only variation from one feed to another is that the "refer" part changes from "economy" to "politics", etc.

Enjoy!

----------------Edit------------------

It occurred to me that you may want to use a variation on the above, as follows:

Filter:
http://www\.bloomberg\.com/apps/news\?pid=.*&sid=(.*)&refer=(.*)

Rewrite:
http://www.bloomberg.com/apps/news?pid=20670001&refer=$2&sid=$1

This allows you to use exactly the same rule for all of the feeds, making it easier to copy and modify the Sunrise documents for each feed.

Last edited by DTM; 10-08-2006 at 12:24 AM.
DTM is offline