This sounds like it's begging to be scripted. Take a look at how your library does authentication to the web site. It's probably a form post with a session cookie returned. Using curl, you can login, and fetch the pages. Grep sed and awk are your friends. You can do streaming editing with them. For a little more work, you can do the whole thing in perl. You can also finish off by converting the html to lrf.
- Ed
|