![]() |
#76 |
Enthusiast
![]() Posts: 35
Karma: 12
Join Date: Oct 2006
Device: Amazon Kindle, Sony Reader
|
I've had some good luck lately finding full-text feeds. The attached xml file has about 50 full feeds, including about 20 that I haven't yet published on the rss2book server.
(I'm using version 13 of rss2book, and the "publish" feature isn't working for me right now.) Anyway, here are the feeds... Last edited by neilm2; 12-17-2006 at 11:56 AM. |
![]() |
![]() |
![]() |
#77 |
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 364
Karma: 1035291
Join Date: Jul 2006
Location: Redmond, WA
Device: iPad Mini,Kindle Paperwhite
|
rss2book publish/subscribe server down due to Pacific Northwest storm
Just want to apologize for publish/subscribe not working; I was in an area badly hit by the windstorm last Thursday and my house is still without electricity and may be for several more days. We had an outage last Wednesday too, so I'm rapidly approaching one week of server downtime. My UPS can only handle about an hour, unfortunately!
I'll post a message once I have power again and you can publish/subscribe away again. I used to joke when I move to the USA from South Africa five years ago that I had moved from the first world to the third, but it feels more and more like that. In Cape Town winds like what Seattle had last week are pretty normal, and the power never goes out. Back in the bad old days of apartheid we'd have rare outages if the ANC blew up a generator somewhere but the power would usually be back up in an hour or two. You'd think that Americans would learn from experience and remove large trees from powerlines - these outages happen to us several times a year although not this bad - but I guess this is yet another of a number of areas where they don't. Or maybe its because they privatize the power system but have area-based monopolies, so there is little incentive to invest in fixing things instead of just repeatedly patching. Whatever the case, its makes me yearn to move back to civilization. |
![]() |
![]() |
Advert | |
|
![]() |
#78 |
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 364
Karma: 1035291
Join Date: Jul 2006
Location: Redmond, WA
Device: iPad Mini,Kindle Paperwhite
|
Release 19 is out
Release 19 is done. It uses the iTextSharp library (which is why the zipfile is now way bigger) which allows it to generate PDF files without requiring htmldoc. The html to PDF conversion is fairly basic so you can still elect to use htmldoc if you want more sophisticated conversion (including TOC).
Release 19 can also generate RTF files with images. I was hoping to have Gutenberg integration by now but haven't done so; nontheless I published a sample 'feed' which is an example of how you can use rss2book to format Project Gutenberg books for your e-Book device. |
![]() |
![]() |
![]() |
#79 |
Enthusiast
![]() Posts: 35
Karma: 12
Join Date: Oct 2006
Device: Amazon Kindle, Sony Reader
|
Thanks, Geekraver. Quick question: What is the advantage of having images with RTF if they don't display on the Sony Reader? Am I missing a trick on how to take advantage of that new feature?
|
![]() |
![]() |
![]() |
#80 | |
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 364
Karma: 1035291
Join Date: Jul 2006
Location: Redmond, WA
Device: iPad Mini,Kindle Paperwhite
|
Quote:
|
|
![]() |
![]() |
Advert | |
|
![]() |
#81 |
Member
![]() Posts: 17
Karma: 10
Join Date: Dec 2006
Device: prs500
|
Great program but can't get it to combine on pdfs it will only work if I uncheck the combine box.any ideas?
|
![]() |
![]() |
![]() |
#82 | |
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 364
Karma: 1035291
Join Date: Jul 2006
Location: Redmond, WA
Device: iPad Mini,Kindle Paperwhite
|
Quote:
|
|
![]() |
![]() |
![]() |
#83 |
Member
![]() Posts: 20
Karma: 10
Join Date: Jan 2007
Device: Sony PRS-500
|
@geekraver--
First, thank you so very much for working on this application. The idea of being able to read the NY Times, BBC World News, NewsWeek and etc. on my reader makes me as happy as a clam. I am certain that I am doing something wrong and am hoping that you can point me in the right direction. When I grab an xml feed, I only get the first level of content, i.e., any embedded links are not available. Is there a way to have this applications scrape 2 or 3 levels deep? One other issue has to do with cookies--one of the feeds that I would like (Newswek.com) returns a document with the following text: Cookies not enabled Cookies Required I would appreciate any advice and thank you once again for working on this application. |
![]() |
![]() |
![]() |
#84 |
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 364
Karma: 1035291
Join Date: Jul 2006
Location: Redmond, WA
Device: iPad Mini,Kindle Paperwhite
|
I'll look into the cookie issue. Can you give me an example of a site with which you have the first problem (levels)?
|
![]() |
![]() |
![]() |
#85 |
Member
![]() Posts: 20
Karma: 10
Join Date: Jan 2007
Device: Sony PRS-500
|
@GeekRaver--
Thank you for replying. As for the first issue, try: http://www.nytimes.com/services/xml/...t/HomePage.xml Thanks again, PS--I have written some code that scrapes all of the .xml files from a given page. Here is the code in case anyone should find it helpful: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> <html> <head> <title>GetFeeds</title> </head> <% Function GetHTML(strURL) Dim objXMLHTTP, strReturn Set objXMLHTTP = Server.CreateObject("MSXML2.ServerXMLHTTP") objXMLHTTP.Open "GET", strURL, False objXMLHTTP.Send If Err <> 0 then strReturn="Error" Else strReturn = objXMLHTTP.responseText End If Set objXMLHTTP = Nothing GetHTML = strReturn End Function Function CleanURL(strURLText,strURL) strStringTemp = Replace(strURLText,"href","",1,-1,1) strStringTemp = Replace(strStringTemp,"=","",1,-1,1) strStringTemp = Replace(strStringTemp,">","",1,-1,1) If InStr(1,strStringTemp,"http:",1) < 1 Then strStringTemp = strURL & "/" & strStringTemp End If strStringTemp = Replace(strStringTemp," ","",1,-1,1) strStringTemp = Replace(strStringTemp,"""","",1,-1,1) strStringTemp = Replace(strStringTemp,"""","",1,-1,1) strStringTemp = Left(strStringTemp,8) & Replace(Right(strStringTemp,Len(strStringTemp)-8),"//","/") CleanURL = strStringTemp End Function Sub findLinks(strPageToParse) Set objRegExp = New RegExp objRegExp.IgnoreCase = True objRegExp.Global = True objRegExp.Pattern = "]*?HREF\s*=\s*[""']?([^'"" >]+?)[ '""]?[^>]*?>" Set colMatches = objRegExp.Execute(strPageToParse) Dim intCounter intCounter = 0 For Each itmMatch in colMatches If InStr(1,itmMatch.value,".xml",1)>1 then Response.write(CleanURL(itmMatch.value,strURL) & "<br />") intCounter = intCounter + 1 If intCounter>999 Then Exit For End If End If Next Set objRegExp = Nothing Set objXMLHTTP = Nothing End Sub strURL = "http://www.nytimes.com/services/xml/rss/index.html" strPageToParse = GetHTML(strURL) Call findLinks(strPageToParse) %> <body> </body> </html> FtB Last edited by fritz_the_blank; 01-17-2007 at 02:22 AM. |
![]() |
![]() |
![]() |
#86 |
Junior Member
![]() Posts: 3
Karma: 16
Join Date: Dec 2006
Device: Sony Reader
|
I subscribed to Economist.com online version, and it uses cookies to determine whether I am a subscriber (am logged in) or not. If rss2book can put the online version (i.e., enable cookies), it can save me some bucks and the planet some trees.
Have you noticed your reading habit has changed with Sony Reader+RSS2Book? Me for one is reading much more online stuff (vs paper stuffs) thanks to this combo. Thanks Geekraver! |
![]() |
![]() |
![]() |
#87 |
Enthusiast
![]() Posts: 38
Karma: 36
Join Date: Dec 2006
Device: Sony Reader PRS-500
|
Geekraver, do you plan on releasing the source for rss2book at all?
|
![]() |
![]() |
![]() |
#88 |
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 364
Karma: 1035291
Join Date: Jul 2006
Location: Redmond, WA
Device: iPad Mini,Kindle Paperwhite
|
I originally did include source, but later decided not too. Mostly because I'm now taking donations (all $25 so far) and I figured if I released source someone will probably rip off what I've done and try make money from it, and I don't feel like dealing with that. There may come a time when I don't care anymore and will again release the source but I have invested a lot of time in the code adding features that I didn't need to (like the WebDAV publish/subscribe), and it would be nice to see some return on that investment. My plans for now (once I finish up with a separate project I'm working on, which is why I'm not that active right now) are to try to generalize the code to the point where it can be extended via plug-ins. I'm already quite close to that on the back end (i.e. turning the HTML into PDF, RTF, etc), and want to do some more on the front end to extend the UI and range of sources.
|
![]() |
![]() |
![]() |
#89 |
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 364
Karma: 1035291
Join Date: Jul 2006
Location: Redmond, WA
Device: iPad Mini,Kindle Paperwhite
|
The New York TImes is also constrained by cookies. Setting up the actual rss2book entry to pull down full articles is easy, but without login and cookie support you get a registration page. So I'll have to add the ability to do a HTTP POST to such sites with appropriate login info first, so as to get the cookies. I'll work on it in the next week or so.
|
![]() |
![]() |
![]() |
#90 | |
Enthusiast
![]() Posts: 38
Karma: 36
Join Date: Dec 2006
Device: Sony Reader PRS-500
|
Quote:
Thats a shame as I'm looking at an RSS import for BBeBinder and your html output converts quite nicely to LRF. What I was thinking of was to wrap your output engine as a library and use that. If you do change your mind at any point then let me know. (Note BBeBinder is totally opensource). Last edited by AndyQ; 01-21-2007 at 05:29 AM. |
|
![]() |
![]() |
![]() |
Thread Tools | Search this Thread |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
rss2book release 20 now available | geekraver | Sony Reader | 4 | 01-26-2007 01:36 PM |
rss2book release 19 | geekraver | Sony Reader | 2 | 12-30-2006 10:51 AM |
rss2book release 18 | geekraver | Sony Reader | 0 | 12-22-2006 03:57 AM |
rss2book release 16 | geekraver | Sony Reader | 1 | 12-13-2006 05:56 AM |
rss2book release 13 | geekraver | Sony Reader | 0 | 11-13-2006 02:41 AM |