Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Recipes

Notices

Reply
 
Thread Tools Search this Thread
Old 11-30-2010, 03:11 AM   #1
OnwardAhead
Junior Member
OnwardAhead began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Mar 2009
Location: Kenya
Device: Kindle DX
Cablegate Wikileaks

Hi all,

was curious to see if anyone who has been following the Wikileaks brew-haha of recent days has found an RSS feed of the released documents at http://cablegate.wikileaks.org

Looks like the docs are being released as a slow-trickle, and frankly I'd love to set up a Calibre recipe to capture these as they come through. However the RSS capabilities on the official site seem to be disabled.

Should anyone have any success of either finding an RSS feed for the docs, or has found another ebook resource for them, please let me know. I did find one site that was converting each document to epub (http://www.iphoneworld.ca/news/2010/...e-epub-format/) but sadly they convert each document (currently 278) into individual epub's, making this a bit chaotic to keep organized. My preference would be to keep in either a single ebook with each document listed as a separate article.

Cheers,
JD
OnwardAhead is offline   Reply With Quote
Old 12-02-2010, 09:22 PM   #2
Phoul
Dances with penguins
Phoul began at the beginning.
 
Phoul's Avatar
 
Posts: 54
Karma: 10
Join Date: Oct 2010
Device: Sony PRS-350
This would be interesting to see
Phoul is offline   Reply With Quote
Advert
Old 12-04-2010, 07:39 PM   #3
leamsi
Junior Member
leamsi began at the beginning.
 
Posts: 7
Karma: 12
Join Date: Nov 2010
Location: Mexico
Device: Kindle
I thought it was a good idea so I made this

It will download the last two days (can be changed, just change the 'DAYS' variable) worth of released cables, and it uses a hardcoded ip address instead of the DNS name.
EDIT: Now it uses one of a few mirrors... since apparently the IP is no longer working (DoSed?)

EDIT: Get the latest version at https://github.com/leamsi/calibre_re...blegate.recipe


I tried now to make the linebreaks of the cables more readable (still sucks in some things, but it should be better now).
EDIT: Added karunaji's ideas (thanks!) to improve handling of linebreaks, as well as a couple other heuristics for the same thing.

Last edited by leamsi; 02-06-2011 at 12:27 PM. Reason: Removed the (outdated) copy which was pasted here. Please use the github link
leamsi is offline   Reply With Quote
Old 12-05-2010, 04:48 PM   #4
Phoul
Dances with penguins
Phoul began at the beginning.
 
Phoul's Avatar
 
Posts: 54
Karma: 10
Join Date: Oct 2010
Device: Sony PRS-350
Leamsi thank you for the recipe! I was hoping someone would make something.. i'm not so good with python myself. Does the job quite well.
Phoul is offline   Reply With Quote
Old 12-06-2010, 12:02 AM   #5
leamsi
Junior Member
leamsi began at the beginning.
 
Posts: 7
Karma: 12
Join Date: Nov 2010
Location: Mexico
Device: Kindle
Glad you like find it useful!

Even if this feels like reading gossip I find it fascinating.

BTW, just did a quick change to add wikileaks.ch as the default host, since it doesn't seem to fail as often now.
leamsi is offline   Reply With Quote
Advert
Old 12-06-2010, 02:07 AM   #6
Phoul
Dances with penguins
Phoul began at the beginning.
 
Phoul's Avatar
 
Posts: 54
Karma: 10
Join Date: Oct 2010
Device: Sony PRS-350
Quote:
Originally Posted by leamsi View Post
Glad you like find it useful!

Even if this feels like reading gossip I find it fascinating.

BTW, just did a quick change to add wikileaks.ch as the default host, since it doesn't seem to fail as often now.
Careful doing that, some ISP's are actively blocking that one since its "official" i suppose.
Phoul is offline   Reply With Quote
Old 12-06-2010, 06:16 PM   #7
Phoul
Dances with penguins
Phoul began at the beginning.
 
Phoul's Avatar
 
Posts: 54
Karma: 10
Join Date: Oct 2010
Device: Sony PRS-350
Leamsi: I've noticed a bit of strange behavior with it howerver. I was wondering if you or someone else who knows more then me could address it.

The days variable seems to base itself on December 04, today being 06 i set it to download 1 day, and i got the news that was released on the fourth. Nothing about todays, the day before that or the fifth. Any ideas?
Phoul is offline   Reply With Quote
Old 12-06-2010, 06:56 PM   #8
leamsi
Junior Member
leamsi began at the beginning.
 
Posts: 7
Karma: 12
Join Date: Nov 2010
Location: Mexico
Device: Kindle
Quote:
Originally Posted by Phoul View Post
Leamsi: I've noticed a bit of strange behavior with it howerver. I was wondering if you or someone else who knows more then me could address it.

The days variable seems to base itself on December 04, today being 06 i set it to download 1 day, and i got the news that was released on the fourth. Nothing about todays, the day before that or the fifth. Any ideas?
Uhm, it seems that most of the mirrors I added aren't refreshing themselves after the 4th. Try using wikileaks.ch or check with another mirror (just randomly checked http://leaks.gooby.org and it seemed to have been updated properly).

Also it seems that no leaks have been released today? The latest in wikileaks.ch is 5th Dec... nothing on the 6th.

Which mirror are you using? (I can take a closer look when I get home as I'm at work right now)
leamsi is offline   Reply With Quote
Old 12-06-2010, 07:00 PM   #9
Phoul
Dances with penguins
Phoul began at the beginning.
 
Phoul's Avatar
 
Posts: 54
Karma: 10
Join Date: Oct 2010
Device: Sony PRS-350
Quote:
Originally Posted by leamsi View Post
Uhm, it seems that most of the mirrors I added aren't refreshing themselves after the 4th. Try using wikileaks.ch or check with another mirror (just randomly checked http://leaks.gooby.org and it seemed to have been updated properly).

Also it seems that no leaks have been released today? The latest in wikileaks.ch is 5th Dec... nothing on the 6th.

Which mirror are you using? (I can take a closer look when I get home as I'm at work right now)
I am now using wikileaks.ch, however you appear to be right, nothing has gone online today. may have something to do with the UK arrest warrent for Julian Assange
Phoul is offline   Reply With Quote
Old 12-11-2010, 03:49 AM   #10
karunaji
Evangelist
karunaji ought to be getting tired of karma fortunes by now.karunaji ought to be getting tired of karma fortunes by now.karunaji ought to be getting tired of karma fortunes by now.karunaji ought to be getting tired of karma fortunes by now.karunaji ought to be getting tired of karma fortunes by now.karunaji ought to be getting tired of karma fortunes by now.karunaji ought to be getting tired of karma fortunes by now.karunaji ought to be getting tired of karma fortunes by now.karunaji ought to be getting tired of karma fortunes by now.karunaji ought to be getting tired of karma fortunes by now.karunaji ought to be getting tired of karma fortunes by now.
 
karunaji's Avatar
 
Posts: 421
Karma: 1033566
Join Date: Mar 2010
Location: Latvia
Device: Kindle 3 Wifi, Bookeen Opus
Post

I have suggestion how to improve line unwrapping a little bit. You only need to unwrap lines that are longer than certain threshold, for example, 50 chars. Shorter lines are probably headings, so do not remove line breaks for them.

Also, lines containing ------------ are used for underlining, so no removal of line breaks before and after them as well.

Python is not my forte but here is an example how it looks like: cables_201012102105.epub
karunaji is offline   Reply With Quote
Old 12-11-2010, 03:57 PM   #11
OnwardAhead
Junior Member
OnwardAhead began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Mar 2009
Location: Kenya
Device: Kindle DX
a bit late back to the thread as I've been traveling. Happy to see that this thread took off, and appreciate the work of leamsi et. al.

Am off into the 'wilderness' for a few days and will be great to have these to read. Many thanks!
OnwardAhead is offline   Reply With Quote
Old 12-11-2010, 04:53 PM   #12
Phoul
Dances with penguins
Phoul began at the beginning.
 
Phoul's Avatar
 
Posts: 54
Karma: 10
Join Date: Oct 2010
Device: Sony PRS-350
Thanks for your work on this recipe, its a great help to have it in git now.
Phoul is offline   Reply With Quote
Old 12-12-2010, 02:42 PM   #13
MDrollette
Junior Member
MDrollette began at the beginning.
 
Posts: 1
Karma: 10
Join Date: Dec 2010
Device: none
There is an up-to-date RSS feed of cables as they are released at http://www.leakfeed.com/ It also has JSON and XML formats and a basic API for searching/querying cables.
MDrollette is offline   Reply With Quote
Old 12-15-2010, 11:11 PM   #14
Phoul
Dances with penguins
Phoul began at the beginning.
 
Phoul's Avatar
 
Posts: 54
Karma: 10
Join Date: Oct 2010
Device: Sony PRS-350
Gits gone.
Phoul is offline   Reply With Quote
Old 12-16-2010, 12:07 AM   #15
leamsi
Junior Member
leamsi began at the beginning.
 
Posts: 7
Karma: 12
Join Date: Nov 2010
Location: Mexico
Device: Kindle
Quote:
Originally Posted by Phoul View Post
Gits gone.
My mistake. Changed the repo name and forgot to update the link. Should work now.
leamsi is offline   Reply With Quote
Reply


Forum Jump


All times are GMT -4. The time now is 07:16 PM.


MobileRead.com is a privately owned, operated and funded community.