Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > Workshop

Notices

Reply
 
Thread Tools Search this Thread
Old 08-10-2009, 11:44 AM   #1
ahi
Wizard
ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.
 
Posts: 1,790
Karma: 507333
Join Date: May 2009
Device: none
Generating from Wikipedia

Are there any good tools or programming (python or PHP) libraries for working with (the) Wikipedia (database) in a reasonably high level way, for the purposes of extracting articles?

The sort of thing I'd want is to be able to grab the full HTML and (full resolution) associated images of an article with a single command or command line call. Get a list of linked article keywords in a similarly straightforward way.

Ideally it ought to be possible to program a sort of limited but "intelligent" spidering of articles.

Thanks for any tips in advance.

- Ahi
ahi is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Problem generating TOC ronin712 Calibre 11 09-06-2015 11:53 PM
Generating TOC entries for prelims / end matter in InDesign forlor ePub 7 07-07-2010 08:26 AM
Trouble generating a TOC foghat Calibre 2 05-07-2010 06:00 PM
Problems generating ePub from HTML/CSS AlexBell Calibre 3 07-17-2009 05:10 AM
Reference Wikipedia: SOS Children 2006 Wikipedia CD hn_88 BBeB/LRF Books 0 01-29-2008 12:23 PM


All times are GMT -4. The time now is 03:26 AM.


MobileRead.com is a privately owned, operated and funded community.