View Single Post
Old 08-30-2004, 10:25 PM   #5
hacker
Technology Mercenary
hacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with othershacker plays well with others
 
hacker's Avatar
 
Posts: 617
Karma: 2561
Join Date: Feb 2003
Location: East Lyme, CT
Device: Direct Neural Implant
Wikipedia in iSilo (or Plucker) format can't be based on the same code that is available now. A lot of the (broken) Perl scripts that are out there that try to convert it to something useful, don't, and are not really scalable.

The actual MediaWiki software that drives the wiki has to be updated.. and I've already done this in my copy. This includes quite a few changes to make the output of the actual wiki more "compatible" with handhelds.

I've been working with the mediawiki developers to work this out. I have an approach that might work, but it still requires quite a bit of work to the database, content, and schema. The wiki tags used in WikiPedia are not exactly "valid" in most cases. A set of rules has to be developed to handle this conversion.

Importing the SQL dump of the wikipedia + images into a MySQL database and then parsing that with Perl to convert it, is the wrong approach. Spidering the actual visible, modified mediawiki code on a per-language basis, is the right way to do this.

It'll happen, but a lot of other pieces have to be fixed up first. See this message I wrote discussing my progress in this, and other areas.
hacker is offline   Reply With Quote