View Single Post
Old 07-21-2007, 10:33 AM   #1
sammykrupa
Reader of the Reader
sammykrupa doesn't littersammykrupa doesn't litter
 
Posts: 103
Karma: 107
Join Date: Apr 2006
Device: Sony Reader PRS-500
kovidgoyal: templatemaker -- automatic data extractor

templatemaker ( http://code.google.com/p/templatemaker/ ) looks like the perfect thing for the web2disk utility:

Given a list of text files in a similar format, templatemaker creates a template that can extract data from files in that same format.

The library is written in Python, but the underlying longest-common-substring algorithm is implemented in C for performance.



Check out the example usage!


Sam Krupa
sammykrupa is offline   Reply With Quote