View Single Post
Old 12-07-2006, 04:47 PM   #9
sic
Addict
sic will become famous soon enoughsic will become famous soon enoughsic will become famous soon enoughsic will become famous soon enoughsic will become famous soon enoughsic will become famous soon enough
 
Posts: 202
Karma: 692
Join Date: Oct 2006
Device: SONY reader
HTML and TXT are the most problematic.
but then again... what would a human do? start looking into the file and near the top you'd find: author... Title...
We could probably come up with a not too AI program that would try to do some pattern matching to figure out basic metadata

After that all is needed to throw them into a table, that's not a big deal.
sic is offline   Reply With Quote