Thanks Manichean !
That worked like a charm! I was trying to do it with out regular expressions so I didn't see the extra field to have an output other than the source field.
If anyone can advise me as to the best scripting language to learn so I get the title and author from the txt and html files I would appreciate it greatly.
Are sed and grep the best way or should I invest the time to learn python?
|