View Single Post
Old 10-15-2010, 09:32 AM   #5
chaley
Grand Sorcerer
chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.
 
Posts: 12,450
Karma: 8012886
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
If you feel up to it, you might consider perl. Text matching and regular expressions are integral to the language, making it good for text hacking.

Python has the advantage of being like 'normal' languages such as C or Java, bug with some nice string manipulation thrown in. It wouldn't be my first choice for hacking text, but it isn't a bad choice by any means. Python also has the advantage of being able to use calibre's libraries, so you could directly set the fields in the database. That by itself could make it the best choice around.

sed/grep/awk/etc can work too, if you can work out the patterns and chaining. This option would be best if the title & author information is easy to locate (standard bracketing text). You probably would need to generate a mess of calibre command line scripts as output, but that is also true for perl.

Have fun.
chaley is offline   Reply With Quote