MobileRead Forums - View Single Post

ahi · 08-31-2009, 01:39 PM

Quote:

Originally Posted by HarryT

Undoubtedly - it would probably be 10x faster if it were written in C++. However, I don't know about you, but for me Calibre is "fast enough". If it takes 2 minutes to convert a book from Mobi to ePub for me I'm not really that bothered.

Actually Python can be remarkably fast. Certain sorts of arguably heavy processing of a 100 MB text file take less than 30 seconds on my machine... which now and then makes me wonder whether certain parts of Calibre are violating some basic Python best practices.

The most obvious one which in the past turned python programs of mine that should have processed under two minutes to take literally hours is the "do not build strings directly" or "do not build strings one character at a time" part.

Code:

output = ''
for tmpChar in longtext:
     outChar = tmpChar
     # some conditional processing of outChar here
     output += outChar

gets incredibly slow incredibly fast as the size of the longtext string grows. A string of size 2X takes far longer than twice the amount of time to process than a string of size X.

Code:

outputList = []
output = ''
for tmpChar in longtext:
     outChar = tmpChar
     # some conditional processing of outChar here
     outputList.append(outChar)
output = ''.join(outputList)

however is consistently fast, longer strings only taking reasonably longer to process.

Admittedly, I doubt this is news to Kovid... but I imagine there are more than one such pitfalls in Python where the obvious way is suboptimal for processing-heavy code. Might something as such act as a bottle-neck in some of your conversion scripts, Kovid?

- Ahi