View Single Post
Old 08-31-2009, 01:39 PM   #8
ahi
Wizard
ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.
 
Posts: 1,790
Karma: 507333
Join Date: May 2009
Device: none
Quote:
Originally Posted by HarryT View Post
Undoubtedly - it would probably be 10x faster if it were written in C++. However, I don't know about you, but for me Calibre is "fast enough". If it takes 2 minutes to convert a book from Mobi to ePub for me I'm not really that bothered.
Actually Python can be remarkably fast. Certain sorts of arguably heavy processing of a 100 MB text file take less than 30 seconds on my machine... which now and then makes me wonder whether certain parts of Calibre are violating some basic Python best practices.

The most obvious one which in the past turned python programs of mine that should have processed under two minutes to take literally hours is the "do not build strings directly" or "do not build strings one character at a time" part.

Code:
output = ''
for tmpChar in longtext:
     outChar = tmpChar
     # some conditional processing of outChar here
     output += outChar
gets incredibly slow incredibly fast as the size of the longtext string grows. A string of size 2X takes far longer than twice the amount of time to process than a string of size X.

Code:
outputList = []
output = ''
for tmpChar in longtext:
     outChar = tmpChar
     # some conditional processing of outChar here
     outputList.append(outChar)
output = ''.join(outputList)
however is consistently fast, longer strings only taking reasonably longer to process.

Admittedly, I doubt this is news to Kovid... but I imagine there are more than one such pitfalls in Python where the obvious way is suboptimal for processing-heavy code. Might something as such act as a bottle-neck in some of your conversion scripts, Kovid?

- Ahi
ahi is offline   Reply With Quote