View Single Post
Old 09-19-2012, 02:03 AM   #445
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 20,568
Karma: 26954694
Join Date: Mar 2012
Location: Sydney Australia
Device: none
I'll make this as detailed as I can, I'm using remote control over a 64Kb/s satellite link with long latencies so I want to avoid to-ing & fro-ing with questions and answers.

I use Windows 7 64 bit with all updates installed, Calibre 0.8.69 and Count Pages 1.6.3 - My settings for the plug-in are attached.

I use Process Explorer to observe what programs are doing.

Some of my books are taking a long time to word count (15 minutes and more), although word counts of similar books of similar size, by the same author, in the same format, from the same source take a few seconds

First what happens when I count words in a single 'slow' book.

Whilst the plugin is counting a 'slow' book's words, one of the two instances of calibre-parallel uses 24.n% of my quad core I5 - i.e almost an entire core. My observation is that the main calibre process spawns a calibre-parallel process which spawns another calibre-parallel process. It's the second calibre-parallel that chews up processor resources, so I assume its the one doing the 'work'. This situation doesn't bother me too much - it gets there eventually

Now what happens when I select a group of books.

I can run into the situation where the group includes several of these 'slow' books. The plug-in spawns up to 5 instances of calibre-parallel. When this happens my computer can become unusable for an unacceptable period (my patience ran out after 33 minutes); because the 4 calibre-parallel secondary processes are EACH consuming 24.n% of a CPU core processing 'slow' books, totaling ~98% of the entire CPU

If I fight with the sloth like mouse to stop the job, that doesn't give me back the CPU resources. Because whilst the primary instance of calibre-parallel dies, the four that were doing the work get detached from the main calibre process, and continue working independently and continue to hog the CPU, so I have to kill them individually with Process Explorer.:

So rather than killing the job from calibre, its faster & easier to kill the process tree of the primary calibre-parallel process with Process Explorer. Calibre complains, but it doesn't crash and there's no apparent harm done; i.e. the books and database are OK because the plug-in is not accessing them when its doing the counting.

What have I done to try to 'fix' it

Based on reading this thread I progressively disabled DEP (at the command line so it was disabled for everything including Windows), disabled my AV, disabled the firewall, disconnected the router, closed all other programs (including disabling the ones that start in the tray), and restarted Windows between each, the final one into Safe Mode - all to no avail.

I haven't tried Closing the Tag Browser, which I saw suggested in this thread and I've seen suggested elsewhere as a possible solution to various problems - because I can't figure out how to do that.

There's no sign of any memory leaks . And the 'slow books' are in the minority, estimate < 10%.

Changing the algo between 'ADE' & 'calibre E-book' makes no obvious difference to the speed. Didn't try the 'APNX ' algo because it seems to be about page counting.

Is there some way to have this plug-in work in a serial manner rather than multi-tasking via spawning multiple calibre-parallel processes, i.e. limit the number of secondary instances of calibre-parallel to ONE.

It would 'help' if I could determine the identity (author-title) of book(s) currently being processed in a job, or books that have completed. Then I could kill the job, put the 'slow books' aside to be done one at a time, and redo the group without the slow books.

My limited tests show that a 'slow to count' book does not appear to take significantly longer to convert to & from EPUB and RTF than similar 'fast to count' books. In fact 'slow book' conversion is usually (always ?) faster 'slow book' word counting, but the opposite is true of a 'fast book'

I have no doubt that its something to do with the book content, but I've no idea what - given that similar books are OK, and other plugs-ins are not slow in their processing of the same books

I'll try starting calibre via the start command with the affinity switch set at 2, theoretically that should limit calibre to 2 cores only

BR
Attached Thumbnails
Click image for larger version

Name:	Capture.JPG
Views:	298
Size:	72.4 KB
ID:	92576  

Last edited by BetterRed; 09-19-2012 at 03:06 AM. Reason: forgot the attachment & para re algo's
BetterRed is offline   Reply With Quote