@g25:
Several comments that you should heed:
Quote:
|
I notice during the "Scrub Work Data at Book Level [Selected Books]" it tends to get stuck at 99% doing miscellaneous scrubbing (or whatever)
|
No. It is
not "stuck". At all. It is doing important, miscellaneous things in pure SQL that is "above" Python, so there is no way to communicate its progress back to Calibre's job notification. It does what it does at the Library Level for all books that have a status of any type. When it says "doing miscellaneous scrubbing", that is exactly what it is doing.
Quote:
|
I then have to manually stop the job. The books don't update. But then I exit out of Calibre (Database locked error) and start it back up and the books have been scrubbed with a "book_ok" status and they look fine.
|
Of course they have a "book_ok" status, because you will note from the Job Log (that you must watch as the job runs), Book Level Scrubbing finished. The total count of books scrubbed at that point is identical to the original unscrubbed number. You wrongly stopped the job
after Book Level Scrubbing finished but
before Miscellaneous Scrubbing finished.
Quote:
|
I then have to manually stop the job. The books don't update. But then I exit out of Calibre (Database locked error) and start it back up and the books have been scrubbed with a "book_ok" status and they look fine.
|
No, you do
not "have to" manually stop the job. That was a personal decision on your part. Who knows what did not get fixed as a result. Why not just unplug your pc? Same result.
Quote:
|
Is it because I am doing a huge 6,000 book library? Even though it hangs for several minutes at the 99% mark even when book level scrubbing only a couple of books?
|
It does not "hang". It is doing many different things as shown below. Those marked "Entirely in SQL" mean that it does not "report back" its progress. That does not mean it is "hung".
- add_tag_combinations - * Entirely in SQL
- apply_tag_string_replacement_rules - * Entirely in SQL
- __delete_unused_values
- __set_null_seriesname_index_to_zero
- _insert_single_tags_into_work_tags_single
- _convert_identifiers_isbn_from_10_to_13
- change_work_title_to_web_title_for_previously_vali dated_work_series - * Entirely in SQL
- miscellany_change_work_index_to_web_index_standalo ne
- apply_author_mapping_rules - * Entirely in SQL
As the User Guide says, scrub your books from start-to-finish in
Batches. That means there should be no books whatsoever in the "current" Q&S library that you do not want to be scrubbed simultaneously, or, if that is impractical, all but a batch-size should not have any Work Data whatsoever. Q&S jobs ignore all books with no Work Data. It also says that in the User Guide.
Author Level Scrubbing is for all books that have a status of "dirty". Period.
Miscellaneous Scrubbing is for all books in the current Q&S that have any non-null status.
You should not have copied "real to work" for all 6000. That is the root cause of your problem.
I have 5 "production" Q&S Libraries, which you would see in the Configuration screen, the User Guide, and some of the .txt attachments in the Original Post.
- QuarantineAndScrub_prefilter - where I "Add Books", convert them to epubs, add ISBN, modify EPUBs, count pages, and so forth. Then, they are moved to _prescrub.
- QuarantineAndScrub_prescrub - where I hold them after _prefilter, and then move them to the next library when I feel like working on one or some of them. Some I work on in this library because I am impatient to read them. Obviously I copy all of my "rules" from my "best" Q&S library prior to that. You will note there is a special job just for that under "Special Tools".
- QuarantineAndScrub - where all Author Level and Book Level Scrubbing is done. I then "copy to work to real" metadata for the entire Batch. Afterwards, moved to _postscrub.
- QuarantineAndScrub_postscrub - where scrubbed books go for final spot-cleaning. I very likely will (again) copy "real to work", but this time the "real" is the "green" metadata from the prior library. Here I will do Tag Scrubbing and Tag Minimization, Consolidating Series Names, Deriving Genres, Deriving Library of Congress and Dewey Decimal Codes, Pseudonyms, and so forth. When I have finished with a book, I move it to my "pristine" library.
- QuarantineAndScrub_CalibreMain - my "pristine" Calibre Library. I work on it in batches of ~200 or so to tweak the metadata of books that I had prior to my creating Q&S for myself to speed things up. Q&S was for my personal use in the beginning. New books start at _prefilter, but my "original" books are worked-on in situ.
At this point, you of course do not want to lose what scrubbing you have done so far. Off the top of my head, this is what I think I would do if I were in your bind (thankfully I am not). Create a copy of your single Q&S Library, books and all, and then paste that into a new Q&S Library called "QuarantineAndScrub_prescrub". Then, switch back to the original one you have been working in, and delete the 3,000 books that you have not yet started. Then, finish the good ones and move them forward. Then, switch back to the _prescrub library and delete the ones that you just moved forward. Finish them there, and move them to the end, bypassing the "real" Q&S library. There are variations on this that I would do, such as doing them in batches of 500 instead of one batch of 3000. That is up to you.
Good luck.
DaltonST
[Additional Comments:]
Quote:
|
But I don't want to do just 100 at a time for 3000 books!
|
Who said you had to do just 100 at a time for 3000 books? You can execute the "copy real to work for "selected books only" as many times as you wish. I prefer batches of 200-500, myself.