View Single Post
Old 06-07-2023, 09:12 PM   #12
tomsem
Grand Sorcerer
tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.tomsem ought to be getting tired of karma fortunes by now.
 
Posts: 6,953
Karma: 27060153
Join Date: Apr 2009
Location: USA
Device: iPhone 15PM, Kindle Scribe, iPad mini 6, PocketBook InkPad Color 3
This is what I came up with (not tested on Windows or Linux):

PHP Code:
from logging import infobasicConfigINFO
from pathlib import Path
from re import fullmatch
from sys import argv

CONTENT_EXTENSIONS 
= {'.kfx''.azw''.azw1''.azw3''.pdf''.txt'}
TMP_REGEX r'\.tmp\d*_(ASC|PHL)'
basicConfig(level=INFO)


def remove_dir(folder):
    try:
        
Path(folder).rmdir()
        
info(f'Removed {folder}')
    
except OSError as ex:
        
info(f'** {ex} **')


def rmtree(root):
    for 
p in root.iterdir():
        if 
p.is_dir():
            
rmtree(p)
        else:
            
p.unlink()

    
root.rmdir()


scan_folder Path('/Volumes/Kindle/documents' if len(argv) == else argv[1])
if 
not scan_folder.is_dir():
    
info(f'Path {scan_folder} does not exist, quitting.')
    exit(
1)

content_file_stems set()
tmp_files = []
sdr_folders = []
for 
path_ in scan_folder.glob('**/*'):
    if 
'.sdr' in str(path_.parent) or 'updates' in path_.parent.parts:
        
# take no action on content in .sdr folders or /updates folder
        
continue
    if 
path_.is_dir():
        if 
path_.suffix == '.sdr':
            
sdr_folders.append(path_)
        else:
            
remove_dir(path_)
    else:
        if 
path_.suffix in CONTENT_EXTENSIONS:
            
content_file_stems.add((path_.parentpath_.stem))
        
elif fullmatch(TMP_REGEXpath_.suffix):
            
tmp_files.append(path_)

for 
tmp_file in tmp_files:
    if (
tmp_file.parenttmp_file.stemnot in content_file_stems:
        
Path(tmp_file).unlink()
        
info(f'Removed orphaned temp file {tmp_file}')

for 
sdr_folder in sdr_folders:
    if (
sdr_folder.parentsdr_folder.stemnot in content_file_stems:
        
rmtree(sdr_folder)
        
info(f'Removed orphaned .sdr folder {sdr_folder}'
All standard library, no external dependencies. Might need Python 3. On Mac you can just type 'python clean.py' once Kindle is plugged in. On Windows, you can use calibre-debug as interpreter (while supplying path argument like E:/documents).

It runs almost instantaneously, so I don't see any particular penalty to running it every time you plug Kindle in.

I haven't seen the tmp files except on one of my Kindles, so it's possible there are other patterns to look for.

I saw updates/ folder on a couple of them, with an .sdr folder inside. Not sure what it is for, but thought it best to leave it alone.

I think only /documents needs to be a target for cleanup. I would leave it to user to clean up screen captures in root, for example.

And I haven't seen any 'turds' in .documents folder so far (Scribe).

[Update] re-factored to use (preferred in Python 3) pathlib.

Last edited by tomsem; 06-22-2023 at 10:17 PM.
tomsem is offline   Reply With Quote