My Calibre library stats are below. The majority of my books are stored redundantly in both AZW3 and EPUB format. The ZIP files are audiobooks (each is a sequence of MP3s). The PDFs are varied - some are simple books, some are picture heavy textbooks, and some are piano sheetmusic.
You can see that the vast majority of my storage usage are the audiobooks.
5085 EPUBs = 1.5Gb compared to
167 audiobooks = 54Gb!!! The PDFs aren't too bad either, 77 PDFs = 1Gb.
Code:
$ du -sh .
66G .
$ find . -iname "*.azw3" -print | wc -l
5050
$ find . -iname "*.azw3" -exec du -csh '{}' + | tail -1
1.8G total
(range 53Kb to 71Mb each)
$ find . -iname "*.epub" -print | wc -l
5085
$ find . -iname "*.epub" -exec du -csh '{}' + | tail -1
1.5G total
(range 28Kb to 83Mb each)
$ find . -iname "*.pdf" -print | wc -l
77
$ find . -iname "*.pdf" -exec du -csh '{}' + | tail -1
1012M total
(range 130Kb to 80Mb each)
$ find . -iname "*.zip" -print | wc -l
167
$ find . -iname "*.zip" -exec du -csh '{}' + | tail -1
54G total
(range 35Mb to 1.4Gb each)
Edit (added the cover artwork):
Code:
$ find . -iname "*.jpg" -print | wc -l
5242
$ find . -iname "*.jpg" -exec du -csh '{}' + | tail -1
122M total
Edit #2 (added Calibre's data files):
Code:
$ find . -iregex ".*\.\(db\|json\|opf\)$" -print | wc -l
5253
$ find . -iregex ".*\.\(db\|json\|opf\)$" -exec du -csh '{}' + | tail -1
38M total