Most ebook formats are html internally, so the viewer just extracts that html. And of course there is more to the viewer than just extracting the HTML. The point I was making was that your white blocks issue is unlikely to be caused by the extract html step. If you want to experiment further you will need to run calibre from source, which is very easy,
https://manual.calibre-ebook.com/develop.html
The viewer code is mostly in src/pyj/read_book and src/calibre/srv/render_book