Ah - I'd forgotten this discussion had already come up here:
https://www.mobileread.com/forums/sho...=preprocess.py
http://bugs.calibre-ebook.com/ticket/2359
I believe the 'Preprocess input file to possibly improve structure detection' is a result of that feature/bugfix. As I recall I started looking into how to implement the new function in other input plugins and shortly thereafter took a break from participating in Calibre/Mobileread. Looks like no one else has picked it up, but the logic is more or less ready to go. Just need to create the regexes and figure out how add the preprocess_html method to the input format plugins.