So I'm not really sure if this is the most appropriate place to put this. Anyway...
So, I got a book from amazon, imported into Calibre, then converted to epub as you do. The first sign that something was amiss was that it took calibre over two and a half minutes to convert the book. Normal conversions only take around 10s or so. Naturally I wanted to see what the problem was, so I opened the book in the Editor.
I let loose a string of expletives.
Then I let loose a few more for good measure.
I have then gone and opened the original azw3 in the ebook editor.
The editor reports that the size of the
HTML file is
6.5 Megabytes 
. This is not the bible mind you, just a run of the mill length novel.
So, what does the body of the text look like? Here is a very small sample:
Code:
<p class="MsoNormal" style="text-align:justify;text-indent:.25in"><span style="font-size:0.92rem">"<span style="letter-spacing:-.05pt">W</span>h<span style="letter-spacing:-.05pt">a</span>t<span style="letter-spacing:1.4pt"> </span><span style="letter-spacing:-.1pt">s</span><span style="letter-spacing:.05pt">a</span>y<span style="letter-spacing:1.25pt"> </span><span style="letter-spacing:-.3pt">y</span><span style="letter-spacing:.05pt">o</span>u<span style="letter-spacing:.05pt">?</span>"<span style="letter-spacing:1.5pt"> </span>
That's just a small portion of a paragraph.
This happens throughout the entire book.
Sorry about the rant, I just felt I had to get it out.
EDIT: This bit of regex solved the problem...
Code:
<span style="letter-spacing[^>]+>([^<]+)</span>