Buzz off. That first page looks pretty nicely coded. (Horribly un-semantic, but no MS Word sludge).
The obvious guess is that its due to the use of embedded fonts, which would be easy enough to fix without going back to square 1.
If that's not it, you may have to resort to divide and conquer / binary search etc. Remove all the text except for that first page - still a problem? Find an ebook that works. Reduce the _working_ ebook to a single (working) page of text. Then look at the differences in the code between the two. Then mix & match the code to narrow down exactly what is causing the problem.
|