Thanks rzikaou for uploading your example. It turns out that was important rather than my suggestion to use a book. I've never understood why there is a 14 byte difference between Amazon's offset into the rawml and where the text actually is. It turns out that for your azw3 the offset is 166 bytes. I have added a -o option to the azw3r.c and azw3r.pl attached to the first post and made a new github release.
I don't know whether it is possible to tell in advance what the offset is. I had to get yours experimentally. I moved your azw3r and rawml files into the same directory so that the command would merely be way too long instead of impossibly long.
Code:
azw3r -h -o 166 -i "Test article_CTD7HH6AE5BVXFGTFOTOV54NOREZMUWNa1bd4a78ed253ba5271d0cb7df407fda.azw3r" -r "Test article_CTD7HH6AE5BVXFGTFOTOV54NOREZMUWN.rawml"
1259 1269 Highlight: 'a thousand '
1184 1220 Highlight: 'On a bright</span> Monday in January '
1462 1518 Highlight: 'They packed themselves into a cheerful courtyard outside '
I have shown the "-o 166" at the beginning of the command for clarity. During experimentation it would be best at the end.