In order to give the code the deserved honor, I have followed 'all' your instructions and can report:
You are 100% right.
I have been busy with these books the last time. These are SCANS from two sources, so the scan errors are also different - and almost innumerable.
S&R is only a small help, it is almost always necessary a semi-manual procedure. This is an strenuous activity, if one wants to work quickly and without flaws. (Find-Replace-Find-Find-Replace-Find, etc.) And in the end proof-reading is necessary. I like to deal with texts, so that is a pleasure for me, which would mean a nonsensical and exhausting affair to others. But my temper does not allow to do a job twice.
So I can not start again. Fortunately, the SMALL CAPS are limited to definable, known places: selected headings. Now I filter these out and change to SMALL CAPS.
At the end still a small drop of bitterness (known a few days ago): Small caps does not function in my ereader. Now I write: ANHANG.
A few images and codes of the procedure!
Original
Small Caps conversion
merging
First repetition has 104 occurences, 2nd, 3rd and 4th repetition has 52 occurences (it's clear for why, without explaining here). I test only with
''Anhang''.
Code:
<p class="calibre1">A<span class="small-cap">n</span><span class="small-cap">h</span><span class="small-cap">a</span><span class="small-cap">n</span><span class="small-cap">g</span> </p>
<p class="calibre1">A<span class="small-cap">nh</span><span class="small-cap">an</span><span class="small-cap">g</span> </p>
<p class="calibre1">A<span class="small-cap">nhan</span><span class="small-cap">g</span> </p>
<p class="calibre1">A<span class="small-cap">nhang</span> </p>
I am looking forward to the next time. Problems are always enough.