The ? means the expression is lazy, and will match as few times as possible.
"<!-- comment --> story <!-- comment -->" will leave " story "
Any hpfanficarchive story will do, they all have the comments.
Here's the first story in latest:
http://hpfanficarchive.com/stories/v...=834&chapter=1
the body found by the adapter starts with <!-- STORY START --> and ends with <!-- STORY END -->
The heuristics code only really have one change, an if was changed to a while loop, to remove multiple layers of divs. AO3 uses 2.
That really should be changed to using beautiful soup for safety, as it might be a problem for this structure:
<div>
<div>something</div>
<div>Something else</div>
</div>
I didn't realize that till just now, so it might be a good idea to roll back that change for now..
Edit: I just reverted that change, so there is nothing to test in the heuristics for now. I have to figure out how to solve this.