New donga just in.
This time it fetches the actual articles during parse index and gets the section titles from the actual content. It then caches the content to a temp file (so that the content isn't fetched twice).
It no longer uses the print versions as these didn't contain the article sections - hopefully ive cleaned these up enough though.
|