Quote:
Originally Posted by cornfieldcraig
Thanks much for the quick response. Works like a charm. For kicks, I used this bit of code instead and it seemed to yield virtually identical results
|
In fact the results are not really the same. Your version appends a full article version to the first page of the article, having the beginning twice in the ebook.
An example for the todays issue is the article
here.
If you want to prevent an article to be broken into several chapters, you will have to implement the get_article_url method. You will have to read the page into a Soup, analyze if it has a "single page" link (e.g. with your regex) and return the link to the complete page.