New Test Version Posted
2020-09-24
- Add get_urls_from_page() and get_series_from_page() to adapters, add support.
This version contains some re-architecting that does two things:
- It moves the 'Get Story URLs from Web Page' code down inside adapter so it's easier to add things like site login without kludging up the geturls.py code (effects ao3, tth, fimf and scarvesandcoffeenet), and;
- It allows adapters to have a separate get_series_from_page() method that is called during get_urls_from_page() to try and recognize an official 'series' page, more intelligently parse the story URLs for it and give the official series name and description. AO3 and TtH only so far.
When FFF presents you with the story URLs for a series anthology, it now also shows you the series name and description above the list. TThey're not labeled or anything yet, this is a first cut version.)
If the series has a name, that will be the name of the anthology (after
anthology_title_pattern (default:${title} Anthology) is applied). You asked for it, that's what you get.
If the series has a description, that will placed in the book comments and then followed by the 'Anthology containing:' text as before. 'New Only' is still applied.
Series name and description in this case
cannot be edited by
replace_metadata or include/exclude, etc because they are not
story metadata.
Right now I'm mostly interested in your comments on the user-interface aspects of this, and if there's anything obvious I've overlooked. Adding series specific code for more sites will come afterwards. However, please provide concrete examples if there are other sites you think break the series paradigm I've set.
Example where not all stories have same (first) series:
https://archiveofourown.org/series/1779154
Examples with extra story URLs in descs:
https://archiveofourown.org/series/620242
https://archiveofourown.org/series/1044225
Example with a lengthy series desc now included:
https://www.tthfanfic.org/Series-2696