Testing on this version has been finished.
I'm again looking for some people to beta test the attached newest version before I put it out for everyone.
This version contains:
- New feature - Allow user to set custom CSS in personal.ini for HTML and EPUB output.
- New feature - Allow user to set custom regular expressions in personal.ini to modify metadata.
- New feature - Use Accept-Encoding=gzip to speed download. (Not sites will use it--it's common for sites to block gzip based on User-Agent.)
- Add progress bars while collecting URLs from stories for list and for updates.
Here's an example of what the default output_css parameter for epub looks like:
Here's an example of what some replace_metadata lines might look like.
## Use regular expressions to find and replace (or remove) metadata.
## For example, you could change Sci-Fi=>SF, remove *-Centered tags,
## etc. See http://docs.python.org/library/re.html (look for re.sub)
## for regexp details.
## Make sure to keep at least one space at the start of each line and
## to escape % to %%, if used.
Magical Girl Lyrical Nanoha=>Nanoha
Puella Magi Madoka Magica.*=>Madoka
- "(.*)-Centered=>" removes all Name-Centered tags entirely (TTH).
- "Puella Magi Madoka Magica.* =>Madoka" changes ffnet's "Puella Magi Madoka Magica/魔法少女まどか★マギカ" to just Madoka.
- "(Friend)(ship)=> \2\1" changes Friendship to shipFriend. Not useful, but demonstrates regular expression group substitution.
- "(.*)Great(.*)=>\1Moderate\2" shows changing one word in the middle of a tag, such as a series name.
- "(?s)(.*)suck(.*)=>\1kcus\2" (?s) tells the regexp engine to allow . to make newlines, too. Needed to affect many descriptions/summaries.
The replacement pattern must match the entire string for a piece of metadata to be changed. "Friend" will not match category "Friendship", but "F.*" will. Replacements are applied to all metadata, so be careful to be specific.
You can have different lists of replacements for different sites. One use could be to normalize the rating tags different sites use to all be the same.