Testing the various regular expressions, so far all of them fail at some aspect. Most notably dealing with either hyphenated names, or dashes in the series or title name. Not to mention extra spaces around dashes munging the whole thing.
Gwynevan's regex has been slightly more robust than other samples. My attempts to expand on it are not meeting with success; but I'm not a programmer or particularly good at regex. I'm trying to account for hyphenated names first as accurately importing the author is most important. I have avoided using \s to avoid later issues with making the formula work with leetspeak. But as that is a lesser concern I can forgo it.
darkmonk & ilovejedd,
My original attempt to post that sample book was clearly defined. But I deleted it as being unnecessary. Now I stand corrected:
- author = John D. Smith - Jones
- series = Bibliographic Perfection
- series index = # (any numeric value from 1-999; keep in mind the comic/manga readers)
- title = The Perfect Book - A Bedtime Story
I think the idea of a series and subordinate series is great. If I remember correctly Alan Dean Foster wrote in a Universe & Series (Humanx & Flinx, etc) as do some of the manga authors (called "circles" - authors write around other members stories [A uses B's characters, etc]). Universe/Setting/(Subordinate) Series, is feasible.
This begs the question of input vs output. I mean, I'm trying to input what I have whereas you're establishing a valid output pattern. Ultimately this is not a problem. But it occurs to me working on the same thing might be more effective!