View Single Post
Old 05-25-2009, 05:01 PM   #7
Sabardeyn
Guru
Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.Sabardeyn ought to be getting tired of karma fortunes by now.
 
Sabardeyn's Avatar
 
Posts: 644
Karma: 1242364
Join Date: May 2009
Location: The Right Coast
Device: PC (Calibre), Nexus 7 2013 (Moon+ Pro), HTC HD2/Leo (Freda)
Ok, I'm taking back my "Eureka!!" now...

Ouch! Looking at your regex and testing it against the formula from the other topic, I've discovered neither one is perfect.

So far all of the formulas are dependent on " - " being used as a field delimiter only. It cannot be used for hyphenated author's names nor as a part of the series (unlikely) or title (where it is likely to occur). When an extra " - " occurs the automatic import fails as the parts of the filename are separated incorrectly.

So, for instance, this example fails in all formulas to import correctly:
Code:
John D. Smith - Jones - Bibliographic Perfection 1 - The Perfect Book - A Bedtime Story.pdf
This can obviously be corrected manually before (change to "Smith-Jones", etc) or after (edit the book's record). Let me be the first to admit that I would rather have things entered accurately the first time, automagically. Editing is a hassle and easily forgotten. Luckily for me I'm already using "Smith-Jones" (no space) for hyphenated names. However, their is no good way around the potential problem in the title.

For my purposes I would prefer a regex that resolves all of the following correctly:
  • John D. Smith - The Perfect Book.pdf
  • John D. Smith - Bibliographic Perfection - The Perfect Book.pdf
  • John D. Smith - Bibliographic Perfection 1 - The Perfect Book.pdf
  • John D. Smith - Bibliographic Perfection 189 - The Perfect Book.pdf
  • John D. Smith - Bibliographic Perfection 1 - The Perfect Book - A Bedtime Story.pdf
  • John D. Smith - Jones - Bibliographic Perfection 1 - The Perfect Book - A Bedtime Story.pdf
  • John D. Smith-Jones & Somebody Else - Bibliographic Perfection 1 - The Perfect Book - A Bedtime Story.pdf
It should also handle author names that are 133t5p34|< (leetspeak), numbers and/or symbols. Why? Because we're already headed down that path and I might as well get a jump on things. Unicode import and export would be good too - more books are being sold internationally and this trend will only grow. (I don't want much, do I?)

Last edited by Sabardeyn; 05-25-2009 at 05:10 PM. Reason: Added to the resolution list - large series number example.
Sabardeyn is offline   Reply With Quote