View Single Post
Old 07-26-2011, 09:03 PM   #6
unboggling
Wizard
unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.
 
Posts: 1,065
Karma: 858115
Join Date: Jan 2011
Device: Kobo Clara, Kindle Paperwhite 10
Mixx & chaley, thank you! Both methods (.*) and (.+) work great in most cases. That saves me a lot of work. So far, cases where neither method works include names such as "Melissa de la Cruz" which yields "Cruz, Melissa de la". "Melissa de la Cruz" needs to be split after the FN (at the first space from left in string) to yield "de la Cruz, Melissa" rather than the current result with each method: "Cruz, Melissa de la". Going the other way from LN, FN to FN LN in that case is fine.

Giving it some thought, it seems complex to distinguish correctly between cases of middle initials "D L" vs last name prefixes like "Da" or "de La". Counting spaces in the FN LN string to get the leftest space would only work in these special cases where they have no middle initials, not cases involving middle initials. And I haven't been using periods after initials either, which further complicates matters for me. If I added periods back in to initials, then that might be a way to distinguish between "de la" and "D. L." using a regex. Another way to distinguish would be to count the number of chars between spaces using 2 if periods present, or 1 if periods not present, to determine if it's an initial.

Anyway, thanks again. BTW, I'm curious what's the specific difference between the * and the + ?

Last edited by unboggling; 07-26-2011 at 09:21 PM.
unboggling is offline   Reply With Quote