| 
			
			 | 
		#1 | 
| 
			
			
			
			 Groupie 
			
			![]() ![]() ![]() ![]() ![]() Posts: 160 
				Karma: 416 
				Join Date: Apr 2010 
				
				
				
				Device: Astak EZ Reader Pro AND Sony PRS-505 
				
				
				 | 
	
	
	
		
		
			
			 
			
			Probably the five millionth since this forum was created, I suppose.  :-) 
		
	
		
		
		
		
		
		
		
		
		
		
		
			Many of my files are kept in the following naming format: L. Frank Baum - [Wizard of Oz 02] - The Marvelous Land of Oz.lit Please note those square brackets are virtually always used in what I've got stored, and the <space><single dash><space> between an author and series/series number and between that and the book title is also pretty consistent. Assuming the vast majority of my books follow this format, does anyone have a good expression to add them with? Ideally the expression would recognize the square brackets as a tip off that a book series and book number are being disclosed. Is such a thing even possible? I ask, because if a book ISN'T part of a series, the existing file name is probably something more like this: H. G. Wells - The Time Machine.epub OPTIONALLY, its at least possible (although I bet this is even harder to resolve) that some files may look like this: Jules Verne - Journey to the Center of the Earth (html).zip Pie in the sky, if those ROUND brackets could be a tip off to ignore something as NOT being part of a book title, that would be ideal. Yeah, even ignorant of how to build these expressions properly, I'm skeptical. Does anyone out there have stuff following approximately these "rules", and what have you done to best ensure proper Calibre "importing"? ANY subset of the requirements I list above, dealing with series names in square brackets, ignoring stuff in round brackets, etc. would be better than nothing, but I don't expect much. Please note that I have zero ability at scripting, so I'm really just asking what the best canned solution is. If its "you're out of luck" I guess I'll figure something else out. If there are proper expressions to handle this already, great. If there are other third party tools outside of Calibre to accurately mass rename files FIRST in an acceptably way, I suppose that's something I'd be willing to try as well (although using a 2nd tool first seems redundant if Calibre can be made to do it). Thanks infinitely in advance for any possible suggestions! Last edited by Spiffy; 04-05-2010 at 03:45 PM.  | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#2 | 
| 
			
			
			
			 Wizard 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,798 
				Karma: 30548723 
				Join Date: Dec 2006 
				Location: Singapore 
				
				
				Device: Boyue 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			I had the same problem so I started using booksorter 
		
	
		
		
		
		
		
		
		
		
		
		
	
	to rename my files to Author - Series # - Title.lit http://iterati.org/ebookTools/BookSorter/Default.aspx then used this for the add to (?P<author>[^_-]+) -?\s*(?P<series>[^_0-9-]*)(?P<series_index>[0-9]*)\s*-\s*(?P<title>[^_].+) ? I did see somewhere on the forum the regex you are looking for but couldn't find it  | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#3 | |
| 
			
			
			
			 Groupie 
			
			![]() ![]() ![]() ![]() ![]() Posts: 160 
				Karma: 416 
				Join Date: Apr 2010 
				
				
				
				Device: Astak EZ Reader Pro AND Sony PRS-505 
				
				
				 | 
	
	
	
		
		
		
		
		 Quote: 
	
 Another way to do this DID occur to me this morning. Doing it in discrete steps. First, importing the books without a series on their own, with a fairly standard regex. Then going back and CHANGING the regex to expect a series and importing THOSE books. But I guess I still would have to deal with the square brackets. I either have to have a way to mass remove them, or mass ignore them in an import. The second problem, the format occasionally being at the end in round brackets, I suppose I just have to live with (and manually erase after the fact).  | 
|
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#4 | 
| 
			
			
			
			 Wizard 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,004 
				Karma: 177841 
				Join Date: Dec 2009 
				
				
				
				Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			I started to respond to this twice, but I'm not at home and can't test anything I post.  It's pretty easy to make the brackets optional if everything else is right. 
		
	
		
		
		
		
		
		
		
		
		
		
		
			this is an optional open bracket: \[? and this is an optional closed bracket: \]? Try this (totally untested): Code: 
	wrong code posted (untested) Code: 
	^((?P<author>([^\-_0-9]+)(?=\s*-\s*)(?!\s*-\s*[0-9.]+)|\b))(\s*-\s*)?(\[?(?P<series>[^0-9\-]+) (- )?(?P<series_index>[0-9.]+)\]?\s*-\s*)?(?P<title>.+) Last edited by Starson17; 04-06-2010 at 07:24 PM.  | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#5 | 
| 
			
			
			
			 Groupie 
			
			![]() ![]() ![]() ![]() ![]() Posts: 160 
				Karma: 416 
				Join Date: Apr 2010 
				
				
				
				Device: Astak EZ Reader Pro AND Sony PRS-505 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			Hmm.   Good to know that.  
		
	
		
		
		
		
		
		
		
		
		
		
	
	The expression doesn't seem to work, unfortunately. But I appreciate the try. When you use that string and run the test tool inside Calibre against this book: L. Frank Baum - [Wizard of Oz 02] - The Marvelous Land of Oz.lit The following shows in the test results: Title: L. Frank Baum - [Wizard of Oz 02] - The Marvelous Land of Oz Authors - nothing Series - nothing Series index - nothing  | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#6 | 
| 
			
			
			
			 Wizard 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,004 
				Karma: 177841 
				Join Date: Dec 2009 
				
				
				
				Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T 
				
				
				 | 
	
	|
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#7 | 
| 
			
			
			
			 Groupie 
			
			![]() ![]() ![]() ![]() ![]() Posts: 160 
				Karma: 416 
				Join Date: Apr 2010 
				
				
				
				Device: Astak EZ Reader Pro AND Sony PRS-505 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			Genius work.  Thank you--that's quite nifty.  It even recognizes that if there's no brackets (or is it counting dashes?), there's no series, and realizes that the position of the title will be different. 
		
	
		
		
		
		
		
		
		
		
		
		
	
	I hate to push, but do you know a way to address the other main issue I had? The occasional optional file type sandwiched between ROUND brackets? Like so: Jules Verne - Journey to the Center of the Earth (html).zip Ideally, the best result would be to drop those file types, round brackets and everything between them, from the Title. Inevitably legit titles with round brackets could be affected, I guess, but that's a small price to pay. I won't be greedy though. You've already saved me a ton of potential headache.  | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#8 | 
| 
			
			
			
			 Wizard 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,004 
				Karma: 177841 
				Join Date: Dec 2009 
				
				
				
				Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			(staggering a bit) ..... try this: 
		
	
		
		
		
		
		
		
		
		
		
		
	
	Code: 
	^((?P<author>([^\-_0-9]+)(?=\s*-\s*)(?!\s*-\s*[0-9.]+)|\b))(\s*-\s*)?(\[?(?P<series>[^0-9\-]+) (- )?(?P<series_index>[0-9.]+)\]?\s*-\s*)?(?P<title>[a-zA-Z1-9 ]+)(\(.*\))?$  | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#9 | |
| 
			
			
			
			 Groupie 
			
			![]() ![]() ![]() ![]() ![]() Posts: 160 
				Karma: 416 
				Join Date: Apr 2010 
				
				
				
				Device: Astak EZ Reader Pro AND Sony PRS-505 
				
				
				 | 
	
	
	
		
		
		
		
		 Quote: 
	
 ![]() No dice though. It tosses everything into title field again.  | 
|
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#10 | 
| 
			
			
			
			 Wizard 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,004 
				Karma: 177841 
				Join Date: Dec 2009 
				
				
				
				Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			It works for me.  Try again, or show me what you're testing it on.  It correctly parsed all of these: 
		
	
		
		
		
		
		
		
		
		
		
		
		
			L. Frank Baum - [Wizard of Oz 02] - The Marvelous Land of Oz.lit L. Frank Baum - [Wizard of Oz 02] - The Marvelous Land of Oz(lit).lit L. Frank Baum - Wizard of Oz 02 - The Marvelous Land of Oz(lit).lit L. Frank Baum - Wizard of Oz 02 - The Marvelous Land of Oz.lit Last edited by Starson17; 04-06-2010 at 10:17 PM.  | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#11 | |
| 
			
			
			
			 Groupie 
			
			![]() ![]() ![]() ![]() ![]() Posts: 160 
				Karma: 416 
				Join Date: Apr 2010 
				
				
				
				Device: Astak EZ Reader Pro AND Sony PRS-505 
				
				
				 | 
	
	
	
		
		
		
		
		 Quote: 
	
  | 
|
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#12 | 
| 
			
			
			
			 Junior Member 
			
			![]() Posts: 3 
				Karma: 10 
				Join Date: Apr 2010 
				
				
				
				Device: none 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			Actually, there are a few instances that it doesn't work to kill version numbers and formats after the title (although it is beautifully written). 
		
	
		
		
		
		
		
		
		
		
		
		
	
	Replacing (\(.*\))?$ with .+ seems to drop everything after the title.  | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#13 | |||
| 
			
			
			
			 Groupie 
			
			![]() ![]() ![]() ![]() ![]() Posts: 160 
				Karma: 416 
				Join Date: Apr 2010 
				
				
				
				Device: Astak EZ Reader Pro AND Sony PRS-505 
				
				
				 | 
	
	
	
		
		
		
		
		 Quote: 
	
 The regex works perfectly with any of this: Quote: 
	
 Quote: 
	
  | 
|||
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#14 | ||
| 
			
			
			
			 Groupie 
			
			![]() ![]() ![]() ![]() ![]() Posts: 160 
				Karma: 416 
				Join Date: Apr 2010 
				
				
				
				Device: Astak EZ Reader Pro AND Sony PRS-505 
				
				
				 | 
	
	
	
		
		
		
		
		 Quote: 
	
 Quote: 
	
  | 
||
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#15 | 
| 
			
			
			
			 Right, Except When Wrong 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 364 
				Karma: 4323767 
				Join Date: Aug 2007 
				Location: Indianapolis 
				
				
				Device: Kindle Oasis 3 (sometimes iPad Mini). 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			This is so close to what I'm trying to accomplish that I thought it was worth posting my query, too. In my case, book titles are formatted like this: 
		
	
		
		
		
		
		
		
		
		
		
		
	
	Brown, Dan - The Lost Symbol [Robert Langdon #3].epub AuthorLast, AuthorFirst - Title [Series #SeriesNum].format It looks like the code that was provided is very close, but I'm not quite sure where the "delimeters" (not sure of the right term) are between the Author, Series, and Title sections of the RE. Thanks for any help you can provide.  | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
![]()  | 
            
        
            
| Thread Tools | Search this Thread | 
            
  | 
    
			 
			Similar Threads
		 | 
	||||
| Thread | Thread Starter | Forum | Replies | Last Post | 
| Regular Expression Help | Azhad | Calibre | 86 | 09-27-2011 03:37 PM | 
| Custom Regular Expressions for adding book information | bigbot3 | Calibre | 1 | 12-25-2010 07:28 PM | 
| Regular Expression Help | smartmart | Calibre | 5 | 10-17-2010 06:19 AM | 
| Regular Expression For Adding Books | jhart711 | Calibre | 3 | 09-27-2010 07:51 AM | 
| Help with the regular expression | Dysonco | Calibre | 9 | 03-22-2010 11:45 PM |