Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Conversion

Notices

Closed Thread
 
Thread Tools Search this Thread
Old 09-24-2010, 12:29 AM   #46
megachirops
Enthusiast
megachirops began at the beginning.
 
Posts: 31
Karma: 12
Join Date: Mar 2010
Device: Kindle 2, Kindle 3
Quote:
Originally Posted by Calibreuser View Post
I have some books named like so
Lauthor, Fauthor - series ##- title.ext
Grant, Maxwell - The Shadow 331 - Mark Of The Shadow(b).txt

My best so far is (?P<author>.+) - (?P<series>.+) - (?P<title>[^_]+)

but now the series index is part of series

what is series index Var. name?
and how do I change title section to drop crap at end like (b)
I can rename as needed in most cases
Code:
(?P<author>((?!\s-\s).)+)\s-\s(?:(?P<series>.+)\s(?P<series_index>\d+)\s-\s)?(?P<title>[^(]+)(?:\(.*\))?
This should solve for "author(s) - title" and "author(s) - series # - title" as well as get rid of anything in ()'s at the end of the title.


Quote:
Originally Posted by Calibreuser View Post
can anyone help me author, series, index, and title out of this

Grant, Maxwell - [The Shadow 331] - Mark Of The Shadow(b).txt
edit: Just noticed the 2nd example had the series in []'s. If that's the case, here it is slightly modified to support optional []'s around the series.

Code:
(?P<author>((?!\s-\s).)+)\s-\s(?:(?:\[\s*)?(?P<series>.+)\s(?P<series_index>\d+)(?:\s*\])?\s-\s)?(?P<title>[^(]+)(?:\(.*\))?

Last edited by megachirops; 09-24-2010 at 12:44 AM.
megachirops is offline  
Old 09-24-2010, 04:31 AM   #47
Calibreuser
Junior Member
Calibreuser began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Sep 2010
Device: nook
Thanks for the help, I will study it a bit before I ask next question and most files don't have [ ] I removed them for consistancy sake.

Calibreuser
Calibreuser is offline  
Old 09-24-2010, 10:47 AM   #48
Manichean
Wizard
Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.
 
Manichean's Avatar
 
Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
Just finished another rather large edit- it actually took me longer than originally writing the post.
Concerning the features discussed, I think it's getting to the point that relevant stuff is included. As I stated in the first paragraphs, I want this to be an introduction explaining some basic concepts. With that in mind, is there still something missing?
Also, i've shuffled some stuff around to hopefully make more sense for someone new to regular expressions and cleaned up formatting.
Manichean is offline  
Old 09-24-2010, 12:23 PM   #49
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,744
Karma: 22446736
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
@Manichean: Drop me a PM when you think the tutorial is ready to be included in the User Manual (if that is including it in the User Manual is ok with you)
kovidgoyal is offline  
Old 09-24-2010, 12:28 PM   #50
Manichean
Wizard
Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.
 
Manichean's Avatar
 
Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
Quote:
Originally Posted by kovidgoyal View Post
@Manichean: Drop me a PM when you think the tutorial is ready to be included in the User Manual (if that is including it in the User Manual is ok with you)
Will do.
Manichean is offline  
Old 09-24-2010, 02:44 PM   #51
kacir
Wizard
kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.
 
kacir's Avatar
 
Posts: 3,447
Karma: 10484861
Join Date: May 2006
Device: PocketBook 360, before it was Sony Reader, cassiopeia A-20
Quote:
Originally Posted by Manichean View Post
Concerning the features discussed, I think it's getting to the point that relevant stuff is included. As I stated in the first paragraphs, I want this to be an introduction explaining some basic concepts. With that in mind, is there still something missing?
I think that at this moment your Introduction is almost perfect. It is a good balance between simplicity and complete description of features. You should consider writing technical books or manuals or technical documentation. Very few people can present such advanced stuff in a concise way that does not sound too complicated to regular users.

As for missing features, there is only one thing that might be useful in the new Bulk editing of metadata, in the experimantal Search and Replace.
All groups in parenthesis can be referenced in the replace field by using escape sequences \1 \2 ... They can and also be referenced this way in search field, I just can't think of situation when *that* would be used in Calibre.

An example. You have set authors field in Calibre to FirstName LastName and you have all metadata for authors_sort in LastName, FirstName and you want to change it to FirstName LastName

In the new Search and Replace dialog, you select Search field author_sort, you search for "([^,]*), (.*)" and the replace string is \2 \1
Now, the author_sort that was London, Jack becomes Jack London.

There is another little thing you might want to add.
Complicated regexps look intimidating, because the syntax is very condensed. But writing regexps is not that difficult if you construct them step by step.
In other words, writing regexps is much easier than it would seem when you just read them.
kacir is offline  
Old 09-24-2010, 02:50 PM   #52
Manichean
Wizard
Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.
 
Manichean's Avatar
 
Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
Quote:
Originally Posted by kacir View Post
I think that at this moment your Introduction is almost perfect. It is a good balance between simplicity and complete description of features. You should consider writing technical books or manuals or technical documentation. Very few people can present such advanced stuff in a concise way that does not sound too complicated to regular users.
Thank you for that.

Quote:
Originally Posted by kacir View Post
As for missing features, there is only one thing that might be useful in the new Bulk editing of metadata, in the experimantal Search and Replace.
All groups in parenthesis can be referenced in the replace field by using escape sequences \1 \2 ... They can and also be referenced this way in search field, I just can't think of situation when *that* would be used in Calibre.
... which is why, until now, I've left them out...

Quote:
Originally Posted by kacir View Post
An example. You have set authors field in Calibre to FirstName LastName and you have all metadata for authors_sort in LastName, FirstName and you want to change it to FirstName LastName

In the new Search and Replace dialog, you select Search field author_sort, you search for "([^,]*), (.*)" and the replace string is \2 \1
Now, the author_sort that was London, Jack becomes Jack London.
... but this makes sense. I'll add it once chaley and kovid are done doing all kinds of crazy amazing stuff with the beta and it moves to the main trunk.

Quote:
Originally Posted by kacir View Post
There is another little thing you might want to add.
Complicated regexps look intimidating, because the syntax is very condensed. But writing regexps is not that difficult if you construct them step by step.
In other words, writing regexps is much easier than it would seem when you just read them.
This is actually one of the reasons we needed this text, I think
Manichean is offline  
Old 09-24-2010, 02:55 PM   #53
kacir
Wizard
kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.
 
kacir's Avatar
 
Posts: 3,447
Karma: 10484861
Join Date: May 2006
Device: PocketBook 360, before it was Sony Reader, cassiopeia A-20
Quote:
Originally Posted by Manichean View Post
... but this makes sense. I'll add it once chaley and kovid are done doing all kinds of crazy amazing stuff with the beta and it moves to the main trunk.
Well, I am using 0.7.19 at this moment and the experimental search and replace in metadata is there.
kacir is offline  
Old 09-24-2010, 03:07 PM   #54
Manichean
Wizard
Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.
 
Manichean's Avatar
 
Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
Quote:
Originally Posted by kacir View Post
Well, I am using 0.7.19 at this moment and the experimental search and replace in metadata is there.
Yes, but it has changed quite a bit in the beta. I'm waiting to see the finished shape.
Manichean is offline  
Old 09-26-2010, 07:39 AM   #55
Manichean
Wizard
Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.
 
Manichean's Avatar
 
Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
Added re-referencing groups and the search & replace- feature.

Personally, I consider this iteration of the introduction to be an excellent candidate for a final version. Thus, I'd like to first, of course, thank those who helped me write this, and second, ask those people (and anyone else who wants to) to re-read the post and make sure it's good.
Manichean is offline  
Old 09-26-2010, 09:05 AM   #56
kacir
Wizard
kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.
 
kacir's Avatar
 
Posts: 3,447
Karma: 10484861
Join Date: May 2006
Device: PocketBook 360, before it was Sony Reader, cassiopeia A-20
Quote:
Originally Posted by Manichean View Post
Personally, I consider this iteration of the introduction to be an excellent candidate for a final version.
Absolutely fantastic job.
Thank you very much.

I wish, I had such excellent text available when I was starting to learn Regular Expressions. It was not using Calibre, but manipulating texts in various advanced text editors, such as TextPad on Windows or, later, Vim. It took me many, *many* trials and errors(*), and many re-reads of cryptic, very condensed manuals to discover things that you describe. Then I found the book "Mastering Regular expressions" ;-)
The book is excellent, but this Introduction is about the maximum that a person that has never used REs before can (and is willing) to digest. Even more important is, that most users will never require more than you describe, even if they do fairly advanced stuff with RE magic.

I think you should definitely consider becoming technical writer. Very few people can understand complicated technical issues AND present them in such a clear, concise manner. Knowing what can be left out for your particular audience is perhaps the most difficult thing for technical writer.

You have just made Calibre much more useful tool for many people.

----------------------------
(*) yes, I did fail spectacularly many times and I did take notes ;-)
kacir is offline  
Old 09-26-2010, 09:12 AM   #57
Manichean
Wizard
Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.
 
Manichean's Avatar
 
Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
Quote:
Originally Posted by kacir View Post
Absolutely fantastic job.
Thank you very much.

I wish, I had such excellent text available when I was starting to learn Regular Expressions. It was not using Calibre, but manipulating texts in various advanced text editors, such as TextPad on Windows or, later, Vim. It took me many, *many* trials and errors(*), and many re-reads of cryptic, very condensed manuals to discover things that you describe. Then I found the book "Mastering Regular expressions" ;-)
The book is excellent, but this Introduction is about the maximum that a person that has never used REs before can (and is willing) to digest. Even more important is, that most users will never require more than you describe, even if they do fairly advanced stuff with RE magic.

I think you should definitely consider becoming technical writer. Very few people can understand complicated technical issues AND present them in such a clear, concise manner. Knowing what can be left out for your particular audience is perhaps the most difficult thing for technical writer.

You have just made Calibre much more useful tool for many people.

----------------------------
(*) yes, I did fail spectacularly many times and I did take notes ;-)
Thank you so much for the compliments. As for technical writing, my chosen profession is somewhat removed from that, but I'm currently thinking about offering to help Calibre out by, I don't know, helping to write documentation or some such.
Manichean is offline  
Old 09-26-2010, 09:31 AM   #58
kacir
Wizard
kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.
 
kacir's Avatar
 
Posts: 3,447
Karma: 10484861
Join Date: May 2006
Device: PocketBook 360, before it was Sony Reader, cassiopeia A-20
Quote:
Originally Posted by Manichean View Post
As for technical writing, my chosen profession is somewhat removed from that, ...
Well, one of the most often repeated advices for writers is: "Don't Quit The Day Job!" ;-)
and indeed, the vast majority or authors do have day job. Even Jeffrey Friedl, author of THE book about Regular Expressions did have day job while he was writing the book
http://regex.info/
kacir is offline  
Old 09-26-2010, 10:21 AM   #59
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
A comment:
Early on in post 1 it starts with "What on earth is a regular expression?" and by the third sentence we're into case, with lots of discussion about case flags in the discussion that follows.
This is a logical place to start that discussion in a general discussion of regexps. A general regexp is case sensitive, and it's an important concept, but as Charles posted about Calibre:
Quote:
ignore case is turned on by default, and therefore cannot be turned off.
This can be pretty confusing if someone is trying to understand regex matching by doing actual searches on the search bar. You say "A" does not match "a" but in Calibre's search bar it does, even if (?i) is not set. If they try to do regexes in the new bulk metadata edit page, they don't use the flags, they use an option switchbox.

Since this discussion is specific to Calibre, I wonder if it would be worth stating up front where ignore case is turned on by default and where it can be controlled by selecting an option box.

Last edited by Starson17; 09-26-2010 at 10:40 AM.
Starson17 is offline  
Old 09-26-2010, 10:45 AM   #60
Manichean
Wizard
Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.Manichean is the 'tall, dark, handsome stranger' all the fortune-tellers are referring to.
 
Manichean's Avatar
 
Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
Quote:
Originally Posted by Starson17 View Post
Since this discussion is specific to Calibre, I wonder if it would be worth stating up front where ignore case is turned on by default and where it can be controlled by selecting an option box.
You're right, I must have missed that comment. That means, if I'm not mistaken, that the only place case matters is the search & replace, where it can be controlled with a checkbox, yes? If that's the case (*snigger* I love really bad puns...), I think I'll remove at least the discussion of the ignore case- flag. The case discussion in the paragraph you pointed to could then be rephrased to the effect that yes, generally, case matters, but Calibre ignores it except for one instance. I'd like to keep at least that one reference.
Manichean is offline  
Closed Thread

Tags
regexp calibre tutorial

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Problem with regular expressions Manichean Conversion 10 02-03-2011 02:27 PM
Custom Regular Expressions for adding book information bigbot3 Calibre 1 12-25-2010 06:28 PM
Help with Regular Expressions ghostyjack Workshop 2 01-08-2010 11:04 AM
Regular Expressions help needed Phil_C Workshop 20 10-03-2009 12:14 AM
BookDesigner v5 and regular expressions ShineOn Sony Reader 11 08-25-2008 04:06 PM


All times are GMT -4. The time now is 12:14 AM.


MobileRead.com is a privately owned, operated and funded community.