I'm not sure if this is related to this topic, but I believe it is.
I have a HUGE (50K+) collection of ebooks in a variety of formats. Many of them are probably duplicates, but locating the duplicates can be a very slow, tedious process, even with good bulk renaming tools and such. Most of my family and several friends are readers who use ereaders, and I'd like to use Calibre's content server to allow them access to my library. I'd even be open to allowing the same library access to this forum's members.
Almost everyone out there seems to have some unique format they prefer for listing their books in their libraries. Some prefer: Author First name, last name - title, series
. Others prefer: title - author first name, last name, series
, or, author last name, first name - title
, etc. Heck, despite the file extension indicating the file type, some people actually put things like "(epub)" or "(mobi)" into the titles. Because of this, I have to handraulically rename large portions of my collection to try to see where the duplicates are, and which formats I want to keep. I've been working on it sporadically for the past several years, and I'm still only working in the first half dozen letters of the alphabet. In addition, there are all the special characters people use, dashes, colons, semi-colons, brackets, braces, etc.
Personally, I prefer the old fashioned way of cataloging: Author Last Name, Author First Name, Title, Series
. What I'd like is a script (or scripts) that would allow me to look at the different elements in the filenames and swap/delete them appropriately, in bulk. Even if I have to go through the file listings a page or two at a time and select individual titles, it would be quicker than what I'm doing now.
Unfortunately, I'm not a programmer or software whiz, so I don't understand the Regex conventions well enough to do this myself
. Any help would be really appreciated,