10-13-2012, 08:43 AM | #1 |
Connoisseur
Posts: 77
Karma: 2136220
Join Date: Sep 2012
Device: none
|
Plugin to trasform database to upper case
Hi to all,
I've the windows portable calibre installed on an usb ntfs formatted external drive. Moreover, I access the same library from the linux version when I'm running linux. I've the need to be able to add books both in linux and windows and I alternate the use of these operating systems (windows at work, linux at home). Since the alternating use of windows and linux creates issues given by differences in how ntfs is handled by the two OS (see PS below), I would like to create one (or more if necessary) plugin(s) to transform the existing database in an upper case one, and to maintain so on when books are added. At the same time, the plugin(s) should avoid to create a file path longer than 256 characters. To transform the existing database I've thought to create a plugin that, for each book in the database, changes author(s) name and title to upper case and add a specific string ('_MYTEMP') to both of them (the latter is needed to force the operating system to change file and dir name even if it is case insensitive). After the changes are saved, it will remove the specific string from names and title and save the changes again. So I expected that at the end of the running the original file tree Code:
Federal Aviation Administration ├── FAA Helicopter Flying Handbook - 8083-21 (292) │** ├── cover.jpg │** ├── FAA Helicopter Flying Handbook - 8083-21 - Federal Aviation Administration.pdf │** └── metadata.opf ├── Pilot's Handbook of Aeronautical Knowled (291) │** ├── cover.jpg │** ├── metadata.opf │** └── Pilot's Handbook of Aeronautical Knowled - Federal Aviation Administration.pdf └── Special Federal Aviation Regulations SFA (293) ├── cover.jpg ├── metadata.opf └── Special Federal Aviation Regulations SFA - Federal Aviation Administration.pdf Code:
FEDERAL AVIATION ADMINISTRATION ├── FAA HELICOPTER FLYING HANDBOOK - 8083-21 (292) │** ├── cover.jpg │** ├── FAA HELICOPTER FLYING HANDBOOK - 8083-21 - FEDERAL AVIATION ADMINISTRATION.PDF │** └── METADATA.OPF ├── PILOT'S HANDBOOK OF AERONAUTICAL KNOWLED (291) │** ├── cover.jpg │** ├── metadata.opf │** └── PILOT'S HANDBOOK OF AERONAUTICAL KNOWLED - FEDERAL AVIATION ADMINISTRATION.PDF └── SPECIAL FEDERAL AVIATION REGULATIONS SFA (293) ├── cover.jpg ├── metadata.opf └── SPECIAL FEDERAL AVIATION REGULATIONS SFA - FEDERAL AVIATION ADMINISTRATION.PDF Obviously the plugin(s) should work on any file type. So the (initial) questions are: 1)which type of plugin should I do? A FileTypePlugin or a MetadData one? 2)how can I loop for all the books? Thank you, Xwang PS: the biggest difference is the fact that linux can create multiple files with same names with the exception of the case and such files are not visible under windows, the other problem is that windows has a maximum path name length of 256 characters which linux do not have, so I can find some books which are not readable under windows) PS2: I prefer to have this implemented as plugin because I don't have so much time to maintain a personal source code branch which will need to be aligned to upstream version every time they are modified |
10-19-2012, 09:10 AM | #2 |
Resident Curmudgeon
Posts: 74,576
Karma: 129670952
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
|
There isn't going to be enough call for for someone to write such a plugin. It's too limited and not enough people will use it to make it worthwhile.
|
10-19-2012, 11:47 AM | #3 |
Plugin Developer
Posts: 6,388
Karma: 3966377
Join Date: Dec 2011
Location: Midwest USA
Device: Kindle Paperwhite(10th)
|
That doesn't stop Xwang from writing one for his own use, though.
First, Xwang, have you proven that changing the author/title to uppercase like that solves your problem? You tried it manually with a smaller set of books, that is? Assuming so, I suggest a UI plugin that searches for titles/authors with lower case and updates the metadata on command. One place you could start is with the Extract ISBN plugin. It's the simplest plugin I know of that modifies metadata. You don't need the whole background processing part, but the technique used to update isbn can probably be adapted to update title/authors instead. Another possible way to do it is this: Code:
db = self.gui.current_db bookids = db.search_getting_ids('title:"~[a-z]" or author:"~[a-z]"', None) for bookid in bookids: mi = db.get_metadata(bookid,index_is_id=True) mi.title = mi.title.upper() auths=[] for auth in mi.authors: auths.append(auth.upper()) mi.authors = auths db.set_metadata(bookid,mi) db.refresh_ids(bookids) |
10-19-2012, 05:20 PM | #4 | |
Connoisseur
Posts: 77
Karma: 2136220
Join Date: Sep 2012
Device: none
|
Quote:
I'm pretty sure that transforming and maintaining the db in upper case is sufficient to solve my problem, however it is necessary to execute it with a double step method: firstly I've to transform in upper case titles and authors adding a special string to both; then I've to remove the special string. I'm not a python expert but I suppose that adding the string and removing it is not a problem, so I can use your code as a base by running the for cycle twice. I've two questions: 1) does bookid change when title or authors are changed? 2) what does "db.search_getting_ids('title:"~[a-z]" or author:"~[a-z]"', None)" exactly do? My idea is something like this: Code:
db = self.gui.current_db bookids = db.search_getting_ids('title:"~[a-z]" or author:"~[a-z]"', None) for bookid in bookids: mi = db.get_metadata(bookid,index_is_id=True) mi.title = mi.title.upper()+'_T#@§' auths=[] for auth in mi.authors: auths.append(auth.upper()+'_T#@§') mi.authors = auths db.set_metadata(bookid,mi) for bookid in bookids: mi = db.get_metadata(bookid,index_is_id=True) mi.title = mi.title[:-5] auths=[] for auth in mi.authors: auths.append(auth[:-5]) mi.authors = auths db.set_metadata(bookid,mi) db.refresh_ids(bookids) Xwang |
|
10-19-2012, 05:45 PM | #5 | |
Plugin Developer
Posts: 6,388
Karma: 3966377
Join Date: Dec 2011
Location: Midwest USA
Device: Kindle Paperwhite(10th)
|
Glad to help.
Quote:
"db.search_getting_ids('title:"~[a-z]" or author:"~[a-z]"', None)" searches the library, giving you back ids (rather than row numbers in the current view) for the search 'title:"~[a-z]" or author:"~[a-z]"' without a restriction(the None). 'title:"~[a-z]" or author:"~[a-z]"' is a search of two regular expressions saying 'any book with title containing letters a-z (not A-Z)' or 'any book with author(s) containing letters a-z (not A-Z)' Rather than loop twice, you might just change the name twice in the same loop. |
|
10-19-2012, 05:56 PM | #6 | |
Connoisseur
Posts: 77
Karma: 2136220
Join Date: Sep 2012
Device: none
|
Quote:
Code:
db = self.gui.current_db bookids = db.search_getting_ids('title:"~[a-z]" or author:"~[a-z]"', None) for bookid in bookids: mi = db.get_metadata(bookid,index_is_id=True) mi.title = mi.title.upper()+'_T#@§' auths=[] for auth in mi.authors: auths.append(auth.upper()+'_T#@§') mi.authors = auths db.set_metadata(bookid,mi) mi = db.get_metadata(bookid,index_is_id=True) mi.title = mi.title[:-5] auths=[] for auth in mi.authors: auths.append(auth[:-5]) mi.authors = auths db.set_metadata(bookid,mi) db.refresh_ids(bookids) Which module should I import? Xwang |
|
10-19-2012, 06:02 PM | #7 | |
Plugin Developer
Posts: 6,388
Karma: 3966377
Join Date: Dec 2011
Location: Midwest USA
Device: Kindle Paperwhite(10th)
|
Quote:
As for which module to import, you need to setup a whole plugin, this is just the core snippet. There's official documentation, but I learned even more from examining the code for existing plugins. |
|
10-19-2012, 06:05 PM | #8 | |
Connoisseur
Posts: 77
Karma: 2136220
Join Date: Sep 2012
Device: none
|
Quote:
Tomorrow I'll study the ISBN extract plugin you suggested previoulsy. Xwang |
|
10-21-2012, 08:45 AM | #9 |
Connoisseur
Posts: 77
Karma: 2136220
Join Date: Sep 2012
Device: none
|
I'm trying to have the plugin running, but when I try to import it into Calibre I obtain the following error:
Code:
Traceback (most recent call last): File "/usr/lib/calibre/calibre/gui2/preferences/plugins.py", line 316, in add_plugin self.check_for_add_to_toolbars(plugin) File "/usr/lib/calibre/calibre/gui2/preferences/plugins.py", line 406, in check_for_add_to_toolbars plugin_action = plugin.load_actual_plugin(self.gui) File "/usr/lib/calibre/calibre/customize/__init__.py", line 543, in load_actual_plugin ac = getattr(importlib.import_module(mod), cls)(gui, AttributeError: 'module' object has no attribute 'UpperizeDBAction' Moreover, since I'm using some of the code of Extract ISBN plugin (namely the common_utils.py file), I've maintained its original copyright. Should I add something also in my code to highlight the fact that I'm using someone else code in mine? Thank you, Xwang |
10-21-2012, 11:44 AM | #10 |
Plugin Developer
Posts: 6,388
Karma: 3966377
Join Date: Dec 2011
Location: Midwest USA
Device: Kindle Paperwhite(10th)
|
Actually, that's not the first error I get:
Code:
calibre, version 0.9.3 ERROR: Unhandled exception: <b>SyntaxError</b>:invalid syntax (calibre_plugins.upperize_db.action, line 39) Traceback (most recent call last): File "site-packages\calibre\gui2\preferences\plugins.py", line 316, in add_plugin File "site-packages\calibre\gui2\preferences\plugins.py", line 406, in check_for_add_to_toolbars File "site-packages\calibre\customize\__init__.py", line 543, in load_actual_plugin File "importlib\__init__.py", line 37, in import_module File "site-packages\calibre\customize\zipplugin.py", line 147, in load_module File "calibre_plugins.upperize_db.action", line 39 def upperizedb(self) ^ SyntaxError: invalid syntax If you're running calibre as 'calibre-debug -g' from CLI (which I always do), you also see this error on the console: Code:
Traceback (most recent call last): File "site-packages\calibre\gui2\ui.py", line 127, in __init__ File "site-packages\calibre\gui2\ui.py", line 141, in init_iaction File "site-packages\calibre\customize\__init__.py", line 543, in load_actual_plugin File "importlib\__init__.py", line 37, in import_module File "site-packages\calibre\customize\zipplugin.py", line 150, in load_module File "calibre_plugins.upperize_db.action", line 23, in <module> NameError: name 'InterfaceAction' is not defined |
10-21-2012, 03:51 PM | #11 | |
Connoisseur
Posts: 77
Karma: 2136220
Join Date: Sep 2012
Device: none
|
Quote:
Now the plugin works, but I've discovered that it is necessary tu upperize also the extension. Is there any way to access to it so that to upperize it in a manner similar to the one used for titles and authors? Xwang |
|
10-21-2012, 04:14 PM | #12 |
Plugin Developer
Posts: 6,388
Karma: 3966377
Join Date: Dec 2011
Location: Midwest USA
Device: Kindle Paperwhite(10th)
|
Well, that is why I asked if you'd already tested doing it manually to make sure it worked...
Why do you need the extensions upcased? You described the problem as conflicts between files such as Aaa and AAA being different files on linux, but the same file on Windows. |
10-21-2012, 04:50 PM | #13 | |
Connoisseur
Posts: 77
Karma: 2136220
Join Date: Sep 2012
Device: none
|
Quote:
Code:
bookids = db.search_getting_ids('title:"~[a-z]" or author:"~[a-z]"', None) However I've searched a bit into the code and I've discovered that file extensions are forced to be lower case (see the function format_abspath in database2.py) Code:
def format_abspath(self, index, format, index_is_id=False): ''' Return absolute path to the ebook file of format `format` WARNING: This method will return a dummy path for a network backend DB, so do not rely on it, use format(..., as_path=True) instead. Currently used only in calibredb list, the viewer and the catalogs (via get_data_as_dict()). Apart from the viewer, I don't believe any of the others do any file I/O with the results of this call. ''' id = index if index_is_id else self.id(index) try: name = self.format_filename_cache[id][format.upper()] except: return None if name: path = os.path.join(self.library_path, self.path(id, index_is_id=True)) format = ('.' + format.lower()) if format else '' fmt_path = os.path.join(path, name+format) if os.path.exists(fmt_path): return fmt_path try: candidates = glob.glob(os.path.join(path, '*'+format)) except: # If path contains strange characters this throws an exc candidates = [] if format and candidates and os.path.exists(candidates[0]): try: shutil.copyfile(candidates[0], fmt_path) except: # This can happen if candidates[0] or fmt_path is too long, # which can happen if the user copied the library from a # non windows machine to a windows machine. return None return fmt_path Moreover, making some more tests, I've discovered that if an author has more than a book, the books are correctly upper cased, but the author name remains unchanged. Finally opening the metadata page in calibre I see a situation like the one in the attached snapshot in where author and title ordering are still not upper cased (red highlighted) (in the snapshot I've manually forced the author ordering and so noe it appears upper cased. Xwang |
|
10-21-2012, 04:59 PM | #14 |
Plugin Developer
Posts: 6,388
Karma: 3966377
Join Date: Dec 2011
Location: Midwest USA
Device: Kindle Paperwhite(10th)
|
In addition to mi.title and mi.authors, try doing the upper steps on mi.title_sort and mi.author_sort? That might do it.
As for authors with more than one book, there's also some author metadata kept outside the books. You might something like this in addition to (or instead of) setting the authors on each book's mi object. Code:
autid=db.get_author_id(authorname) db.rename_author(autid, authorname.upper()) |
10-21-2012, 06:02 PM | #15 | |
Connoisseur
Posts: 77
Karma: 2136220
Join Date: Sep 2012
Device: none
|
Quote:
I've done as you suggested and now the db is correctly upped cased. I attach the latest version of the plugin in case you would like to have a look at it. The only issue opened at the moment is that it continues to rename all the db if I run it twice. To solve this issue maybe I can add an additional boolean field in the db and when a book is upper cased by the plugin, the additional value is put to yes. The logic of the plugin should be modified to look at that value and modify a book only if its additional value is not set to yes. I've already added the additional field in my test db with the name 'is_upper_case_db' which will be yes only if the book has already been upper cased. The question is "how can I look for that variable to understand if I have to upper case the book?" Xwang |
|
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
How to access "database" from a plugin | hakan42 | Development | 2 | 09-06-2012 05:35 PM |
upper case to sentence case conversion | cybmole | Sigil | 8 | 01-20-2011 06:03 AM |
I don't like the way calibre sticks with upper-case/capital | acolsandra | Calibre | 6 | 11-12-2010 11:17 AM |
Update Metadata in database from Plugin | DokaMax | Plugins | 0 | 05-22-2010 05:58 AM |
Upper half of the screen blank | tapf! | Sony Reader | 6 | 07-18-2008 02:49 AM |