MobileRead Forums

MobileRead Forums (https://www.mobileread.com/forums/index.php)
-   Plugins (https://www.mobileread.com/forums/forumdisplay.php?f=237)
-   -   Plugin to trasform database to upper case (https://www.mobileread.com/forums/showthread.php?t=193728)

Xwang 10-13-2012 09:43 AM

Plugin to trasform database to upper case
 
Hi to all,
I've the windows portable calibre installed on an usb ntfs formatted external drive. Moreover, I access the same library from the linux version when I'm running linux. I've the need to be able to add books both in linux and windows and I alternate the use of these operating systems (windows at work, linux at home).

Since the alternating use of windows and linux creates issues given by differences in how ntfs is handled by the two OS (see PS below), I would like to create one (or more if necessary) plugin(s) to transform the existing database in an upper case one, and to maintain so on when books are added.
At the same time, the plugin(s) should avoid to create a file path longer than 256 characters.

To transform the existing database I've thought to create a plugin that, for each book in the database, changes author(s) name and title to upper case and add a specific string ('_MYTEMP') to both of them (the latter is needed to force the operating system to change file and dir name even if it is case insensitive). After the changes are saved, it will remove the specific string from names and title and save the changes again.

So I expected that at the end of the running the original file tree
Code:

Federal Aviation Administration
├── FAA Helicopter Flying Handbook - 8083-21 (292)
│** ├── cover.jpg
│** ├── FAA Helicopter Flying Handbook - 8083-21 - Federal Aviation Administration.pdf
│** └── metadata.opf
├── Pilot's Handbook of Aeronautical Knowled (291)
│** ├── cover.jpg
│** ├── metadata.opf
│** └── Pilot's Handbook of Aeronautical Knowled - Federal Aviation Administration.pdf
└── Special Federal Aviation Regulations SFA (293)
    ├── cover.jpg
    ├── metadata.opf
    └── Special Federal Aviation Regulations SFA - Federal Aviation Administration.pdf

has been changed in the following way regardless the OS in use
Code:

FEDERAL AVIATION ADMINISTRATION
├── FAA HELICOPTER FLYING HANDBOOK - 8083-21 (292)
│** ├── cover.jpg
│** ├── FAA HELICOPTER FLYING HANDBOOK - 8083-21 - FEDERAL AVIATION ADMINISTRATION.PDF
│** └── METADATA.OPF
├── PILOT'S HANDBOOK OF AERONAUTICAL KNOWLED (291)
│** ├── cover.jpg
│** ├── metadata.opf
│** └── PILOT'S HANDBOOK OF AERONAUTICAL KNOWLED - FEDERAL AVIATION ADMINISTRATION.PDF
└── SPECIAL FEDERAL AVIATION REGULATIONS SFA (293)
    ├── cover.jpg
    ├── metadata.opf
    └── SPECIAL FEDERAL AVIATION REGULATIONS SFA - FEDERAL AVIATION ADMINISTRATION.PDF

Then it would be nice to have a plugin which on save does the same to maintain the library upper case (in case this second plugin is difficult, Maybe I can modify the first one to verify if book and author is upper cased before modify them).

Obviously the plugin(s) should work on any file type.
So the (initial) questions are:
1)which type of plugin should I do? A FileTypePlugin or a MetadData one?
2)how can I loop for all the books?

Thank you,
Xwang


PS: the biggest difference is the fact that linux can create multiple files with same names with the exception of the case and such files are not visible under windows, the other problem is that windows has a maximum path name length of 256 characters which linux do not have, so I can find some books which are not readable under windows)

PS2: I prefer to have this implemented as plugin because I don't have so much time to maintain a personal source code branch which will need to be aligned to upstream version every time they are modified

JSWolf 10-19-2012 10:10 AM

There isn't going to be enough call for for someone to write such a plugin. It's too limited and not enough people will use it to make it worthwhile.

JimmXinu 10-19-2012 12:47 PM

That doesn't stop Xwang from writing one for his own use, though. :)

First, Xwang, have you proven that changing the author/title to uppercase like that solves your problem? You tried it manually with a smaller set of books, that is?

Assuming so, I suggest a UI plugin that searches for titles/authors with lower case and updates the metadata on command.

One place you could start is with the Extract ISBN plugin. It's the simplest plugin I know of that modifies metadata. You don't need the whole background processing part, but the technique used to update isbn can probably be adapted to update title/authors instead.

Another possible way to do it is this:

Code:

db = self.gui.current_db
bookids = db.search_getting_ids('title:"~[a-z]" or author:"~[a-z]"', None)

for bookid in bookids:
  mi = db.get_metadata(bookid,index_is_id=True)
  mi.title = mi.title.upper()
  auths=[]
  for auth in mi.authors:
      auths.append(auth.upper())
  mi.authors = auths
  db.set_metadata(bookid,mi)

db.refresh_ids(bookids)

I haven't tested it, so I doubt it would work exactly as is, but it's a starting point.

Xwang 10-19-2012 06:20 PM

Quote:

Originally Posted by JimmXinu (Post 2271303)
That doesn't stop Xwang from writing one for his own use, though. :)

First, Xwang, have you proven that changing the author/title to uppercase like that solves your problem? You tried it manually with a smaller set of books, that is?

Assuming so, I suggest a UI plugin that searches for titles/authors with lower case and updates the metadata on command.

One place you could start is with the Extract ISBN plugin. It's the simplest plugin I know of that modifies metadata. You don't need the whole background processing part, but the technique used to update isbn can probably be adapted to update title/authors instead.

Another possible way to do it is this:

Code:

db = self.gui.current_db
bookids = db.search_getting_ids('title:"~[a-z]" or author:"~[a-z]"', None)

for bookid in bookids:
  mi = db.get_metadata(bookid,index_is_id=True)
  mi.title = mi.title.upper()
  auths=[]
  for auth in mi.authors:
      auths.append(auth.upper())
  mi.authors = auths
  db.set_metadata(bookid,mi)

db.refresh_ids(bookids)

I haven't tested it, so I doubt it would work exactly as is, but it's a starting point.

First of all, thank you for your help.
I'm pretty sure that transforming and maintaining the db in upper case is sufficient to solve my problem, however it is necessary to execute it with a double step method:
firstly I've to transform in upper case titles and authors adding a special string to both; then I've to remove the special string.
I'm not a python expert but I suppose that adding the string and removing it is not a problem, so I can use your code as a base by running the for cycle twice.
I've two questions:
1) does bookid change when title or authors are changed?
2) what does "db.search_getting_ids('title:"~[a-z]" or author:"~[a-z]"', None)"
exactly do?

My idea is something like this:
Code:

db = self.gui.current_db
bookids = db.search_getting_ids('title:"~[a-z]" or author:"~[a-z]"', None)

for bookid in bookids:
  mi = db.get_metadata(bookid,index_is_id=True)
 
  mi.title = mi.title.upper()+'_T#@§'
  auths=[]
  for auth in mi.authors:
      auths.append(auth.upper()+'_T#@§')
  mi.authors = auths
  db.set_metadata(bookid,mi)

for bookid in bookids:
  mi = db.get_metadata(bookid,index_is_id=True)
 
  mi.title = mi.title[:-5]
  auths=[]
  for auth in mi.authors:
      auths.append(auth[:-5])
  mi.authors = auths
  db.set_metadata(bookid,mi)


db.refresh_ids(bookids)

This should function if bookid is not changed, otherwise I've to do a new search between the two for loops.

Xwang

JimmXinu 10-19-2012 06:45 PM

Glad to help. :)

Quote:

Originally Posted by Xwang (Post 2271776)
I've two questions:
1) does bookid change when title or authors are changed?
2) what does "db.search_getting_ids('title:"~[a-z]" or author:"~[a-z]"', None)"
exactly do?

bookid doesn't change.

"db.search_getting_ids('title:"~[a-z]" or author:"~[a-z]"', None)" searches the library, giving you back ids (rather than row numbers in the current view) for the search 'title:"~[a-z]" or author:"~[a-z]"' without a restriction(the None).

'title:"~[a-z]" or author:"~[a-z]"' is a search of two regular expressions saying 'any book with title containing letters a-z (not A-Z)' or 'any book with author(s) containing letters a-z (not A-Z)'

Rather than loop twice, you might just change the name twice in the same loop.

Xwang 10-19-2012 06:56 PM

Quote:

Originally Posted by JimmXinu (Post 2271794)
Glad to help. :)



bookid doesn't change.

"db.search_getting_ids('title:"~[a-z]" or author:"~[a-z]"', None)" searches the library, giving you back ids (rather than row numbers in the current view) for the search 'title:"~[a-z]" or author:"~[a-z]"' without a restriction(the None).

'title:"~[a-z]" or author:"~[a-z]"' is a search of two regular expressions saying 'any book with title containing letters a-z (not A-Z)' or 'any book with author(s) containing letters a-z (not A-Z)'

Rather than loop twice, you might just change the name twice in the same loop.

Code with only a loop.
Code:

db = self.gui.current_db
bookids = db.search_getting_ids('title:"~[a-z]" or author:"~[a-z]"', None)

for bookid in bookids:
  mi = db.get_metadata(bookid,index_is_id=True)
 
  mi.title = mi.title.upper()+'_T#@§'
  auths=[]
  for auth in mi.authors:
      auths.append(auth.upper()+'_T#@§')
  mi.authors = auths
  db.set_metadata(bookid,mi)
 
  mi = db.get_metadata(bookid,index_is_id=True)
 
  mi.title = mi.title[:-5]
  auths=[]
  for auth in mi.authors:
      auths.append(auth[:-5])
  mi.authors = auths
  db.set_metadata(bookid,mi)


db.refresh_ids(bookids)

Changed are saved on disk when db.set_metadata(bookid,mi) is executed?
Which module should I import?
Xwang

JimmXinu 10-19-2012 07:02 PM

Quote:

Originally Posted by Xwang (Post 2271810)
Changed are saved on disk when db.set_metadata(bookid,mi) is executed?
Which module should I import?
Xwang

I believe so.

As for which module to import, you need to setup a whole plugin, this is just the core snippet.

There's official documentation, but I learned even more from examining the code for existing plugins.

Xwang 10-19-2012 07:05 PM

Quote:

Originally Posted by JimmXinu (Post 2271817)
I believe so.

As for which module to import, you need to setup a whole plugin, this is just the core snippet.

There's official documentation, but I learned even more from examining the code for existing plugins.

Ok!
Tomorrow I'll study the ISBN extract plugin you suggested previoulsy.
Xwang

Xwang 10-21-2012 09:45 AM

1 Attachment(s)
I'm trying to have the plugin running, but when I try to import it into Calibre I obtain the following error:
Code:

Traceback (most recent call last):
  File "/usr/lib/calibre/calibre/gui2/preferences/plugins.py", line 316, in add_plugin
    self.check_for_add_to_toolbars(plugin)
  File "/usr/lib/calibre/calibre/gui2/preferences/plugins.py", line 406, in check_for_add_to_toolbars
    plugin_action = plugin.load_actual_plugin(self.gui)
  File "/usr/lib/calibre/calibre/customize/__init__.py", line 543, in load_actual_plugin
    ac = getattr(importlib.import_module(mod), cls)(gui,
AttributeError: 'module' object has no attribute 'UpperizeDBAction'

What's the problem?

Moreover, since I'm using some of the code of Extract ISBN plugin (namely the common_utils.py file), I've maintained its original copyright. Should I add something also in my code to highlight the fact that I'm using someone else code in mine?

Thank you,
Xwang

JimmXinu 10-21-2012 12:44 PM

Actually, that's not the first error I get:
Code:

calibre, version 0.9.3
ERROR: Unhandled exception: <b>SyntaxError</b>:invalid syntax (calibre_plugins.upperize_db.action, line 39)

Traceback (most recent call last):
  File "site-packages\calibre\gui2\preferences\plugins.py", line 316, in add_plugin
  File "site-packages\calibre\gui2\preferences\plugins.py", line 406, in check_for_add_to_toolbars
  File "site-packages\calibre\customize\__init__.py", line 543, in load_actual_plugin
  File "importlib\__init__.py", line 37, in import_module
  File "site-packages\calibre\customize\zipplugin.py", line 147, in load_module
  File "calibre_plugins.upperize_db.action", line 39
    def upperizedb(self)
                      ^
SyntaxError: invalid syntax

But that's a simple missing ':' on line 39. After fixing that, I get your error--from the GUI.

If you're running calibre as 'calibre-debug -g' from CLI (which I always do), you also see this error on the console:

Code:

Traceback (most recent call last):
  File "site-packages\calibre\gui2\ui.py", line 127, in __init__
  File "site-packages\calibre\gui2\ui.py", line 141, in init_iaction
  File "site-packages\calibre\customize\__init__.py", line 543, in load_actual_plugin
  File "importlib\__init__.py", line 37, in import_module
  File "site-packages\calibre\customize\zipplugin.py", line 150, in load_module
  File "calibre_plugins.upperize_db.action", line 23, in <module>
NameError: name 'InterfaceAction' is not defined

Now we know the real problem: You need to import InterfaceAction in action.py. It's commented out. :)

Xwang 10-21-2012 04:51 PM

Quote:

Originally Posted by JimmXinu (Post 2273452)
Actually, that's not the first error I get:
Code:

calibre, version 0.9.3
ERROR: Unhandled exception: <b>SyntaxError</b>:invalid syntax (calibre_plugins.upperize_db.action, line 39)

Traceback (most recent call last):
  File "site-packages\calibre\gui2\preferences\plugins.py", line 316, in add_plugin
  File "site-packages\calibre\gui2\preferences\plugins.py", line 406, in check_for_add_to_toolbars
  File "site-packages\calibre\customize\__init__.py", line 543, in load_actual_plugin
  File "importlib\__init__.py", line 37, in import_module
  File "site-packages\calibre\customize\zipplugin.py", line 147, in load_module
  File "calibre_plugins.upperize_db.action", line 39
    def upperizedb(self)
                      ^
SyntaxError: invalid syntax

But that's a simple missing ':' on line 39. After fixing that, I get your error--from the GUI.

If you're running calibre as 'calibre-debug -g' from CLI (which I always do), you also see this error on the console:

Code:

Traceback (most recent call last):
  File "site-packages\calibre\gui2\ui.py", line 127, in __init__
  File "site-packages\calibre\gui2\ui.py", line 141, in init_iaction
  File "site-packages\calibre\customize\__init__.py", line 543, in load_actual_plugin
  File "importlib\__init__.py", line 37, in import_module
  File "site-packages\calibre\customize\zipplugin.py", line 150, in load_module
  File "calibre_plugins.upperize_db.action", line 23, in <module>
NameError: name 'InterfaceAction' is not defined

Now we know the real problem: You need to import InterfaceAction in action.py. It's commented out. :)

Thank you for your help!
Now the plugin works, but I've discovered that it is necessary tu upperize also the extension.
Is there any way to access to it so that to upperize it in a manner similar to the one used for titles and authors?
Xwang

JimmXinu 10-21-2012 05:14 PM

Well, that is why I asked if you'd already tested doing it manually to make sure it worked...

Why do you need the extensions upcased? You described the problem as conflicts between files such as Aaa and AAA being different files on linux, but the same file on Windows.

Xwang 10-21-2012 05:50 PM

1 Attachment(s)
Quote:

Originally Posted by JimmXinu (Post 2273678)
Well, that is why I asked if you'd already tested doing it manually to make sure it worked...

Why do you need the extensions upcased? You described the problem as conflicts between files such as Aaa and AAA being different files on linux, but the same file on Windows.

I fear that the lower case extension is seen as part of the title because if I run the plugin once the db has already been upper cased, I see that it changes all the folder again. It seems that this line:
Code:

bookids = db.search_getting_ids('title:"~[a-z]" or author:"~[a-z]"', None)
return all the book in the library.

However I've searched a bit into the code and I've discovered that file extensions are forced to be lower case (see the function format_abspath in database2.py)

Code:

def format_abspath(self, index, format, index_is_id=False):
        '''
        Return absolute path to the ebook file of format `format`

        WARNING: This method will return a dummy path for a network backend DB,
        so do not rely on it, use format(..., as_path=True) instead.

        Currently used only in calibredb list, the viewer and the catalogs (via
        get_data_as_dict()).

        Apart from the viewer, I don't believe any of the others do any file
        I/O with the results of this call.
        '''
        id = index if index_is_id else self.id(index)
        try:
            name = self.format_filename_cache[id][format.upper()]
        except:
            return None
        if name:
            path = os.path.join(self.library_path, self.path(id, index_is_id=True))
            format = ('.' + format.lower()) if format else ''
            fmt_path = os.path.join(path, name+format)
            if os.path.exists(fmt_path):
                return fmt_path
            try:
                candidates = glob.glob(os.path.join(path, '*'+format))
            except: # If path contains strange characters this throws an exc
                candidates = []
            if format and candidates and os.path.exists(candidates[0]):
                try:
                    shutil.copyfile(candidates[0], fmt_path)
                except:
                    # This can happen if candidates[0] or fmt_path is too long,
                    # which can happen if the user copied the library from a
                    # non windows machine to a windows machine.
                    return None
                return fmt_path

If it is effectively so, the problem is to change the search line in my plugin to effectively return only the non already upper cased books/authors.

Moreover, making some more tests, I've discovered that if an author has more than a book, the books are correctly upper cased, but the author name remains unchanged.

Finally opening the metadata page in calibre I see a situation like the one in the attached snapshot in where author and title ordering are still not upper cased (red highlighted) (in the snapshot I've manually forced the author ordering and so noe it appears upper cased.
Xwang

JimmXinu 10-21-2012 05:59 PM

In addition to mi.title and mi.authors, try doing the upper steps on mi.title_sort and mi.author_sort? That might do it.

As for authors with more than one book, there's also some author metadata kept outside the books. You might something like this in addition to (or instead of) setting the authors on each book's mi object.

Code:

autid=db.get_author_id(authorname)
db.rename_author(autid, authorname.upper())


Xwang 10-21-2012 07:02 PM

1 Attachment(s)
Quote:

Originally Posted by JimmXinu (Post 2273737)
In addition to mi.title and mi.authors, try doing the upper steps on mi.title_sort and mi.author_sort? That might do it.

As for authors with more than one book, there's also some author metadata kept outside the books. You might something like this in addition to (or instead of) setting the authors on each book's mi object.

Code:

autid=db.get_author_id(authorname)
db.rename_author(autid, authorname.upper())


Well, I don't know how to thank you for your help.
I've done as you suggested and now the db is correctly upped cased.
I attach the latest version of the plugin in case you would like to have a look at it.
The only issue opened at the moment is that it continues to rename all the db if I run it twice.
To solve this issue maybe I can add an additional boolean field in the db and when a book is upper cased by the plugin, the additional value is put to yes.
The logic of the plugin should be modified to look at that value and modify a book only if its additional value is not set to yes.
I've already added the additional field in my test db with the name 'is_upper_case_db' which will be yes only if the book has already been upper cased.
The question is "how can I look for that variable to understand if I have to upper case the book?"

Xwang


All times are GMT -4. The time now is 07:00 PM.

Powered by: vBulletin
Copyright ©2000 - 3.8.5, Jelsoft Enterprises Ltd.
MobileRead.com is a privately owned, operated and funded community.