Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 07-07-2010, 09:45 AM   #1
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
For Charles/Chaley Merging user defined metadata

Charles,
While reading one of the recent threads, I realized that my Merge code was written before your custom column code. The merge code finds the first record selected (the "dest" destination record), then loops through the other selected book records (the "src" source records) in the order they were selected. For each src record, it copies the book formats in that record (provided that format doesn't already exist in the dest record) into the dest record.

Then it loops through the src records again to merge in the metadata. For each type of metadata there is a check to see if it exists in the source metadata information (src_mi), but not in the dest (dest_mi), and if so, it's written into the dest record. It occurred to me that there are no checks (at least none written by me) for user defined metadata. Thus, merge would lose that data if it's not in the src record. Some types of metadata are handled in a special way (Comments are appended, authors may be "Unknown," so they exist, but are still handled as if they don't, etc.)

All this "action" happens starting at line 802 of calibre\gui2\actions.py (pun not intended) in the merge_metadata(self, dest_id, src_ids) function.

Here's a quick example of the merger of the cover:
Code:
if src_mi.cover and not dest_mi.cover:
      dest_mi.cover = src_mi.cover
I did a brief test of a custom user defined Yes/No column and it was not merged. I wonder if you would look it over and add in appropriate tests to merge in the user defined custom columns. I'm not up to speed on how to handle your new metadata, but I suspect it would only take you a few minutes to fix it.

Thanks.

http://bugs.calibre-ebook.com/ticket/6120

Last edited by Starson17; 07-07-2010 at 10:09 AM.
Starson17 is offline   Reply With Quote
Old 07-07-2010, 10:39 AM   #2
chaley
Grand Sorcerer
chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.
 
Posts: 11,741
Karma: 6997045
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
Will do, but perhaps not for a few days. My wife found me a short consulting gig, and in a fit of madness I said yes. Now I must deliver on it.
chaley is offline   Reply With Quote
Advert
Old 07-07-2010, 10:46 AM   #3
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by chaley View Post
Will do, but perhaps not for a few days. My wife found me a short consulting gig, and in a fit of madness I said yes. Now I must deliver on it.
No problem. No one has found it yet If I get time, I may look at how you handle your metadata. With the previous limited set of predefined metadata, I just checked all the metadata fields. With the new user-defined fields, I suppose I'd have to ask for a list, then cycle through that list. Some may have to be handled in special ways. I don't know how to do either - ask for the list or figure out which items might need special handling during merge.

(Maybe "a few minutes" to do this was excessively optimistic, even for one of your talents )
Starson17 is offline   Reply With Quote
Old 07-07-2010, 10:58 AM   #4
chaley
Grand Sorcerer
chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.
 
Posts: 11,741
Karma: 6997045
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
Actually, it probably is a few minutes.

Look at the new class library.field_metadata, available as db.field_metadata. This a dict, keyed by the attribute name. The data fetched using the key has everything needed, including the datatype and its column number for fetching and storing the information.

Something like:
Code:
  for key in db.field_metadata:
    if db.field_metadata[key]['is_custom']:
      col_num = db.field_metadata[key]['col_num']
      # now do what is needed, according to type. rec_index is used
      # to get the value you are working with. something like
      from_record_value = db.get_custom(from_id, num=col_num)
      # process ...
      set_custom(to_id, val, num=col_num)
chaley is offline   Reply With Quote
Old 07-07-2010, 11:37 AM   #5
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by chaley View Post
Actually, it probably is a few minutes.
Show off For me it'd take longer, but thanks for the hints.

Quote:
Look at the new class library.field_metadata, available as db.field_metadata. This a dict, keyed by the attribute name.
It looks like that includes non-custom keys as well. Correct? Will "for key in db.field_metadata:" loop through all fields, including, e.g. title and authors? If I'm going back into the code again it might make sense to make a single loop for all metadata, instead of keeping the current non-custom tests and adding a loop for the 'is_custom' fields. That way, if the old fields are ever renamed or a new non-custom is added, it will still be merged (default of: if it doesn't exist in the destination, but does exist in a source, merge it in.)

Can you think of any of your field types that need special handling? I've forgotten all of the different types, but if you have one for text, like comments, should a src be appended to the dest or ignored?

Any other types that should be appended or that have defaults that should be overwritten (like the default author "Unknown")?
Starson17 is offline   Reply With Quote
Advert
Old 07-07-2010, 11:55 AM   #6
chaley
Grand Sorcerer
chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.
 
Posts: 11,741
Karma: 6997045
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
Quote:
Originally Posted by Starson17 View Post
It looks like that includes non-custom keys as well. Correct? Will "for key in db.field_metadata:" loop through all fields, including, e.g. title and authors? If I'm going back into the code again it might make sense to make a single loop for all metadata, instead of keeping the current non-custom tests and adding a loop for the 'is_custom' fields. That way, if the old fields are ever renamed or a new non-custom is added, it will still be merged (default of: if it doesn't exist in the destination, but does exist in a source, merge it in.)
Yes, it has all the fields, or at least is supposed to. I wondered whether you could unify the processing. I found in the search code that I could.

If you are going to play with standard fields, then you will want to know about field_metadata[key]['rec_index']. That field is the index into the _data record. You would use db.get(id, rec_index, row_is_id=True) (found in library.caches.py) to get the value for that field for a given db_id. Don't use db.set function in caches.py, because the data won't be written to the DB.

Quote:
Can you think of any of your field types that need special handling? I've forgotten all of the different types, but if you have one for text, like comments, should a src be appended to the dest or ignored?
This is really a requirements issue. Are tags (text, is_multiple=True) merged, or do they overwrite? Are comments merged, or is one overwritten? What happens with series indices? The equivalent custom fields should have the same processing.

My guess is that text (non-tag, is_multiple=False), bool, int, float, and date columns should overwrite. Make sense to you?
Quote:
Any other types that should be appended or that have defaults that should be overwritten (like the default author "Unknown")?
No. No type has a required value, and all can be set to None.
chaley is offline   Reply With Quote
Old 07-07-2010, 12:40 PM   #7
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by chaley View Post
This is really a requirements issue. Are tags (text, is_multiple=True) merged, or do they overwrite?
Tags are merged.

Quote:
Are comments merged, or is one overwritten?
Comments are appended.

Quote:
What happens with series indices?
There is currently no independent test for series index. Series index is treated as a pair with series name. If the series name is empty in the destination, then both series name and index are written into the dest from the src. If the dest already has a series name, the index is never changed during merger.

Quote:
The equivalent custom fields should have the same processing.
Agreed, to the extent possible.

Quote:
My guess is that text (non-tag, is_multiple=False), bool, int, float, and date columns should overwrite. Make sense to you?
Yes - ("overwrite" empty field in dest, if src is not empty)

There seem to be 4 types of text

Text has:
  • tag-like (merge, like tags)
  • comments-like (append)
  • plain-jane text (if dest is empty, fill from first src that is not empty)
  • series-like (???)

As to series-like text, how does this differ from plain-jane text? I'm thinking of someone who has set up a secondary custom series in one field with an associated (in his mind) number field for ordering. It looks like even if the series-like text field is associated with an int or float, it would still be OK to treat the fields independently. As long as the user enters something in his number field when he enters something in the series-like text field, it wouldn't be overwritten and the pair wouldn't be decoupled.
Starson17 is offline   Reply With Quote
Old 07-07-2010, 12:55 PM   #8
chaley
Grand Sorcerer
chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.
 
Posts: 11,741
Karma: 6997045
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
Quote:
Originally Posted by Starson17 View Post
As to series-like text, how does this differ from plain-jane text?
In the same way that series differs from (say) publisher. Series custom fields have an associated index.
Quote:
I'm thinking of someone who has set up a secondary custom series in one field with an associated (in his mind) number field for ordering. It looks like even if the series-like text field is associated with an int or float, it would still be OK to treat the fields independently. As long as the user enters something in his number field when he enters something in the series-like text field, it wouldn't be overwritten and the pair wouldn't be decoupled.
There isn't much you can do to maintain association correctness with columns that are paired in the user's head, so your assertion/statement makes sense. However, as series custom fields are physically paired with an index, you should apply the same rule you applied for standard series.

You might not notice the pairing for series custom fields. These fields have two pieces of information. The first is the series name, which acts like a text (is_multiple=False) field. However, in this case the index is stored in the connection record in the DB, which makes it (sort-of) a column in the books table. You can get the value of the series field from the book view -- field_metadata.cc_series_index_column_for() will give you the field number -- and using db.get_custom_extra() (in custom_columns.py). The method db.set_custom() has a keyword parameter used for setting the index field at the same time the series is set.
chaley is offline   Reply With Quote
Old 07-07-2010, 01:04 PM   #9
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by chaley View Post
In the same way that series differs from (say) publisher. Series custom fields have an associated index.
That explains it.

Quote:
you should apply the same rule you applied for standard series.
Yes. Obviously, I was unaware of the above.

Again, thanks for the details.
Starson17 is offline   Reply With Quote
Old 07-16-2010, 11:53 AM   #10
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by chaley View Post
Look at the new class library.field_metadata, available as db.field_metadata. This a dict, keyed by the attribute name. The data fetched using the key has everything needed, including the datatype and its column number for fetching and storing the information.

Something like:
Code:
  for key in db.field_metadata:
Charles,

I looked at this after we spoke. I saw a field that I think was called "datatype" that was populated with one of the 9 types of data that could be specified as user-defined data.

Does there happen to be a defined list of all datatypes that I can cycle through in a for: loop, something like:

Code:
for datatype in list_of_all_available_datatypes:
I'd like to catch any new datatypes in that loop beyond the current 9, if any are ever added.

Thanks.
Starson17 is offline   Reply With Quote
Old 07-16-2010, 12:17 PM   #11
chaley
Grand Sorcerer
chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.
 
Posts: 11,741
Karma: 6997045
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
Yes, there is, but it is buried in the custom column code.

It is an excellent idea to add such a declaration to FieldMetadata, specifically
Code:
    VALID_DATA_TYPES = frozenset(['rating', 'text', 'comments', 'datetime',
                                  'int', 'float', 'bool', 'series'])
I will do it now and submit it for Kovid's perusal.

Edit: Code has been pushed and merged into the trunk.

Last edited by chaley; 07-16-2010 at 01:00 PM.
chaley is offline   Reply With Quote
Old 07-16-2010, 01:46 PM   #12
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Quote:
Originally Posted by chaley View Post
It is an excellent idea to add such a declaration to FieldMetadata, specifically
Code:
    VALID_DATA_TYPES = frozenset(['rating', 'text', 'comments', 'datetime',
                                  'int', 'float', 'bool', 'series'])
There are 8 there. There are 9 in the pulldown list of new field types the user can create. I vaguely recall a "text*" datatype. I suspected it was the series name and indicated there was a related series number. Most of the other types seem self explanatory. Did I miscount or recall incorrectly?
Starson17 is offline   Reply With Quote
Old 07-16-2010, 02:07 PM   #13
chaley
Grand Sorcerer
chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.
 
Posts: 11,741
Karma: 6997045
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
The pulldown distinguishes between text/is_multiple=true & false. Tags & single-entry fields have the same underlying type.
chaley is offline   Reply With Quote
Old 07-17-2010, 06:13 PM   #14
Starson17
Wizard
Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.Starson17 can program the VCR without an owner's manual.
 
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
Charles, two questions:

Code:
from_record_value = db.get_custom(from_id, num=col_num)
set_custom(to_id, val, num=col_num)
How do I get/set the series_index associated with a custom field of datatype 'series'?

Is there an easy way to merge tag-like text from two records when is_multiple is True? With tags, it's just:
Code:
for tag in from_mi.tags:
   to_mi.tags.append(tag)
Duplicate tags are automatically removed.
Thanks.
Starson17 is offline   Reply With Quote
Old 07-18-2010, 07:23 AM   #15
chaley
Grand Sorcerer
chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.
 
Posts: 11,741
Karma: 6997045
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
Quote:
Originally Posted by Starson17 View Post
Charles, two questions:

Code:
from_record_value = db.get_custom(from_id, num=col_num)
set_custom(to_id, val, num=col_num)
How do I get/set the series_index associated with a custom field of datatype 'series'?
Get the value with db.get_custom_extra(...)

Set the value with db.set_custom(..., extra=series_index). You set the series and the series_index together. They cannot be set separately.
Quote:
Is there an easy way to merge tag-like text from two records when is_multiple is True? With tags, it's just:
Code:
for tag in from_mi.tags:
   to_mi.tags.append(tag)
Duplicate tags are automatically removed.
Thanks.
Two things:

1) your code does not remove duplicates. The append() method adds the tag to the end of the list (array), with no check for duplicates. I assume that later you call db.set_tags(). That method checks for duplicate tags in the input list, which is why your code works.

There are two ways to deal with duplicates, one with a set and one with 'in' check. The set code works because sets automatically remove duplicates, and would be something like:
Code:
to_mi.tags = list(set(to_mi.tags) | set(from_mi.tags))
The list code would look something like:
Code:
for tag in from_mi.tags:
    if tag not in to_mi.tags:
       to_mi.tags.append(tag)
If you are willing to let the db code cull the duplicates, then you can use
Code:
to_mi.tags.extend(from_mi.tags)
2) The custom code also removes duplicates, so you can use the similar code. It would be something like:
Code:
# assume that label contains the column label.
# id must be the id, not the index
tags = db.get_custom(to_mi.id, label=label, index_is_id=True)
tags.extend(db.get_custom(from_mi.id, label=label, index_is_id=True))
db.set_custom(to_mi.id, tags, label=label)
chaley is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Losing books/formats is not normal.-Chaley theducks Calibre 7 10-02-2010 11:11 AM
Suggestion: Selecting a user-defined Category should show all of the books in it Daemon Calibre 6 08-23-2010 01:19 PM
Jobs Queue, Merging, Metadata, I think that's it.... rabidrobot Calibre 2 08-17-2010 07:31 PM
User Defined Columns jjansen Calibre 3 03-17-2010 05:33 PM
User Defined Fonts gr8npwrfl Ectaco jetBook 1 01-21-2010 08:35 AM


All times are GMT -4. The time now is 07:10 AM.


MobileRead.com is a privately owned, operated and funded community.