I'm in the process of creating a User Interface plugin as a calibre version of Doitsu's new Sigil plugin (convert selected epub text files to MP3 using the MS Windows Speech API and the LAME encoder).
I'm still doing initial testing so am still running my .py scripts via calibre-debug in a Windows .bat file rather than via plugin. Consequently, I don't have access to the calibre library metadata ATMO so am trying to extract the few items I want from the container.mi object.
My problem is in understanding whether a container.mi field contains unicode or something else. For example this print statement
Code:
print('authors:', container.mi.authors, '\ntitle:', container.mi.title)
results in this in my CMD box
Code:
authors: [u'Yrsa Sigur\xf0ard\xf3ttir']
title: A ‘unicode’ title Yrsa Sigurðardóttir
'title' looks like a unicode string, but 'authors' looks like a list of non-unicode items.
Please can you advise how I should be accessing the container.mi data to make sure I always end up with unicode?
I could do what the Sigil plugin does and extract the data directly from container.opf but I'm sure container.mi has already done a far more robust job of that than anything I could come up with.
The metadata will be used to populate the MP3 tags.
I can attach a small test epub if necessary.