![]() |
#16 |
Sigil Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 8,508
Karma: 5703586
Join Date: Nov 2009
Device: many
|
Hi,
FYI: a Topaz Input plugin would be a one way street. There really is no way to convert any other xhtml based file to the Topaz file format. The inputs required are really scanning based (list of glyphs and paths to create each glyph, x,y positions of each glyph on the page (and the glyphs are not like font glyphs as they have no baselines), ocr info, page continuation info, dehyphenation info, fixed page format, etc,. So I actually think it would be easier to create a .tpzZ input plugin that would take the archive with html, cover.jpeg, opf that I have already generated and handle things. The generated html is xhtml which can be input directly into an lxml tree without need for tidy or beautiful soup so it is very close to your internal calibre format as it stands right now. It is almost as if we need a new file type extension based on zip called ".calibre" that represents Calibre's internal format and I could modify the plugin code I have now to write to that standard and calibre could output to that code as well when exporting to disk. Kevin |
![]() |
![]() |
![]() |
#17 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,222
Karma: 27110894
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
just use zip in that case. If you want the cover to import, modify the zip metadata reader to read covers from OPF files, IIRC the code is in metadata.archive
|
![]() |
![]() |
Advert | |
|
![]() |
#18 |
Sigil Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 8,508
Karma: 5703586
Join Date: Nov 2009
Device: many
|
Hi,
Forgive me if I am messed up as I am new to this code but I think metadata.archive falls back to metadata.zip if there is no known type inside the archive (unless a comic). And it looks like metadata/zip.py already will detect an .opf file if inside the zip and will invoke meta.get_metadata(stream,'opf') which in turn will invoke opf_metadata(path) which will parse it properly including the cover information from the manifest and the guide. The issue is that since libprs is not forced to be True and application_id is None since this book is not part of calibre yet), meta.get_Metadata(stream,stream_type) will not return the opf information just collected, but will instead try to drag meta information from the filename and just before returning will do a base.smart_update(opf) with the opf meta information. The issue is that opf.smart.update() will not update the cover or cover_data attributes as they are not on the list of attributes it will try to smartly update. Unless this would mess you up, the easiest fix would be to do something along these lines (I think) --- opf2.py 2010-12-23 16:39:20.000000000 -0500 +++ opf2_new.py 2010-12-25 20:47:37.000000000 -0500 @@ -990,7 +990,7 @@ for attr in ('title', 'authors', 'author_sort', 'title_sort', 'publisher', 'series', 'series_index', 'rating', 'isbn', 'tags', 'category', 'comments', - 'pubdate'): + 'pubdate', 'cover', 'cover_data'): val = getattr(mi, attr, None) if val is not None and val != [] and val != (None, None): setattr(self, attr, val) Am I understanding this correctly? |
![]() |
![]() |
![]() |
#19 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,222
Karma: 27110894
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
open a ticket for it as I am travelling for the next few days so this post will get lost.
|
![]() |
![]() |
![]() |
#20 |
Sigil Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 8,508
Karma: 5703586
Join Date: Nov 2009
Device: many
|
Hi,
Will do. Have fun on your travels. Thanks again for all of your help. Kevin |
![]() |
![]() |
Advert | |
|
![]() |
#21 |
Sigil Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 8,508
Karma: 5703586
Join Date: Nov 2009
Device: many
|
Hi,
Please ignore my previous attempt at a patch. The code I thought was running did not because it was protected by an application_id not None check that had to be worked around. I created a new patch to metadata/zip.py that I tested and it does do what I wanted. As requested I have created a bug tracker issue with the patch attached. http://bugs.calibre-ebook.com/ticket/8066 Thanks again for all of your help. KevinH |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Importing - Metadata aquisition | Justy | Calibre | 1 | 02-05-2010 03:44 PM |
why does html appears as Zip? | yasmeen57 | Calibre | 6 | 10-06-2009 11:25 AM |
regex Issue when Importing | river | Calibre | 3 | 06-16-2009 11:03 AM |
Multiple html issue - too many links and zip isn't created in calibre | Katelyn | Calibre | 1 | 03-10-2009 01:31 PM |
Conversion issue with zip of Warbreaker | Mitchll | Calibre | 6 | 07-28-2008 06:25 PM |