Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Development

Notices

Reply
 
Thread Tools Search this Thread
Old 06-07-2012, 11:03 PM   #1
SauliusP.
Plugin developer
SauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notes
 
SauliusP.'s Avatar
 
Posts: 108
Karma: 24394
Join Date: Feb 2012
Location: Lithuania
Device: Kindle
DOCX Input and DOCX Metadata Reader

Updated and maintained plugin thread is here:
https://www.mobileread.com/forums/sho....php?p=2107703
---------------------------------------
Spoiler:

Hello,

I have made DOCX Metadata Reader and DOCX Input plugins for my own purposes and, well, maybe someone else would make use of them too. As an article writer I have lots of DOCX and tried to find good free alternative for DOCX to EPUB or MOBI conversion. However, good EPUB tools are not free, and Amazon's conversion service did not satisfy me, it makes formatting crappy and "not book like". So here they are, my own conversion tools. Please feel free to use them for your own purposes. Development will continue, I will constantly add new features. I was quite surprised there is no other plugin for Calibre, as DOCX format is comparatively simple.

DOCX Metadata Reader simply reads metadata from DOCX file, when added to Calibre library or appropriate button is pushed in book's details editor. The very first picture (if applicable) is used as a cover.

PLANS:
Add options dialogue to turn on/off cover extraction.



DOCX Input plugin converts a DOCX file format to OEB (if I'm not mistaken, bunch of HTMLs with OPF file and CSS stylesheets). Then Calibre converts it to anything it supports. My main target is MOBI, but no hacks included for better support.

SUPPORTED FEATURES
1. Conversion to CSS and filtering of Word styles (only in-use styles are converted).
2. Paragraph properties: left, right indents, first line indent, last rendered page break (might be: manual page break, style-based page break, section break etc).
3. Images support. Limitation: wrapped around pictures are floated to left only, as position calculation is a feature I didn't like in the Word.
4. Tables (also multi-level table in a cell support).
5. Everything until first rendered page break is considered to be "a cover". I.e. most of my documents, that I convert, include some type of cover and a manual page break.
6. Font embedding of DejaVu Serif (included into plugin itself).
7. Footnotes are saved into individual HTML files and superscript links are added.
8. Paragraphs, that have TOC level styles applied (like Heading 1, 2 etc., or custom ones), are converted to appropriate level h1, h2 etc. HTML tags.
9. Font-sizes are converted to pt (same value, as you see in Word itself).
10. Indents are converted to em (just looks better).

NOT SUPPORTED
1. Manual linebreak.
2. Lists (bulleted and numbered). However, for that purpose I use Word macro (also attached), that converts all the lists to plain text (bullets and numbers are preserved).
3. Table styling. Now only collapsed 1px black borders are hard-coded.
4. No font-face styling. I support only DejaVu Serif font-face for font-embedded (like EPUB) conversion.
5. Footnotes back-link.
6. No endnotes support and is not planned. If required, I convert all endnotes to footnotes beforehand.
7. Another fancy things, like vector graphics, OLEs, effects etc. Not planned either.

PLANNED
1. Options dialogue: cover conversion modes (until first page break, use Calibre's), font embedding on/off, switch font-size units: em, pt, px, %.
2. Font face support (but in far future).
3. List support (as well, as break-continued lists).
4. Line breaks.
5. Hard space (code 160), ndash and mdash to HTML entities.
6. Table styling (if not too difficult).


USING
To get best results Calibre should be tuned a bit.
1. To generate TOC, go to Common Options, Table of Contents and add expressions for HTML headings (use wizard or input //h:h1 for Level 1 TOC, //h:h2 for Level 2 and //h:h3 for Level 3).
2. For EPUB conversion go to EPUB output options and tick "No default cover" and "No SVG cover".

All critiques, crashes and suggestions are most welcome, but I will not be quick in responses or new features development. At the moment I'm quite satisfied with plugins.

Last edited by SauliusP.; 09-28-2012 at 05:32 AM.
SauliusP. is offline   Reply With Quote
Old 06-08-2012, 12:23 AM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,843
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
If you want your plugins to be available in the calibre add plugins wizard, then you should post this thread in the plugin forum, and send kiwidude a PM to add it to the index of plugins. I'll add the docx metadata reader plugin into calibre itself, since it is trivial.
kovidgoyal is offline   Reply With Quote
Advert
Old 06-08-2012, 02:46 AM   #3
SauliusP.
Plugin developer
SauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notes
 
SauliusP.'s Avatar
 
Posts: 108
Karma: 24394
Join Date: Feb 2012
Location: Lithuania
Device: Kindle
Allright, reposted the thread and will contact kiwidude. Thank you!
SauliusP. is offline   Reply With Quote
Old 06-08-2012, 04:33 AM   #4
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,843
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
I committed a docx metadata reader plugin to calibre.
kovidgoyal is offline   Reply With Quote
Old 06-15-2012, 02:07 AM   #5
SauliusP.
Plugin developer
SauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notes
 
SauliusP.'s Avatar
 
Posts: 108
Karma: 24394
Join Date: Feb 2012
Location: Lithuania
Device: Kindle
Hi Kovid,

I don't want to be a jerk, but anyway: could you please put "SauliusP." by DOCX Metadata Reader plugin, that's now included into Calibre? Unless you have refactored it to irrecognizable level :-)
SauliusP. is offline   Reply With Quote
Advert
Old 06-15-2012, 02:17 AM   #6
SauliusP.
Plugin developer
SauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notesSauliusP. can name that song in three notes
 
SauliusP.'s Avatar
 
Posts: 108
Karma: 24394
Join Date: Feb 2012
Location: Lithuania
Device: Kindle
Allright, indeed you have refactored it to irrecognizable level :-) Ignore above post.
SauliusP. is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
MSWord Doc or Docx formats bulldogmo Conversion 3 04-27-2012 05:14 PM
Can't seem to automerge .docx files mshnryman Library Management 19 12-28-2011 07:06 AM
My vote for *.docx support. Innomen Conversion 3 12-20-2011 10:39 PM
.DOCX format NoWorthWhile Amazon Kindle 10 01-14-2011 07:48 AM
DOCX with equations to MOBI Chrasty Kindle Formats 2 11-22-2010 04:13 PM


All times are GMT -4. The time now is 05:51 PM.


MobileRead.com is a privately owned, operated and funded community.