View Full Version : Save any Document from Microsoft Word 2007 to EPUB using a Free Add-in from Aspose


romeok
08-22-2009, 05:24 AM
Aspose.Words for Microsoft Word is a free utility that allows converting any document opened in Microsoft Word 2007 to the EPUB format. Microsoft Word 2007 can load documents in many formats including DOC, DOCX, RTF, HTML, ODT etc and you can now easily convert them all to EPUB using Aspose.Words for Microsoft Word.

More here

http://www.aspose.com/community/blogs/aspose.words-for-.net-java-reporting-services-and-jasperreports/archive/2009/08/22/save-any-document-from-microsoft-word-2007-to-epub-using-a-free-add-in-from-aspose.aspx

JSWolf
08-22-2009, 07:18 AM
Does this support mobile ADE?

PennyPie
08-22-2009, 07:55 AM
Thanks! This looks cool!

mtravellerh
08-22-2009, 07:56 AM
Does this support mobile ADE?
Doesn' seem so, at least its not mentioned!

griffonwing
08-22-2009, 01:30 PM
Question: Does this free plug-in require any activation code or license-key?

The reason I ask is that you have to log-in to download it. And in order to log in, you have to supply personal info such as Name, Street Address, Phone, Country, etc.. This is not something I am willing to supply for a free plug-in for Word.

If it's not a licensed-key required version, I'd like to ask if it could be sent to me via email, or posted somewhere else for download.

PennyPie
08-22-2009, 06:08 PM
This is not something I am willing to supply for a free plug-in for Word.


Ditto...

cmbs
08-22-2009, 09:56 PM
I don't know anything about this software, and am not suggesting you even try it.

But in general, when someone is demanding this personal information which they don't need I always provide wrong information and it works fine. I usually give my correct email address, because it's usually necessary. And sometimes you have to give a real zip code (but it doesn't have to be the one you live in). Other than that, make stuff up.

My phone number is "(999) 999-9999" and my street address is usually something like "you don't need this information"

I try to make it obvious I'm not telling them, I don't want to send junk mail or phone calls to some random person.

JSWolf
08-22-2009, 10:00 PM
Doesn' seem so, at least its not mentioned!
Then to me, it's totally useless. Also, a Word 2003 version would be nice too if it did support Mobile ADE.

Leep
08-22-2009, 10:09 PM
Aspose.Words for Microsoft Word is a free utility that allows converting any document opened in Microsoft Word 2007 to the EPUB format. Microsoft Word 2007 can load documents in many formats including DOC, DOCX, RTF, HTML, ODT etc and you can now easily convert them all to EPUB using Aspose.Words for Microsoft Word.

More here

http://www.aspose.com/community/blogs/aspose.words-for-.net-java-reporting-services-and-jasperreports/archive/2009/08/22/save-any-document-from-microsoft-word-2007-to-epub-using-a-free-add-in-from-aspose.aspx

They've apparently temporarily pulled the download link according to their website.

cheers

JSWolf
08-22-2009, 10:14 PM
I do hope it's to make sure it's Mobile ADE compatible.

jgray
08-22-2009, 11:09 PM
I immensely dislike having to provide any information, just for a free download. I usually skip such things. If it is something that I really want to try (like this now missing download), I also input bogus information. In these cases, I never provide a real email address. If an email address is required to obtain a download link, registration number, etc., I use services like 10minutemail.com. This gives you a valid email address that forwards to your real address, but it expires in ten minutes. You can extend the timeout for additional ten minute intervals.

Elfwreck
08-22-2009, 11:50 PM
I shudder to think of ePubs made by automatic output from MS Word, using Word's attempts at HTML.

griffonwing
08-23-2009, 01:42 AM
I havent tried, but does Word07 export HTML better than 03?

brewt
08-28-2009, 12:10 PM
Got it, tried it.
If you're spooked about providing "personal" information, lie. They connect you to the download link once you create the account; it isn't sent in an email.

It seems to be a fairly straight-forward single-file word-to-epub convertor. It does not enable word to open pre-existant epubs. There's no opportunity/easy way to edit metadata - it seems to use the Word Owner as the author of anything and everything.

Hyperlinks within the file seem to work ok in both calibre and ade (like a word-created toc). Obviously, you can't call on another file with the hyperlinking, being all single-file and all. Funny, there is a calibre_bookmarks.txt file in the generated epub. I'm not seeing any credit for our friend Kovid.

It does strip css - word embeds css-like code at the beginning of each file that Calibre does parse into real-ish css in the epub.

It converts this from the original word-htm:
h1
{mso-style-link:"Heading 1 Char";
margin-top:12.0pt;
margin-right:0in;
margin-bottom:3.0pt;
margin-left:0in;
text-align:justify;
line-height:36.0pt;
page-break-before:always;
page-break-after:avoid;
font-size:18.0pt;
font-family:"Frutiger Linotype","sans-serif";
font-variant:small-caps;
font-weight:bold;}

to an on-the-fly call like this:

<span style="font-family:sans-serif; font-size:18pt; font-variant:small-caps; font-weight:bold">Your Text Here</span>

each time it's used.

Some things work differently on the resultant epub between ade and calibre-reader, but they do anyway. The epub does load into calibre ok, and a mobi file can be created, with all the abbreviated style bits.

And yes, Word is perfectly viable as a base-file editor for epubs/mobi. I do it all the time. No matter how you feel about Microsoft, they did do some things right with Word - tocs, spell/grammar check, thesarus, macros, search and replace by style, easily-changed styles and style sets, comments, footnotes, citations, corrections/somewhat of a revision control, blah blah blah.

There are 2 big things in Word2007 that do make it a significant improvement over 2003:
1) Style Set Management. You can create an arbitrary style set and change any document into it with little duress.
2) Interface customization. You can put an icon onto the Quick Access Toolbar for any command or feature in word including macros. Makes things oh-so-much-faster.
Sure, they moved everything around which takes some time to get used to. And they did get a little carried away with the css-embedment at the beginning of each file (I've seen it have a reference for all possible styles with a reference to every font on my machine - and I've got a boat load of those). But the help (for the most part) works.

So there.

-bjc

romeok
09-04-2009, 09:15 PM
Hi all,

Thanks for trying the plugin and for the feedback. I will try to answer all questions I've seen so far:

1. Personal info and registration. I don't understand what the fuss is about. It is an acceptable practice to require registration, especially for something free. If you don't like providing useful info, don't. You will still get the download. Our company's focus is components for .NET and Java developers and a simple registration is required to access downloads and reply in the forums. We have created this free plug-in based on one of our products and made it available for public use. We do not particularly need information about plugin users, I'd say it will just be an extra cost for us to change the existing website.

2. ADE means Adobe Digital Editions I presume. Yes, EPUB created by Aspose.Words for Microsoft Word reads in ADE well. It is the first on the list of products that we test with.

3. Word 2007 to HTML or Word 2003 to HTML. The answer is neither. Aspose.Words is itself a file conversion engine. In this particular case the flow is MS Word saves DOCX (OOXML), Aspose.Words reads it and converts to EPUB itself. So it is not MS Word to HTML conversion. Our main commercial product is the component for software developers Aspose.Words for .NET and Aspose.Words for Java that offers much more. Conversion to EPUB is just one small feature that we thought would be cool to make into a free add-in for MS Word.

4. Edit Metadata. Aspose.Words converts document properties into EPUB metadata. So if you go into document properties in MS Word and specify title, keywords, author etc, you will get that in the EPUB output.

5. calibre_bookmarks.txt embedded in EPUB. Sorry no idea what you are talking about. Aspose.Words is completely our own code in all its features and does not use any third party libraries. So we do not give credit to anyone.

6. CSS in HTML. This MS Word to EPUB export in Aspose.Words for Microsoft Word is based on the capability of Aspose.Words for .NET to convert various documents to HTML/MHTML and EPUB. There is a number of options are supported in the actual component (that is for software developers) including different ways of saving CSS. There is an option to save inline CSS like you've seen above (default). There is an option to save using a CSS stylesheet that is embedded in the file and also into external CSS stylesheet (although in EPUB I think it will always be internal). You do not get these options in Aspose.Words for Microsoft Word because it is just V1.0 and we are interested in seeing any feedback. If you feel that EPUB generated by Aspose.Words can be improved, feel free to suggest what do you want to see.

I just want to reiterate that Aspose is a company that develops components for .NET and Java developers. This Aspose.Words for Microsoft Word is a free product based on one of our components Aspose.Words for .NET. We are interested in improving EPUB conversion (as well as all other features) and felt that making it available for public to use in different scenarios will help us get constructive feedback.

BTW Aspose.Words for .NET is also the engine that is used by Adobe Buzzword to convert documents to EPUB.

NSqirrel
09-05-2009, 04:42 AM
Thanks for posting the link. I am interested, although have yet to buy a reader (very close now that the PRS 600 is almost out in the UK.) Look forward to experimenting in the mean time with the add-in.

brewt
09-05-2009, 04:06 PM
on Bookmarks_Calibre.txt:

My goof. After one opens an epub with Calibre's reader, Calibre embeds the bookmark file into the epub. Your stuff didn't put it there. Apologies.

Yours is the first generater-epub-er-maker I've seen that the epub created could be opened directly in MobiPocket. So, you're on the right track there.

With a Word Docx file, one has the option to embed fonts directly into the document. Is this on your radar? Sure, one could do that with Indesign, but gad, that's hard. There's oodles of formatting tricks in Word that tend to get stripped in the other builder applications, so if yours would utilize more of them, that would be better (dropcaps, themes, background colorings, tables, smart art, word art, etc)

Overall, I rather liked it, for single-file conversions. Me, I tend to build multi-file books; it makes my organizational book life a lean bit more to my liking. So please, keep up the efforts. Being able to use my favoritist editor directly into epub = good.

-bjc

JSWolf
09-05-2009, 04:14 PM
Does your converter support the mobile ADE specification? That is, no flow greater than 300k.

NSqirrel
09-06-2009, 04:52 AM
Perhaps my question comes into that asked by brewt above (>> one has the option to embed fonts directly into the document. Is this on your radar? )
Being ereaderless at present, I have installed the Sony ereader application on my laptop and created an epub file, with some graphics, with the greatest of ease using your add-on. One things I noticed is that I cannot change the font size in the ereader application, whereas I can with other epubs I have obtained online. Am I too early in your dev cycle or is the problem with my file creation?

(Edit: Sony app referred to above is the 'eBook Library' application v3.0.00.08040.)

JSWolf
09-06-2009, 08:55 AM
Yes, you can use eBook LIbrary to view ePub and you can change the size of the font with eBook Library.

NSqirrel
09-06-2009, 10:59 AM
Yes, you can use eBook LIbrary to view ePub and you can change the size of the font with eBook Library.Thanks for the comment, but not quite my point. I know I can do as you suggest with ePub books, but not with those few tests I have generated using the add-in. Hence my question being: does the add-in allow font size changes with eBook Library or am I doing something wrong (-quite likely!)?

Elfwreck
09-06-2009, 12:07 PM
3. Word 2007 to HTML or Word 2003 to HTML. The answer is neither. Aspose.Words is itself a file conversion engine. In this particular case the flow is MS Word saves DOCX (OOXML), Aspose.Words reads it and converts to EPUB itself. So it is not MS Word to HTML conversion.

:wall:

EPUB is an HTML file, or a set of HTML files, with some metadata attached, in a ZIP container, with the extension changed to EPUB. There is no "convert to ePub" without "convert to HTML." Aspose may not use either of Word's HTML conversion processes but it has to convert to HTML somehow. (I suppose it could convert to XML instead of HTML, but the same issues exist--how much of Word's unnecessary repetitive paragraph-level coding does it keep?)

Patricia
09-06-2009, 03:38 PM
Thanks for the generous offer, Romeok. I'll try it out!

Meine
09-18-2009, 08:18 AM
Thanks for the comment, but not quite my point. I know I can do as you suggest with ePub books, but not with those few tests I have generated using the add-in. Hence my question being: does the add-in allow font size changes with eBook Library or am I doing something wrong (-quite likely!)?

I had the same finding as you did so I went on and posted a question on the Aspose forums about, and this is what I got back:

.....When viewing EPUB we are normally able to increase and decrease font size dynamically. For instance in Adobe Digital Editions there are two corresponding buttons in the toolbar. To make this feature working, font sizes in EPUB must be relative like this:

style=”font-size:1.15em”

“em” measurement unit in CSS corresponds to the current effective font size. But Aspose.Words outputs EPUB with absolute size values in points, not em. This is a known issue. We’ll notify you when it’s implemented. But we cannot promise any timeframe because good implementation would be quite complex (considering tables, lists, etc.).....


so its still something in their conversion method that needs a change :(

NSqirrel
09-26-2009, 04:09 AM
I had the same finding as you did so I went on and posted a question on the Aspose forums about....(
Thanks Meine for the reply and explanation, which is a great help.

kacir
09-26-2009, 05:29 AM
The reason I ask is that you have to log-in to download it. And in order to log in, you have to supply personal info such as Name, Street Address, Phone, Country, etc.. This is not something I am willing to supply for a free plug-in for Word.
Come on ...
I never have problem supplying those data.

Let us see ...
Name: FirstName
Surename: LastName
Street: First Street 1
Town: Qwerty
Zip: 012345
Country: Afghanistan (it is usually the first on the list. Feel free to supply Albania or Bali)
email: IDoNotWantToTell@SoGetThis.com

It is a good idea to make a habit in registering to all those countless sites that require registration using the same name, like
IRegistered
IvanKuznec (John Smith in Russian)
IDoNotWantToTell
So when you return to the site, you just automatically try logging in with LoggedIn name and password321 password.

The beauty is, you CAN receive email at the above address IDoNotWantToTell@SoGetThis.com. Just go to the www.mailinator.com
sogetthis.com is just one of many alternative names mailinator.com has.

kacir
09-26-2009, 05:35 AM
I havent tried, but does Word07 export HTML better than 03?
Yes it does, but do not select html as an output. Select simplified html or stripped html or something like that (I do not remember and I do not have MSW2007 installed on this comp.). There is such file format in a drop down list of file formats when you select SaveAs in menu.

I say the html is better, but it does mean the code looks well. Word just leaves out lots of styles and other garbage that is only needed if you need to open that html in word again.

dogsballs
09-28-2009, 05:47 PM
have downloaded this app. to try out but i seem to be having a compatibility issue. the application wont show up when i try to save as. im using microsoft office 2010 beta guessing this is the problem but wondering if there is a way around without loading an older version

brewt
09-28-2009, 07:53 PM
From Wikipedia (http://en.wikipedia.org/wiki/Microsoft_Office_2010),

The Beta Build 4417 was leaked to the internet on August 30, 2009. It contained a number of UI enhancements, as well as the near final implementation of Backstage View.

Technical Preview

On May 15, 2009, the first Technical Preview was leaked to BitTorrent websites. An internal post-Beta build was leaked on July 12, 2009, newer than the official preview build and including a "Limestone" internal test application.

On July 14, 2009, Microsoft started to send out invitations on Connect to test an official preview build of Office 2010. On August 30, 2009, the beta build 4417 was leaked on the internet via torrent networks.

Public Beta

Microsoft has confirmed that a public beta of office 2010 will be released later this year.

Office 2010 isn't exactly out yet, and targeting an app for a Beta release usually doesn't work out so good.

If you have to see it work, at least this afternoon, you have to backstep to non-beta-microsoft-stuff.

Or, wait until 2010 is really out. Since O2010 is a web-based app, I'm not sure how one would go about loading in a 3rd party add-in to Microsoft's Servers without their say-so.

-bjc

dogsballs
09-29-2009, 04:17 AM
cheers brewt

ericshliao
10-10-2009, 05:07 PM
Just tried the docx-to-epub converter. It seems that I cant't embed fonts in epub files created by the converter. Is there any solution?

JSWolf
10-12-2009, 07:27 PM
Just tried the docx-to-epub converter. It seems that I cant't embed fonts in epub files created by the converter. Is there any solution?
Edit the ePub and add in the fonts you want. It's what I do.

Mr. Dalliard
10-12-2009, 09:17 PM
Question: Does this free plug-in require any activation code or license-key?

The reason I ask is that you have to log-in to download it. And in order to log in, you have to supply personal info such as Name, Street Address, Phone, Country, etc.. This is not something I am willing to supply for a free plug-in for Word.

If it's not a licensed-key required version, I'd like to ask if it could be sent to me via email, or posted somewhere else for download.


Mr. Dalliard says: "My dear Mr. Wing, this is why we keep a special email account for potential spammers and people we don't really feel like giving our 'real' address to".

You forgot pr0nsites! Anyway, I am inclined, at an angle of approximately 37.2 degrees, to agree.

As for the plugin, I look forward to checking it out, in anticipation of the ereader which my sister, supposedly, sent me over two weeks ago.

romeok
10-17-2009, 07:32 PM
Here I am checking on the status again... Thanks for your interest so far.

I have scheduled the following to be implemented in the Aspose.Words for Microsoft Word save as EPUB plugin to be done by the end of November:

1. Embed subsets of used True Type Fonts.

2. Investigate and support the ADE mobile specification.

3. Make font size resizing in the readers work. At the moment they do not work because we always export fixed font sixe, the way it is specified in MS Word. E.g. 10pt we export as 10pt and ADE does not seem to allow resizing this. It only allows resizing for relative font sizes I guess.


To answer some of the earlier questions:

Yes, to produce EPUB Aspose.Words converts the document into HTML internally, but it does so using its own HTML exporter, it is not using MS Word to convert to HTML so that's why the produced HTML can be a bit "cleaner" than if produced by MS Word.

Yes, Aspose.Words embeds all formatting as inline CSS on each paragraph etc. It is just one of the modes for exporting CSS formatting in Aspose.Words. There are three (inline CSS, embedded CSS stylesheet and external CSS stylesheet). We've defaulted to inline CSS, but I guess we can changed to embedded CSS stylesheet as it seems some of you do care.

Richey79
09-27-2010, 05:03 AM
Potentially useful, but every time I've downloaded this, the installer crashes halfway through (Win7 x64).