View Full Version : Converting .docx to ePub and combining


wildchild1954
03-25-2011, 04:06 PM
I create my books chapter by chapter in Office 2007 Word. Is there any program out there that will allow me to convert each chapter to ePub and then combine them into one file bookmarking each chapter like PDF files do?

I was originally converting each .docx file to PDF and then combining them in Acrobat but my Pandigital won't read those files correctly. I'm looking for a way to do the same thing only converting into ePub format.

Toxaris
03-25-2011, 07:35 PM
Save the file as filtered HTML, add to file to Sigil. Save and presto.

Sigil will create a TOC for you, if you use headers.

Remember, the HTML from Word is full of excessive garbage, even the filtered HTML. You can clean/remove a lot without problems.

wildchild1954
03-25-2011, 10:29 PM
I'm computer illiterate so could you explain what you just said? It sounds like you're talking about converting just one file and I need to convert several and then combine them. I have yet to find any of these programs that will keep my formatting and my page headers correctly.

dwig
03-25-2011, 10:58 PM
EPUB files are actually archives containing multiple files. The text files in the archive are (x)html files.

It is generally considered good practice for each chapter or other major segments of a book to be indivitual (x)html files in the EPUB archive. To accomplish this you could work this way:

1. Export the first chapter from Word as "Web page filtered HTML".
2. Use Calibre to convert this HTML file into an EPUB.

As you create subsequent chapters:

3. Export each as separate "filtered HTML" files from Word
4. Open the EPUB in Sigil and add the next chapter.

Sigil will build and/or update the other necessary components in the EPUB as you add successive files. You will, though, have to do some massaging of the HTML files and the CSS styles to clean up the garbage with which Word burdens its HTML files.

wildchild1954
03-26-2011, 01:26 PM
That last part there about massaging the files would have me lost. I wouldn't have the slightest idea where or how to even find this information you say can be removed. I am soooooo lost in this but apparently I'm going to have to find a way to change all of my files so that instead of being compiled as PDFs they are combined and created as EPUBs so that the new update on my Pandigital will actually read them correctly in the BN ereader.

yaip
03-26-2011, 05:03 PM
If I run validatithru' Sigil, it says, "no error found". However, if I run it thru' epubcheck-1.2.jar, it tells me value of attribute "preserveAspectRatio" is invalid; must be a string matching the regular expression....

I have, preserveAspectRatio="none"

Which one is the correct check?

dwig
03-26-2011, 10:39 PM
That last part there about massaging the files would have me lost. ...

Another approach would be to combine the files in Word and save as RTF. Then open and resave the RTF in Wordpad, which will strip a ton of junk from the file. You can then import the RTF into Calibre and convert to EPUB.

The results at this stage should be pretty good, but if you need a decent TOC and want other changes you can use Sigil as a "word processor" to make simple changes (break the file into separate chapters, set the chapter headings as proper headings (H1, H2, ...) without having to delve into code or style editing.

wildchild1954
03-27-2011, 11:26 AM
Can anyone tell me how to combine files in Word?

DMSmillie
03-27-2011, 12:09 PM
Hi wildchild

Both dwig's and Toxaris's suggestions are good ones - provides you with alternative ways of doing what you're trying to do.

The simplest way to combine the Word files would be to make a copy of the first file, rename the copy to indicate it's the whole book and not just the first section, then open that file and copy and paste the contents of each of the other files onto the end, in the correct order. Once you've done that, you can save the file, then save it as an RTF file.

However Toxaris' suggestion is worth revisiting - save each file as "Web Page, Filtered", then use Sigil to build them into an EPUB file. Yes, MS Word adds rather a lot of HTML and CSS code that isn't entirely necessary, but using the "Filtered" save setting cuts down on that, and unless you've done really weird stuff with your formatting in Word, it should convert OK. Might not be perfect from an HTML purist standpoint (and I'd include myself in that group most of the time), but unless your ebook requirements are complicated, it should work fine.

About Sigil - it will happily handle the job of taking multiple HTML files and compiling them into a single EPUB. When you start a new file in Sigil, you simply insert all of your HTML files, and Sigil does the job of creating the EPUB file. If you have used Word's heading styles for your section and chapter titles, you can use Sigil's TOC (table of contents) editor/wizard to create a single TOC for the whole EPUB, based on those headings.

I'd recommend having a browse around the Sigil forum here in MobileRead, where you'll find the link to download and install Sigil, along with lots of info about how to use it.

DMSmillie
03-27-2011, 12:11 PM
If I run validatithru' Sigil, it says, "no error found". However, if I run it thru' epubcheck-1.2.jar...

Looks like you're using an old version of epubcheck, yaip. I'd download the latest version and install it - might resolve the differences you're seeing when checking your epub file.

awp
03-30-2011, 10:05 PM
wildchild1954, you can easily combine multiple DOCX files into one with Atlantis Word Processor, then convert directly to EPUB (http://www.atlantiswordprocessor.com/en/help/ebook.htm):

1) Create a new blank document in Atlantis Word Processor (press Ctrl+N).
2) Choose the "Insert | File..." menu command of Atlantis.
3) Direct the "Open Document" window to a folder containing your files.
4) Select the files by Ctrl+clicking them.
5) Click the "Open" button.
6) Choose the "File | Save Special | Save as eBook..." menu command of Atlantis to save as ePub.

wildchild1954
07-17-2011, 02:56 AM
Sorry I haven't been on the forum in such a long time but I did try Atlantis and now I have a problem with my .docx files. Every single one of them now has that A in the circle on them and I can't access them through Office Word anymore except as a read-only file. How do I get them back the way they were before?

DaleDe
07-17-2011, 10:07 PM
Sorry I haven't been on the forum in such a long time but I did try Atlantis and now I have a problem with my .docx files. Every single one of them now has that A in the circle on them and I can't access them through Office Word anymore except as a read-only file. How do I get them back the way they were before?

You should be able to access them with a right click and select Word from the menu. This problem is a windows issue. Inside of the windows browser there is a tools menu with with a folders option. Inside the folders options there is a file association choice and in that choice you can pick which application to associate with a given extension. Atlantis changed the default when it was installed. Word could also have an option to change the association of extension on its menu somewhere but I don't know where it might be on the version of Word you own.

Dale

wildchild1954
07-18-2011, 03:21 AM
Is there any way in Atlantis to change the default back to Office Word 2007? I'll see if I can find a way to change it back in Word but until then I have deleted the Atlantis program as that was the only way I could get the files back to .docx opening in Word.

Toxaris
07-18-2011, 04:28 AM
Actually DaleDe gave you the anwser already. Right click on the file. choose 'Open with' and then 'Choose Program'. Select Word and it should be fixed.

wildchild1954
07-18-2011, 12:19 PM
That's going to take forever if I have to do that for every single .docx file that I have!! You're talking about well over 800 files that would have to be done one by one. What I want to know is if there is any way to stop Atlantis from setting itself as the default of to change it back to Word from Atlantis without doing every single file.

Toxaris
07-18-2011, 03:05 PM
You only need to do that once. That way you change the way docx files are opened, not just that file.

awp
07-22-2011, 02:35 AM
Is there any way in Atlantis to change the default back to Office Word 2007? I'll see if I can find a way to change it back in Word but until then I have deleted the Atlantis program as that was the only way I could get the files back to .docx opening in Word.

Please see also this article:
Changing file associations (http://atlantiswordprocessor.blogspot.com/2010/03/changing-file-associations.html)

wildchild1954
08-02-2011, 02:32 PM
I want to say that I figured out the steps I was questioning in the Atlantis program, tried the trial, and bought it!! It works wonderfully to create ePub files from my Word documents and the only changes I had to make was downsizing my fonts and changing a few of them to something Atlantis would support. That wasn't really a big problem and the support team at Atlantis was wonderful about helping me with that problem.

So I would thoroughly and enthusiastically recommend this program to anyone looking to create ePub files from Word documents.