View Full Version : Converting Word Doc with Tables to Epub?


dhume01
12-06-2010, 02:35 PM
Wondering if anyone could help, I'm trying to figure out an automatic way to convert a word file that contains tables to epub.
I found a terrific version of Marx's Capital online here: http://marxists.org/archive/marx/works/1867-c1/index.htm

I tried it in Calibre, but the formatting wasn't particularly great (or there was some other problem - I tried the conversion about a week ago, and I'm remembering to post now). I have Atlantis Word Processor, but unfortunately it does not support tables.

It seems like it should be a simple conversion, but unfortunately I do not have the technical know how.:help:

CazMar
12-06-2010, 11:42 PM
I think to do this you will have to create a "table" in the HTML document in your EPUB file. Do you have any software to edit the HTML documents, like Sigil?

dhume01
12-07-2010, 03:37 PM
I tried to convert the doc to html via word, but all that Sigil saw was the first page. I'll see if I can convert the doc to html via calibre, then maybe open up in sigil....

CazMar
12-07-2010, 05:33 PM
Are these the tables as in "Chapter Eighteen: Various Formula for the rate of Surplus-Value" etc?

Have you tried simply importing each of the existing HTML documents in Sigil exactly as they are? You will need to also import the style sheet (let me know if you can't work out how to get this). A problem might also come up with the width of each table as they are formatted for a bigger computer screen.

CazMar
12-07-2010, 05:49 PM
Sorry - now I see what you are doing! Using the DOC file, not the HTML documents. (Must be having a slow day!)
This might be one document where my "study note" method will work better :
Handy hint? Study notes on ereader

I was trying to get some university study notes on my Kobo, with the thought I might actually read the stuff on the bus etc. I tried converting it to HTML, making EPUB books etc etc. Works but it's tedious even for an experienced HTML writer and using some of the original PDF files became hard to read. Yesterday I got an inspirational flash and used Open Office to create a set of study notes.
I created a page template exactly the same size as my Kobo screen and set a margin (about .5 cm should do it) and set the type to 10 pitch. Then I copied all my study notes out of the various documents and pasted them into the new template and checked everything was in 10 pitch. This should retain the bolding and new paragraphs from the original document.
Then just tell Open Office to export it as a PDF document. It's easy to read because it is already the same size as your screen, no having to resize.
A quick and dirty way to convert short documents - perhaps lists you need or similar. And you can re-use the template any time.
Until there is a free program to resize PDF documents this might be the easiest way to do this job.

One thing about this method is that you can reproduce difficult formatting with less pain, BUT of course you get a PDF file.
Have you tried using Open Office (Sun's free Office suite program) as it can be much more user friendly when doing conversions to EPUB.

dhume01
12-08-2010, 09:12 PM
As it turns out, what I ended up doing was

edit the doc in microsoft word
save the document as a filtered html file
add the html file and associated folder to a zip file
import the zip file to calibre, and convert to epub

Surprisingly it worked pretty darn well. Mr. Goyal's software continues to impress.

CazMar
12-09-2010, 05:33 PM
......
Surprisingly it worked pretty darn well. Mr. Goyal's software continues to impress.

Yes, he is amazing and could perhaps give lessons to certain other large software companies about making software that actually works!

eping
12-13-2010, 08:52 AM
You can try ePub Maker.
Import from Word, and support tables

Gdzhlpr
12-28-2010, 08:02 PM
You can try ePub Maker.
Import from Word, and support tables

I came here looking for a fix on my conversion issues. your margin suggestions gave me an idea. and crossing myself I think I have finally fixed and have a solution (for me) anyways.

I use ABBYY FineReader corporate and have set the options as follows:
Open PDF
Save as HTML
Formatted Text in dropdown

OPTIONS SAVE TAB
HTML TAB
Keep line breakes
keep headers and footers
generate a table of contents checked
automatically create files based on headings

after typing up a wordpad doc with all of my settings I learned I can auto save in ABBYY but not in calibre so time was not wasted. Here are my settings I changed for Calibre: in the preference tab
Preference Defaults

SENDING BOOKS TO DEVICES:

My Files/Books/{author_sort}/{series}/{series_index} - {title} - {authors}

Above line is how to send books to the device in calibre per the mobiread forums.
I added tags so the new line is

My Files/Books/{author_sort}/{series}/{series_index} - {title} - {authors} - {tags}

COMMON OPTIONS
look & Feel
utf-8

Structure Detection
Preprocess input file to possibly improve structure detection

TABLE OF CONTENTS
Force use of auto-generated Table of Contents is checked

To bad Calibre doesnt let you save these settings too. However I did see a plugin tab where I can create and save a template so that may actually be a possiblility.

I also discovered the ABBYY hot folder: nifty little folder I can drop pdf files into and it will automatically perform the above formatting. keep in mind there will be no spell check so count on some errors. I just hope it all works as set up and I don't jinks this process by bragging here. If I searched a little more I may have found this information. One thing I still may change is the min font size on the output file in calibre. I may set it at 10 or 12 since the reason for the NOOKCOLOR in the first place is print size :)

good luck, and happy formatting ;) I will post back with any additional changes