05-15-2013, 01:35 AM | #1 |
Witchman
Posts: 628
Karma: 788808
Join Date: May 2013
Location: Philippines
Device: Android S5
|
Workflow: Converting to Kindle and EPUB
WORKFLOW Converting a Word document to eBook Format This document puts forward a standard step-by-step procedure to gernerate eBooks in EPUB, Kindle and other formats with one general workflow method. The method used, involves using applications that won't cost you a penny. This method converts a Word Doc to eBook format without the need to initially nuke and remove all the MS Word format say in a text editor like Notepad first. In order to achieve the conversion you will need the following applications: * An HTML editor like DreamWeaver or HTML-Kit(free software) * Sigil(free software from SouceForge) to convert HTML --> EPUB. * Calibre(free software from SourceForge) for converting HTML --> Kindle format, EPUB and other formats. * Gimp(free software from SourceForge) for designing images for book covers. You can easily download the free software from the internet. Here is the procedure: 1.) First, create an HTML CSS stylesheet based on your Word Styles used in your document: • Create, change and save all the styles that are used in your Word document -- this involves creating and renaming header, paragraph and font styles with names like BODYText, CHAPTERNumber, INTROHeader,QUOTE, COPYRIGHT, LICENCEText, TITLEMain etc. Do this for every single different style that you use in your eBook document. • Now create an HTML CSS stylesheet with the same formatting and style names as defined and used in your Word document Styles for your eBook. To create your style sheet -- use an HTML editor like DreamWeaver, Sigil or HTML-Kit. • If required, save your style definitions to Normal.dot -- the global Microsoft Word doc template -- so you can use this template to convert other docs and books to eBooks. • To learn about creating and using CSS stylesheets, go to the resource at www.w3c.org for full documentation. • Reformat all Chapters with classes derived from h1(Header 1). Header 1 classes allow page breaks before and after the chapter number headings. A Header 1 or h1 style class should be defined like this in the stylesheet: .CHAPTERNumber { display:block; margin-top:0em; margin-right:0em; margin-bottom:2em; margin-left:0; text-align:center; text-indent:0; page-break-after:avoid; font-size:1.1666em; font-family:"Book Antiqua","serif"; text-transform:uppercase; font-weight:normal; page-break-before:always; } For example, when declared in this way, the above CSS declaration in the style sheet allows the defined CHAPTERNumber class to be used globally -- in the p, div, h1, h2, h3 etc attributes or wherever you like. It ain’t Object Orientated programming but it sure lightens the workload !! • Use only em and percentage(%) or relative values to define lengths and sizes of fonts, spaces and indents etc in the HTML version of your ebook. Doing this allows the fonts in your ebbok to be easily reflowable and resized on different hardware ie different screen sizes on mobile phones,tabs and PCs. Absolute values that are defined in inches, cms, pixels etc prevent your ebook text from being automaticallly resized easily on different devices. • Remove all blank lines ie lines that contain <br /> or <p class="BODYText"> </p> etc. Remove ALL BLANK LINES IN THE DOCUMENT. 2.) Reformat your Word Doc to EPUB and Kindle eBook format. I prefer getting rid of all Microsoft names in the HTML CSS version and replacing them with equivalents that I have named in MS Word Paragragh Styles. • Save your book document as a Web page, Filtered(htm, html). • Using an HTML editor like DreamWeaver or HTML-Kit, search and replace for all <br /> tags with <p class="BODYText"> </p>. Doing this replaces a hard break with BODYText which is my own redefinition of MS Word's Normal Style(shown as MsoNormal in HTML). just defines a blank character space on a line. • Remove all paragraph end tags that contain <br /> in the HTML and replace them with a space using Search and Replace. Remove all unnecessary whitespace or blank lines from the HTML. • Remove all tabs from the document by Search and Replacing ^t with nothing. • Remove all MsoNormal eg <p class="MsoNormal"> HTML tags from the HTML text and replace them with your own definition for the body text(which is the main style that your actual story is written in) -- such as BODYText. • Any intentional whitespace that you wish left in then you should reformat the line to: <p class=BODYText > </p> The <br /> and <br> tags cause innumerable formatting problems if they are left in the HTML. So remove all of them!! • Main headings, chapters and images tags (used at the start of chapters) should only be defined using paragraph style classes derived and defined by you from the Header 1 or h1 style eg CHAPTERNumber, PROLOGUEHeader etc. • The last thing you should do is go into the html and do a last search and replace on the <br /> tag to make sure they are all gone -- in case you put some in linebreaks in yourself while reformatting!! • Generate a TOC for your EPUB conversion in the Sigil app. Ensure that you have used <h1> tags, with your own styles, embedded in the <h1> attribute to define the main headings and chapters. EPUB generates the TOC by picking up h1, h2, h3 etc tags. • Run the EPUB conversion through Calibre into Kindle format. • Test the Kindle version on Kindle app and Test the EPUB version on Kindle Previewer and test the EPUB on Adobe Digital Editions(free software dowload). • You now have your ebook in EPUB and Kindle(and other) formats. Various Problems Arising • The worst problem that I had through all of this was trying to set a small JPEG glyph picture above the PROLOGUE heading for a nice effect. But no matter what I did -- the picture or glyph always caused a page break after. Then I realized that the two attributes -- page-break-after:avoid and page-break-before:avoid -- were ignored unless you declared the image tag header as being derived from h1. Then it worked like a dream and did not create a pagebreak!! Simple Workflow Summary * Create a Word HTML (Filtered) file from your Word document or book. * From the HTML, create your own CSS style definitions to your own formatting style. * Using Calibre, convert the Word HTML(Filtered) using the Convert Books tab to convert to EPUB format. In Calibre, add your cover image -- it’s the easiest way. Turn off Heuristic Processing in Calibre. This causes more problems than it solves and you will get less formatting errors if you do this. * Using Calibre to convert from HTML to EPUB will remove some of your own formatting styles(I don’t know why) and there will be some other errors. Open the generated EPUB file in Sigil. Import your own CSS document (represents your own formatting style) into Sigil -- into the Styles directory. Manually reformat the HTML tags in the HTML and where necessary, transferring defined classes from your own CSS style sheet to the Sigil stylesheet -- called stylesheet.css.This isn’t alot of work but you will have to know about HTML and CSS stylesheets. * Once your reformatting and error checking is complete, use Calibre again to convert and test the finished, reformatted, EPUB to Kindle or whatever format and there should be few problems. * You now have an EPUB and Kindle to upload as fully formatted ebooks.. The whole point of the above method was just to find the easiest and quickest way to end up with a Kindle or EPUB ebook ready for upload. I am currently testing another method of converting to Kindle and EPUB: * Import your Word doc(Word 2007) ebook into Adobe InDesign. Use Files/Place and then, to drop the whole file in, hold down the Shift Key and click in the top left of the text block with the mouse. * Format your ebook. * Export using the Kindle or EPUB plugins for InDesign (both free from the internet). * Follow the same Simple Workflow as shown above. Your start point will either be the generated Kindle file or the EPUB file. *InDesign Pros: Nice and quick. Generates everything -- cover image, styles, embedded fonts etc. You can also generate a TOC with Metadata. Another notable point and a big plus -- whenever I’ve tested the EPUB generated from InDesign, there are never any irritating or spurious blank pages generated in the EPUB ebook. *Indesign Cons: InDesign will always introduce defaults that you do not want -- the worst is their default text-indent:4.233mm in Paragraph Styles that will be automatically generated for all the paragraph styles in your document. There is simply no way to get rid of problems like these except by going through each style and manually changing them later in the generated stylesheet in Sigil. |
Thread Tools | Search this Thread |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
OCR to EPUB Best Workflow | Pumpkin Soup | Workshop | 19 | 04-22-2014 03:05 PM |
Workflow - XHTML to mobi to ePub | lissie | Workshop | 7 | 01-23-2013 03:22 AM |
InDesign CS5 to epub and mobi Workflow | nhmuse | ePub | 20 | 03-09-2012 04:07 PM |
Persisting html-to-epub workflow | Chaihana Joe | Calibre | 2 | 01-28-2012 05:37 PM |
Opinion on workflow (and enhancing it) - research-type workflow | TheDarkTrumpet | Which one should I buy? | 8 | 03-02-2009 10:41 AM |