View Single Post
Old 12-22-2008, 02:48 PM   #1
ProDigit
Karmaniac
ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.
 
Posts: 2,553
Karma: 11499146
Join Date: Oct 2008
Location: Miami FL
Device: PRS-505, Jetbook, + Mini, +Color, Astak Ez Reader Pro, PPW1, Aura H2O
My share on making an Ebook! For Easy - Medium - and semi-advanced books.

Hi,

I'm busy for quite some time now with searching for programs that will allow me to work quick and swiftly to create Ebooks.

Most of the time, I use a MS Word 2007 edition,to remove gutenberg or other website's headers/footers, check for spelling and grammar errors, save the book as a html, and import it to Book Designer to further edit it to make my ebook file from it.
Also, in 'Word 2007',there is an option to click AutoFormat (Ctrl+ALT+K).
Sometimes txt files get converted in one click to a document with titles and paragraphs. I even once had, that Word automatically corrected manual line breaks or paragraph brakes, and made the text like it should (without a sentence being divided by a hard enter).

I'd also recommend to check the document in Word in Web-layout, as occasional hard enter errors are easier to be noted there then in draft or print view.

Generally in bookdesigner I remove empty lines, update page/line and paragraph breaks,set subtitles and insert a coverpicture or occasional additional pictures. Then I create a TOC (table of Contents), and export it to LRF.
Though BD also supports other formats (like LIT, PRC and others..)

Generally this will do for the majority of the books available, and is in my eyes about the simplest way to create an ebook. This may differ for the file formats used. I use the Sony reader (PRS-505), so I need to use BD.
It would be nice to hear from other users who convert to other formats



Next are little more advanced tasks:

There are some books with endnotes or references (which I replace with Word 'ENDNOTES'). Footnotes don't always work.
Usually books with a reference-index on the bottom (reference of words used & their page) I remove with Word, since it makes no sense to have the page number in a document with another pagesize and numbering.

Semi - Advanced 1
But now I'm busy for several days with the Old testament Bible.
Just like with maybe some scientific books, it uses a lot of links, that will enable you to faster access chapters, or crosslink.
Because of the size of the book, I will manually create a TOC in MS WORD,and disable the BookDesigner TOC.

TOC's, hyperlinks, crosslinks and stuff are in Word, best inserted using 'bookmarks'.
Then you can hyperlink text to bookmarks.
Every link, but endnotes, are best inserted via bookmarks.
You can also hyperlink a word directly to a chapter title, or subtitle, but that seems to cause some issues when you modify the document.
Also, Word creates a bookmark for every link you click, and often those bookmarks make no sense using names like "_Hlt217721467", are hard to trace back in case you need to lookup a link. Name your bookmarks a logical name (like eg a title, or the name of a word/sentence/picture it is referring to)
Also the automatic creation of links in a MS Word document, will only make the file larger, slower to convert in bookdesigner, and may even cause trouble when creating a book that already contains a lot of links.

Therefor, make sure that before importing the file into BookDesigner, you remove bookmarks automatically created by Word, when clicking a link inside of a document.
MS Word seems to add bookmarks automatically every time a link is clicked. (you can edit by clicking 'CTRL+SHIFT+F5' in word).


So far I have had no success using MS Word 2007 Crosslinks.
I don't know if that option is available in previous Word editions, but it is in the 2007 edition with which I work.

Since the Bible is one of the books with the most chapters around, it is very painful to create all these hyperlinks and bookmarks, so I've been searching for some ways to do them semi-automated.

I use a lot of 'search & replace', and macro's in word. They help me reduce the workload tons!

But certain links (like 'Next Chapter', and 'previous chapter', which are not necessary for a normal book, but is for the bible with way over 400 chapters), you can not automatically add.
I noted I needed an editor, with good 'search and replace' functionality, and so far I've found a couple of programs which I currently am testing.

I previously tried to edit my html files with notepad; but it seems notepad sometimes leaves traces in the document which are unwanted.
Especially when editing .doc files; plus, when the .doc file gets larger than 1MB in size, searching and replacing in notepad goes painfully slow; even on a 1,66Ghz Dual core machine).
Later on I found, that editing .doc files makes little sense, since bookdesigner converts Word documents directly to .htm files; plus htm files are easier to edit.
So I searched for a Hex/Bin editor.
Usually editors of this kind don't leave any traces, but it seemed that I could not insert or delete a character in the Hex editor programs I had.

Learning that BD creates htm documents, I decided I needed to edit the Html version.
MS Word, no matter how good of a program it is, would not allow me to see a text based version of an HTML editor, I decided to go for an external HTML editor (not related to/build in Windows or Word).
I tried several programs, including the painful and slow old program 'Edit.com' in the Windows Command.
Though Edit works fine, there are issues with copy/paste, lines that won't fit the screen, and there is no 'undo'.

In the searching process I also noted BD has an internal HTM editor,however when using this editor, it would change links and make them larger, like eg: link: '#_Chapter_1' would be in BD converted to '#_file_location/filename/Chapter_1' or something...

So now I'm settled with Microsoft Visual Web Developer 2008 Express Edition.
Having seen their MS Visual Basic 2008 Edition, and MS Visual C#/C++ EditionI can say I'm quite impressed with the program.
It is a text based html editor, that organizes and colors all the data inside an HTM and HTML file to better edit it.
However loading a large file goes very slow.

Microsoft Visual Web Developer 2008 Express Edition is a free version for download here
You'll need to register it (free) within 30 days, and the program will work fine after that.

Via this way I want to ask some forum members who have experience with text based html editors, which they would recommend. Obviously it needs to be free, and create an HTM(L) document without trouble; but so far I think this program,with build in macro would be close to the best available

So after some testing, I've looked inside of the HTML created from Word, and it seems Word puts a lot of data there that is totally unnecessary (like headings and fonts I never used in the document!).
Is there anyone who knows if Bookdesigner automatically removes those?

Semi Advanced 2
All the above works fine for converting already existing electronic textbooks.
If you have a P-book you want to convert, there are some additional steps to take prior to the editing/laying-out of a book.
You'll need a scanner, or documents that are scanned in image form and preferably have a higher than 200dpi resolution.
Then you'll need a program with OCR support, that can read,recognize,and convert the text.
Programs with OCR support are generally very expensive;around the 350-400 dollars region.
I can not help you on finding a free version,since I tried 2 trial versions that would cost me money, and I found that even they still had a lot of errors.
Out of the 2 programs tried, I would recommend you the Adobe Acrobat 9 program!

It scans your document pretty fast, and the OCR recognizes text better than with the other version I tried.

After OCR is completed, copy paste the text from Acrobat into a Word document.

In Word itself, the first thing to do is to find all the titles,and paragraphs, and set their style. (meaning change their color, size, and font type)
I don't know why but it seems when scanning a text there still remain a lot of errors in font formatting.

Next is further converting your text manually.
I would suggest to only use 'search and replace' in word,and not to replace each word individually.
The reason being: OCR does make the same mistake repeatedly in a document.
With search and replace you can find all text strings like eg: "'IIne", and replace them with the word "The" (the word which it should be).

Working this way would drastically reduce the workload the further you get through the document,since all previous errors are already corrected.

You can also try to find strings like for instance if the OCR spelled the word "workable" as "workalole" (a 'b' was mistranslated to 'lo')you can search and replace "alole".
In this case words like "do-able", "able", "workable" and "table" get corrected as well.

But don't correct too small strings like for instance in above example: change "lo" to "b" as it could change a word "hello" into "helb".

After the manual conversion has completed, the same steps of above need to be taken.



------------------------------------------------
the 'above this line' topic will probably be a good guide for many starters on how to create your first ebook for the Sony reader, and perhaps other formats as well!.
If anyone cares,he may copy-paste it in full or part to the wiki, if it's found to have enough value.
------------------------------------------------

Last edited by ProDigit; 12-22-2008 at 03:18 PM.
ProDigit is offline   Reply With Quote