![]() |
#1 |
Connoisseur
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 86
Karma: 470352
Join Date: Dec 2012
Device: Kindle Fire, IPad
|
Clean HTML from word For EPub
I have a word doc which I saved as html, I need to know how I can cleanup the html while retaining the same styling that was originally done in word so I can create an Epub in Sigil
Any help? HI |
![]() |
![]() |
![]() |
#2 |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 31,039
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Moderator Notice
This is still not a Sigil question. This is a WORD-EPUB question. Moving to EPUB per your reply to the previous thread |
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
|
You can clean it up manually (there are some guides out there) or use other tooling to create either a clean HTML or create an ePUB out of Word.
|
![]() |
![]() |
![]() |
#4 |
Color me gone
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,089
Karma: 1445295
Join Date: Apr 2008
Location: Central Oregon Coast
Device: PRS-300
|
Toxaris has modestly refrained from tooting his own horn for a macro he has created which is shown in his signature line and which a number have reported useful.
It does NOT have the capacity to change the essential nature of epubs....nothing does. Epub text is reflowable, so if you want something on a specific position on a certain page you are SOL (simply out of luck.) There are other limitations such as the aggravation of tables and the fact it takes a enormous amount of work to index because of all the links which have to be created and indexing to multiple return locations is an invitation to insanity. Sigil has a function which helps with indexing. Everything on epubs will vary on different devices and this is a headache for Hitch who heads a company which produces epubs. For some devices there are fixed layout epubs, which are not full featured. It is a bit like going to beach and complaining that the sand is not solid concrete. |
![]() |
![]() |
![]() |
#5 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
|
I would actually not recommend the macro, but the add-in if you can run it. It offers much more features and will actually allow you to create an ePUB from a Word document, ready to be finalized with Sigil.
|
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Member
![]() Posts: 15
Karma: 48
Join Date: Dec 2011
Device: Kindle4 Touch
|
I use TextPipe to automatize cleaning process.
![]() |
![]() |
![]() |
![]() |
#7 |
mostly an observer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,519
Karma: 987654
Join Date: Dec 2012
Device: Kindle
|
I run my Word docs through word2cleanhtml.com online. Requires a template and preferably a style sheet, both of which are on my blog:
The blog: <a href="http://notjohnkdp.blogspot.com">Notjohn's KDP Guide</a> |
![]() |
![]() |
![]() |
#8 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 11,470
Karma: 13095790
Join Date: Aug 2007
Location: Grass Valley, CA
Device: EB 1150, EZ Reader, Literati, iPad 2 & Air 2, iPhone 7
|
You can use Atlantis Word Processor to read you doc file and create an ePub directly from the app. It will retain your formatting and make a clean ePub. You can read about it in our wiki and I am working on a review AWP Review.
|
![]() |
![]() |
![]() |
#9 |
Wanderer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 106
Karma: 472218
Join Date: Jan 2011
Device: Kindle 3, PaperWhite 2
|
If you are going to use the cleaned up file in Sigil, or another tool that uses external style sheets, the process is quite easy.
Make sure that all (I mean ALL!) text in the Word document has been formatted using styles. Create an externall .css file that contains the styles you used in the Word document. Save the Word document as a filtered web file. Open up the resulting file in an editor and delete everything from <style> to </style>. Insert a link to the stylesheet and you are done! Bob |
![]() |
![]() |
![]() |
#10 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
|
Quote:
|
|
![]() |
![]() |
![]() |
#11 |
Wanderer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 106
Karma: 472218
Join Date: Jan 2011
Device: Kindle 3, PaperWhite 2
|
Toxaris: I have not found that to be true. As long as I do everything using styles, there are now extra spans or fonts, even having Word create the TOC and endnotes.
The only extra thing I deal with is replacing <p class = "msnormal"> with just <p>. I am using Word 2010 on a Windows machine. Bob |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Word macro for clean HTML code | Toxaris | ePub | 135 | 02-28-2015 02:21 AM |
Clean HTML from word | holdit | Workshop | 6 | 10-09-2013 05:20 PM |
How to Clean/Strip HTML from epub file? | Jimbo724 | General Discussions | 9 | 12-12-2012 11:22 AM |
Converting Word-> HTML -> Epub | arturox | Conversion | 37 | 07-18-2012 10:29 AM |
Docvert 2.0 converts MS Word files to clean HTML | Alexander Turcic | Lounge | 0 | 03-16-2006 04:50 AM |