Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 01-27-2013, 12:26 PM   #1
jackibar
Enthusiast
jackibar began at the beginning.
 
Posts: 38
Karma: 12
Join Date: May 2010
Device: iPhone apps
Converting to a Clean html File

What is the easiest/best way to convert a book I've created in Pages (Mac) to a clean html file to then create my ePub in Sigil? In the past, I've manually taken the text file output and gone through each paragraph and adding back in the bold, italics, etc., but this book is 400 pages and has a LOT of formatting and that would just take forever!

I've tried the export to ePub feature from Pages, but I hate the way it outputs the file (very messy and lots of unnecessary css code) - I *could* take that file and clean it up if I have to...

But I'm wondering if there's any way to either copy and paste into Sigil and keep the formatting (just the basic formatting - bold, italic, underline) - or to import a file into Sigil and still have that formatting intact?

Thanks so much for any help!
jackibar is offline   Reply With Quote
Old 01-27-2013, 01:31 PM   #2
Turtle91
Guru
Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.
 
Turtle91's Avatar
 
Posts: 669
Karma: 3807234
Join Date: Dec 2012
Location: Shannon, Ireland today
Device: iPhone 5/iPad 1&2/Surface Pro/Kindle PW
Hey Jackibar!

The first step in making things easier is to ditch the Mac.... lol


Seriously though, there are a few program's that people have written to do what you suggest. There is one that you can find on this thread HERE that works very well for "Word on a PC"...but I'm not sure if they have been tried on the Mac...I have looked over the code on the linked macro and I didn't see anything that stood out as bad. You may have some luck with it - are you any good at transcribing code??

There are also a some more editing tools that I have seen discussed in this thread HERE...but I haven't used them.

If you have some knowledge/experience with search/replace in Pages you can:
- replace the end paragraph markers with </p><p>
- replace anything formatted as bold/italics with the proper <b><i> tags
- replace lines formatted as headers with the appropriate <h> tag

Then SelectAll from the main screen and copy/paste into your favorite text editor or directly into an html page in Sigil. That keeps the html tags but gets rid of ALL the extraneous codes.

I hope that helps!
Turtle91 is offline   Reply With Quote
Old 01-27-2013, 03:23 PM   #3
st_albert
Fanatic
st_albert calls his or her ebook reader Vera.st_albert calls his or her ebook reader Vera.st_albert calls his or her ebook reader Vera.st_albert calls his or her ebook reader Vera.st_albert calls his or her ebook reader Vera.st_albert calls his or her ebook reader Vera.st_albert calls his or her ebook reader Vera.st_albert calls his or her ebook reader Vera.st_albert calls his or her ebook reader Vera.st_albert calls his or her ebook reader Vera.st_albert calls his or her ebook reader Vera.
 
Posts: 544
Karma: 64420
Join Date: Feb 2010
Device: none
possibly another suggestion. Can pages output other formats besides epub? If you could save the book as an rtf file, for example, you could then use LibreOffice (or OpenOffice) with the writer2xhtml extension to output a fairly decent epub that you could easily clean up in Sigil as necessary.

NB: writer2xhtml is a part of the writer2latex package at the above link.

All of the above software is free-as-in-speech, as far as I know.
st_albert is offline   Reply With Quote
Old 01-27-2013, 06:41 PM   #4
jackibar
Enthusiast
jackibar began at the beginning.
 
Posts: 38
Karma: 12
Join Date: May 2010
Device: iPhone apps
Quote:
Originally Posted by Turtle91 View Post
Hey Jackibar!

The first step in making things easier is to ditch the Mac.... lol
Nope, not gonna happen

Quote:
Originally Posted by Turtle91 View Post
Seriously though, there are a few program's that people have written to do what you suggest. There is one that you can find on this thread HERE that works very well for "Word on a PC"...but I'm not sure if they have been tried on the Mac...I have looked over the code on the linked macro and I didn't see anything that stood out as bad. You may have some luck with it - are you any good at transcribing code??
The newest version actually does say it works on a Mac... But yes, I'm good at coding and tinkering and love to do so

I downloaded and got it installed but when I ran it, it gave a compile error, so not sure what it's mad about...! I'll try some of the other tools recommended and see what I can come up with. Thanks again for the help!
jackibar is offline   Reply With Quote
Old 01-27-2013, 06:43 PM   #5
jackibar
Enthusiast
jackibar began at the beginning.
 
Posts: 38
Karma: 12
Join Date: May 2010
Device: iPhone apps
Quote:
Originally Posted by st_albert View Post
possibly another suggestion. Can pages output other formats besides epub? If you could save the book as an rtf file, for example, you could then use LibreOffice (or OpenOffice) with the writer2xhtml extension to output a fairly decent epub that you could easily clean up in Sigil as necessary.
Thanks for the tip - yes, Pages will export to pretty much any format - Word, ePub, PDF, RTF, and plain text... I saved as RTF and tried pulling that straight into Sigil, but it was way too messy and didn't maintain the bold and italics...

I hadn't heard of the other resources you mentioned, so I'll give that a try... If all else fails, I can always do the search/replace method - was just wondering if there was a more automated way of getting this done!
jackibar is offline   Reply With Quote
Old 01-30-2013, 11:39 AM   #6
Notjohn
Addict
Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.
 
Posts: 220
Karma: 246332
Join Date: Dec 2012
Device: Kindle
I believe Pages will save a file in *.doc format. I've never tried this (because I don't use Mac OS) but I have run an Open Office *.doc file through word2cleanhtml.com and it cleaned up nicely. Be worth a try, surely.
Notjohn is offline   Reply With Quote
Old 01-30-2013, 11:07 PM   #7
Hitch
Bookmaker & Cat Slave
Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.
 
Hitch's Avatar
 
Posts: 2,504
Karma: 13830371
Join Date: Apr 2010
Location: Phoenix, AZ
Device: Kindle2, iPad, KindleFire and NookColor
Quote:
Originally Posted by jackibar View Post
Thanks for the tip - yes, Pages will export to pretty much any format - Word, ePub, PDF, RTF, and plain text... I saved as RTF and tried pulling that straight into Sigil, but it was way too messy and didn't maintain the bold and italics...

I hadn't heard of the other resources you mentioned, so I'll give that a try... If all else fails, I can always do the search/replace method - was just wondering if there was a more automated way of getting this done!
When you say you "saved as RTF, and tried pulling that straight into Sigil," I assume that you exported the HTML first? I mean, rather than cut-and-paste into BV? I've had good luck with RTF's, overall, and I am 99% sure that I've used the Mini-Mac sitting on my desk to export a Pages file to doc/RTF and kept the individual text formatting, so I'm surprised to read you say that.

I've also been surprised at how clean an exported ePUB was from a very simple Pages document I was sent (as an experiment). It was poetry, and while the output was far from perfect, it would have been super-fast to regex it and have the book done and dusted. I didn't see anything untoward about the ePUB, given that it came from what is basically a word-processor. I think the output for clean-up is six of one, half-dozen of another, considering it will have to be cleaned either as HTML or XHTML. {shrug}. Just my $.02.

Hitch
Hitch is offline   Reply With Quote
Old 01-31-2013, 09:25 AM   #8
Notjohn
Addict
Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.
 
Posts: 220
Karma: 246332
Join Date: Dec 2012
Device: Kindle
Quote:
Originally Posted by Hitch View Post
I've also been surprised at how clean an exported ePUB was from a very simple Pages document I was sent (as an experiment).
So you think it is better to export the Pages document as an epub, open it in Sigil, and work on it there?

(As opposed to saving it as a *.doc file, then opening it say in Open Office and following one of the more traditional routes to generating html?)
Notjohn is offline   Reply With Quote
Old 01-31-2013, 09:39 AM   #9
Turtle91
Guru
Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.
 
Turtle91's Avatar
 
Posts: 669
Karma: 3807234
Join Date: Dec 2012
Location: Shannon, Ireland today
Device: iPhone 5/iPad 1&2/Surface Pro/Kindle PW
+1 for Hitch's rec. If Pages puts out anywhere near a clean-ish ePub then putting that through Sigil is probably the best/fastest route. Sigil has pretty much replaced my normal text editor for cleaning up the HTML (that took a bunch of recreating saved regex's) and since it keeps everything packaged in a compliant ePub it is much more convenient.
Turtle91 is offline   Reply With Quote
Old 01-31-2013, 03:58 PM   #10
Hitch
Bookmaker & Cat Slave
Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.Hitch ought to be getting tired of karma fortunes by now.
 
Hitch's Avatar
 
Posts: 2,504
Karma: 13830371
Join Date: Apr 2010
Location: Phoenix, AZ
Device: Kindle2, iPad, KindleFire and NookColor
Quote:
Originally Posted by Notjohn View Post
So you think it is better to export the Pages document as an epub, open it in Sigil, and work on it there?

(As opposed to saving it as a *.doc file, then opening it say in Open Office and following one of the more traditional routes to generating html?)
nj:

(Between you and Bear, this place is starting to feel very familiar, LOL)...yes, generally, I think I would. I'm not going to state it categorically, because my experience is limited. I don't much care for the Mac environment, so have not spent a lot of time banging around in it. However, when I received that poetry in Pages, and had to open it over on the Mac, while I was exporting the .doc format, I thought, "what the hell" and tried other export options. The ePUB output from the original Pages file was remarkably clean. It would have been a matter of minutes (because we already have the relevant CSS ready-to-go) to rename the elements as needed, marry it to our House CSS and have it done. Now, I want to iterate that it was an extremely simple file, just one line of text, followed by "enter" after another, so it wasn't a complicated test.

To me, either it would have been fast to a) do it in Sigil or even b) explode the exported ePUB, open all the html files in NT Pro, and regex it either way. I don't see that going Pages-->Doc-->HTML-->Sigil would have been faster. Now, that's someone who does this all the time speaking. I'm not sure it would be more intuitive for a noob DIY'er, like the folks on the KDP. That's my caveat, here. It's fine if you know HTML/XHTML/Regex. Probably not so easy if one doesn't, or needs to upload Word or HTML at the KDP.

Hitch
Hitch is offline   Reply With Quote
Old 01-31-2013, 04:05 PM   #11
Turtle91
Guru
Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.Turtle91 ought to be getting tired of karma fortunes by now.
 
Turtle91's Avatar
 
Posts: 669
Karma: 3807234
Join Date: Dec 2012
Location: Shannon, Ireland today
Device: iPhone 5/iPad 1&2/Surface Pro/Kindle PW
Quote:
(Between you and Bear, this place is starting to feel very familiar, LOL)
lol!!
Turtle91 is offline   Reply With Quote
Old 02-01-2013, 03:01 AM   #12
Pranananda
Connoisseur
Pranananda can see what is invisible to the naked eye.Pranananda can see what is invisible to the naked eye.Pranananda can see what is invisible to the naked eye.Pranananda can see what is invisible to the naked eye.Pranananda can see what is invisible to the naked eye.Pranananda can see what is invisible to the naked eye.Pranananda can see what is invisible to the naked eye.Pranananda can see what is invisible to the naked eye.Pranananda can see what is invisible to the naked eye.Pranananda can see what is invisible to the naked eye.Pranananda can see what is invisible to the naked eye.
 
Pranananda's Avatar
 
Posts: 97
Karma: 115862
Join Date: Apr 2010
Location: Humboldt County, California
Device: ipad, iPod touch, JetBook Lite
TextEdit puts out clean HTML. If the paragraph and font formatting is complex, you can remove the multiple fonts and multiple font-sizes by selecting the entire text and using the tools on the toolbar to change all the fonts to the same font (but preserving the italic and bold), as well as changing all the font sizes to just one size (or leaving the sizes all the same). Plus, you can examine the HTML code in TextEdit by changing the the preferences to show the HTML codes versus rendering the rich text. Afterwords, you can hand edit the HTML code, and just remove the font property altogether.

Also you can make all the paragraph styles the same using the Copy Ruler / Paste Ruler commands found under Format->Text. This ought to decrease the number of styles in your book down to 1 or 2.
Pranananda is offline   Reply With Quote
Old 02-02-2013, 06:57 AM   #13
exaltedwombat
Evangelist
exaltedwombat ought to be getting tired of karma fortunes by now.exaltedwombat ought to be getting tired of karma fortunes by now.exaltedwombat ought to be getting tired of karma fortunes by now.exaltedwombat ought to be getting tired of karma fortunes by now.exaltedwombat ought to be getting tired of karma fortunes by now.exaltedwombat ought to be getting tired of karma fortunes by now.exaltedwombat ought to be getting tired of karma fortunes by now.exaltedwombat ought to be getting tired of karma fortunes by now.exaltedwombat ought to be getting tired of karma fortunes by now.exaltedwombat ought to be getting tired of karma fortunes by now.exaltedwombat ought to be getting tired of karma fortunes by now.
 
Posts: 444
Karma: 1703930
Join Date: Nov 2011
Device: none
If you're writing a book from scratch, there's a lot to be said for starting off in Sigil. You'll get the ultimate "Clean code", just paragraphs and maybe a Header style for chapter titles. Which is just about all it's sensible to ask of an eBook, though you can play around complicating it if you like!

If you want a printed version, Ctrl-A in Page View of each chapter followed by Ctrl-C and then Ctrl-V into Word doesn't take long. Then you can add page layout as much as you wish, in a medium that will take some notice of it!

(Translate all that into Mac if necessary.)

Last edited by exaltedwombat; 02-02-2013 at 06:59 AM.
exaltedwombat is offline   Reply With Quote
Old 02-04-2013, 09:15 AM   #14
Notjohn
Addict
Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.Notjohn ought to be getting tired of karma fortunes by now.
 
Posts: 220
Karma: 246332
Join Date: Dec 2012
Device: Kindle
I don't suppose the developers would be willing to add the basic WordStar formatting commands to Sigil?

I am at this moment writing a book in WordStar "non-document" (text) mode with the extension *.htm. I'd be happy to cut out the middleman!
Notjohn is offline   Reply With Quote
Old 02-04-2013, 09:39 AM   #15
theducks
Grand Sorcerer
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 14,854
Karma: 5654321
Join Date: Aug 2009
Location: (The original) Silicon Valley, USA
Device: Galaxy Tab 2, Astak Pocket Pro, K4NT
Wordstar. I have not heard that name in a long time.

There is a set of WS 5.5 Floppies around here somewhere , problem is, they are 5.25"
theducks is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
How to Clean/Strip HTML from epub file? Jimbo724 General Discussions 9 12-12-2012 11:22 AM
Converting Mobi or HTML file to Epub Patuba Sigil 1 07-23-2011 04:14 PM
Converting Mobi or HTML file to Epub Patuba ePub 7 07-19-2011 12:11 PM
Need help converting file which is too long to be HTML ficbot Workshop 8 04-06-2010 11:45 PM
converting lit html output into one big file for BD Dave Berk Sony Reader 15 03-29-2007 10:02 PM


All times are GMT -4. The time now is 08:06 AM.


MobileRead.com is a privately owned, operated and funded community.