Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > ePub

Notices

Reply
 
Thread Tools Search this Thread
Old 11-14-2013, 11:34 AM   #1
christopher88
Junior Member
christopher88 began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Nov 2013
Location: Bournemouth, UK
Device: Kindle, iPad
Bulk convert HTML characters for epub

Hi there,

I was wondering if anyone knows the best way to batch convert text from a doc file to HTML characters? E.g. for any instance of & to be converted to &?

I’ve tried converting from a txt file through Calibre but I noticed that it didn’t take these into account.

Any help would be greatly appreciated!

Chris
christopher88 is offline   Reply With Quote
Old 11-14-2013, 12:00 PM   #2
Doitsu
Wizard
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 1,691
Karma: 4392001
Join Date: Dec 2010
Device: Kindle 3
You could create a new ePub with Sigil and simply copy and paste the text into the default Section0001.xhtml page in Book View mode. Sigil will automatically convert ampersands and other problematic characters to entities.

AFAIK, Calibre can also convert markdown text files to ePubs.
Doitsu is offline   Reply With Quote
 
Enthusiast
Old 11-14-2013, 12:10 PM   #3
theducks
Grand Sorcerer
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 13,630
Karma: 5126946
Join Date: Aug 2009
Location: The (original) Silicon Valley, USA
Device: Galaxy Tab 2, Astak Pocket Pro, K4NT
IIRC Writer2EPUB handle this when doing a DOC to EPUB save
theducks is offline   Reply With Quote
Old 11-14-2013, 02:11 PM   #4
Toxaris
Wizard
Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.
 
Toxaris's Avatar
 
Posts: 2,749
Karma: 2117329
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-300, PRS-T1
You could also use my macro or add-in.
Toxaris is offline   Reply With Quote
Old 11-15-2013, 05:18 AM   #5
christopher88
Junior Member
christopher88 began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Nov 2013
Location: Bournemouth, UK
Device: Kindle, iPad
Thank you very much for your responses!

I’ve used the Sigil option as I presume that Toxaris, your plugin won’t work on mac?

It does convert characters such as & to & however it doesn’t seem to convert characters: “ ( “ ) ’ ( ’ ) – ( – )

Are these essential for text in pubs or do they not need to be converted?

Thank you.
christopher88 is offline   Reply With Quote
Old 11-15-2013, 05:34 AM   #6
Doitsu
Wizard
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 1,691
Karma: 4392001
Join Date: Dec 2010
Device: Kindle 3
Quote:
Originally Posted by christopher88 View Post
It does convert characters such as & to & however it doesn’t seem to convert characters: “ ( “ ) ’ ( ’ ) – ( – )

Are these essential for text in pubs or do they not need to be converted
AFAIK, only the five pre-defined XML entities (&, <, >, " and ') need to be converted; all other named HTML entities are pre-defined in the xhtml standard.
Doitsu is offline   Reply With Quote
Old 11-15-2013, 06:27 AM   #7
christopher88
Junior Member
christopher88 began at the beginning.
 
Posts: 3
Karma: 10
Join Date: Nov 2013
Location: Bournemouth, UK
Device: Kindle, iPad
Quote:
Originally Posted by Doitsu View Post
AFAIK, only the five pre-defined XML entities (&, <, >, " and ') need to be converted; all other named HTML entities are pre-defined in the xhtml standard.
Excellent! Thank you very much for your help.
christopher88 is offline   Reply With Quote
Old 11-15-2013, 07:50 AM   #8
Toxaris
Wizard
Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.
 
Toxaris's Avatar
 
Posts: 2,749
Karma: 2117329
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-300, PRS-T1
Quote:
Originally Posted by christopher88 View Post
Thank you very much for your responses!

I’ve used the Sigil option as I presume that Toxaris, your plugin won’t work on mac?

It does convert characters such as & to & however it doesn’t seem to convert characters: “ ( “ ) ’ ( ’ ) – ( – )

Are these essential for text in pubs or do they not need to be converted?

Thank you.
The add-in does not work on Mac, the macro does.
Toxaris is offline   Reply With Quote
Old 11-28-2013, 10:49 AM   #9
radius
Lector minore
radius trips the light fantastic.radius trips the light fantastic.radius trips the light fantastic.radius trips the light fantastic.radius trips the light fantastic.radius trips the light fantastic.radius trips the light fantastic.radius trips the light fantastic.radius trips the light fantastic.radius trips the light fantastic.radius trips the light fantastic.
 
radius's Avatar
 
Posts: 327
Karma: 128734
Join Date: Jan 2008
Device: Sony PRS-505, BlackBerry Playbook
Quote:
Originally Posted by Doitsu View Post
Sigil will automatically convert ampersands and other problematic characters to entities.
Hi Doitsu, by "problematic characters", I imagine you are talking about various quotes and brackets? I can see how they could be an issue in tags, but in what way are they and amperands problematic when they are in the body of the text?

Thanks!
radius is offline   Reply With Quote
Old 11-28-2013, 11:13 AM   #10
Doitsu
Wizard
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 1,691
Karma: 4392001
Join Date: Dec 2010
Device: Kindle 3
Quote:
Originally Posted by radius View Post
Hi Doitsu, by "problematic characters", I imagine you are talking about various quotes and brackets? I can see how they could be an issue in tags, but in what way are they and amperands problematic when they are in the body of the text?
Since the five pre-defined XML entities are used to define character entities and tags they need to be "escaped" if they're used as regular characters. I.e. they need to be written as entities. For example, single ampersands need to be inserted as & in Code View mode. (If you enter an ampersand in Book View mode, Sigil'll automatically convert it to an entity.)
Doitsu is offline   Reply With Quote
Old 11-28-2013, 02:14 PM   #11
theducks
Grand Sorcerer
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 13,630
Karma: 5126946
Join Date: Aug 2009
Location: The (original) Silicon Valley, USA
Device: Galaxy Tab 2, Astak Pocket Pro, K4NT
Quote:
Originally Posted by radius View Post
Hi Doitsu, by "problematic characters", I imagine you are talking about various quotes and brackets? I can see how they could be an issue in tags, but in what way are they and amperands problematic when they are in the body of the text?

Thanks!
If using Sigil
Use the Omega sign button tool ,to insert special characters, includes many not on the keyboard

Last edited by theducks; 11-28-2013 at 02:15 PM. Reason: Sigil note
theducks is offline   Reply With Quote
Old 11-29-2013, 11:00 AM   #12
radius
Lector minore
radius trips the light fantastic.radius trips the light fantastic.radius trips the light fantastic.radius trips the light fantastic.radius trips the light fantastic.radius trips the light fantastic.radius trips the light fantastic.radius trips the light fantastic.radius trips the light fantastic.radius trips the light fantastic.radius trips the light fantastic.
 
radius's Avatar
 
Posts: 327
Karma: 128734
Join Date: Jan 2008
Device: Sony PRS-505, BlackBerry Playbook
Oh OK. It's obvious why angle brackets need to be escaped, but I have apostrophes, quotes and maybe even ampersands all over the place in HTML and never realized they might cause a problem. Thanks.
radius is offline   Reply With Quote
Old 11-29-2013, 11:13 AM   #13
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 5,804
Karma: 4027751
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon
Quotes and apostrophes I believe only have to escaped when they are used in some attribute value, as in <h1 title="How to be &quot;smart&quot;">, otherwise they are fine in HTML. Ampersands must be escaped always
Jellby is offline   Reply With Quote
Old 12-02-2013, 12:23 PM   #14
Symmetria
Junior Member
Symmetria began at the beginning.
 
Symmetria's Avatar
 
Posts: 8
Karma: 10
Join Date: Dec 2013
Device: none
This online WYSIWYG html5 compliant editor also automatically converts special characters (like vowels with umlaut, etc) into name references (HTML entities).
http://htmleditor.in/index.html

Paste your code in it whil it's in source mode, turn it to visual mode and back to source mode and they will be converted.
Symmetria is offline   Reply With Quote
Reply

Tags
html characters

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
¿Convert unicode decomposed characters to unique/normal characters? JohnQwerty Calibre 3 04-05-2012 12:08 PM
HTML to Epub conversion dosn`t work because special characters eLit Conversion 2 08-29-2011 02:01 AM
Convert epub to HTML MShroff ePub 6 06-19-2011 05:52 PM
html 2 epub will not convert Amalthia Calibre 2 06-04-2010 12:39 PM
Convert html to epub colly Calibre 9 03-10-2010 10:30 AM


All times are GMT -4. The time now is 05:30 PM.


MobileRead.com is a privately owned, operated and funded community.