![]() |
#1 | |
Hmm.
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 124
Karma: 2016606
Join Date: Oct 2015
Device: Android 4.2 Google Play Reader
|
How to allow extended ASCII German characters in my EPUB?
I'm writing a tool to make basic EPUB2 books from a configuration file, and Markdown files (which are converted to XHTML). I didn't find anything else that suited my needs as I have large text files I want to convert to EPUB, and add basic formatting to them.
Currently, the Firefox plugin EPUBReader, which I use to test with, seems to choke on ASCII characters over 127. How do I allow ASCII characters and German characters in my EPUB? Do I have to change something in the XHTML header? I can write my software to change these extended ASCII characters to Unicode but EPUBreader seems to produce an error with any unicode also, except named entities. The ERROR I'm getting from EPUBReader about the German characters is: Code:
XML Parsing Error: not well-formed Location: file:///C:/Users/XXX/AppData/Roaming/Mozilla/Firefox/Profiles/0ddipa6u.default/epub/54/OEBPS/Text/00intro.xhtml#H2_00intro_00003 Line Number 27, Column 4: The header for each XHTML file is this: Quote:
Last edited by crankypants; 11-10-2015 at 09:46 AM. Reason: added code tags to cope with very long line |
|
![]() |
![]() |
![]() |
#2 |
The Grand Mouse 高貴的老鼠
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 73,584
Karma: 315126578
Join Date: Jul 2007
Location: Norfolk, England
Device: Kindle Oasis
|
You must specify the character set used in the XHTML files properly. This is most easily done with unicode text, IMO.
You could make one sample ePub using your converted Unicode text and something like Sigil, to ensure that you're creating your XHTML and the rest of the ePub with the proper declarations and metadata. |
![]() |
![]() |
![]() |
#3 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
|
You must use UTF-8 for that. All reading systems will expect that. The error you mention is because another error probably.
|
![]() |
![]() |
![]() |
#4 | ||
Hmm.
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 124
Karma: 2016606
Join Date: Oct 2015
Device: Android 4.2 Google Play Reader
|
Quote:
What else do I need to do? Modifiy some of the other files like content.opf? This is the top of my content.opf file which already specifies utf8. Quote:
Last edited by crankypants; 11-10-2015 at 09:50 AM. |
||
![]() |
![]() |
![]() |
#5 |
The Grand Mouse 高貴的老鼠
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 73,584
Karma: 315126578
Join Date: Jul 2007
Location: Norfolk, England
Device: Kindle Oasis
|
If you specify UTF-8, the text must be UTF-8. Not a high-ASCII German encoding.
|
![]() |
![]() |
![]() |
#6 |
Hmm.
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 124
Karma: 2016606
Join Date: Oct 2015
Device: Android 4.2 Google Play Reader
|
I believe Perl 5.18 has a mode for writing utf-8. I'll look into it. Thanks.
|
![]() |
![]() |
![]() |
#7 | |
Hmm.
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 124
Karma: 2016606
Join Date: Oct 2015
Device: Android 4.2 Google Play Reader
|
It worked! Thank you. Here's the Perl open statement I used:
Quote:
|
|
![]() |
![]() |
![]() |
#8 |
Zealot
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 142
Karma: 669192
Join Date: Nov 2013
Device: Kindle 4.1.1 no touch
|
You could have simply used any texteditor which allows to change a text's encoding like (I guess) Notepad+ or (I know) jEdit.
|
![]() |
![]() |
![]() |
#9 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 631
Karma: 7544528
Join Date: Apr 2013
Location: Berlin
Device: PRS 350, Kobo Aura
|
Not to discourage you to write your own script, but maybe you want to look at calibre or pandoc to do what you want. With pandoc you can make whatever you want out of markdown. Epub, html, docx, latex, excellent pdfs via latex (without needing to know anything about latex) etc.
|
![]() |
![]() |
![]() |
Tags |
ascii, epub, german |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Extended ASCII characters in txt file | pshute | Conversion | 10 | 02-28-2012 06:57 AM |
Non-ASCII characters in recipe titles show as ü | bubak | Recipes | 2 | 11-30-2011 07:49 AM |
Converting non-ASCII characters | davidnye | Recipes | 0 | 08-20-2011 07:16 PM |
advanced text search and non-ascii characters | msz59 | General Discussions | 0 | 05-05-2011 09:47 AM |
Typing non-ASCII characters with the keyboard | Edmundo | Amazon Kindle | 5 | 01-20-2011 01:18 PM |