![]() |
#1 |
Connoisseur
![]() Posts: 70
Karma: 12
Join Date: Apr 2010
Location: Pittsburgh area
Device: prs-505,900,T2
|
txt/epub - Guillemets conversion problem
Whenever I have text with guillemets ( » « ) and convert to epub I am getting small t & capital T with caron [ ť Ť ] instead. Is there a setting I am missing?
|
![]() |
![]() |
![]() |
#2 | |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 30,990
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Quote:
In Look and Feel options (on the convert screen) be sure the Code page is set to one that includes them. Usually it is blank and Calibre guesses. This time it got it wrong ![]() |
|
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Connoisseur
![]() Posts: 70
Karma: 12
Join Date: Apr 2010
Location: Pittsburgh area
Device: prs-505,900,T2
|
|
![]() |
![]() |
![]() |
#4 |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 30,990
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
|
![]() |
![]() |
![]() |
#5 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 6,251
Karma: 16539642
Join Date: Sep 2009
Location: UK
Device: ClaraHD, Forma, Libra2, Clara2E, LibraCol, PBTouchHD3
|
Try to make sure your TXT file is saved in utf-8 format. Notepad++ (free) should be able to do that.
If Calibre doesn't guess correctly when you convert txt-to-epub then make sure you set Convert - Look&Feel - Input character encoding to utf-8 so Calibre won't have to guess. |
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Connoisseur
![]() Posts: 70
Karma: 12
Join Date: Apr 2010
Location: Pittsburgh area
Device: prs-505,900,T2
|
I will have to do a better job of keeping track of what I am actually doing. There are several steps involved before I even get to the Calibre conversion.
Actually I would much rather keep it Latin1/ANSI. But what I don't understand, since the guillemets are part of ansi/latin1, why I have to say anything at all. All the other characters make it fine. |
![]() |
![]() |
![]() |
#7 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 6,251
Karma: 16539642
Join Date: Sep 2009
Location: UK
Device: ClaraHD, Forma, Libra2, Clara2E, LibraCol, PBTouchHD3
|
If you're sure that the TXT is in latin1 encoding then try setting Convert - Look&Feel - Input character encoding to latin1, again so Calibre won't have to guess. It will convert your source to utf-8 during the conversion to epub.
|
![]() |
![]() |
![]() |
#8 | ||
Connoisseur
![]() Posts: 70
Karma: 12
Join Date: Apr 2010
Location: Pittsburgh area
Device: prs-505,900,T2
|
Sorry it took so long to get back. Let me try to explain it different:
If I make a short test txt file (ANSI encoding using Notepad) and copy the following text into it: Quote:
Quote:
Do you get the same? |
||
![]() |
![]() |
![]() |
#9 |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 30,990
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Try HTML entities « and »
|
![]() |
![]() |
![]() |
#10 | |
Connoisseur
![]() Posts: 70
Karma: 12
Join Date: Apr 2010
Location: Pittsburgh area
Device: prs-505,900,T2
|
I need to maintain the txt file and I consider it worse having:
Quote:
I do not see a way to edit ePub within Calibre, and if I understand you right I should make an html file first and from that the ePub. That is what I have been doing all along, not using Calibre though, but I thought Calibre could go direct from a txt file to ePub. Am I wrong? |
|
![]() |
![]() |
![]() |
#11 | |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 30,990
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Quote:
![]() But there are exceptions (especially when you step outside ASCII).You may have found one ![]() |
|
![]() |
![]() |
![]() |
#12 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
|
I'm not sure why you have a preference for ANSI - ANSI means it will only work on your computer and people who change Windows to say it's in the same country as your computer. UTF8 will guarantee it will work on any computer on the planet.
Sometimes I want to kill/throttle whatever idiots at Microsoft insist on neverending support and propagation their asinine default ANSI settings - it's 2011 and those idiots are the number one reason this question comes up over and over again for well over a decade now. |
![]() |
![]() |
![]() |
#13 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
|
btw, if you save as ANSI the encoding in Calibre will definitely not be UTF-8. And I'm guessing Latin 1 may be iso-8859-1 - the problem with using ANSI is unless you're familiar with what ANSI is for your local country you need to play this guessing game. You also need to know what Python's name for the encoding is (and since we don't know what encoding you're using it's difficult for us to tell you.)
|
![]() |
![]() |
![]() |
#14 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,289
Karma: 27111240
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
Generally speaking, windows' determination to not break backward compatibility is to be admired, but in this instance, I agree with ldolse, the crazy mix of encodings that windows systems typically operate under causes endless headaches.
|
![]() |
![]() |
![]() |
#15 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
|
And in this day and age I can't actually imagine that 'anything' would break... It's not like ANSI support would go away, they'd just change their save dialogs to default to UTF8 and all newly saved text files going forward would suddenly have a BOM and work on every PC on the planet - files without a BOM will still open as ANSI....
The occasional Windows 95 era program that actually didn't understand a UTF-8 BOM would still render 98% of the text correctly for a western language.. I imagine in Asia there may be some workflows that assume gb2312/shift-JIS/big5 etc as the default, but you could even keep a registry setting around for people who insisted on staying in the dark ages. |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Txt->EPUB Conversion Issue | nrich61 | Conversion | 7 | 07-07-2011 06:03 PM |
TXT to EPUB conversion issue | rockeh | Conversion | 2 | 03-18-2011 10:50 AM |
Preserving <br /> on epub -> txt conversion | billingd | Calibre | 1 | 08-11-2010 06:24 AM |
Conversion: EPUB to TXT | Starson17 | Calibre | 11 | 05-29-2010 12:31 PM |
TXT conversion to ePub or LRF - paragraph formatting | Zapped | Calibre | 6 | 10-23-2009 05:06 PM |