Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Conversion

Notices

Reply
 
Thread Tools Search this Thread
Old 10-16-2011, 01:04 PM   #1
julo
Connoisseur
julo began at the beginning.
 
Posts: 70
Karma: 12
Join Date: Apr 2010
Location: Pittsburgh area
Device: prs-505,900,T2
txt/epub - Guillemets conversion problem

Whenever I have text with guillemets ( » « ) and convert to epub I am getting small t & capital T with caron [ ť Ť ] instead. Is there a setting I am missing?
julo is offline   Reply With Quote
Old 10-16-2011, 01:10 PM   #2
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 30,990
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by julo View Post
Whenever I have text with guillemets ( » « ) and convert to epub I am getting small t & capital T with caron [ ť Ť ] instead. Is there a setting I am missing?
TXT files don't always contain which code page (character set) was used

In Look and Feel options (on the convert screen) be sure the Code page is set to one that includes them. Usually it is blank and Calibre guesses. This time it got it wrong
theducks is offline   Reply With Quote
Advert
Old 10-16-2011, 03:03 PM   #3
julo
Connoisseur
julo began at the beginning.
 
Posts: 70
Karma: 12
Join Date: Apr 2010
Location: Pittsburgh area
Device: prs-505,900,T2
Quote:
Originally Posted by theducks View Post
...Usually it is blank and Calibre guesses. This time it got it wrong
Thanks for the answer. Is there any way to find out what Calibre guessed?
julo is offline   Reply With Quote
Old 10-16-2011, 03:57 PM   #4
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 30,990
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by julo View Post
Thanks for the answer. Is there any way to find out what Calibre guessed?
Only if you had the Debug mode (must be turned on) log from when you converted. Then you would have to wade through (and know what you are looking at/for )
theducks is offline   Reply With Quote
Old 10-17-2011, 12:41 PM   #5
jackie_w
Grand Sorcerer
jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.
 
Posts: 6,251
Karma: 16539642
Join Date: Sep 2009
Location: UK
Device: ClaraHD, Forma, Libra2, Clara2E, LibraCol, PBTouchHD3
Try to make sure your TXT file is saved in utf-8 format. Notepad++ (free) should be able to do that.

If Calibre doesn't guess correctly when you convert txt-to-epub then make sure you set Convert - Look&Feel - Input character encoding to utf-8 so Calibre won't have to guess.
jackie_w is offline   Reply With Quote
Advert
Old 10-17-2011, 09:01 PM   #6
julo
Connoisseur
julo began at the beginning.
 
Posts: 70
Karma: 12
Join Date: Apr 2010
Location: Pittsburgh area
Device: prs-505,900,T2
I will have to do a better job of keeping track of what I am actually doing. There are several steps involved before I even get to the Calibre conversion.
Quote:
Originally Posted by jackie_w View Post
Try to make sure your TXT file is saved in utf-8 format....
Actually I would much rather keep it Latin1/ANSI. But what I don't understand, since the guillemets are part of ansi/latin1, why I have to say anything at all. All the other characters make it fine.
julo is offline   Reply With Quote
Old 10-17-2011, 09:27 PM   #7
jackie_w
Grand Sorcerer
jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.
 
Posts: 6,251
Karma: 16539642
Join Date: Sep 2009
Location: UK
Device: ClaraHD, Forma, Libra2, Clara2E, LibraCol, PBTouchHD3
Quote:
Originally Posted by julo View Post
Actually I would much rather keep it Latin1/ANSI. But what I don't understand, since the guillemets are part of ansi/latin1, why I have to say anything at all. All the other characters make it fine.
If you're sure that the TXT is in latin1 encoding then try setting Convert - Look&Feel - Input character encoding to latin1, again so Calibre won't have to guess. It will convert your source to utf-8 during the conversion to epub.
jackie_w is offline   Reply With Quote
Old 11-07-2011, 05:25 PM   #8
julo
Connoisseur
julo began at the beginning.
 
Posts: 70
Karma: 12
Join Date: Apr 2010
Location: Pittsburgh area
Device: prs-505,900,T2
Sorry it took so long to get back. Let me try to explain it different:

If I make a short test txt file (ANSI encoding using Notepad) and copy the following text into it:
Quote:
This is a test text using »guillemets«.
I get the following output in ePub:
Quote:
This is a test text using ťguillemetsŤ.
I tried convert settings of blank, ascii, cp1252, latin1 and utf-8.

Do you get the same?
julo is offline   Reply With Quote
Old 11-07-2011, 06:48 PM   #9
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 30,990
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Try HTML entities « and »
theducks is offline   Reply With Quote
Old 11-07-2011, 08:08 PM   #10
julo
Connoisseur
julo began at the beginning.
 
Posts: 70
Karma: 12
Join Date: Apr 2010
Location: Pittsburgh area
Device: prs-505,900,T2
Quote:
Originally Posted by theducks View Post
Try HTML entities « and »
I need to maintain the txt file and I consider it worse having:
Quote:
This is a test text using &raquoguillemets&laquo.
which is what it will then also look in the ePub file. Now you really can't read it.

I do not see a way to edit ePub within Calibre, and if I understand you right I should make an html file first and from that the ePub. That is what I have been doing all along, not using Calibre though, but I thought Calibre could go direct from a txt file to ePub.

Am I wrong?
julo is offline   Reply With Quote
Old 11-07-2011, 08:50 PM   #11
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 30,990
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by julo View Post
I need to maintain the txt file and I consider it worse having:which is what it will then also look in the ePub file. Now you really can't read it.

I do not see a way to edit ePub within Calibre, and if I understand you right I should make an html file first and from that the ePub. That is what I have been doing all along, not using Calibre though, but I thought Calibre could go direct from a txt file to ePub.

Am I wrong?
No you are not wrong.
But there are exceptions (especially when you step outside ASCII).You may have found one
theducks is offline   Reply With Quote
Old 11-07-2011, 09:19 PM   #12
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
I'm not sure why you have a preference for ANSI - ANSI means it will only work on your computer and people who change Windows to say it's in the same country as your computer. UTF8 will guarantee it will work on any computer on the planet.

Sometimes I want to kill/throttle whatever idiots at Microsoft insist on neverending support and propagation their asinine default ANSI settings - it's 2011 and those idiots are the number one reason this question comes up over and over again for well over a decade now.
ldolse is offline   Reply With Quote
Old 11-07-2011, 09:24 PM   #13
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
btw, if you save as ANSI the encoding in Calibre will definitely not be UTF-8. And I'm guessing Latin 1 may be iso-8859-1 - the problem with using ANSI is unless you're familiar with what ANSI is for your local country you need to play this guessing game. You also need to know what Python's name for the encoding is (and since we don't know what encoding you're using it's difficult for us to tell you.)
ldolse is offline   Reply With Quote
Old 11-07-2011, 09:27 PM   #14
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,289
Karma: 27111240
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Generally speaking, windows' determination to not break backward compatibility is to be admired, but in this instance, I agree with ldolse, the crazy mix of encodings that windows systems typically operate under causes endless headaches.
kovidgoyal is offline   Reply With Quote
Old 11-07-2011, 09:45 PM   #15
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
And in this day and age I can't actually imagine that 'anything' would break... It's not like ANSI support would go away, they'd just change their save dialogs to default to UTF8 and all newly saved text files going forward would suddenly have a BOM and work on every PC on the planet - files without a BOM will still open as ANSI....

The occasional Windows 95 era program that actually didn't understand a UTF-8 BOM would still render 98% of the text correctly for a western language.. I imagine in Asia there may be some workflows that assume gb2312/shift-JIS/big5 etc as the default, but you could even keep a registry setting around for people who insisted on staying in the dark ages.
ldolse is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Txt->EPUB Conversion Issue nrich61 Conversion 7 07-07-2011 06:03 PM
TXT to EPUB conversion issue rockeh Conversion 2 03-18-2011 10:50 AM
Preserving <br /> on epub -> txt conversion billingd Calibre 1 08-11-2010 06:24 AM
Conversion: EPUB to TXT Starson17 Calibre 11 05-29-2010 12:31 PM
TXT conversion to ePub or LRF - paragraph formatting Zapped Calibre 6 10-23-2009 05:06 PM


All times are GMT -4. The time now is 05:39 PM.


MobileRead.com is a privately owned, operated and funded community.