| 
			
			 | 
		#1 | 
| 
			
			
			
			 Perfectionist 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 72 
				Karma: 12802 
				Join Date: Apr 2014 
				
				
				
				Device: none 
				
				
				 | 
	
	
	
		
		
			
			 
				
				Character Encoding Question
			 
			
			
			@ kovid 
		
	
		
		
		
		
		
		
		
		
		
		
	
	I have a book encoded in ISO 8859-1. When I open it in Sigil, it shows hieroglyphs all over, but both in Calibre viewer and editor it looks just fine. 1) Is Calibre ironing out on-the-fly, with UTF-8 as a default solution? 2) Would it be possible to instruct Check Book to warn about any other encoding than UTF-8 present in the book?  | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#2 | 
| 
			
			
			
			 creator of calibre 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,609 
				Karma: 28549044 
				Join Date: Oct 2006 
				Location: Mumbai, India 
				
				
				Device: Various 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			calibre detects encodings declared in HTML. I have no idea what Sigil does, from your description I'd guess it assumes UTF-8 always.  
		
	
		
		
		
		
		
		
		
		
		
		
	
	As for checking encodings, IIRC the editor autoconverts to UTF-8 whenever it processes any HTML, so I dont think checking will be possible or even necessary.  | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#3 | |
| 
			
			
			
			 Perfectionist 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 72 
				Karma: 12802 
				Join Date: Apr 2014 
				
				
				
				Device: none 
				
				
				 | 
	
	
	
		
		
		
		
		 Quote: 
	
 <meta content="application/xhtml+xml; charset=iso-8859-1" http-equiv="content-type"/> and I think the e-book readers that do not autoconvert may display text with different strange characters (â, Ã, Â, ¦, etc. instead of “, ”, etc.), as Sigil does. I know I can use Modify ePub add-on to encode in UTF-8, just would like to know that I need to. The way it is, I must use Sigil to open the book and find out if the encoding is off. I guess my question is: would it be possible for Check Book to report "There is a non-UTF-8 encoding declared in xyz.html"? That would be really helpful.  | 
|
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#4 | 
| 
			
			
			
			 creator of calibre 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,609 
				Karma: 28549044 
				Join Date: Oct 2006 
				Location: Mumbai, India 
				
				
				Device: Various 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			The editor does not do anything on opening. Run any automated tool, such as fix html to make it happen.
		 
		
	
		
		
		
		
		
		
		
		
		
		
	
	 | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#5 | |
| 
			
			
			
			 Perfectionist 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 72 
				Karma: 12802 
				Join Date: Apr 2014 
				
				
				
				Device: none 
				
				
				 | 
	
	
	
		
		
		
		
		 Quote: 
	
 I'd really like to get to the bottom of this... P.S. I can send you the book via PM, if it helps.  | 
|
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#6 | 
| 
			
			
			
			 creator of calibre 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,609 
				Karma: 28549044 
				Join Date: Oct 2006 
				Location: Mumbai, India 
				
				
				Device: Various 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			The editor detects the encoding, it simply does not make any *changes* to the file until you run a tool.
		 
		
	
		
		
		
		
		
		
		
		
		
		
	
	 | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#7 | 
| 
			
			
			
			 creator of calibre 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,609 
				Karma: 28549044 
				Join Date: Oct 2006 
				Location: Mumbai, India 
				
				
				Device: Various 
				
				
				 | 
	
	|
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#8 | |
| 
			
			
			
			 Perfectionist 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 72 
				Karma: 12802 
				Join Date: Apr 2014 
				
				
				
				Device: none 
				
				
				 | 
	
	
	
		
		
		
		
		 Quote: 
	 | 
|
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#9 | 
| 
			
			
			
			 Perfectionist 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 72 
				Karma: 12802 
				Join Date: Apr 2014 
				
				
				
				Device: none 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			This may be irrelevant, but I thought I should report it anyway. 
		
	
		
		
		
		
		
		
		
		
		
		
		
			Installed Calibre 1.42. Ran a Check Book on a title with iso-8859-1 encoding. Got the message "This file has its encoding declared as %s". If you feel like it, perhaps you could improve the error report so that it matches the exact encoding. Just a thought... Also, it would be helpful to add the option to change the encoding throughout the book, not just the particular HTML file. Last edited by mikapanja; 06-27-2014 at 07:30 AM.  | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#10 | |
| 
			
			
			
			 Ex-Helpdesk Junkie 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 19,421 
				Karma: 85400180 
				Join Date: Nov 2012 
				Location: The Beaten Path, USA, Roundworld, This Side of Infinity 
				
				
				Device: Kindle Touch fw5.3.7 (Wifi only) 
				
				
				 | 
	
	
	
		
		
		
		
		 Quote: 
	
 Anyway, it got fixed: https://github.com/kovidgoyal/calibr...978c08888b9022  | 
|
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#11 | |
| 
			
			
			
			 Perfectionist 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 72 
				Karma: 12802 
				Join Date: Apr 2014 
				
				
				
				Device: none 
				
				
				 | 
	
	
	
		
		
		
		
		 Quote: 
	
  | 
|
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#12 | 
| 
			
			
			
			 Ex-Helpdesk Junkie 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 19,421 
				Karma: 85400180 
				Join Date: Nov 2012 
				Location: The Beaten Path, USA, Roundworld, This Side of Infinity 
				
				
				Device: Kindle Touch fw5.3.7 (Wifi only) 
				
				
				 | 
	
	|
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
![]()  | 
            
        
    
| Thread Tools | Search this Thread | 
            
  | 
    
			 
			Similar Threads
		 | 
	||||
| Thread | Thread Starter | Forum | Replies | Last Post | 
| Character Encoding: How to fix it? | Genre fan | Reading and Management | 6 | 08-13-2022 08:07 PM | 
| character encoding conversion without other changes? | Barb-B | Conversion | 6 | 11-13-2012 04:28 AM | 
| Problem with character encoding | thesuker | Calibre | 2 | 11-09-2012 11:11 PM | 
| What character encoding am I seeing? | Claghorn | Conversion | 1 | 08-22-2012 11:02 AM | 
| how to tell the character encoding??? | rheostaticsfan | Calibre | 23 | 06-21-2010 04:26 PM |