Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 06-26-2010, 11:02 AM   #1
Ivo
Member
Ivo began at the beginning.
 
Posts: 22
Karma: 10
Join Date: Jun 2010
Location: Toronto, Ontario
Device: kobo
Character conversion: "—" --> "—"

Hi,
I didn't read the whole Sigil forum, so I don't know if this problem was brought up before. It has to do with you certain characters are converted into a series of ugly ascii characters. For exmaple, if the original file had the following text:


this was—before

after editing the file with Sigil, it becomes:

this was—before.

The character conversion is permanent, that is, if you safe the file (after you made some changes), the series of ascii characters "—" is everywhere where "—" is supposed to be.

The problematic "—" is not a regular dash, but it would be nice to keep it as is.

There are probably other characters that cause similar problem, and I believe the probelem is related to how the characters are encoded.

[Edit]
I did a little bit of search, and found out that the character in question is a unicode dash U+002D. My guess is that Sigil doesn't handle unicode, but I'm sure this particular dash will be found in many books.

Last edited by Ivo; 06-26-2010 at 11:10 AM.
Ivo is offline   Reply With Quote
Old 06-26-2010, 11:12 AM   #2
Valloric
Created Sigil, FlightCrew
Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.
 
Valloric's Avatar
 
Posts: 1,982
Karma: 350515
Join Date: Feb 2008
Device: Kobo Clara HD
Is this an HTML file you're importing or a TXT file? If it's an HTML file, then it's probably specifying the wrong encoding, or none at all.

If it's a TXT file, Sigil recognizes all UTF variants based on BOM presence, and if none is present, falls back to UTF-8. So if you're using a TXT file with a non-Unicode encoding, convert it to UTF-8/16 first.
Valloric is offline   Reply With Quote
Advert
Old 06-26-2010, 11:16 AM   #3
Ivo
Member
Ivo began at the beginning.
 
Posts: 22
Karma: 10
Join Date: Jun 2010
Location: Toronto, Ontario
Device: kobo
The original file was HTML, and I already converted it to ePub using calibre (the intermediate step was to import html into a rtf file, maybe it was not necessary). I wanted to do some minor changes, and ran into this problem, so the only solution was to unzip the files, and use vi.

[Edit] To restate the problem, if you have an epub file that has this character, and try to edit it, you lose all the unicode dashes.

Last edited by Ivo; 06-26-2010 at 11:18 AM.
Ivo is offline   Reply With Quote
Old 06-26-2010, 11:22 AM   #4
Valloric
Created Sigil, FlightCrew
Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.
 
Valloric's Avatar
 
Posts: 1,982
Karma: 350515
Join Date: Feb 2008
Device: Kobo Clara HD
Hm, create a new issue on the tracker with the epub file. Calibre often incorrectly specifies two different encodings, but I've recently worked around that.

Which version of Sigil are you using?
Valloric is offline   Reply With Quote
Old 06-26-2010, 11:26 AM   #5
Ivo
Member
Ivo began at the beginning.
 
Posts: 22
Karma: 10
Join Date: Jun 2010
Location: Toronto, Ontario
Device: kobo
Valloric, I saw this problem with 0.2.2 first, and it is still there with 0.2.3. I don't believe there is any problem with calibre. It is plain simple. The epub file correctly displays the long dash when you open it, and then when you try to edit that file with Sigil, the problem happens. If you have problems reproducing this problem then I can certainly create some dummy epub file that properly displays on any ebook reader, and then when you open it with Sigil you will see the problem.
Ivo is offline   Reply With Quote
Advert
Old 06-26-2010, 11:38 AM   #6
Ivo
Member
Ivo began at the beginning.
 
Posts: 22
Karma: 10
Join Date: Jun 2010
Location: Toronto, Ontario
Device: kobo
It looks like you are right, Valloric. I used the latest calibe, 0.7.5, added a whole bunch of unicode characters, and there was no problem when I used Sigil for editing. So the problem only occurs with files that were probably incorrectly encoded with calibre. Thanks.
Ivo is offline   Reply With Quote
Old 06-26-2010, 12:37 PM   #7
Valloric
Created Sigil, FlightCrew
Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.
 
Valloric's Avatar
 
Posts: 1,982
Karma: 350515
Join Date: Feb 2008
Device: Kobo Clara HD
I'd still like to see the epub file you were having problems with if you are willing to provide it.
Valloric is offline   Reply With Quote
Old 06-26-2010, 01:37 PM   #8
Ivo
Member
Ivo began at the beginning.
 
Posts: 22
Karma: 10
Join Date: Jun 2010
Location: Toronto, Ontario
Device: kobo
I have gotten around this problem - there are many ways on how to solve this problem, but since I've started this question, I've sent you the file. No matter how I try, I cannot reproduce it with any other file.

There are two reasons why I started using Sigil. One is to fix the font problem with Kobo device, and the second one is that calibre doesn't build a proper TOC even when you try to set proper headers in RFT file (at least it didn't work for me). So I would go, edit the epub file with Sigil, and set the Part/Chapter structure properly, which recreates TOC. Don't know why chapter do not work well (for me) with calibre and RTF files, I tried Atlantis and it does an excellent job.

In the end, I think Sigil is going to be an excellent tool, once it reaches higher version. And I'm happy that I can use it under linux. The only problem is that it is a bit slow.
Ivo is offline   Reply With Quote
Old 06-26-2010, 01:57 PM   #9
Valloric
Created Sigil, FlightCrew
Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.
 
Valloric's Avatar
 
Posts: 1,982
Karma: 350515
Join Date: Feb 2008
Device: Kobo Clara HD
Quote:
Originally Posted by Ivo View Post
I have gotten around this problem - there are many ways on how to solve this problem, but since I've started this question, I've sent you the file. No matter how I try, I cannot reproduce it with any other file.
It was a very minor bug, took two minutes to fix. Thanks for the bug report.

In case you're curious, it was caused by the XML declaration using single quotes instead of double quotes. The regex failed to account for single quotes.
Valloric is offline   Reply With Quote
Old 06-26-2010, 02:01 PM   #10
Ivo
Member
Ivo began at the beginning.
 
Posts: 22
Karma: 10
Join Date: Jun 2010
Location: Toronto, Ontario
Device: kobo
Thanks, I am impressed with the speed of your (re)action!
Ivo is offline   Reply With Quote
Old 06-26-2010, 08:32 PM   #11
charleski
Wizard
charleski ought to be getting tired of karma fortunes by now.charleski ought to be getting tired of karma fortunes by now.charleski ought to be getting tired of karma fortunes by now.charleski ought to be getting tired of karma fortunes by now.charleski ought to be getting tired of karma fortunes by now.charleski ought to be getting tired of karma fortunes by now.charleski ought to be getting tired of karma fortunes by now.charleski ought to be getting tired of karma fortunes by now.charleski ought to be getting tired of karma fortunes by now.charleski ought to be getting tired of karma fortunes by now.charleski ought to be getting tired of karma fortunes by now.
 
Posts: 1,196
Karma: 1281258
Join Date: Sep 2009
Device: PRS-505
Quote:
Originally Posted by Ivo View Post
In the end, I think Sigil is going to be an excellent tool, once it reaches higher version.
It's been an excellent tool for quite some time, since it's the only readily-available program to allow post-creation editing of epubs. Valloric and I may have had disagreements in the past, but Sigil occupies a unique, and very valuable, position in the epub ecosystem.
charleski is offline   Reply With Quote
Old 06-26-2010, 09:15 PM   #12
Ivo
Member
Ivo began at the beginning.
 
Posts: 22
Karma: 10
Join Date: Jun 2010
Location: Toronto, Ontario
Device: kobo
Sorry, I take it back. It is an excellent tool! I wish it was faster a bit (e.g., if I open an archive with 7-zip, edit the file and save it back while still in the archive mode - which is possible, things go much faster).

There are few minor things not work reporting, and given that it comes for free I certainly like it a lot.
Ivo is offline   Reply With Quote
Old 06-27-2010, 07:55 AM   #13
Valloric
Created Sigil, FlightCrew
Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.
 
Valloric's Avatar
 
Posts: 1,982
Karma: 350515
Join Date: Feb 2008
Device: Kobo Clara HD
Quote:
Originally Posted by charleski View Post
Valloric and I may have had disagreements in the past
We have? Frankly I don't remember. What did we disagree on?
Valloric is offline   Reply With Quote
Old 06-27-2010, 08:23 AM   #14
nyrath
Addict
nyrath reads XML... blindfoldednyrath reads XML... blindfoldednyrath reads XML... blindfoldednyrath reads XML... blindfoldednyrath reads XML... blindfoldednyrath reads XML... blindfoldednyrath reads XML... blindfoldednyrath reads XML... blindfoldednyrath reads XML... blindfoldednyrath reads XML... blindfoldednyrath reads XML... blindfolded
 
nyrath's Avatar
 
Posts: 281
Karma: 52007
Join Date: Jun 2010
Device: nook
Talking

Quote:
Originally Posted by charleski View Post
It's been an excellent tool for quite some time, since it's the only readily-available program to allow post-creation editing of epubs. Valloric and I may have had disagreements in the past, but Sigil occupies a unique, and very valuable, position in the epub ecosystem.
Seconded.
I had a regrettably large number of epubs with spelling mistakes and other editorial gaffes. Sigil not only made fixing these possible, it made it easy.

I also have fun re-inserting diagrams and illustrations present in the dead-tree version of the document but missing from the epub.

I really like Sigil, and it is not just because at my day job I'm a fan of TrollTech's Qt framework.
nyrath is offline   Reply With Quote
Old 06-27-2010, 10:32 PM   #15
FizzyWater
You kids get off my lawn!
FizzyWater ought to be getting tired of karma fortunes by now.FizzyWater ought to be getting tired of karma fortunes by now.FizzyWater ought to be getting tired of karma fortunes by now.FizzyWater ought to be getting tired of karma fortunes by now.FizzyWater ought to be getting tired of karma fortunes by now.FizzyWater ought to be getting tired of karma fortunes by now.FizzyWater ought to be getting tired of karma fortunes by now.FizzyWater ought to be getting tired of karma fortunes by now.FizzyWater ought to be getting tired of karma fortunes by now.FizzyWater ought to be getting tired of karma fortunes by now.FizzyWater ought to be getting tired of karma fortunes by now.
 
FizzyWater's Avatar
 
Posts: 4,220
Karma: 73492664
Join Date: Aug 2007
Location: Columbus, Ohio
Device: Oasis 2 and Libra H2O and half a dozen older models I can't let go of
Is this the same issue as seen in the two attachments (I wish I could remember how to copy these so they're always visible, but I never can)...

It always happened occasionally, but it seems like lately it's happening to almost every ebook I view.

These are existing ePubs that I add a cover and blurb to in Calibre.
Attached Thumbnails
Click image for larger version

Name:	Sigil.jpg
Views:	571
Size:	80.2 KB
ID:	54197   Click image for larger version

Name:	Calibre.jpg
Views:	488
Size:	41.0 KB
ID:	54198  
FizzyWater is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
"Settings," then "311" - Int'l Kindle searches for wireless providers in the area Dr. Drib Amazon Kindle 2 08-28-2011 10:27 AM
Yep. It's official. Sony Reader has "ruined" books for me. A final "review." WilliamG Sony Reader 48 01-14-2011 03:49 AM
"Balanced copyright" and feedback from real people (not just corporate "persons") llreader News 16 02-15-2010 08:27 AM
"Zeit-Odyssee"-Trilogie droht das "dunkle Turm"-Schicksal ThR E-Books 4 02-10-2010 05:18 AM
Question - Does iLiab have the "search" & "annotation, highlighting" features? HiSoC8Y iRex 5 07-01-2009 04:37 PM


All times are GMT -4. The time now is 02:34 PM.


MobileRead.com is a privately owned, operated and funded community.