Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 02-16-2013, 02:19 PM   #1
travger
Evangelist
travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.
 
travger's Avatar
 
Posts: 480
Karma: 270594
Join Date: Aug 2010
Device: palm tx, Windows7, Galaxy A5
Just FYI, ä

This may be fixed in newer versions, but I then maybe it's been well hidden.

I had an test epub that was made with old, much simpler version (0.4...) of Sigil. When I opened it in 0.5.907, there was no html file. (Other books were fine.) So I changed the name of the not showing html, deleted the part containing 'ä' - and now Sigil found it. Put ä back in Sigil, now there's no problems.

Just thought that I should mention it, as most people don't use ü & ö and this could go unnoticed for quite a while.
travger is offline   Reply With Quote
Old 02-28-2013, 05:03 PM   #2
Man Eating Duck
Addict
Man Eating Duck juggles neatly with hedgehogs.Man Eating Duck juggles neatly with hedgehogs.Man Eating Duck juggles neatly with hedgehogs.Man Eating Duck juggles neatly with hedgehogs.Man Eating Duck juggles neatly with hedgehogs.Man Eating Duck juggles neatly with hedgehogs.Man Eating Duck juggles neatly with hedgehogs.Man Eating Duck juggles neatly with hedgehogs.Man Eating Duck juggles neatly with hedgehogs.Man Eating Duck juggles neatly with hedgehogs.Man Eating Duck juggles neatly with hedgehogs.
 
Posts: 254
Karma: 69786
Join Date: May 2006
Location: Oslo, Norway
Device: Kobo Aura, Sony PRS-650
Quote:
Originally Posted by travger View Post
Just thought that I should mention it, as most people don't use ü & ö and this could go unnoticed for quite a while.
Generally I would advise people to never use characters outside of English a-z (or spaces for that matter) in internal filenames. While all software should be using Unicode by now, a lot of it is English-centric so that errors with different charsets never show up in testing by the author(s). These bugs can be hard to track down.

On a side note I can mention that calibre still uses backslashes in links when converting an epub to zip (which generates an otherwise very nice set of html files of your book), but only on Windows. Needless to say this is also a Bad Idea
Man Eating Duck is offline   Reply With Quote
Advert
Old 02-28-2013, 08:19 PM   #3
user_none
Sigil & calibre developer
user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.
 
user_none's Avatar
 
Posts: 2,488
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
Quote:
Originally Posted by travger View Post
I had an test epub that was made with old, much simpler version (0.4...) of Sigil. When I opened it in 0.5.907, there was no html file. (Other books were fine.) So I changed the name of the not showing html, deleted the part containing 'ä' - and now Sigil found it. Put ä back in Sigil, now there's no problems.
The html file within the archive had 'ä' as part of the filename?

That beta had known issues with unicode characters as part of the filename. It was a beta and is no longer distributed. Try 0.7.0 and see if the problem persists. If it does then that is a bug that need to be fixed.
user_none is offline   Reply With Quote
Old 03-02-2013, 05:37 AM   #4
travger
Evangelist
travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.
 
travger's Avatar
 
Posts: 480
Karma: 270594
Join Date: Aug 2010
Device: palm tx, Windows7, Galaxy A5
Sorry, can't do it again. I have no more 4.xx installed. 5.9 and 7.0 have no problems seeing html-s that were renamed by the other.

Yes, html file within the archive had 'ä' as part of the filename. I deleted that part outside Sigil.

Just thought that there may be some several years old epubs out there where unsuspecting people may lose access to the html in Sigil and think that epub is damaged.
travger is offline   Reply With Quote
Old 03-02-2013, 12:18 PM   #5
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
Here is an example of a massive EPUB I created this EPUB in Sigil 0.5.3 with unicode filenames:

http://www.mediafire.com/?9v891dfcbw9i87z

(Yes, yes, I know the EPUB is a huge mess and is not exactly fully EPUB compliant, these were a huge WIP that I put on hold to continue other work).

Sigil 0.5.3 allowed me to save/open/rename files with unicode characters perfectly fine (and if I recall correctly, FlightCrew said nothing about potential filename errors). I imported them into Sigil 0.5.3 using the typical Add Existing File dialog.

My Nook is able to read these EPUBs fine, even the articles with unicode in the filenames.

These HTML files were all auto generated from a website using the "ArticleNumber_Author_ArticleTitle.html" format. Here is an example of one file name:

Code:
3251_Juan.Ramón.Rallo.Julián_Economic.Crisis.and.Paradigm.Shift.html
When running the EPUB through EPUBCheck 3.0, you get this output:

Code:
File name contains non-ascii characters: óá. Consider changing filename
Now, when opened in Sigil 0.7.0 (and I believe 0.6.0+), the unicode files do not appear in the Book Browser, and if the FlightCrew check is done, it gives this error:

Code:
The <item> element's "href" attribute points to file "Text/x3251_Juan.Ram%C3%B3n.Rallo.Juli%C3%A1n_Economic.Crisis.and.Paradigm.Shift.html" which does not exist.
Perhaps there should be some way to gracefully handle Unicode filenames instead of making them disappear completely?

Last edited by Tex2002ans; 03-02-2013 at 12:22 PM.
Tex2002ans is offline   Reply With Quote
Advert
Old 03-02-2013, 04:42 PM   #6
user_none
Sigil & calibre developer
user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.user_none ought to be getting tired of karma fortunes by now.
 
user_none's Avatar
 
Posts: 2,488
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
Quote:
Originally Posted by Tex2002ans View Post
Here is an example of a massive EPUB I created this EPUB in Sigil 0.5.3 with unicode filenames:

...

Sigil 0.5.3 allowed me to save/open/rename files with unicode characters perfectly fine (and if I recall correctly, FlightCrew said nothing about potential filename errors). I imported them into Sigil 0.5.3 using the typical Add Existing File dialog.

My Nook is able to read these EPUBs fine, even the articles with unicode in the filenames.
Later version of Sigil are more strict about the EPUB 2 spec. The file you've attached is invalid. The filenames in the archive must be utf-8 encoded according to the spec and Sigil tries to decode the filenames using utf-8. The filenames in this file are _not_ utf-8 encoded.

So what's happening is Sigil is getting the list of files from the OPF and they don't match what it gets from the archive. So Sigil thinks the file does not exist.

Some EPUB readers are more relaxed and don't really care about the filename encoding. These will either ignore utf-8 encoding and use the standard ZIP encoding or they will check if the utf-8 bit is set and only use utf-8 in that case. I've made a change to Sigil for 0.7.1 to check the utf-8 bit and use the standard ZIP filename encoding if it's not set instead. With this change the example file opens properly.



Quote:
Originally Posted by Tex2002ans View Post
When running the EPUB through EPUBCheck 3.0, you get this output:

Code:
File name contains non-ascii characters: óá. Consider changing filename
This recommendation is because of this very situation. These characters must be decoded properly otherwise they won't match what's in the OPF. A reading system can either A) follow the spec and expect the defined encoding. B) See what the archive has set as the encoding. With A we get into this situation. With B, well this assumes the encoding was marked properly. Either way you're going to run into this situation using non-ascii characters.
user_none is offline   Reply With Quote
Old 03-02-2013, 05:40 PM   #7
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by user_none View Post
I've made a change to Sigil for 0.7.1 to check the utf-8 bit and use the standard ZIP filename encoding if it's not set instead. With this change the example file opens properly.
Great to hear.
Tex2002ans is offline   Reply With Quote
Old 03-02-2013, 09:13 PM   #8
travger
Evangelist
travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.
 
travger's Avatar
 
Posts: 480
Karma: 270594
Join Date: Aug 2010
Device: palm tx, Windows7, Galaxy A5
Thanks, wonderful!
travger is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
[FYI] fix_permissions on PRS-T1 anddam Sony Reader Dev Corner 0 10-26-2012 09:02 AM
FYI mfkrafft enTourage eDGe 6 03-06-2012 10:46 PM
FYI: Stanza 3.0.3 kyteflyer Apple Devices 0 01-01-2012 11:26 PM
FYI: Dr Who Sale happy_terd Lounge 5 12-11-2010 03:45 PM
FYI AJ Starr Sony Reader 1 06-06-2009 02:17 PM


All times are GMT -4. The time now is 09:49 AM.


MobileRead.com is a privately owned, operated and funded community.