Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 12-03-2012, 09:17 AM   #1
boatat72
Member
boatat72 began at the beginning.
 
Posts: 19
Karma: 10
Join Date: Apr 2011
Location: Cambridge UK
Device: Kindle, Sony Reader, iPad, Kobo, Nook
0.6.1   breaks conversion by Calibre

When converting EPUB to EPUB with Calibre, the presence of a non-breaking space [&nbsp;] in a given file causes the CSS file path in the <head> of that document to point to the wrong place.

On opening the document with Sigil I get

The following errors occurred when loading the EPUB:
Not well formed, Cannot perform html updates: myfilename1.xhtml
Not well formed, Cannot perform html updates: myfilename2.xhtml
etc.

Correcting the CSS path then gives this error

entity 'nbsp' not found

Removing all &nbsp; cures this completely but I find them useful for indenting the occasional line in a poem or other minor positional adjustments.
This did not happen in 0.6.0
boatat72 is offline   Reply With Quote
Old 12-03-2012, 10:56 AM   #2
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 7,514
Karma: 18512745
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
So... calibre creates invalid files... wouldn't that be an issue to report to calibre?
Jellby is offline   Reply With Quote
Advert
Old 12-03-2012, 11:12 AM   #3
boatat72
Member
boatat72 began at the beginning.
 
Posts: 19
Karma: 10
Join Date: Apr 2011
Location: Cambridge UK
Device: Kindle, Sony Reader, iPad, Kobo, Nook
Possibly, but I have to start somewhere. There may be other entities that cause this to happen. Catch 22. Calibre would probably say 'as it didn't happen before today's update it must be Sigil's problem'.
boatat72 is offline   Reply With Quote
Old 12-03-2012, 12:25 PM   #4
meme
Sigil developer
meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.
 
Posts: 1,275
Karma: 1101600
Join Date: Jan 2011
Location: UK
Device: Kindle PW, K4 NT, K3, Kobo Touch
I'm a little confused on exactly where the nbsp character is in your file, or what the problem is. Do you have a sample EPUB?

nbsp did undergo a change in 0.6.1, but I'm not sure if its related to your issue.

Essentially the Qt editor does not work as expected with the nbsp character (the 160 code); if there are any nbsp characters in your file when you are using it (whether in Book View or when storing/retrieving it for Code View) then Qt would convert the nbsp character to a normal space character. This meant that if you use an &nbsp; entity string in your code and switched to Book View and back after some editing, all your &nbsp; entities would disappear because Book View would convert them to the nbsp character and then convert them to normal spaces. This was also true when loading a file as the nbsp characters would get converted to normal spaces.

0.6.1 now converts any nbsp characters into &nbsp; entities when it loads your EPUB/file so that they don't get changed to normal spaces and therefore 'lost'. This is the only character that gets treated as special.

Interesting it probably means you'll see more &nbsp; entities in Code View if you edit in Book View. This is because Book View will insert an nbsp character into the code when it thinks you want one (e.g. if you hit the spacebar twice). Before, when you switched from Book View to Code View that nbsp character would be invisibly converted to a space, and the pretty printing might convert the 2 spaces to 1 space. Now you'll see the &nbsp; entity character that you put in there - but you'll probably end up wanting to do replace to convert the ones you don't need to blank spaces
meme is offline   Reply With Quote
Old 12-03-2012, 02:31 PM   #5
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,465
Karma: 192992430
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
When converting EPUB to EPUB with Calibre, the presence of a non-breaking space [&nbsp;] in a given file causes the CSS file path in the <head> of that document to point to the wrong place.
I just can't quite get my head around that. Are you saying there's an nbsp entity in the path to the css file in the <link> tag in the head? That's not valid to begin with ... and I frankly can't think of another nbsp entity situation that would cause the file path to be wrong.

As to the "entity 'nbsp' not found" error (and nothing working until all occurrences are removed) that sounds like an incorrect namespace in the document header.
DiapDealer is offline   Reply With Quote
Advert
Old 12-04-2012, 05:33 AM   #6
boatat72
Member
boatat72 began at the beginning.
 
Posts: 19
Karma: 10
Join Date: Apr 2011
Location: Cambridge UK
Device: Kindle, Sony Reader, iPad, Kobo, Nook
Hi, thanks for looking at this.
All the &nbsp; are in the text in the <body> and were inserted in book view by hitting the spacebar.
There are none inside the <head> or <link> tags. But it is the <link> that gets damaged if a &nbsp; is present in the text of that file.

There are four files attached, 2 pure Sigil code and the same two after conversion by Calibre. They all contain exactly the same content but made with 2 different versions of Sigil, as per the filenames.
I did not keep a copy of 0.6.0 so used 0.5.3.
Chapter 1 is an example of a poem with lots of differential spacing. [gives error]
Chapter 2 has one &nbsp; [gives error]
Chapter 3 has only ordinary spaces [no error]

The problem does not exist before the EPUB is opened and it does not appear to matter which version of Sigil made the file originally.
The act of opening EPUBs with Sigil 0.6.1 introduces the problem to files that have been converted by Calibre. The same file can be opened by 0.5.3 with no errors.
Attached Files
File Type: epub Made with 0.5.3 Converted by Calibre.epub (49.5 KB, 170 views)
File Type: epub Made with 0.6.1 Converted by Calibre.epub (49.7 KB, 204 views)
File Type: epub TEMPLATE made with 0.5.3.epub (42.5 KB, 205 views)
File Type: epub TEMPLATE made with 0.6.1.epub (42.5 KB, 165 views)
boatat72 is offline   Reply With Quote
Old 12-14-2012, 11:13 AM   #7
jackie_w
Grand Sorcerer
jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.
 
Posts: 6,171
Karma: 16228536
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
I was just about to report a nbsp problem with Sigil 0.6.2 when I spotted this thread. Does anyone know whether a solution is likely in the short term?

The only workaround I can think of is to
  1. explode epub -- find/replace all occurrences of unicode nbsp char with some other 'uncommon' char string -- rebuild epub.
  2. load epub into Sigil to do originally planned changes -- save.
  3. reverse changes from step 1, to put nbsp back again.
Has anyone found a better workaround?

As it stands, epubs created by authors who have used 'nbsp empty paragraphs', rather than proper css styles, to create scenebreaks etc, are pretty much unusable in Sigil without some kind of pre-/post-processing.

I can give more detail and sample epub if necessary.
jackie_w is offline   Reply With Quote
Old 12-14-2012, 12:36 PM   #8
boatat72
Member
boatat72 began at the beginning.
 
Posts: 19
Karma: 10
Join Date: Apr 2011
Location: Cambridge UK
Device: Kindle, Sony Reader, iPad, Kobo, Nook
I'm still looking for a robust replacement to create spacing.
Had not thought of your workaround.
The en, em, and thin space entities are all safe in Sigil and Calibre, but some eReaders, especially Sony ones, cannot display them and you get "?" (question mark) or "¤" (undefined currency).

For scene breaks I use either of these dots or three tildes. They are all safe wherever I have tried them. The dot is barely noticable, if you are looking for sublety.

<div class="smallbullet">• • • • •</div>

div.smallbullet {
display: block;
text-align: center;
font-size: 0.7em;
margin-bottom: 0.8em;
margin-top: 0.8em;
margin-left: 0;
margin-right: 0;
border: 0;
text-indent: 0;
}

<div class="dot">⋅ ⋅ ⋅ ⋅ ⋅</div>

div.dot {
display: block;
text-align: center;
margin-bottom: 0.8em;
margin-top: 0.8em;
margin-left: 0;
margin-right: 0;
border: 0;
text-indent: 0;
}
boatat72 is offline   Reply With Quote
Old 12-14-2012, 01:02 PM   #9
cybmole
Wizard
cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.
 
Posts: 3,720
Karma: 1759970
Join Date: Sep 2010
Device: none
i am not seeing this issue.
have been tweaking a book which has lots of &nbsp: - sigil tells me there are 480 of them

many are used as line spacers i.e. like <p class="something">&nbsp;</p>

the book opens & closes in sigil just fine, no issue with links, no issues with "lost" spaces

I reckon I started this book tweak at least one version back - maybe 2 versions back.

I am using the 64bit versions.

I just re-opened a different recent book & that still has 43 NBSP intact??

these are books that have also been run thru calibre, because if I fix up chapter breaks I then use a calibre epub-to-epub conversion to split the xhtml files on a one-per-chapter basis - as that is faster than doing them manually.

either the bug is being badly described or the 64 bit versions are immune ?

Last edited by cybmole; 12-14-2012 at 01:30 PM.
cybmole is offline   Reply With Quote
Old 12-14-2012, 01:45 PM   #10
jackie_w
Grand Sorcerer
jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.
 
Posts: 6,171
Karma: 16228536
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
@boatat72,
I'm not too worried (for my own purposes) if I have to lose nbsps in scenebreak ornaments - a single standard space between the bullets/asterisks is acceptable for my personal reading. It's the 'empty paragraphs' that are more problematic.

Re: problems on Sony readers. I think this is an unrelated issue. I do not experience any problems with 'special characters' on any of my Sony readers. This is probably because I never use the hopeless Sony default font to read anything. I expect this problem would go away for you too if you customised your epub fonts, several methods exist depending on which Sony model you're using.


@cybmole,
Hmm, I hadn't considered that there may be a difference depending on whether one was using 32-bit or 64-bit. I'm still using WinXP SP3 32-bit.

I've attached a sample 1-page epub (calibre conversion from html). If you have time, perhaps you could try opening it with Sigil 0.6.2. These are the problems I see:
  1. On opening - an error message "Not well formed, cannot perform html updates: Sigil_nbsp.html"
  2. The css <link> in the html header: the original path has been retained
    Code:
    <link href="stylesheet.css" type="text/css" rel="stylesheet"/>
    rather than what should happen, i.e. path automatically changed to
    Code:
    <link href="../Styles/stylesheet.css" type="text/css" rel="stylesheet"/>
  3. Try changing some of the text on the html page, then try to save the epub. I get an error box:
    "The operation you requested cannot be performed because Sigil_nbsp.html is not a well formed XML document.
    An error was found at or above line 14: entity 'nbsp' not found.
    The Fix Manually option will let you fix the problem by hand."
Attached Files
File Type: epub Sigilnbsp.epub (2.7 KB, 150 views)
jackie_w is offline   Reply With Quote
Old 12-14-2012, 01:50 PM   #11
cybmole
Wizard
cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.
 
Posts: 3,720
Karma: 1759970
Join Date: Sep 2010
Device: none
it opens with no errors for me, windows 7 64 bit sigil 6.0.2.
and the code seems intact:
if I add some text and save, the save works OK also. here is a paste from my sigil code view, after loading.
Code:
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
  <title>Sigil vs nbsp</title>
  <link href="../Styles/stylesheet.css" rel="stylesheet" type="text/css" />
</head>

<body class="calibre">
  <h4 class="calibre1">Sigil and the non-breaking space</h4>

  <p class="calibre2">Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.</p>

  <p class="ctr">*&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;*&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;*</p>

  <p class="calibre2">The above paragraph spacer contains several non-breaking spaces and causes problems when opened in Sigil v6.2.</p>

  <p class="calibre2">&nbsp;</p>

  <p class="calibre2">&nbsp;</p>

  <p class="calibre2">This paragraph is separated from the previous one by 2 empty nbsp paragraphs.</p>
</body>
</html>
cybmole is offline   Reply With Quote
Old 12-14-2012, 02:00 PM   #12
jackie_w
Grand Sorcerer
jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.
 
Posts: 6,171
Karma: 16228536
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
Ah, I think we may be at cross purposes. The problem I described is with Sigil v0.6.2 (Build time: 2012.12.06 20:35:42 UTC) not Sigil v0.6.0.2.

If I understood it correctly, meme's description in post #4, sounded as if v0.6.0 silently converted the nbsp entity to a standard space, which could be fairly disasterous with 'empty paragraphs'.

[Added: At least with v0.6.2 I can see when I have an epub with an 'nbsp problem' and can do something about it.]

Last edited by jackie_w; 12-14-2012 at 02:04 PM.
jackie_w is offline   Reply With Quote
Old 12-14-2012, 02:09 PM   #13
boatat72
Member
boatat72 began at the beginning.
 
Posts: 19
Karma: 10
Join Date: Apr 2011
Location: Cambridge UK
Device: Kindle, Sony Reader, iPad, Kobo, Nook
Thanks, we may be onto something here.
Sigilnbsp.epub is broken for me with 0.6.2 and Windows 7 32 bit
If I can persuade my wfe to lend me her 64 bit I will test this out.
Maybe not tonight, have to decorate the tree.
boatat72 is offline   Reply With Quote
Old 12-14-2012, 03:13 PM   #14
cybmole
Wizard
cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.
 
Posts: 3,720
Karma: 1759970
Join Date: Sep 2010
Device: none
hmm,all is not completely well with 6.0.2. 64 bit.

I a now tweaking a different book and every time I save it , then load it I find
<p class="calibre5"></p>
in between every text paragraph.
i.e. I see
<p class="calibre5">para1</p>
<p class="calibre5"></p>
<p class="calibre5">para2</p>
<p class="calibre5"></p>

etc all through the book.

this has no visual effect, in book view but it is disconcerting as I cannot deduce why in this book , but not in others, the empty paras are being added back in

any suggestions for further diagnosis ?
code view sample follows:
Code:
<p class="calibre5"></p>

  <p class="calibre5"><span>On to Leah, then. She is a <span>Volo</span>, which is the top telekinetic demon category. Like Adam, she is a rarity, fathered by a singular high-ranking demon. The difference is that Adam, at twenty-four, only recently learned to use his full powers. As with spell-casters, the progression takes time. Although Adam started being able to inflict burns by twelve, it took another dozen years before he could incinerate. Leah, at thirty-one, has likely been in full use of her power for at least five years now, giving her plenty of practice time.</span></p>

  <p class="calibre5"></p>

  <p class="calibre5"><span>Cary</span><span>'s death was a good indication of what Leah can do. Yet it was the only clear example of her powers I had. Yes, we'd encountered her last year and, yes, lots of objects had gone flying through the air, but there was a problem. Not only hadn't I witnessed anything firsthand, but there'd been a sorcerer involved, meaning it was difficult to tell where his contributions to the chaos left off and Leah's began.</span></p>

  <p class="calibre5"></p>
if I turn pretty print OFF, save & reload, I now see these mismatched tag entries instead:
<p class="calibre5"/>

and when I replace All of those I see xtra line spacing.

getting worried now about messing up this book so will post, go make backup, then ponder a while...

update - seems OK now - I turned pretty print back on, reloaded, no empty line additions, so the last thing i did ( a replace all aimed at the extra line spacing, followed by a save with pp off) , seems to have cleaned things up ok

Last edited by cybmole; 12-14-2012 at 03:20 PM.
cybmole is offline   Reply With Quote
Old 12-14-2012, 04:40 PM   #15
meme
Sigil developer
meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.meme ought to be getting tired of karma fortunes by now.
 
Posts: 1,275
Karma: 1101600
Join Date: Jan 2011
Location: UK
Device: Kindle PW, K4 NT, K3, Kobo Touch
If you are having an nbsp problem - change your Preferences -> Clean Source to Pretty Print Tidy, and then re-open your file.

Its likely that the header of your document is invalid and not setting the type of the document to one that supports the nbsp entity. Your preferences may have been accidentally changed to 'Off' so Sigil reads the file as is, which generates the error message. With Pretty Print Tidy on, it corrects as much as it can and should allow your files to be opened.

And if you are seeing any odd entries in your files - try unzipping your original epub or looking at your original HTML file to see what is actually in the file, since it was probably pretty odd to begin with
meme is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Use of nonbreaking space (&nbsp;) Ti-Ron ePub 10 04-14-2013 10:57 PM
Use of &nbsp; for spacing Ripplinger Sigil 11 11-25-2012 04:36 AM
iBooks does NOT LIKE &nbsp; Erin Apple Devices 0 09-13-2011 11:17 AM
Specify indent in css, not with &nbsp James_Wilde Calibre 7 09-13-2010 09:48 PM
Mobiperl &nbsp; lost when converting to mobi Jellby Kindle Formats 19 08-26-2008 03:10 PM


All times are GMT -4. The time now is 03:08 AM.


MobileRead.com is a privately owned, operated and funded community.