Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Conversion

Notices

Reply
 
Thread Tools Search this Thread
Old 03-08-2012, 01:32 PM   #1
Bobosmite
Enthusiast
Bobosmite ought to be getting tired of karma fortunes by now.Bobosmite ought to be getting tired of karma fortunes by now.Bobosmite ought to be getting tired of karma fortunes by now.Bobosmite ought to be getting tired of karma fortunes by now.Bobosmite ought to be getting tired of karma fortunes by now.Bobosmite ought to be getting tired of karma fortunes by now.Bobosmite ought to be getting tired of karma fortunes by now.Bobosmite ought to be getting tired of karma fortunes by now.Bobosmite ought to be getting tired of karma fortunes by now.Bobosmite ought to be getting tired of karma fortunes by now.Bobosmite ought to be getting tired of karma fortunes by now.
 
Posts: 35
Karma: 474582
Join Date: Mar 2012
Device: Kindle-DX
Questions about converting ellipsis

In my Word documents, I expand the spacing in the ellipsis by 3pt., but conversion ignores it and give me dot-dot-dot squeezed together. Typically, when it converts, it looks like this: TEXT ... TEXT or: TEXT...TEXT
By adding a space between the dots, this is how I want it to look: TEXT. . .TEXT

When converting to an ebook, can Calibre detect this as an ellipsis and allow the line to break before or after, or is that determined by the reading device? Also, does Calibre recognize non-breaking spaces?

Last edited by Bobosmite; 03-08-2012 at 01:35 PM.
Bobosmite is offline   Reply With Quote
Old 03-08-2012, 02:41 PM   #2
jackie_w
Grand Sorcerer
jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.
 
Posts: 6,198
Karma: 16228558
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
Someone should correct me if I'm wrong but... Check your Convert - Look&Feel page. If the 'Smarten punctuation' box is checked then both dot-dot-dot (3-char) and dot-space-dot-space-dot (5-char) will be converted to a single char ellipsis in the converted file. The visual appearance of the single-char ellipsis will depend on the font face used in the ereader application.

As far as I know Calibre respects the non-breaking-space entity. What I don't know is whether dot-nbsp-dot-nbsp-dot will also be converted to a single char ellipsis if 'Smarten punctuation' is checked.

As far as line-breaking at an ellipsis is concerned, I would think the reading application would decide that.
jackie_w is offline   Reply With Quote
Advert
Old 03-08-2012, 02:51 PM   #3
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,539
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Originally Posted by jackie_w View Post
What I don't know is whether dot-nbsp-dot-nbsp-dot will also be converted to a single char ellipsis if 'Smarten punctuation' is checked.
Calibre uses SmartyPants to smarten punctuation, so dot-nbsp-dot-nbsp-dot will be left alone by the process.
DiapDealer is online now   Reply With Quote
Old 03-08-2012, 06:04 PM   #4
dwig
Wizard
dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.
 
dwig's Avatar
 
Posts: 1,613
Karma: 6718479
Join Date: Dec 2004
Location: Paradise (Key West, FL)
Device: Current:Surface Go & Kindle 3 - Retired: DellV8p, Clie UX50, ...
Quote:
Originally Posted by Bobosmite View Post
...can Calibre detect this as an ellipsis
probably not as you've inserted spaces between the dots.

Quote:
and allow the line to break before or after, or is that determined by the reading device?
It's the reading device/app that determines where line wrapping occurs. If you want the reader to be able to break before and/or a true elipsis you need in insert a standard space before and/or after the elipsis.

A true elipsis is a single special character. The three periods style is an antique typewriter kluge. Many apps, including a properly configured MS Word, will replace these with a proper elipsis even if you adjust the letter spacing (aka kerning) between the characters. When you space the periods out with space characters I wouldn't expect any app to do the auto replacement. If you use non-breaking spaces (  in HTML) to space the periods it will flow as if it is a single character, but you need to use standard "breaking" spaces before and/or after if you want the reader to be able to wrap the line at those positions.

Quote:
Also, does Calibre recognize non-breaking spaces?
Yes, at least in most input formats.
dwig is offline   Reply With Quote
Old 03-08-2012, 06:34 PM   #5
Bobosmite
Enthusiast
Bobosmite ought to be getting tired of karma fortunes by now.Bobosmite ought to be getting tired of karma fortunes by now.Bobosmite ought to be getting tired of karma fortunes by now.Bobosmite ought to be getting tired of karma fortunes by now.Bobosmite ought to be getting tired of karma fortunes by now.Bobosmite ought to be getting tired of karma fortunes by now.Bobosmite ought to be getting tired of karma fortunes by now.Bobosmite ought to be getting tired of karma fortunes by now.Bobosmite ought to be getting tired of karma fortunes by now.Bobosmite ought to be getting tired of karma fortunes by now.Bobosmite ought to be getting tired of karma fortunes by now.
 
Posts: 35
Karma: 474582
Join Date: Mar 2012
Device: Kindle-DX
Quote:
Originally Posted by dwig View Post
If you want the reader to be able to break before and/or a true elipsis you need in insert a standard space before and/or after the elipsis.
Yeah, I had a feeling that might be the case. I'm so used to having full control of the document and going the other direction, to a flat file, is like drawing with ASCII. Once I figure out the conversion logic, it will be much easier.
Bobosmite is offline   Reply With Quote
Advert
Old 03-09-2012, 05:29 AM   #6
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
Quote:
Originally Posted by DiapDealer View Post
Calibre uses SmartyPants to smarten punctuation, so dot-nbsp-dot-nbsp-dot will be left alone by the process.
Smarten Punctuation primarily relies on smarty pants, but it also converts ... to ellipsis, and converts -- to em-dash.
ldolse is offline   Reply With Quote
Old 03-09-2012, 06:49 AM   #7
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,539
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Originally Posted by ldolse View Post
Smarten Punctuation primarily relies on smarty pants, but it also converts ... to ellipsis, and converts -- to em-dash.
That's all part of what SmartyPants does. quotes, ellipses, em- en-dash conversion.

I wasn't guessing. I looked at the code for the SmartyPants script that's included with calibre. The only things Smarten punctuation converts to the elipse character is three periods in a row, or period<sp>period<sp>period.

Period-nbsp-period-nbsp-period will escape Smarten punctuation's notice.
DiapDealer is online now   Reply With Quote
Old 03-09-2012, 07:20 AM   #8
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
Quote:
Originally Posted by DiapDealer View Post
That's all part of what SmartyPants does. quotes, ellipses, em- en-dash conversion.

I wasn't guessing. I looked at the code for the SmartyPants script that's included with calibre. The only things Smarten punctuation converts to the elipse character is three periods in a row, or period<sp>period<sp>period.

Period-nbsp-period-nbsp-period will escape Smarten punctuation's notice.
I wasn't guessing either, though I didn't fully get the gist of your response - I missed your point about non-breaking spaces, which is true regardless (although I think the OP asked about nbsp as a separate topic). Last time I checked though Smartypants didn't change three dots in a row, e.g. '...'. I'll readily admit that it's quite possible I'm no longer correct about current behavior due to more recent changes in smartypants. I know my original statement was true at one time in the past because smarten punctuation didn't affect ellipsis or double dashes, to resolve the issue I added and QA'd the extra logic which modifies them. Check the last lines of code here:
http://bazaar.launchpad.net/~kovid/c.../preprocess.py

edit - it's possible you're agreeing with me, but we got mixed up on where the smartypants vs smarten_punctuation code is.

Last edited by ldolse; 03-09-2012 at 07:26 AM.
ldolse is offline   Reply With Quote
Old 03-09-2012, 07:54 AM   #9
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,539
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Last time I checked though Smartypants didn't change three dots in a row, e.g. '...'.
Of course it's fully configurable and you can tell it not to convert three consecutive periods to an epllipse if you want, but its default behavior to do so. The two regex patterns it looks for in it's EducateEllipses function is:
\.\.\. and \. \. \.

Spoiler:
Code:
def educateEllipses(str):
	"""
	Parameter:  String.
	Returns:    The string, with each instance of "..." translated to
	            an ellipsis HTML entity.

	Example input:  Huh...?
	Example output: Huh…?
	"""

	str = re.sub(r"""\.\.\.""", r"""& #8230;""", str)
	str = re.sub(r"""\. \. \.""", r"""& #8230;""", str)
	return str
Spaces after the "&" added by me so the html entity didn't get eaten by the forum software .


Quote:
edit - it's possible you're agreeing with me, but we got mixed up on where the smartypants vs smarten_punctuation code is.
That appears to be the case

To be perfectly honest... those last lines of calibre code you linked to have always confused me. SmartyPants already converted '...' and '. . .' to the ellipse html entity... so I'm not sure why it's being done again before doing the substitute_entity call. And the first -- replacement is being done to preserve any html comments, but it seems as if the final "--" substitution would undo all that.

EDIT: never mind that last part about the -- replacement. That appears to be catching any '--' that has a space on either side of it, which would exclude the html comments... but I'm still convinced it's not going to find any occurrances of ' -- ' after the SmartyPants default call.

Last edited by DiapDealer; 03-09-2012 at 08:34 AM.
DiapDealer is online now   Reply With Quote
Old 03-12-2012, 10:38 AM   #10
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
Interesting. I didn't realize the logic was already built into smartypants, and I distinctly recall having test cases which weren't converted by smartypants, which was why the extra lines were added - I agree with you that on a quick review of the code they seem redundant, and it's possible something else was going on. I'll dig back into it when I get a chance.
ldolse is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Converting to .mobi Questions gilvezan Calibre 1 01-15-2011 02:15 PM
Questions about converting epub to Kindle inkyness Amazon Kindle 6 01-10-2011 02:02 PM
Questions about Send to Folder and converting to Mobi Diana495 Calibre 2 09-17-2010 08:45 AM
How to prevent ellipsis orphan? Chad48309 ePub 2 08-16-2010 08:43 PM
2 questions converting covers and library mypolar Calibre 3 08-18-2009 11:40 PM


All times are GMT -4. The time now is 02:47 PM.


MobileRead.com is a privately owned, operated and funded community.