Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 11-11-2014, 06:28 PM   #1
Psymon
Chief Bohemian Misfit
Psymon ought to be getting tired of karma fortunes by now.Psymon ought to be getting tired of karma fortunes by now.Psymon ought to be getting tired of karma fortunes by now.Psymon ought to be getting tired of karma fortunes by now.Psymon ought to be getting tired of karma fortunes by now.Psymon ought to be getting tired of karma fortunes by now.Psymon ought to be getting tired of karma fortunes by now.Psymon ought to be getting tired of karma fortunes by now.Psymon ought to be getting tired of karma fortunes by now.Psymon ought to be getting tired of karma fortunes by now.Psymon ought to be getting tired of karma fortunes by now.
 
Psymon's Avatar
 
Posts: 571
Karma: 462964
Join Date: May 2013
Device: iPad, ADE
Using regex for more elegant hyphenation and word wrap

Wow, I only just learned today what "regex" means -- I've seen it here and there in different programs, but never had a clue what exactly it was for until now (duh) -- and what world of possibilities it might open up for me in simplifying a couple of things that I've been very laboriously doing "manually" so far. I've been reading up all afternoon on regex, though, and I'm still confused on how to go about doing what I want to do, so I hope someone out there can help me come up with the right regex expressions to use.

Basically there's two separate things that I've been doing in order to make my books a little more "elegant," typographically.

PROBLEM #1 - More Selective Hyphenation

I'm primarily an iPad user (forgive me), but I hate the way that it automatically hyphenates words willy-nilly all over the place, even shorter words that didn't need to be, and so what I did to counter that was initially turn hyphenation off in my book completely, by adding this in my styles (wherever I wanted hyphenation to be turned off)...

Code:
-webkit-hyphens:none;
-epub-hyphens:none;
-moz-hyphens:none;
adobe-hyphenate: none;
hyphens:none;
As far as I know, that covers "everything," i.e. whatever devices would allow me to turn hyphenation off to begin with. And then I created a class so that I could selectively turn hyphenation on wherever I would find that it's problematic (that is, where I would find that certain lines would have a ridiculous amount of white space between words, because longer words wouldn't hyphenate)...

Code:
.hyph {
hyphens:auto;
-webkit-hyphens:auto;
-epub-hyphens:auto;
-moz-hyphens:auto;
adobe-hyphenate: auto;
}
And then, call me crazy, but I actually went through my whole, entire book(s) looking for "problematic words", and wrapping them with that class...

Code:
<p>Here's a paragraph with an <span class="hyph">unreasonably</span> long word.</p>
As you can imagine, this is an enormous amount of work to pore over the entire book, almost word-for-word, but now I'm thinking that there surely must be an easy way to do a simple search & replace using regex -- but after spending the whole afternoon trying to figure out how, I can't seem to come up with the right expression to use.

What I'd like to search for is something to the effect of this...

[space] + [a word with at least 8 characters] + [a space OR any number of alphanumeric characters]

...and then for the replace function I want to wrap <span class="hyph"></span> around the 8+ character word and -- if it's not too ridiculous a thing to ask -- ALSO any number of punctuation marks that might come after it, but NOT if it's a space, then just close the span right after the word. If this latter is getting too weird, then wrapping it around just the word would be fine, too. The point of searching for a [space] before the word is because if, say, it's a long word at the very beginning of a paragraph (<p>), then obviously that doesn't need to be hyphenated (unless the first word happened to be "supercalifragilisticexpialidocious" or something).

Does that make sense, what I'm trying to do here? I'm having some problems grasping this regex stuff more generally, just for starters, but the biggest thing I can't figure out is how to search for words that would be 8 characters or longer (and ignore all shorter words).

PROBLEM #2 - Selectively Preventing Word Wrap

Another "typographically-annoying" thing is whenever a line happens to end with the first word of a new sentence (or a phrase after punctuation mark) which starts with a single-letter word -- which, as far as I can come up with, would be "I" or "A" or "a", or, in rarer instances, "O".

Here's a made-up example of an especially annoying paragraph...

Quote:
This is an example paragraph for you. I
hate having the "I" at the end of the line
and want it to wrap with the next word.
This should also take into account punc-
tuation, too, for example if I said, "O,
how nice this would be!", or if, say, I
was using a colon or semi-colon: a
phrase starting with "a" (and coming at
the end of a line, just like I just had here)
would be annoying, too.
So what would be nice to have a regex expression for would be to search for...

[any punctuation mark] + [space] + "I" + [space OR punctuation mark + a space] + [word of 5 characters or less, but not longer]

...and then replace that by wrapping the "I" and the following word (if it's 5 characters or less) with <span class="nowrap"></span>, where the "nowrap" class is...

Code:
.nowrap {
white-space: nowrap;
}
The reason that I only want to include words (i.e. the second word) that are only 5 characters or less is because if they're longer than that, well, then you're running into the same potential issue as with the hyphenation issue outlined above, and you'd be better off just letting the "I" (or "a" or whatever) alone, and put up with it being at the end of the line, if it turns out that way.

I hope you all don't think I'm crazy for nit-picking over hyphenation and word wrap like this, but, well, maybe I actually am crazy. Nevertheless, I've been doing this "manually" all along so far, and wow, what an enormous time saver it would make if I could come up with a regex expression that could do this with a simple search & replace instead! I spent the whole afternoon trying to figure this out, though, I just can't seem to come up with how to do it, though, what expressions I would use.

Can anyone help?

Last edited by Psymon; 11-11-2014 at 06:31 PM.
Psymon is offline   Reply With Quote
Old 11-11-2014, 07:45 PM   #2
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by Psymon View Post
PROBLEM #1 - More Selective Hyphenation

I'm primarily an iPad user (forgive me), but I hate the way that it automatically hyphenates words willy-nilly all over the place, even shorter words that didn't need to be, and so what I did to counter that was initially turn hyphenation off in my book completely, by adding this in my styles (wherever I wanted hyphenation to be turned off)...
This is a bad, bad idea... This is something better left at the specific reading program level, and the hyphenation algorithms of the device itself. Yes, SOME devices might be crap at certain words, some of the algorithms might be junky, but those can be fixed with a software update (or will be fixed in other device X).

The ONLY spot I can MAYBE see disabling hyphenation being useful, is if you wanted to disable it in headings. Besides that, it is not recommended. I DEFINITELY don't recommend disabling it everywhere, and ENABLING it on certain words (if anything, you would do the exact opposite).

Side Note: This reminds me a lot of the soft-hyphenation talk. There is even that Calibre Plugin, "Hyphenate This!", which is dedicated towards adding in soft-hyphens everywhere under the sun:

https://www.mobileread.com/forums/sho...d.php?t=208534

Here is more talk on the soft-hyphen problem:

https://www.mobileread.com/forums/sho...d.php?t=230358

Best to leave hyphenation as a user choice, which they can either enable/disable in their reader.

Other Side Note: Here is Wikipedia's page on Widows/Orphans (italics mine):

https://en.wikipedia.org/wiki/Widows_and_orphans

Quote:
Similarly, a single orphaned word at the end of a paragraph can be cured by forcing one or more words from the preceding line into the orphan's line. In web-publishing, this is typically accomplished by concatenating the words in question with a non-breaking space and, if available, by utilizing the orphans: and widows: attributes in Cascading Style Sheets. Sometimes it can also be useful to add non-breaking spaces to the first two (or few) short words of a paragraph to avoid that a single orphaned word is placed to the left or right of a picture or table, while the remainder of the text (with longer words) would only appear after the table.

[...]

In technical writing where a single source may be published in multiple formats, and now in HTML5 with the expectation of viewing content at different sizes/resolutions, use the word processor settings that automatically prevent widows and orphans. Manual overrides like inserted empty lines or extra spaces can cause unexpected white space in the middle of pages.
Quote:
Originally Posted by Psymon View Post
And then, call me crazy, but I actually went through my whole, entire book(s) looking for "problematic words", and wrapping them with that class...[...]
This is going to cause you a ton of headaches, for very little gain. And what happens when the user changes font size, changes dimensions (landscape + portrait, or devices become higher resolution/larger), etc. etc. You are just going to cause yourself TONS of headaches.

IF, and that is a big IF, you wanted that much control over the look, you might as well just go "Fixed Format". (Which brings along its own host of problems).

Quote:
Originally Posted by Psymon View Post
[space] + [a word with at least 8 characters] + [a space OR any number of alphanumeric characters]
This is the best Regex Tutorial:

http://www.regular-expressions.info/tutorial.html

To do the above, you would want something along these lines:

Search: (\b\w{8,}\b)
Replace: <span class="hyph">\1</span>

\b = a "Word Boundary", you can read up on that here: http://www.regular-expressions.info/wordboundaries.html
\w = any "Word Character", you can read up on that here: http://www.regular-expressions.info/shorthand.html
{8,} = 8 or more characters

So, in English, this says "Find a Word Boundary, then any 8 or more Word Characters in a row, followed by another Word Boundary". Since the entire thing is surrounded by parenthesis, this says, stick this entire thing in a capture point \1.

Then take everything in \1, and "wrap that entire thing with <span class="hyph"></span>".

Quote:
Originally Posted by Psymon View Post
PROBLEM #2 - Selectively Preventing Word Wrap

Another "typographically-annoying" thing is whenever a line happens to end with the first word of a new sentence (or a phrase after punctuation mark) which starts with a single-letter word -- which, as far as I can come up with, would be "I" or "A" or "a", or, in rarer instances, "O".

Here's a made-up example of an especially annoying paragraph...
Again, you are going to cause yourself lots of problems..... the only way to keep certain words together is adding non-breaking spaces all over the place.... and I highly recommend against that.

Look, at a certain point, you have to accept that reflowable ebooks ARE NOT PRINT.

#1: Give up trying to make them print.
#2: The EPUB standards are just not there to support a lot of the complex typographical decisions.

If you want to do all of that typographical nitpicking in EPUB, you will have to go Fixed Format, OR, just create a PDF using whatever tools (LaTeX, Quark, InDesign, etc. etc.).

Side Note: For example, in French Typography, there seems to be this weird rule of "the last line of a paragraph should not be shorter than the double of the indentation of the next paragraph.":

https://tex.stackexchange.com/questi...ne/28361#28361

This sort of weird conventions are just NOT POSSIBLE in EPUB.

Other Side Note: Sometimes, I really wonder how typographers survive on the Internet. Their eyeballs/brains must be going crazy from websites not following typography rules, and users reading at all different font sizes + device/monitor sizes. Where they want pixel/mm perfect typography, the rest of us want resizable/reflowable/customizable.

Quote:
Originally Posted by Psymon View Post
I hope you all don't think I'm crazy for nit-picking over hyphenation and word wrap like this, but, well, maybe I actually am crazy.
Meh, it is not "crazy", but a reflowable EPUB is not the format for that. Either go PDF/DJVU, or Fixed Format.

Quote:
Originally Posted by Psymon View Post
Nevertheless, I've been doing this "manually" all along so far, and wow, what an enormous time saver it would make if I could come up with a regex expression that could do this with a simple search & replace instead!
Oh yeah, Regex is a super time saver. Now I can do stuff in a few clicks that used to take me many hours (for example, fixing up Indexes, catching typos in page numbers, adding en dashes between numbers, etc. etc.).

Last edited by Tex2002ans; 11-11-2014 at 08:20 PM.
Tex2002ans is offline   Reply With Quote
Advert
Old 11-11-2014, 08:25 PM   #3
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,546
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
I agree entirely with the previous post. If you want to control hyphenation for a personal document on a specific device for your own use, that's one thing. Have at it. But I wouldn't recommend trying to force something that should be a user/device preference. For one thing, you have no way of knowing where "problematic" lines might create unwanted whitespace when you don't know what fonts/fontsizes readers are using.
DiapDealer is offline   Reply With Quote
Old 11-11-2014, 09:06 PM   #4
Psymon
Chief Bohemian Misfit
Psymon ought to be getting tired of karma fortunes by now.Psymon ought to be getting tired of karma fortunes by now.Psymon ought to be getting tired of karma fortunes by now.Psymon ought to be getting tired of karma fortunes by now.Psymon ought to be getting tired of karma fortunes by now.Psymon ought to be getting tired of karma fortunes by now.Psymon ought to be getting tired of karma fortunes by now.Psymon ought to be getting tired of karma fortunes by now.Psymon ought to be getting tired of karma fortunes by now.Psymon ought to be getting tired of karma fortunes by now.Psymon ought to be getting tired of karma fortunes by now.
 
Psymon's Avatar
 
Posts: 571
Karma: 462964
Join Date: May 2013
Device: iPad, ADE
For what it's worth, with regard to the hypenation thing I've been doing with my previous (and in-progress) books, in doing it the way I outlined, and testing it out on the iPad in all font sizes, both in portrait and landscape orientation, everything turned out remarkably well, way better than just leaving things to the "default" (i.e. by having not done what I did at all). Not only is the text more visually/aesthetically pleasing to look at, without a zillion needless hyphenations all over the place, but there's not a single instance of "large white spaces" anywhere, and hence, as a result of all that, not only is the book more visually "pleasurable" just too look at (from a design perspective) but overall ease-of-readability is improved, too (because there's fewer hyphenated words all over the place).

I can appreciate both your concerns -- if what you're concerned about was actually happening -- but in fact the exact opposite of what you're concerned about is the end result of doing what I did.
Psymon is offline   Reply With Quote
Old 11-11-2014, 09:28 PM   #5
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,546
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Originally Posted by Psymon View Post
Not only is the text more visually/aesthetically pleasing to look at
Gotta be honest: this ^^^ troubles me. But if you're happy, I'm ecstatic. *shrug*
DiapDealer is offline   Reply With Quote
Advert
Old 11-11-2014, 09:31 PM   #6
Psymon
Chief Bohemian Misfit
Psymon ought to be getting tired of karma fortunes by now.Psymon ought to be getting tired of karma fortunes by now.Psymon ought to be getting tired of karma fortunes by now.Psymon ought to be getting tired of karma fortunes by now.Psymon ought to be getting tired of karma fortunes by now.Psymon ought to be getting tired of karma fortunes by now.Psymon ought to be getting tired of karma fortunes by now.Psymon ought to be getting tired of karma fortunes by now.Psymon ought to be getting tired of karma fortunes by now.Psymon ought to be getting tired of karma fortunes by now.Psymon ought to be getting tired of karma fortunes by now.
 
Psymon's Avatar
 
Posts: 571
Karma: 462964
Join Date: May 2013
Device: iPad, ADE
Why would that be "troubling"?
Psymon is offline   Reply With Quote
Old 11-11-2014, 10:13 PM   #7
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,546
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Originally Posted by Psymon View Post
Why would that be "troubling"?
I don't believe anyone's qualified to determine what might be "aesthetically pleasing" for anyone other than themselves. Other than some basic formatting and maybe a few simple flourishes/frills, I believe ebook creators should stay (mostly) out of the way of users and their preferred settings on their preferred readers (with regard to basic default body text formatting). They (readers) should be free to make the decisions about what pleases them aesthetically, rather than having it dictated.

I realize not everyone feels that way--and that's fine. I'm not going to harp on it other than what I've already said.
DiapDealer is offline   Reply With Quote
Old 11-11-2014, 10:20 PM   #8
Psymon
Chief Bohemian Misfit
Psymon ought to be getting tired of karma fortunes by now.Psymon ought to be getting tired of karma fortunes by now.Psymon ought to be getting tired of karma fortunes by now.Psymon ought to be getting tired of karma fortunes by now.Psymon ought to be getting tired of karma fortunes by now.Psymon ought to be getting tired of karma fortunes by now.Psymon ought to be getting tired of karma fortunes by now.Psymon ought to be getting tired of karma fortunes by now.Psymon ought to be getting tired of karma fortunes by now.Psymon ought to be getting tired of karma fortunes by now.Psymon ought to be getting tired of karma fortunes by now.
 
Psymon's Avatar
 
Posts: 571
Karma: 462964
Join Date: May 2013
Device: iPad, ADE
Quote:
Originally Posted by DiapDealer View Post
I don't believe anyone's qualified to determine what might be "aesthetically pleasing" for anyone other than themselves.
Well, I don't know too many people who think that more hyphenation is better, where it's done not because it's actually needed, but simply because the software see that it "can" hyphenate, and so it does, and as a result there's gratuitous hyphenation all over the place that actually interferes with ease-of-readability and overall visual appeal.

Of course, that latter, as you say, is a judgement call -- but as I said, I don't know too many people (if any) who think that more hyphenation is better, just because you (or the software) "can."
Psymon is offline   Reply With Quote
Old 11-11-2014, 10:23 PM   #9
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
What about non-iPad devices, what about iPad #X (future device)?

What about iPhone? I could see hyphenation enabling/disabling causing a much worse problem when you don't have much left/right space available to fit many characters.

What about someone who reads on an iPhone with huge margins on the edges? What about someone that reads with no margins? What about someone who chooses their own fonts? What about someone who wants to read in Marvin instead? *Insert huge list of changes/questions here*.

What if they want to convert this EPUB to Format X and read it in Program XYZ? Your hyphenation code will most likely not be transferred over at all. (And at worst, might get in the way/break something else... for example, soft-hyphens, while they "look nice", break search functionality in many devices).

You should stay as out of the way of the user/reader as possible, and only giving very general guidance with your CSS to tell the device how to treat the book.

It goes back to the ol' argument between "specific" versus the "broad" fixes. Your problem is with the CURRENT iBooks hyphenation algorithm. So you add in all of these SPECIFIC manual tweaks, to try to make it look "more aesthetically pleasing"... but the real problem should be geared towards the READER/DEVICE level, and complaining to iBooks to update/tweak their hyphenation algorithm!

Or heck, what if I did want to read WITHOUT hyphenation, your code will get in my way. (And I DO disable any and all hyphenation when I read, because I want to catch actual typos/errors, and not see the auto soft-hyphens all over the place).

What if I wanted to read left-aligned text, with no hyphenation (maybe it reminds me of MobileRead posts!). You enabling/disabling hyphenation with individual spans would make me angry.

Last edited by Tex2002ans; 11-11-2014 at 10:37 PM.
Tex2002ans is offline   Reply With Quote
Old 11-11-2014, 10:29 PM   #10
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 27,546
Karma: 193191846
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Originally Posted by Psymon View Post
Well, I don't know too many people who think that more hyphenation is better, where it's done not because it's actually needed, but simply because the software see that it "can" hyphenate, and so it does, and as a result there's gratuitous hyphenation all over the place that actually interferes with ease-of-readability and overall visual appeal.

Of course, that latter, as you say, is a judgement call -- but as I said, I don't know too many people (if any) who think that more hyphenation is better, just because you (or the software) "can."
That's where I think you're limiting the scope of your efforts by focussing so much on preparing for a single platform. Most of us try to make something that renders the same no matter what appication or device you read it with. I have absolutely no issues with the built-in hyphenation algorithm in my preferred reading app. Never noticed it trying to hyphenate too much.
DiapDealer is offline   Reply With Quote
Old 11-11-2014, 10:39 PM   #11
Psymon
Chief Bohemian Misfit
Psymon ought to be getting tired of karma fortunes by now.Psymon ought to be getting tired of karma fortunes by now.Psymon ought to be getting tired of karma fortunes by now.Psymon ought to be getting tired of karma fortunes by now.Psymon ought to be getting tired of karma fortunes by now.Psymon ought to be getting tired of karma fortunes by now.Psymon ought to be getting tired of karma fortunes by now.Psymon ought to be getting tired of karma fortunes by now.Psymon ought to be getting tired of karma fortunes by now.Psymon ought to be getting tired of karma fortunes by now.Psymon ought to be getting tired of karma fortunes by now.
 
Psymon's Avatar
 
Posts: 571
Karma: 462964
Join Date: May 2013
Device: iPad, ADE
Quote:
Originally Posted by Tex2002ans View Post
What about non-iPad devices, what about iPad #X (future device)? What about iPhone?

What about someone who reads with huge margins on the edges? What about someone that reads with no margins? What about someone who chooses their own fonts? What about someone who wants to read in Marvin instead? *Insert huge list of changes/questions here*.
Well, okay, those (and the rest of your message) are certainly far more valid arguments against doing what I've been doing -- at least, better than merely saying that "aesthetically pleasing" is a judgement call -- and if I were to just up and scrap it for all the various reasons given, both in my works-in-progress as well as also going back and removing it from my previous books...

well... have you got a brilliant regex script for me that will get rid of every instance of <span class="hyph"></span> without getting rid of the words in-between, and not getting rid of any other spans, and then I'll just allow hyphenation everywhere, let the software do whatever it want (and which it does), willy-nilly all over the place?

Don't get me wrong, I'll take your (plural) advice, but I do actually think this totally, totally sucks. Of course, I also felt it sucked the other way, too, which compelled me to add in all that extra coding in the first place. :/
Psymon is offline   Reply With Quote
Old 11-11-2014, 10:48 PM   #12
PeterT
Grand Sorcerer
PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.
 
PeterT's Avatar
 
Posts: 12,160
Karma: 73448616
Join Date: Nov 2007
Location: Toronto
Device: Nexus 7, Clara, Touch, Tolino EPOS
Try:

Search: <span class="hyph">(.+?)</span>
Replace: \1

Or just revert to a backup you had made of that ePub before making those changes.
PeterT is offline   Reply With Quote
Old 11-11-2014, 11:56 PM   #13
Tex2002ans
Wizard
Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.Tex2002ans ought to be getting tired of karma fortunes by now.
 
Posts: 2,297
Karma: 12126329
Join Date: Jul 2012
Device: Kobo Forma, Nook
Quote:
Originally Posted by Psymon View Post
Don't get me wrong, I'll take your (plural) advice, but I do actually think this totally, totally sucks. Of course, I also felt it sucked the other way, too, which compelled me to add in all that extra coding in the first place. :/
Well you think the hyphenation algorithm stinks? The JUSTIFICATION algorithms stink! Especially without access to being able to tweak more complex microtypography (kerning, letter-width (stretching/shrinking a letter by 2-3%), letter-spacing (tweaking the space between characters a tiny bit), etc. etc.).

All these readers do just like your typical Word Processor... only change the spacing between WORDS.

Side Note: Well, kerning can be done at the font level, but meh, someone can always choose a new font, and your work goes out the window.

Much of this micro-typography is being implemented in CSS3, but meh, I don't know how well it is going to work, or be supported by reading devices.

Or let us take the hyphenation algorithm itself, the amount of hyphens in a single paragraph should really be minimized as much as possible (and heaven forbid, two lines MUST NOT have hyphens in a row). Here, you can see a comparison in justification/hyphenation between Word/InDesign/LaTeX:

https://tex.stackexchange.com/questi...esetting-ligat

Then if you REALLY want to get more into the typography rules of hyphenation, there are rules such as "a person's last name SHOULD NOT be hyphenated"... so you have to start wrapping all of those in <span class="donthyphenate">LastName</span>.

You should also avoid doing lots of things, because they are "more aesthetically pleasing".... but next thing you know, you have a huge mess of code like InDesign or Word outputs!

As I said, best to leave it up at the device/reader level, than to get super nitpicky. Too many variables for you to worry about. This is easy if you know the EXACT page size, and the EXACT font, and the EXACT font size (like if you are designing a print book)... but you start changing any of those variables, and most of your hard work goes out the window (or gets in the way/causes problems elsewhere).

Anyway, not stopping you from wanting to do all that hyphenation/nitpickyness.... but please, just don't globally disable/enable hyphenation via CSS. Use it sparingly (as was mentioned, maybe only disable hyphens in a <h1>, <h2>, <h3>).

Last edited by Tex2002ans; 11-12-2014 at 12:15 AM.
Tex2002ans is offline   Reply With Quote
Old 11-12-2014, 08:03 AM   #14
RbnJrg
Wizard
RbnJrg ought to be getting tired of karma fortunes by now.RbnJrg ought to be getting tired of karma fortunes by now.RbnJrg ought to be getting tired of karma fortunes by now.RbnJrg ought to be getting tired of karma fortunes by now.RbnJrg ought to be getting tired of karma fortunes by now.RbnJrg ought to be getting tired of karma fortunes by now.RbnJrg ought to be getting tired of karma fortunes by now.RbnJrg ought to be getting tired of karma fortunes by now.RbnJrg ought to be getting tired of karma fortunes by now.RbnJrg ought to be getting tired of karma fortunes by now.RbnJrg ought to be getting tired of karma fortunes by now.
 
Posts: 1,539
Karma: 6613969
Join Date: Mar 2013
Location: Rosario - Santa Fe - Argentina
Device: Kindle 4 NT
Quote:
Originally Posted by Psymon View Post
Of course, that latter, as you say, is a judgement call -- but as I said, I don't know too many people (if any) who think that more hyphenation is better, just because you (or the software) "can."
Well Psymon, I AM one of those ones that think that more hyphenation is better Of course, my native language is spanish but I prefer more hyphenation even in english books

Regards
RbnJrg is offline   Reply With Quote
Old 11-12-2014, 10:44 AM   #15
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 29,782
Karma: 54830978
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by Psymon View Post
Well, I don't know too many people who think that more hyphenation is better, where it's done not because it's actually needed, but simply because the software see that it "can" hyphenate, and so it does, and as a result there's gratuitous hyphenation all over the place that actually interferes with ease-of-readability and overall visual appeal.

Of course, that latter, as you say, is a judgement call -- but as I said, I don't know too many people (if any) who think that more hyphenation is better, just because you (or the software) "can."
More - is rarely better
But a forced (upon you) cure is (IMHO) never
(and having no dash is really terrible )


Take the K4 (I have one). 40px of wasted screen , MINIMUM l/r margins, space, is not pleasing. to me. (I have a fragile hack in place that sets it to 10).

I love my books to look nice,
Not a stark, Joe Friday style:
"Just the words, M'am. Just the words."
.
But I also understand that some of the tricks-o-the-trade Typesetters (Cold or Hot) used when letters stayed put, no longer apply when the page can flow.

IMHO Fixed layout should be reserved for those extra special cases , where free flowing text makes a hash out of the meaning of the work.

I have also seen the mess MRSDK can make using Widows and Orphans . The cure can backfire. IMHO Avoid using force.

<RANT>
Apple is NOT the metric for E-Books. Just say i-won't! force the Apple way on others. There are still more other brand-models in use as reading devices than there are from the big(-headed) Apple.
</RANT>
theducks is offline   Reply With Quote
Reply

Tags
hyphenation, regex, search & replace, word wrap


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
word wrap issues on Kindle from .txt rbdavis Conversion 9 02-08-2011 07:55 AM
Q: Tables, images, and word-wrap AndrewH Workshop 2 12-22-2010 02:34 AM
Sheet To Go -- Word Wrap in Cells? kenjennings enTourage Archive 0 05-06-2010 10:34 AM
Word wrap in the forum [closed] JSWolf Lounge 51 11-11-2007 10:22 PM


All times are GMT -4. The time now is 05:16 PM.


MobileRead.com is a privately owned, operated and funded community.