Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 12-15-2016, 01:49 AM   #511
Nando Sandiego
Junior Member
Nando Sandiego began at the beginning.
 
Posts: 9
Karma: 10
Join Date: Oct 2016
Device: Hypen app for iOS
css option

Quote:
Originally Posted by Psymon View Post
Hey, folks -- I am trying to learn/do this regex stuff on my own (however slowly)! I'm stumped on something that I would think should be fairly easy, though.

In my book, I've got almost 300 paragraphs that start off with a dropcap, with this being an example of how those paragraphs begin...

Code:
<span class="initial">H</span>onourable
What I want to do is make that first word in smallcaps, and so the code in this latter example would then be...

Code:
<span class="initial">H</span><span class="smallcaps">ONOURABLE</span>
So basically what I want to do is convert the case of that first word to uppercase and then wrap that smallcaps span around the relevant part of the word.

For my regex search I initially came up with this...

<span class=\"initial\">(.+?)</span>([^>]*)\s

...and for replace this...

<span class="initial">\1</span><span class="smallcaps">\U\2\E</span>

...(and in this latter there's an invisible space there that I suppose you won't "see" in this post -- but it would be there in my S&R, of course).

For the life of me, though, that \s won't stop at the first space, that is, after the first word -- it selects the entire paragraph up to the last space in the paragraph! -- and it's also possible that there might actually be not a space, but a comma (or other punctuation) instead, and I'd like that closing span (for my smallcaps) to come before that.

I've searched around the 'net trying to find the solution to this, but just can't seem to find it -- every "answer" that I find on other sites and try just doesn't seem to work.

Thanks in advance, if anyone can help!

(PS. I'm not sure if my "replace" code is correct either, actually -- although I never got that far with figuring this out!)
I was formatting documents similar to the way you your were but used:
<span class=\"initial\">(.*?)</span>(.*?)\s
and did chapters one at a time.

I since discovered this css trick that lets me assign a paragraph class that make the first letter a drop cap while automatically accounting for quotes and also starting with span (<p class="first><span class=<italic">), or <i>, <em>, and seemingly anything until the first letter is found.

The only condition I have found where the first letter is not drop capped is if dash, en-dash or em-dash are the first character.


I also set the entire first line of text as smallcaps dynamically, meaning smallcaps for the first line no matter how many words are displayed. The result is much like books are formatted.

As a result I do much less regex editing than I was doing before.

The css is:

.parafirst { your format options for first paragraph }
.parafirst:first-letter { your format options for drop cap }
.parafirst:first-line { font-variant: small-caps; } < could also be any format you like, such as bold or convert to all uppercase

I found the above solution looking for a way for css to handle a first word of a paragraph but that does not seem to exist. Just letter and line as options.
Nando Sandiego is offline   Reply With Quote
Advert
Old 12-15-2016, 01:50 AM   #512
Nando Sandiego
Junior Member
Nando Sandiego began at the beginning.
 
Posts: 9
Karma: 10
Join Date: Oct 2016
Device: Hypen app for iOS
css option

Quote:
Originally Posted by Psymon View Post
Hey, folks -- I am trying to learn/do this regex stuff on my own (however slowly)! I'm stumped on something that I would think should be fairly easy, though.

In my book, I've got almost 300 paragraphs that start off with a dropcap, with this being an example of how those paragraphs begin...

Code:
<span class="initial">H</span>onourable
What I want to do is make that first word in smallcaps, and so the code in this latter example would then be...

Code:
<span class="initial">H</span><span class="smallcaps">ONOURABLE</span>
So basically what I want to do is convert the case of that first word to uppercase and then wrap that smallcaps span around the relevant part of the word.

For my regex search I initially came up with this...

<span class=\"initial\">(.+?)</span>([^>]*)\s

...and for replace this...

<span class="initial">\1</span><span class="smallcaps">\U\2\E</span>

...(and in this latter there's an invisible space there that I suppose you won't "see" in this post -- but it would be there in my S&R, of course).

For the life of me, though, that \s won't stop at the first space, that is, after the first word -- it selects the entire paragraph up to the last space in the paragraph! -- and it's also possible that there might actually be not a space, but a comma (or other punctuation) instead, and I'd like that closing span (for my smallcaps) to come before that.

I've searched around the 'net trying to find the solution to this, but just can't seem to find it -- every "answer" that I find on other sites and try just doesn't seem to work.

Thanks in advance, if anyone can help!

(PS. I'm not sure if my "replace" code is correct either, actually -- although I never got that far with figuring this out!)
I was formatting documents similar to the way you your were but used:
<span class=\"initial\">(.*?)</span>(.*?)\s
and did chapters one at a time.

I since discovered this css trick that lets me assign a paragraph class that make the first letter a drop cap while automatically accounting for quotes and also starting with span (<p class="first><span class=<italic">), or <i>, <em>, and seemingly anything until the first letter is found.

The only condition I have found where the first letter is not drop capped is if dash, en-dash or em-dash are the first character.

I also set the entire first line of text as smallcaps dynamically, meaning smallcaps for the first line no matter how many words are displayed. The result is much like books are formatted.

The css is:

.parafirst { your format options for first paragraph }
.parafirst:first-letter { your format options for drop cap }
.parafirst:first-line { font-variant: small-caps; }

I do much less regex than I was doing before.

I found the above solution looking for a way for css to handle a first word of a paragraph but that does not seem to exist. Just letter and line as options.
Nando Sandiego is offline   Reply With Quote
Old 02-05-2017, 08:09 AM   #513
GalacticHull
Member
GalacticHull began at the beginning.
 
Posts: 20
Karma: 10
Join Date: Nov 2014
Device: kobo h2O
Hi,

I'm looking to remove, in Sigil: name="54614"

The numbers change, but there are always 6 [edit: 7 in one xhtml file]. I can't determine how to use regular expressions in Sigil to remove every name="######"

Anyone know a specific expression?

Thank you.

Last edited by GalacticHull; 02-05-2017 at 08:12 AM. Reason: additional information
GalacticHull is offline   Reply With Quote
Old 02-05-2017, 08:24 AM   #514
Doitsu
Wizard
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 3,852
Karma: 10339686
Join Date: Dec 2010
Device: Kindle PW2
Quote:
Originally Posted by GalacticHull View Post
I'm looking to remove, in Sigil: name="54614"
You could use:

Find:
Code:
name="(\d+)"
The numbers that were found can be referenced with \1.

Last edited by Doitsu; 02-05-2017 at 08:28 AM.
Doitsu is offline   Reply With Quote
Old 02-05-2017, 08:52 AM   #515
GalacticHull
Member
GalacticHull began at the beginning.
 
Posts: 20
Karma: 10
Join Date: Nov 2014
Device: kobo h2O
Quote:
Originally Posted by Doitsu View Post
You could use:

Find:
Code:
name="(\d+)"
The numbers that were found can be referenced with \1.
When I run that, nothing is found. I'm not too familiar with regular expressions, however. Does it need to be prefaced with something like (?s) or regex ... ?

I appreciate the help. I've rarely encountered the need for regular expressions and so you're dealing with a total idiot.
GalacticHull is offline   Reply With Quote
Old 02-05-2017, 09:16 AM   #516
Doitsu
Wizard
Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.Doitsu ought to be getting tired of karma fortunes by now.
 
Doitsu's Avatar
 
Posts: 3,852
Karma: 10339686
Join Date: Dec 2010
Device: Kindle PW2
Quote:
Originally Posted by GalacticHull View Post
When I run that, nothing is found. I'm not too familiar with regular expressions, however. Does it need to be prefaced with something like (?s) or regex ... ?
That was my mistake, I put the parens in the wrong place it should read:
Code:
name="(\d+)"
This should work.
Doitsu is offline   Reply With Quote
Old 02-05-2017, 09:28 AM   #517
GalacticHull
Member
GalacticHull began at the beginning.
 
Posts: 20
Karma: 10
Join Date: Nov 2014
Device: kobo h2O
Know what? Whatever you did or didn't do, I'm a fool. This was my second attempt, and after about a dozen I posted here. The fact is, I didn't recognize that Sigil had a specific option to run regex, so I just kept running it in normal search!

And, if not for your help making me confident of that regex, I probably never would have noticed. So thank you
GalacticHull is offline   Reply With Quote
Old 02-05-2017, 09:28 AM   #518
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 16,695
Karma: 86991370
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
I'd consider changing that "name=" attribute to "id=" instead of just removing it. It's obsolete, but there still may be operating links that refer to it in their hrefs.
DiapDealer is offline   Reply With Quote
Old 02-05-2017, 09:42 AM   #519
GalacticHull
Member
GalacticHull began at the beginning.
 
Posts: 20
Karma: 10
Join Date: Nov 2014
Device: kobo h2O
Quote:
Originally Posted by DiapDealer View Post
I'd consider changing that "name=" attribute to "id=" instead of just removing it. It's obsolete, but there still may be operating links that refer to it in their hrefs.
Thank you for the suggestion, but the ID attribute was there in addition to the name attribute. The name not referring to anything. I will of course keep that in mind in the future. Now, running epubcheck 4.0.2 (https://github.com/IDPF/epubcheck/releases) it seems I have a perfectly formed epub.
GalacticHull is offline   Reply With Quote
Old 02-05-2017, 09:46 AM   #520
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 16,695
Karma: 86991370
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Sounds good. Just wanted to be sure that internal links had been considered.
DiapDealer is offline   Reply With Quote
Old 07-18-2017, 11:16 AM   #521
roger64
Wizard
roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.
 
Posts: 2,131
Karma: 2150557
Join Date: Jan 2009
Device: Kobo Glo - Kindle PW3 (wifi)
Hi

To avoid a disgraceful linebreak between the name of a ruler and his number, the French insert a no-break space between them (here represented by _. Thus we find, Charles_XII, Henri_II. This rule applies even for Louis_XVI...

I'd like to write a regex to add automatically the missing no-break spaces. We have two parts: a surname beginning with a capital letter and a number written with Roman numerals.

The following regex finds all of them
Code:
([A-Z])([a-z]+)\s(I|V|X)+
I dropped the L because a revolution should take place long before number fifty.

However, this regex is a little too greedy because it also works for Hans Viktor, Si Votre..., Pour Vienne...,

So I'd like to be sure it should not work if the Roman numerals are followed by lower case letters. Could some kind helping hand improve this regex?

Last edited by roger64; 07-18-2017 at 11:22 AM.
roger64 is offline   Reply With Quote
Old 07-18-2017, 01:11 PM   #522
KevinH
Wizard
KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.KevinH ought to be getting tired of karma fortunes by now.
 
Posts: 2,523
Karma: 772404
Join Date: Nov 2009
Device: many
So after the roman numerals all you want to allow is whitespace or punctuation (.,) is that right? If so couldn't you can add them as [\s.,]+ at the end? You might want to include single and double quotes as well and even an exclamations point such as Louis XVI! LouisXVI. etc.

If you really want anything other than a lowercase letter you could add [^a-z] or something along those lines to negate the set.
KevinH is offline   Reply With Quote
Old 07-18-2017, 03:42 PM   #523
roger64
Wizard
roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.
 
Posts: 2,131
Karma: 2150557
Join Date: Jan 2009
Device: Kobo Glo - Kindle PW3 (wifi)
@Kevin

Thanks for your help.

After some trials, it seems I get good results this way:

Code:
([A-Z])([a-zé]+)\s([I|V|X]+)([\s.,?!]+|<sup>er</sup>)
Code:
\1\2\u00a0\3\4

Last edited by roger64; 07-18-2017 at 03:54 PM. Reason: Napoléon Ier
roger64 is offline   Reply With Quote
Old 07-19-2017, 06:17 AM   #524
davidfor
Grand Sorcerer
davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.
 
Posts: 12,848
Karma: 20165848
Join Date: Jul 2011
Location: Sydney, Australia
Device: Kobo: Touch, Glo, Aura H2O, Glo HD, Aura ONE
A small suggestion. Rather than using the possible punctuation, use "\b". That is a word break, so will match to everything you have, plus all the other possible punctuation that might end a word.
davidfor is offline   Reply With Quote
Old 07-23-2017, 02:41 AM   #525
roger64
Wizard
roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.roger64 ought to be getting tired of karma fortunes by now.
 
Posts: 2,131
Karma: 2150557
Join Date: Jan 2009
Device: Kobo Glo - Kindle PW3 (wifi)
Quote:
Originally Posted by davidfor View Post
A small suggestion. Rather than using the possible punctuation, use "\b". That is a word break, so will match to everything you have, plus all the other possible punctuation that might end a word.
Thanks for the tip.
roger64 is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Examples of Subgroups emonti8384 Lounge 32 02-26-2011 06:00 PM
Accessories Pen examples Gunnerp245 enTourage Archive 15 02-21-2011 03:23 PM
Stylesheet examples? Skitzman69 Sigil 15 09-24-2010 08:24 PM
Examples kafkaesque1978 iRiver Story 1 07-26-2010 03:49 PM
Looking for examples of typos in eBooks Tonycole General Discussions 1 05-05-2010 04:23 AM


All times are GMT -4. The time now is 02:31 PM.


MobileRead.com is a privately owned, operated and funded community.