Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 12-12-2011, 12:31 PM   #1
el.motar
Junior Member
el.motar began at the beginning.
 
el.motar's Avatar
 
Posts: 6
Karma: 10
Join Date: Nov 2009
Location: Asturias
Device: Handlin
RegEx

Hi Bookworms.

I am having a problem with a RE expression and wonder if anyone could shed some light on the problem.

I am trying to replace the div tags with h3 tags around the second headers of a book with lots of those, I think around 198 in total, bit tedious by hand.

The headers are all upper-case and normally one space between words, this is my search string with minimal matching selected.

<p class="western1">([^a-z]+[^¿][A-Z]+[\s].*)</p>

This does select the headers but I can not figure out how to store the test contents of the headers.

I seem to end up with only a part of the header, have tried to include up to 5 round brackets "(" and use the \1 to \5 to replace the contents but not luck.

Would appreciate any help from the experts.

ATB

el.motar
el.motar is offline   Reply With Quote
Old 12-12-2011, 01:02 PM   #2
theducks
Grand Sorcerer
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 14,432
Karma: 5560777
Join Date: Aug 2009
Location: The (original) Silicon Valley, USA
Device: Galaxy Tab 2, Astak Pocket Pro, K4NT
Quote:
Originally Posted by el.motar View Post
Hi Bookworms.

I am having a problem with a RE expression and wonder if anyone could shed some light on the problem.

I am trying to replace the div tags with h3 tags around the second headers of a book with lots of those, I think around 198 in total, bit tedious by hand.

The headers are all upper-case and normally one space between words, this is my search string with minimal matching selected.

<p class="western1">([^a-z]+[^¿][A-Z]+[\s].*)</p>

This does select the headers but I can not figure out how to store the test contents of the headers.

I seem to end up with only a part of the header, have tried to include up to 5 round brackets "(" and use the \1 to \5 to replace the contents but not luck.

Would appreciate any help from the experts.

ATB

el.motar
can you paste a complete sample PARAGRAPH for us to see?

does <p class="western1"> appear on any other line?
If so, what is the line just before the 'header"? Is it unique in relation.

As always, do not post more than a few lines of copyrighted material without the authors (included) permission.
theducks is offline   Reply With Quote
Old 12-12-2011, 01:35 PM   #3
el.motar
Junior Member
el.motar began at the beginning.
 
el.motar's Avatar
 
Posts: 6
Karma: 10
Join Date: Nov 2009
Location: Asturias
Device: Handlin
Quote:
Originally Posted by theducks View Post
can you paste a complete sample PARAGRAPH for us to see?

does <p class="western1"> appear on any other line?
If so, what is the line just before the 'header"? Is it unique in relation.

As always, do not post more than a few lines of copyrighted material without the authors (included) permission.
Hi theducks,

Thanks for your reply

This is the actual paragraph.

<p class="western3">CECILIA VOLANGES A SOFIA CARNAY EN EL CONVENTO DE URSULINAS DE . . .</p>

I don remember the replacement string, as I have tried quite a few.

Thanks.
el.motar is offline   Reply With Quote
Old 12-12-2011, 01:43 PM   #4
theducks
Grand Sorcerer
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 14,432
Karma: 5560777
Join Date: Aug 2009
Location: The (original) Silicon Valley, USA
Device: Galaxy Tab 2, Astak Pocket Pro, K4NT
Quote:
Originally Posted by el.motar View Post
Hi theducks,

Thanks for your reply

This is the actual paragraph.

<p class="western3">CECILIA VOLANGES A SOFIA CARNAY EN EL CONVENTO DE URSULINAS DE . . .</p>

I don remember the replacement string, as I have tried quite a few.

Thanks.
Minimal matching ticked.

<p class="western3">(.+)</p>


<h3 class="western3">\1</h3>

I used H3, but you can use any.
class="western3" is optional, usually I want some type of class as ADE does not center headers by default

The above only is safe if western3 is used only on headers OR you step though the document manually Find Next, Replace or Find Next (to skip)
theducks is offline   Reply With Quote
Old 12-12-2011, 02:05 PM   #5
el.motar
Junior Member
el.motar began at the beginning.
 
el.motar's Avatar
 
Posts: 6
Karma: 10
Join Date: Nov 2009
Location: Asturias
Device: Handlin
Quote:
Originally Posted by theducks View Post
Minimal matching ticked.

<p class="western3">(.+)</p>


<h3 class="western3">\1</h3>

I used H3, but you can use any.
class="western3" is optional, usually I want some type of class as ADE does not center headers by default

The above only is safe if western3 is used only on headers OR you step though the document manually Find Next, Replace or Find Next (to skip)
This is embarrassing, must cut down on the wine before posting.

The reason I was having problems with particular RE F and R is that there are two headers plus other paragraphs with the same tags, i.e. the one posted and this type.
<p class="western3">Text text text text ....</p>

I wanted to mark the headings CARTA as h2 tags and the others as h3 tags.

The CARTA tags are easy, but once they are done, the only differentiation between these and the other paragraphs using the same tags is that these are all capital letters plus a space between the words.

I was trying to select paragraphs which contain only capital letters, the RE expression posted does the selection but I could not work out the replacement string.

Thanks for your time.

ATB
el.motar
el.motar is offline   Reply With Quote
Old 12-12-2011, 02:31 PM   #6
theducks
Grand Sorcerer
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 14,432
Karma: 5560777
Join Date: Aug 2009
Location: The (original) Silicon Valley, USA
Device: Galaxy Tab 2, Astak Pocket Pro, K4NT
Quote:
Originally Posted by el.motar View Post
This is embarrassing, must cut down on the wine before posting.

The reason I was having problems with particular RE F and R is that there are two headers plus other paragraphs with the same tags, i.e. the one posted and this type.
<p class="western3">Text text text text ....</p>

I wanted to mark the headings CARTA as h2 tags and the others as h3 tags.

The CARTA tags are easy, but once they are done, the only differentiation between these and the other paragraphs using the same tags is that these are all capital letters plus a space between the words.

I was trying to select paragraphs which contain only capital letters, the RE expression posted does the selection but I could not work out the replacement string.

Thanks for your time.

ATB
el.motar
\0 is the magic back reference for the whole found string
theducks is offline   Reply With Quote
Old 12-12-2011, 02:36 PM   #7
theducks
Grand Sorcerer
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 14,432
Karma: 5560777
Join Date: Aug 2009
Location: The (original) Silicon Valley, USA
Device: Galaxy Tab 2, Astak Pocket Pro, K4NT
OK there is more to it
\0 puts the entire original inside the replace
so pass 2 and 3 is needed to fix
<h3 class="ct"><p class="western3">CECILIA VOLANGES A SOFIA CARNAY EN EL CONVENTO DE URSULINAS DE . . .</p></h3>

BTW here is my search: <p class="western3">(([A-Z]+\s+){2,}.+)</p>
and Replace <h3 class="ct">\0</h3>
theducks is offline   Reply With Quote
Old 12-12-2011, 03:11 PM   #8
el.motar
Junior Member
el.motar began at the beginning.
 
el.motar's Avatar
 
Posts: 6
Karma: 10
Join Date: Nov 2009
Location: Asturias
Device: Handlin
Quote:
Originally Posted by theducks View Post
OK there is more to it
\0 puts the entire original inside the replace
so pass 2 and 3 is needed to fix
<h3 class="ct"><p class="western3">CECILIA VOLANGES A SOFIA CARNAY EN EL CONVENTO DE URSULINAS DE . . .</p></h3>

BTW here is my search: <p class="western3">(([A-Z]+\s+){2,}.+)</p>
and Replace <h3 class="ct">\0</h3>
Hi theducks,

This is great, save me a lot of time with my book.
Worked a treat and is good to know about the "0".

Thank you

ATB
el.motar
el.motar is offline   Reply With Quote
Old 12-12-2011, 03:43 PM   #9
theducks
Grand Sorcerer
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 14,432
Karma: 5560777
Join Date: Aug 2009
Location: The (original) Silicon Valley, USA
Device: Galaxy Tab 2, Astak Pocket Pro, K4NT
Quote:
Originally Posted by el.motar View Post
Hi theducks,

This is great, save me a lot of time with my book.
Worked a treat and is good to know about the "0".

Thank you

ATB
el.motar
I went to the help page and re-read after over your problem (that I had the same issue )
and went
theducks is offline   Reply With Quote
Old 12-12-2011, 04:50 PM   #10
el.motar
Junior Member
el.motar began at the beginning.
 
el.motar's Avatar
 
Posts: 6
Karma: 10
Join Date: Nov 2009
Location: Asturias
Device: Handlin
Quote:
Originally Posted by theducks View Post
I went to the help page and re-read after over your problem (that I had the same issue )
and went
Well I got really stuck could not think straight after an hour or so struggling with the replacement string.
But looking over your search I had it all wrong to start with.
I guess that a bad search string would mean a worst replacement one.
I am fairly new to RE so as they say it is hard at the beginning then it gets more complicated.

BTW Could you tell me what the "{" bracket does?

Thanks

ATB
el.motar
el.motar is offline   Reply With Quote
Old 12-12-2011, 05:54 PM   #11
theducks
Grand Sorcerer
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 14,432
Karma: 5560777
Join Date: Aug 2009
Location: The (original) Silicon Valley, USA
Device: Galaxy Tab 2, Astak Pocket Pro, K4NT
Quote:
Originally Posted by el.motar View Post
Well I got really stuck could not think straight after an hour or so struggling with the replacement string.
But looking over your search I had it all wrong to start with.
I guess that a bad search string would mean a worst replacement one.
I am fairly new to RE so as they say it is hard at the beginning then it gets more complicated.

BTW Could you tell me what the "{" bracket does?

Thanks

ATB
el.motar
must repeat pattern
in this case 2 or more times (comma, no ending value

This way it does not match I or A by themselves
theducks is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Help me with regex please. eVrajka Library Management 5 08-15-2011 12:17 PM
regex help please thevoiceofcheese Calibre 2 08-01-2011 11:27 PM
Regex Faster Sigil 2 04-24-2011 09:08 PM
What a regex is Worldwalker Calibre 20 05-10-2010 05:51 AM
Help with a regex A.T.E. Calibre 1 04-05-2010 07:50 AM


All times are GMT -4. The time now is 04:37 AM.


MobileRead.com is a privately owned, operated and funded community.