Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 12-16-2011, 01:19 AM   #1
David Kudler
Enthusiast
David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'
 
David Kudler's Avatar
 
Posts: 48
Karma: 10000
Join Date: Apr 2011
Device: iPad
Question Regular expressions - Single back-reference, multiple instances?

I have a huge scanned file (500,000+ words) I'm converting into an ebook. I've got the text more or less converted. I'm now dealing with nearly a thousand footnotes. I don't want to have to hand-code all of the buggers so that they link from the reference to the note and back. I've got all of the note references with code ready to go--they all currently read
Code:
<sup><a href="" name"">NUMBER</a></sup>
(where NUMBER is, obviously, an integer -- the note number).

What I want to do is use a regular expression to help fill in the blanks, so that it looks like this:

Code:
<sup><a href="../Text/Foreword.xhtml#noteNUMBER" name="refNUMBER">NUMBER</a></sup>
Here's the search I tried:

Code:
<sup><a href="" name="">(.*)</a></sup>
And here's my attempted regex:

Code:
<sup><a href="../Text/Foreword.xhtml#note\1" name="ref\1">\1</a></sup>
Unfortunately, when I do get it to work at all, it only finds the first footnote, and munges the code:

Code:
<sup><a href="../Text/Foreword.xhtml#note1"</a></sup>
Any suggestions? Help?

ETA: Okay. Forgot the ? to make the asterisk lazy:

Code:
<sup><a href="" name="">(.*?)</a></sup>
But now the search just crashes the application.

Last edited by David Kudler; 12-16-2011 at 02:26 AM.
David Kudler is offline   Reply With Quote
Old 12-16-2011, 02:36 AM   #2
Toxaris
Wizard
Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.
 
Toxaris's Avatar
 
Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
Your code works fine in my small test case.

Try the following. Probably better, more smart ways will be placed below my answer, but it should work:

find:
<sup><a href="" name="">([0-9]{1,})</a></sup>

replace:
<sup><a href="../Text/Foreword.xhtml#note\1" name="ref\1">\1</a></sup>

I tested it in Sigil 0.4.9.02 and there it works. However, I must point out that in your initial code there is missing an equal sign in the name part.
Toxaris is offline   Reply With Quote
Advert
Old 12-16-2011, 04:17 AM   #3
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 7,543
Karma: 19001583
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
Do not use "name", it's not supported in XHTML, use "id" instead.
Jellby is offline   Reply With Quote
Old 12-16-2011, 01:06 PM   #4
Toxaris
Wizard
Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.
 
Toxaris's Avatar
 
Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
of course, missed that one.
Toxaris is offline   Reply With Quote
Old 12-16-2011, 02:28 PM   #5
David Kudler
Enthusiast
David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'
 
David Kudler's Avatar
 
Posts: 48
Karma: 10000
Join Date: Apr 2011
Device: iPad
Thanks both!

I also forgot to escape out the periods in the file address, which may have been causing the crashing. Catastrophic back-reference?

And it took me a while to figure out that I needed to edit the search term to (\d{3}) instead of just {1}. Too many darn notes. ;-)
David Kudler is offline   Reply With Quote
Advert
Old 12-16-2011, 02:41 PM   #6
David Kudler
Enthusiast
David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'
 
David Kudler's Avatar
 
Posts: 48
Karma: 10000
Join Date: Apr 2011
Device: iPad
So... What seems to be causing the crashing is an inclusion of angle brackets: > or <. Which is kind of a pain.

Sigil has choked and crashed on a number of variations on this search:

Code:
(\d{3})</a>
Am I missing something?
David Kudler is offline   Reply With Quote
Old 12-16-2011, 02:57 PM   #7
Serpentine
Evangelist
Serpentine ought to be getting tired of karma fortunes by now.Serpentine ought to be getting tired of karma fortunes by now.Serpentine ought to be getting tired of karma fortunes by now.Serpentine ought to be getting tired of karma fortunes by now.Serpentine ought to be getting tired of karma fortunes by now.Serpentine ought to be getting tired of karma fortunes by now.Serpentine ought to be getting tired of karma fortunes by now.Serpentine ought to be getting tired of karma fortunes by now.Serpentine ought to be getting tired of karma fortunes by now.Serpentine ought to be getting tired of karma fortunes by now.Serpentine ought to be getting tired of karma fortunes by now.
 
Posts: 416
Karma: 1045911
Join Date: Sep 2011
Location: Cape Town, South Africa
Device: Kindle 3
Quote:
Originally Posted by David Kudler View Post
So... What seems to be causing the crashing is an inclusion of angle brackets: > or <. Which is kind of a pain.
If you're using the beta, this has been fixed

Quote:
Originally Posted by David Kudler View Post
Code:
(\d{3})</a>
Am I missing something?
That works fine here, might be related to the previous bug however, if you're using the beta.
Serpentine is offline   Reply With Quote
Old 12-16-2011, 03:59 PM   #8
David Kudler
Enthusiast
David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'
 
David Kudler's Avatar
 
Posts: 48
Karma: 10000
Join Date: Apr 2011
Device: iPad
Hmm. I'm using 4.902, which I didn't think was still in beta.
David Kudler is offline   Reply With Quote
Old 12-16-2011, 05:37 PM   #9
David Kudler
Enthusiast
David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'
 
David Kudler's Avatar
 
Posts: 48
Karma: 10000
Join Date: Apr 2011
Device: iPad
Yup. Switched back to 4.2 and all was well. Eesh.
David Kudler is offline   Reply With Quote
Old 12-16-2011, 05:46 PM   #10
Toxaris
Wizard
Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.Toxaris ought to be getting tired of karma fortunes by now.
 
Toxaris's Avatar
 
Posts: 4,520
Karma: 121692313
Join Date: Oct 2009
Location: Heemskerk, NL
Device: PRS-T1, Kobo Touch, Kobo Aura
Strange, because I actually tested my string on the beta release.
Toxaris is offline   Reply With Quote
Old 12-17-2011, 12:05 PM   #11
David Kudler
Enthusiast
David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'David Kudler knows the difference between 'who' and 'whom'
 
David Kudler's Avatar
 
Posts: 48
Karma: 10000
Join Date: Apr 2011
Device: iPad
FWIW, I was running 4.902 on a MacBook Pro 2.16GHz (4GB) under OS X 10.7.2.
David Kudler is offline   Reply With Quote
Reply

Tags
regular expressions, sigil


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Regular Expressions geormes Calibre 4 08-04-2011 07:09 AM
Regular Expressions littleezza Conversion 1 07-15-2011 11:52 AM
Another help with regular expressions encapuchado Library Management 6 06-21-2011 03:14 PM
Help with regular expressions jevonbrady Library Management 6 06-21-2011 10:16 AM
Help with Regular Expressions ghostyjack Workshop 2 01-08-2010 11:04 AM


All times are GMT -4. The time now is 07:25 AM.


MobileRead.com is a privately owned, operated and funded community.