Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > Workshop

Notices

Reply
 
Thread Tools Search this Thread
Old 09-28-2012, 02:16 AM   #1
cybmole
Wizard
cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.
 
Posts: 2,857
Karma: 1163098
Join Date: Sep 2010
Device: Kobo aura HD, Kobo Arc, Kindle Fire HDX 8.9 , Kindle for PC
help with hard regex please

this is a regex question - not really a sigil one, but since this is where the helpful regex experts are, I'm hoping one will step in and advise

I want to clean up some .xml files & am thinking regex in notepad++ could do the job if I can figure out the expression.

I want to zap all chunks that look like the example below, where the key search term is X-Fi.

for anything with X-Fi in that 2nd line I want to delete the whole block- ambitious I know but can that be done ?

<RemoteButton>
<Name>X-Fi 24-bit Wheel Button</Name>
<MidiSignal>0A 41 44 09 76</MidiSignal>
<USBSignal>02 C1 44 89 76</USBSignal>
<ButtonType>btKeyboardEvent</ButtonType>
<KeyCode>173</KeyCode>
</RemoteButton>

logically, I have to find the start phrase <RemoteButton>\s*<Name>X-Fi , look forward to locate the matching </Remote... & delete everything found in between. I also have to cope with intermediate line feeds, white space & / characters..

hmm - I seem to be solving my own question as i type...

will a simple (.*) suffice i.e. find
<RemoteButton>\s*<Name>X-Fi(.*)</RemoteButton>


well i can load the xml into sigil & the above almost works, ( it leaves some mysterious de> entries ) but I can't see how to then save as xml ( sigil v4)

i try the same expression in notepad++ & it does NOT work - either N+= cannot do multi-line or it uses different regex syntax ?

Last edited by cybmole; 09-28-2012 at 02:33 AM.
cybmole is offline   Reply With Quote
Old 09-28-2012, 03:55 AM   #2
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 6,195
Karma: 4800739
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
Replace <RemoteButton> and </RemoteButton> with two unused characters, like ¬ and |

Search for ¬\s*<Name>X-Fi[^|]*| (with multi-line)

Replace back ¬ and | with <RemoteButton> and </RemoteButton>

If Notepad does not have multi-line matching, find another editor that does, it's an essential feature. (In Sigil, you can copy-paste the result, instead of saving.)
Jellby is offline   Reply With Quote
 
Enthusiast
Old 09-28-2012, 04:08 AM   #3
cybmole
Wizard
cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.
 
Posts: 2,857
Karma: 1163098
Join Date: Sep 2010
Device: Kobo aura HD, Kobo Arc, Kindle Fire HDX 8.9 , Kindle for PC
in sigil, it worked 100% when I replaced
<RemoteButton>\s*<Name>X-Fi(.*)</RemoteButton>
with
<RemoteButton>\s*<Name>X-Fi(.*)/RemoteButton>

previously it worked on replace once, but not on replace all, it left in trailing each instance of de> from KeyCode> for some obscure reason.

I googled notepad++ it does not have multi-line regex, not sure that any free editor does, except for sigil!

open, edit, copy - paste from code view into notepad++ and save may be the way to go
cybmole is offline   Reply With Quote
Old 09-28-2012, 04:17 AM   #4
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 6,195
Karma: 4800739
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
Be careful if you have greedy matching or something like:

Code:
<RemoteButton>
<Name>X-Fi 24-bit Wheel Button</Name>
<MidiSignal>0A 41 44 09 76</MidiSignal>
<USBSignal>02 C1 44 89 76</USBSignal>
<ButtonType>btKeyboardEvent</ButtonType>
<KeyCode>173</KeyCode>
</RemoteButton>

<RemoteButton>
<Name>foo 24-bit Wheel Button</Name>
<MidiSignal>0A 41 44 09 76</MidiSignal>
<USBSignal>02 C1 44 89 76</USBSignal>
<ButtonType>btKeyboardEvent</ButtonType>
<KeyCode>173</KeyCode>
</RemoteButton>
(no X-Fi in the second button)
Jellby is offline   Reply With Quote
Old 09-28-2012, 04:26 AM   #5
cybmole
Wizard
cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.
 
Posts: 2,857
Karma: 1163098
Join Date: Sep 2010
Device: Kobo aura HD, Kobo Arc, Kindle Fire HDX 8.9 , Kindle for PC
Quote:
Originally Posted by Jellby View Post
Be careful if you have greedy matching or something like:

Code:
<RemoteButton>
<Name>X-Fi 24-bit Wheel Button</Name>
<MidiSignal>0A 41 44 09 76</MidiSignal>
<USBSignal>02 C1 44 89 76</USBSignal>
<ButtonType>btKeyboardEvent</ButtonType>
<KeyCode>173</KeyCode>
</RemoteButton>

<RemoteButton>
<Name>foo 24-bit Wheel Button</Name>
<MidiSignal>0A 41 44 09 76</MidiSignal>
<USBSignal>02 C1 44 89 76</USBSignal>
<ButtonType>btKeyboardEvent</ButtonType>
<KeyCode>173</KeyCode>
</RemoteButton>
(no X-Fi in the second button)
I will have that construction, there will typically be a bunch of X-Fi button defintions, then some other button definitions.

I have not really mastered greedy management - that is why I was concerned about using (.*)

is there a way to write STARTphrase(.*)ENDphrase type searches which makes them nongreedy

this will be useful info for books also for removing redundant SPAN structures
cybmole is offline   Reply With Quote
Old 09-28-2012, 07:07 AM   #6
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 6,195
Karma: 4800739
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
I believe you can use (.*?) instead to make it ungreedy. (Or using [^¬]* after replacement, as per my suggestion above, which is essentially the same.)

With <span> it may be more complicated, since they can be nested to no end (I assume <RemoteButton>s would not be nested).
Jellby is offline   Reply With Quote
Old 09-29-2012, 08:47 PM   #7
PeterT
Taking a break; Fed up
PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.PeterT ought to be getting tired of karma fortunes by now.
 
PeterT's Avatar
 
Posts: 6,981
Karma: 44005669
Join Date: Nov 2007
Location: Toronto
Device: Wife: Touch, Arc, Vox Me: Nexus 7, Glo
I just went looking and came across http://www.editpadlite.com/ which claims
Quote:
EditPad Lite is a compact general-purpose text editor. Use EditPad Lite to easily edit any kind of plain text file. EditPad Lite has all the essential features to make text editing a breeze:
  • Large file and long line support.
  • Full Unicode support, including complex scripts and right-to-left scripts.
  • Direct editing of text files using Windows, UNIX, and mac Mac text encodings (code pages) and line breaks.
  • Tabbed interface for working with many files.
  • Unlimited undo and redo for all open files, even after saving.
  • Automatic backup and working copies prevent data loss.
  • Powerful search-and-replace with literal search terms and regular expressions that can span multiple lines.
PeterT is offline   Reply With Quote
Old 09-30-2012, 04:03 AM   #8
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 6,195
Karma: 4800739
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
Vim supports multiline regex too, and much more.
Jellby is offline   Reply With Quote
Old 09-30-2012, 06:33 AM   #9
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 9,269
Karma: 42123822
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Originally Posted by PeterT View Post
I just went looking and came across http://www.editpadlite.com/ which claims
EditPad Lite is a great regex-capable editor. I use it all the time.

There are some subtle differences between its JGSoft regex engine and Sigils PCRE engine, but they rarely come up in most situations. JGSoft doesn't support /K, and its commands for changing the case of text is a little different.
DiapDealer is online now   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
RegEx el.motar Sigil 10 12-12-2011 05:54 PM
Help me with regex please. eVrajka Library Management 5 08-15-2011 12:17 PM
regex help please thevoiceofcheese Calibre 2 08-01-2011 11:27 PM
PRS-300 Breaking up is hard to do....or not so hard after all.... sterling1989 Sony Reader 2 09-02-2010 07:06 PM
Easy hard drive data archiving with a USB hard drive adapter Bob Russell Lounge 24 02-20-2007 04:15 PM


All times are GMT -4. The time now is 12:40 PM.


MobileRead.com is a privately owned, operated and funded community.