![]() |
1 Attachment(s)
Quote:
|
:smack:
Thank you Doitsu. Indeed... You know me too well by now... :D |
Here are a pile of 'code error' corrections I have accumulated over time. Few are mine, most are from generous people who have shared their efforts. Thanks to all of you.
Suggest you copy and paste into a new text file. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~ FIND / REPLACE text (use with tags) For a string of letters and numbers ([^>]+)(.*?) eg. <a name="Chapter_LIII" id="Chapter_LIII"></a> <a([^>]+)(.*?)></a> or <body id="0-a5e9337bbdff40f4b38c8f20e5723a9a" class="calibre"> Find id="0-a5e9337bbdff40f4b38c8f20e5723a9a" With id=([^>]+)(.*?) class some text like, id=, then ([^>]+)(.*?) and then something to end string of letters & numbers Find number in <b> in Regex mode <b>[0-9]+</b> Find Roman Numerals lower or UPPER CASE [xvi]+ [XVI]+ \>I[XVI]+ [1 space] \[\s] Find I, II, III <p>[I]+</p> Find Pg ### in Regex mode (?DotAll) [P][g] (\d+) [P][g] [xvi]+ Find Page_394 in Regex mode (?DotAll) \Q"Page_\E(\d+)" Find id="sigil_toc_id_3" \Qid="sigil_toc_id_\E(\d+)" [^\.] will match anything but . eg [^\.>]</ [,;:] will match any punctuation except period [^,;:], where ^ stands for NOT in the character set. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~ |
Can you please remove all but the first few regex examples/solutions? This thread is for regex examples/tips/issues only.
|
But can you then post the rest in its own thread? Looks quite useful as well. Thanks.
|
Quote:
|
Sorry to have to ask but where and how do I edit my post. You're right about my topic going 'off topic'.
|
Quote:
|
I need some help... After OCR there are occasional punctuation errors like this in the text:
Wrong: Quote:
Quote:
I got this far: Code:
Find: |
@Skydancer:
The following should work for you: Find:»([^»|‘]+)‘ Replace:,\1‘ |
Thank you, @Doitsu! :bow2:
That got me off to a good start. I modified your regex just a tiny bit so now it works perfectly: Code:
»([^»,|‘]+)‘ |
I have a feeling the answer for this is going to seem simple once someone tells me but my brain is not working properly at the moment so... if anyone can help...
I want to find sets of em dashes with some text between them that are in the *same sentence*. So the text between the em dashes cannot include .!? but can include ,;: for example. For example: Match: Sanctuaire – là encore situé à Eyralice – abritait Don't match (the . in this example could be ? or !): les Ténèbres – et de valoir la mort à qui le possédait. Or c’est dans ce livre à la fois oublié et maudit, que Jall devait lire que le Dernier Sanctuaire – là encore situé à Eyralice |
Quote:
Something like this might work: Find: –(\w*[^\.\?!]+?)– Replace: —\1— That would replace the en dashes with em dashes, and stick the captured "non-sentence" back in the middle. I didn't do thorough testing though, so it probably would break in a lot of edge cases, but it did work correctly on your examples. |
Quote:
|
Quote:
Quote:
Replace: – \1 – Hopefully it works, and it will at least save you a lot of time. The rest can probably then be found with a simple: Find: – <--- Put a space before or after the en dash |
| All times are GMT -4. The time now is 07:52 PM. |
Powered by: vBulletin
Copyright ©2000 - 3.8.5, Jelsoft Enterprises Ltd.
MobileRead.com is a privately owned, operated and funded community.