09-15-2015, 02:52 PM | #16 |
Member
Posts: 14
Karma: 10
Join Date: Sep 2015
Device: none
|
Thanks to all of you, finally i did it!!!
Thom*'s formula is great: <p class="calibre2"><sup class="calibre3">(.*?)</sup> The group (.*?) changed with ([0-9]) for 1-9, ([0-9][0-9]) for 10-99, ([0-9][0-9][0-9]) for 100 to 999, makes the job. Wow it look nice!!! And quickly. |
09-16-2015, 12:40 PM | #17 |
The Fumbler
Posts: 66
Karma: 10
Join Date: Jun 2015
Device: android 4.2/fbreader
|
Another try
I was so inspired by davidfor's suggestion to use Regex-Funtion mode that I delved into it and wrote my first function. It was a challenge that I could not pass up. I do believe that it works just as you requested, so I might as well share it with you.
So: - Begin by selecting "Regex-Function" in the search/replace panel. - Leave the "All text files" selected. - Make sure there are no files listed in the "Function:" box so that when you hit "Create/edit" a new file will be created. - Click "Create/edit" and name the new file what you like (I named it "FixIt"). - Copy the following code to the body of the new file replacing anything that is there. Code:
def replace(match, number, file_name, metadata, dictionaries, data, functions, *args, **kwargs): newid = ('<p id="id_' + (file_name [-10:-5:]) + ("000" + match.group(1)) [-3:] + '" class="calibre1"><sup class="calibre2">' + match.group(1) + '</sup>') return newid - Click "OK" and you are ready to rock and roll. - Use the search string that I provided you previously: <p class="calibre2"><sup class="calibre3">(.*?)</sup> - Hit replace and you should be in business. - It should do what you want to all files. Please let me know if this works for you or what problems you encounter. That was great fun, thanks for the challenge. |
09-17-2015, 01:55 AM | #18 |
Wizard
Posts: 3,720
Karma: 1759970
Join Date: Sep 2010
Device: none
|
that is interesting, could you talk us through it please.
I can intuit what most of the code is doing, but what is the significance of needing exactly 4 spaces in two locations? |
09-17-2015, 02:26 AM | #19 |
Grand Sorcerer
Posts: 24,907
Karma: 47303748
Join Date: Jul 2011
Location: Sydney, Australia
Device: Kobo:Touch,Glo, AuraH2O, GloHD,AuraONE, ClaraHD, Libra H2O; tolinoepos
|
It is Python code. Rather than using delimiters of some sort, Python uses the indent levels to indicate code blocks. The "def" line is defining a function. Any code lines in the function have to be indented under it. The indent is usually four spaces, but a single space or a tab should work.
|
09-17-2015, 03:07 AM | #20 | |
Wizard
Posts: 3,720
Karma: 1759970
Join Date: Sep 2010
Device: none
|
Quote:
|
|
09-17-2015, 03:12 AM | #21 |
Ex-Helpdesk Junkie
Posts: 19,422
Karma: 85397180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
|
|
09-17-2015, 10:44 AM | #22 |
The Fumbler
Posts: 66
Karma: 10
Join Date: Jun 2015
Device: android 4.2/fbreader
|
With Documentataion
Here I have documented the function in detail and I have included the debug (print) statements so you can see the results. I hope it helps.
Code:
'''Begin with what I assume is kovidgoyal's Python funtion "replace" that gives access to file_name and search data.''' def replace(match, number, file_name, metadata, dictionaries, data, functions, *args, **kwargs): '''Simply key in the desired text to begin the replace string.''' str1 = '<p id="id_' print(str1) '''Call the file_name and extract the 10th (inclusive) to the 5th (exclusive) from the end.''' str2 = file_name [-10:-5:] print(str2) '''Call the match, add zeros to the front and extract the last 3 digits.''' str3 = ("000" + match.group(1)) [-3:] print(str3) '''Key in the desired text for the middle of the replace string.''' str4 = '" class="calibre1"><sup class="calibre2">' print(str4) '''Call the match (unmolested).''' str5 = match.group(1) print(str5) '''Key in the desired text for the end of the replace string.''' str6 = '</sup>' print(str6) '''Concatenate the string.''' newid = (str1 + str2 + str3 + str4 + str5 + str6) print(newid) '''Return the replace string.''' return newid |
09-18-2015, 04:00 AM | #23 | |
Member
Posts: 14
Karma: 10
Join Date: Sep 2015
Device: none
|
Quote:
Wowwwwwwwwwww it's amazing Thom*. The function works and do ALL the job. This is what i was looking for!!!!!!!!! |
|
09-18-2015, 04:41 AM | #24 |
Member
Posts: 14
Karma: 10
Join Date: Sep 2015
Device: none
|
Thom* the function works great it amazing.
Form file v3001 to v3011 goes VERY fine counting from 1 to 24. When it goes to v3012 to v3017 it seems that skips the number 1 and starts from 2 and the space where should be number 1 nothing happens. Why is that? From v3001 to v3011: <p class="block_1" id="v3001001><span class="block_2">1</span><span class="text_"> <p class="calibre2" id="v3001002"><sup class="calibre3">2</sup> .... <p class="calibre2" id="v3001017"><sup class="calibre3">17</sup> The file v3011: <p class="block_1" id="v3011001"><span class="block_2">11</span><span class="text_"> <p class="calibre2" id="v3011002"><sup class="calibre3">2</sup> <p class="calibre2" id="v3011003"><sup class="calibre3">3</sup> .... <p class="calibre2" id="v3011036"><sup class="calibre3">36</sup> From v3012 to v3017: <p class="block_1"><span class="block_2">17</span><span class="text_"> <p id="v3017002" class="calibre1"><sup class="calibre2">2</sup> <p id="v3017003" class="calibre1"><sup class="calibre2">3</sup> ..... <p id="v3017016" class="calibre1"><sup class="calibre2">16</sup> Every file begins with the number of the chapter 1-17 that has different format. The strange thing is that from 1 to 11 your function works great and from chapter 12 skips ONLY the first paragraph and starts counting from paragraph 2. |
09-18-2015, 09:10 AM | #25 |
The Fumbler
Posts: 66
Karma: 10
Join Date: Jun 2015
Device: android 4.2/fbreader
|
Wow, this looks like a whole different scenario and I can't quite follow your question.
If the search and replace is skipping the first paragraph on some files, probably the search criteria is not met. Test that by simply running the search (skip the replace) and see if the paragraph is found. It is common in ebooks for the first paragraph of a chapter to have slightly different format. |
09-18-2015, 12:36 PM | #26 | |
Member
Posts: 14
Karma: 10
Join Date: Sep 2015
Device: none
|
Quote:
In few words, from chapter 1 to 11 make the changes of the first paragraph. From chapter 12 to 17 the changes are made starting from second paragraph. But it's OK i will do them manually and if i will find the way to make it automatically... I will let you know. Thanks again for your help!!! |
|
09-22-2015, 05:21 AM | #27 |
Member
Posts: 14
Karma: 10
Join Date: Sep 2015
Device: none
|
It seams that was a format problem and search criteria. To clean a little more the code exported from Word how can I replace?
This is my line: <span class="text_chapter">2 <span class="text_1">This is my text I want to save.</span></span></p> How i want to be: <span class="text_chapter"><span class="text_1">2</span></span>This is my text I want to save.</p> In other words is there a formula to group a text line? How can I delete all the spaces that I found after the number 2 of my chapters? Thanks again for the help. P.S. I have a bigger challenge: Is there a way to know when a paragraph number is missing? <sup class="calibre3">2</sup> - this is the format of the paragraph |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
More than 3 regular expression? | james968 | Conversion | 1 | 04-04-2012 05:39 AM |
Regular Expression Help | iKarampa | Calibre | 13 | 12-15-2010 07:17 AM |
Regular expression help | krendk | Calibre | 4 | 12-04-2010 04:32 PM |
Regular Expression Help | smartmart | Calibre | 5 | 10-17-2010 05:19 AM |
Help with the regular expression | Dysonco | Calibre | 9 | 03-22-2010 10:45 PM |