Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Editor

Notices

Reply
 
Thread Tools Search this Thread
Old 09-15-2015, 02:52 PM   #16
geniale
Member
geniale began at the beginning.
 
Posts: 14
Karma: 10
Join Date: Sep 2015
Device: none
Thanks to all of you, finally i did it!!!
Thom*'s formula is great: <p class="calibre2"><sup class="calibre3">(.*?)</sup>

The group (.*?) changed with ([0-9]) for 1-9, ([0-9][0-9]) for 10-99, ([0-9][0-9][0-9]) for 100 to 999, makes the job.

Wow it look nice!!! And quickly.
geniale is offline   Reply With Quote
Old 09-16-2015, 12:40 PM   #17
Thom*
The Fumbler
Thom* began at the beginning.
 
Posts: 66
Karma: 10
Join Date: Jun 2015
Device: android 4.2/fbreader
Another try

I was so inspired by davidfor's suggestion to use Regex-Funtion mode that I delved into it and wrote my first function. It was a challenge that I could not pass up. I do believe that it works just as you requested, so I might as well share it with you.

So:
- Begin by selecting "Regex-Function" in the search/replace panel.
- Leave the "All text files" selected.
- Make sure there are no files listed in the "Function:" box so that when you hit "Create/edit" a new file will be created.
- Click "Create/edit" and name the new file what you like (I named it "FixIt").
- Copy the following code to the body of the new file replacing anything that is there.
Code:
def replace(match, number, file_name, metadata, dictionaries, data, functions, *args, **kwargs):

    newid = ('<p id="id_' + (file_name [-10:-5:]) + ("000" + match.group(1)) [-3:] + '" class="calibre1"><sup class="calibre2">' + match.group(1) + '</sup>')

    return newid
- Check to be sure you have 0 spaces before "def replace", 4 spaces before "newid" and 4 spaces before "return".
- Click "OK" and you are ready to rock and roll.
- Use the search string that I provided you previously: <p class="calibre2"><sup class="calibre3">(.*?)</sup>
- Hit replace and you should be in business.
- It should do what you want to all files.

Please let me know if this works for you or what problems you encounter.

That was great fun, thanks for the challenge.
Thom* is offline   Reply With Quote
Old 09-17-2015, 01:55 AM   #18
cybmole
Wizard
cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.
 
Posts: 3,720
Karma: 1759970
Join Date: Sep 2010
Device: none
that is interesting, could you talk us through it please.
I can intuit what most of the code is doing, but what is the significance of needing exactly 4 spaces in two locations?
cybmole is offline   Reply With Quote
Old 09-17-2015, 02:26 AM   #19
davidfor
Grand Sorcerer
davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.
 
Posts: 24,907
Karma: 47303748
Join Date: Jul 2011
Location: Sydney, Australia
Device: Kobo:Touch,Glo, AuraH2O, GloHD,AuraONE, ClaraHD, Libra H2O; tolinoepos
Quote:
Originally Posted by cybmole View Post
that is interesting, could you talk us through it please.
I can intuit what most of the code is doing, but what is the significance of needing exactly 4 spaces in two locations?
It is Python code. Rather than using delimiters of some sort, Python uses the indent levels to indicate code blocks. The "def" line is defining a function. Any code lines in the function have to be indented under it. The indent is usually four spaces, but a single space or a tab should work.
davidfor is offline   Reply With Quote
Old 09-17-2015, 03:07 AM   #20
cybmole
Wizard
cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.cybmole ought to be getting tired of karma fortunes by now.
 
Posts: 3,720
Karma: 1759970
Join Date: Sep 2010
Device: none
Quote:
Originally Posted by davidfor View Post
It is Python code. Rather than using delimiters of some sort, Python uses the indent levels to indicate code blocks. The "def" line is defining a function. Any code lines in the function have to be indented under it. The indent is usually four spaces, but a single space or a tab should work.
that makes sense thanks - 4 spaces seemed arbitrary. The stuff in square brackets is extracting specific parts of character strings ?
cybmole is offline   Reply With Quote
Old 09-17-2015, 03:12 AM   #21
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,422
Karma: 85397180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
https://stackoverflow.com/questions/...slice-notation
eschwartz is offline   Reply With Quote
Old 09-17-2015, 10:44 AM   #22
Thom*
The Fumbler
Thom* began at the beginning.
 
Posts: 66
Karma: 10
Join Date: Jun 2015
Device: android 4.2/fbreader
With Documentataion

Here I have documented the function in detail and I have included the debug (print) statements so you can see the results. I hope it helps.
Code:
'''Begin with what I assume is kovidgoyal's Python funtion "replace" that gives access to file_name and search data.'''
def replace(match, number, file_name, metadata, dictionaries, data, functions, *args, **kwargs):

    '''Simply key in the desired text to begin the replace string.'''
    str1 = '<p id="id_'
    print(str1)

    '''Call the file_name and extract the 10th (inclusive) to the 5th (exclusive) from the end.'''
    str2 = file_name [-10:-5:]
    print(str2)

    '''Call the match, add zeros to the front and extract the last 3 digits.'''
    str3 = ("000" + match.group(1)) [-3:]
    print(str3)

    '''Key in the desired text for the middle of the replace string.'''
    str4 = '" class="calibre1"><sup class="calibre2">'
    print(str4)

    '''Call the match (unmolested).'''
    str5 = match.group(1)
    print(str5)

    '''Key in the desired text for the end of the replace string.'''
    str6 = '</sup>'
    print(str6)

    '''Concatenate the string.'''
    newid = (str1 + str2 + str3 + str4 + str5 + str6)
    print(newid)

    '''Return the replace string.'''
    return newid
Thom* is offline   Reply With Quote
Old 09-18-2015, 04:00 AM   #23
geniale
Member
geniale began at the beginning.
 
Posts: 14
Karma: 10
Join Date: Sep 2015
Device: none
Talking

Quote:
Originally Posted by Thom* View Post
I was so inspired by davidfor's suggestion to use Regex-Funtion mode that I delved into it and wrote my first function. It was a challenge that I could not pass up. I do believe that it works just as you requested, so I might as well share it with you.

So:
- Begin by selecting "Regex-Function" in the search/replace panel.
- Leave the "All text files" selected.
- Make sure there are no files listed in the "Function:" box so that when you hit "Create/edit" a new file will be created.
- Click "Create/edit" and name the new file what you like (I named it "FixIt").
- Copy the following code to the body of the new file replacing anything that is there.
Code:
def replace(match, number, file_name, metadata, dictionaries, data, functions, *args, **kwargs):

    newid = ('<p id="id_' + (file_name [-10:-5:]) + ("000" + match.group(1)) [-3:] + '" class="calibre1"><sup class="calibre2">' + match.group(1) + '</sup>')

    return newid
- Check to be sure you have 0 spaces before "def replace", 4 spaces before "newid" and 4 spaces before "return".
- Click "OK" and you are ready to rock and roll.
- Use the search string that I provided you previously: <p class="calibre2"><sup class="calibre3">(.*?)</sup>
- Hit replace and you should be in business.
- It should do what you want to all files.

Please let me know if this works for you or what problems you encounter.

That was great fun, thanks for the challenge.
*****************
Wowwwwwwwwwww it's amazing Thom*. The function works and do ALL the job. This is what i was looking for!!!!!!!!!
geniale is offline   Reply With Quote
Old 09-18-2015, 04:41 AM   #24
geniale
Member
geniale began at the beginning.
 
Posts: 14
Karma: 10
Join Date: Sep 2015
Device: none
Thom* the function works great it amazing.

Form file v3001 to v3011 goes VERY fine counting from 1 to 24. When it goes to v3012 to v3017 it seems that skips the number 1 and starts from 2 and the space where should be number 1 nothing happens. Why is that?

From v3001 to v3011:
<p class="block_1" id="v3001001><span class="block_2">1</span><span class="text_">
<p class="calibre2" id="v3001002"><sup class="calibre3">2</sup>
....
<p class="calibre2" id="v3001017"><sup class="calibre3">17</sup>

The file v3011:
<p class="block_1" id="v3011001"><span class="block_2">11</span><span class="text_">
<p class="calibre2" id="v3011002"><sup class="calibre3">2</sup>
<p class="calibre2" id="v3011003"><sup class="calibre3">3</sup>
....
<p class="calibre2" id="v3011036"><sup class="calibre3">36</sup>

From v3012 to v3017:
<p class="block_1"><span class="block_2">17</span><span class="text_">
<p id="v3017002" class="calibre1"><sup class="calibre2">2</sup>
<p id="v3017003" class="calibre1"><sup class="calibre2">3</sup>
.....
<p id="v3017016" class="calibre1"><sup class="calibre2">16</sup>

Every file begins with the number of the chapter 1-17 that has different format. The strange thing is that from 1 to 11 your function works great and from chapter 12 skips ONLY the first paragraph and starts counting from paragraph 2.
geniale is offline   Reply With Quote
Old 09-18-2015, 09:10 AM   #25
Thom*
The Fumbler
Thom* began at the beginning.
 
Posts: 66
Karma: 10
Join Date: Jun 2015
Device: android 4.2/fbreader
Wow, this looks like a whole different scenario and I can't quite follow your question.

If the search and replace is skipping the first paragraph on some files, probably the search criteria is not met. Test that by simply running the search (skip the replace) and see if the paragraph is found.

It is common in ebooks for the first paragraph of a chapter to have slightly different format.
Thom* is offline   Reply With Quote
Old 09-18-2015, 12:36 PM   #26
geniale
Member
geniale began at the beginning.
 
Posts: 14
Karma: 10
Join Date: Sep 2015
Device: none
Thumbs up

Quote:
Originally Posted by Thom* View Post
Wow, this looks like a whole different scenario and I can't quite follow your question.

If the search and replace is skipping the first paragraph on some files, probably the search criteria is not met. Test that by simply running the search (skip the replace) and see if the paragraph is found.

It is common in ebooks for the first paragraph of a chapter to have slightly different format.
******
In few words, from chapter 1 to 11 make the changes of the first paragraph. From chapter 12 to 17 the changes are made starting from second paragraph.

But it's OK i will do them manually and if i will find the way to make it automatically... I will let you know.

Thanks again for your help!!!
geniale is offline   Reply With Quote
Old 09-22-2015, 05:21 AM   #27
geniale
Member
geniale began at the beginning.
 
Posts: 14
Karma: 10
Join Date: Sep 2015
Device: none
It seams that was a format problem and search criteria. To clean a little more the code exported from Word how can I replace?

This is my line:
<span class="text_chapter">2 <span class="text_1">This is my text I want to save.</span></span></p>

How i want to be:
<span class="text_chapter"><span class="text_1">2</span></span>This is my text I want to save.</p>

In other words is there a formula to group a text line? How can I delete all the spaces that I found after the number 2 of my chapters?

Thanks again for the help.

P.S. I have a bigger challenge: Is there a way to know when a paragraph number is missing?
<sup class="calibre3">2</sup> - this is the format of the paragraph
geniale is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
More than 3 regular expression? james968 Conversion 1 04-04-2012 05:39 AM
Regular Expression Help iKarampa Calibre 13 12-15-2010 07:17 AM
Regular expression help krendk Calibre 4 12-04-2010 04:32 PM
Regular Expression Help smartmart Calibre 5 10-17-2010 05:19 AM
Help with the regular expression Dysonco Calibre 9 03-22-2010 10:45 PM


All times are GMT -4. The time now is 04:49 PM.


MobileRead.com is a privately owned, operated and funded community.