Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Editor

Notices

Reply
 
Thread Tools Search this Thread
Old 12-30-2014, 08:55 PM   #16
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 37,010
Karma: 16422171
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
The way those functions work is that they uppercase the contents of any groups in the find expression. You have specified a group that matches H1. You need to specify a group that matches the actual content, like this.

<[Hh][1-6]>(.+?)</[Hh][1-6]>

If you want a case changing function that ignores text in tag definitions in the matched text, then you will need to write one for yourself. The builtin functions wont do that, because, they are for general purpose use, not specifically for changing text between tags.
kovidgoyal is offline   Reply With Quote
Old 12-31-2014, 10:29 AM   #17
phossler
Guru
phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.
 
Posts: 897
Karma: 295106
Join Date: Jan 2009
Location: Valley Forge, PA, USA
Device: Kindle Paperwhite
Thanks for your patience and the explanations


Quote:
You need to specify a group that matches the actual content, like this.
Thanks - I'm a little smarter about RegEx now.

Using your Find works exactly as advertised and it correctly finds and highlights the Hx tags.


Quote:
The built in functions won't do that, because, they are for general purpose use, not specifically for changing text between tags.
Understand, but it still seems (to me at least) that there is a possible side effect of the built in TitleCase function


1. It replaces tag markers ('<' and '>') with what is treated like normal text
2. It does not TitleCase the text that it does find


Quote:
'''Title-case matched text. If the regular expression contains groups,
only the text in the groups will be changed, otherwise the entire text is
changed.'''

So I assume that

<[Hh][1-6]>(.+?)</[Hh][1-6]>

would make the \1 group for the Replace just the red text in the Before below?

Before:

Code:
  <h1>TEST1 TEST1 TEST1 TEST1 TEST1 </h1>
  <p>NOW IS THE TIME and this should remain mixed case</p>
  <h1>TEST2 TEST2 TEST2 <br/><br/>TEST3 TEST3 </h1>
  <p>NOW IS THE TIME and this should remain mixed case</p>
  <h1>TEST4 <i>TEST4 TEST4 TEST4</i> TEST4 </h1>

After:

Code:
 <h1>Test1 Test1 Test1 Test1 Test1 </h1>
  <p>NOW IS THE TIME and this should remain mixed case</p>
  <h1>TEST2 TEST2 TEST2 &lt;br/&gt;&lt;br/&gt;TEST3 TEST3 </h1>
  <p>NOW IS THE TIME and this should remain mixed case</p>
  <h1>TEST4 &lt;i&gt;TEST4 TEST4 TEST4&lt;/i&gt; TEST4 </h1>
1. So the simplest case (first H1) works correctly

2. I don't understand why the same logic isn't applied to the second and third so that all text between the Hx's is made title case, as well as why the replacement of < and > with entities which end up being treated like normal text
Attached Thumbnails
Click image for larger version

Name:	Capture.JPG
Views:	76
Size:	120.2 KB
ID:	133162  
phossler is offline   Reply With Quote
Old 12-31-2014, 10:45 AM   #18
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 37,010
Karma: 16422171
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
The logic is simple:

*Everything* that matches the expression inside the brackets is made upper case. Furthermore, the function treats all that text as plain text, not a mix of HTML and plain text. That means that because the output of the function is being put into an HTML file < and > get replaced by entities.

Or in other words, that function is not designed to be used in the way you are trying to use it.

You need to come up with a function that understands that it could be operating on a mixture of HTML tags and plain text and so restricts itself to only the plain text parts.
kovidgoyal is offline   Reply With Quote
Old 12-31-2014, 10:46 AM   #19
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 37,010
Karma: 16422171
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
I have created a builtin function for you that does that, in the next release.

https://github.com/kovidgoyal/calibr...151ff7a9946577
kovidgoyal is offline   Reply With Quote
Old 12-31-2014, 11:10 AM   #20
jbacelar
Interested in the matter
jbacelar ought to be getting tired of karma fortunes by now.jbacelar ought to be getting tired of karma fortunes by now.jbacelar ought to be getting tired of karma fortunes by now.jbacelar ought to be getting tired of karma fortunes by now.jbacelar ought to be getting tired of karma fortunes by now.jbacelar ought to be getting tired of karma fortunes by now.jbacelar ought to be getting tired of karma fortunes by now.jbacelar ought to be getting tired of karma fortunes by now.jbacelar ought to be getting tired of karma fortunes by now.jbacelar ought to be getting tired of karma fortunes by now.jbacelar ought to be getting tired of karma fortunes by now.
 
jbacelar's Avatar
 
Posts: 325
Karma: 426094
Join Date: Dec 2011
Location: Spain, south coast
Device: Pocketbook Touch HD
Paul,
What it seeks this expression: <[Hh] [1-6]>(.+?)</ [Hh] [1-6]>, is:
<one h (or H) followed by a number (1 to 6)>anything</ another h followed by another number>

Here:
<h followed by one number> anything </ br or </i

br or i is not one h followed by a number.

I recommend that if you want to use regex, visit this website:
http://www.regular-expressions.info/tutorial.html
jbacelar is offline   Reply With Quote
Old 12-31-2014, 03:19 PM   #21
phossler
Guru
phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.
 
Posts: 897
Karma: 295106
Join Date: Jan 2009
Location: Valley Forge, PA, USA
Device: Kindle Paperwhite
@kovid -- THANKS!!!! I can see I'll have to learn at least a little python

I was confused by the apparent different treatment of the TitleCase function between the first (simplest) sentence "Where It Worked Just Fine" and the second and third where IT LEFT EVERYTHING IN UPPER CASE

@jbacelar -- The Find Kovid gave me seems to work fine. It would select all this H1 text, including the <h1> and </h1> ...

<h1>TEST2 TEST2 TEST2 <br/><br/>TEST3 TEST3 </h1>

After the Replace

<h1>TEST2 TEST2 TEST2 &lt;br/&gt;&lt;br/&gt;TEST3 TEST3 </h1>

What was confusing me was that the text was not in title case. I understand the replaced entities now

I believe that Kovid's new built-in function is the only way to handle these types of cases
phossler is offline   Reply With Quote
Old 06-26-2020, 02:04 PM   #22
Ted Friesen
Member
Ted Friesen began at the beginning.
 
Posts: 11
Karma: 10
Join Date: May 2016
Device: Kindle
Title-case text built-in function

I'm also having trouble with the "Title-case text (ignore tags)" built-in function. I've wrapped all the UPPER case text that I want to convert to Title case in <h2> tags and am using the search parameter "(?s)<h\d>(.+?)</h\d>".
Applying "Replace-all" results in a deletion of all H tags and the intervening text. No conversion just deletion.
Editing the built-in function, this is what I see:
def replace(match, number, file_name, metadata, dictionaries, data, functions, *args, **kwargs):
return ''
Shouldn't there be more to it?
Ted Friesen is offline   Reply With Quote
Old 06-26-2020, 10:57 PM   #23
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 37,010
Karma: 16422171
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
https://manual.calibre-ebook.com/fun...n-the-document
kovidgoyal is offline   Reply With Quote
Old 06-27-2020, 08:22 PM   #24
phossler
Guru
phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.
 
Posts: 897
Karma: 295106
Join Date: Jan 2009
Location: Valley Forge, PA, USA
Device: Kindle Paperwhite
@Ted - Your PM

I actually run two steps: one to upper case headings, and then a second to title case them

These are my saved searches and this is the function listing for 'Title case text - Ignore tags'

Code:
from calibre.utils.titlecase import titlecase
from calibre.ebooks.oeb.polish.utils import apply_func_to_html_text

def replace(match, number, file_name, metadata, dictionaries, data, functions, *args, **kwargs):
    '''Title-case matched text, ignoring the text inside tag definitions.'''
    return apply_func_to_html_text(match, titlecase)
Attached Thumbnails
Click image for larger version

Name:	Capture.JPG
Views:	18
Size:	237.9 KB
ID:	180240   Click image for larger version

Name:	Capture2.JPG
Views:	19
Size:	255.6 KB
ID:	180241  
phossler is offline   Reply With Quote
Old 06-29-2020, 03:43 PM   #25
Ted Friesen
Member
Ted Friesen began at the beginning.
 
Posts: 11
Karma: 10
Join Date: May 2016
Device: Kindle
Title-case text built-in function

Thanks Paul it worked!!

Question: When I Create/Edit built-in functions is there supposed to be some code there?
Ted Friesen is offline   Reply With Quote
Old 06-29-2020, 07:23 PM   #26
phossler
Guru
phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.
 
Posts: 897
Karma: 295106
Join Date: Jan 2009
Location: Valley Forge, PA, USA
Device: Kindle Paperwhite
I have mine as a Saved Search, but you can also do it ad hoc

There is code there that defines the function, but I don't know Python so I never created any

I guess you can create your own function

The Calibre Users' Manual is one of the best I've seen in a long time:

https://manual.calibre-ebook.com/function_mode.html
Attached Thumbnails
Click image for larger version

Name:	Capture.JPG
Views:	12
Size:	134.8 KB
ID:	180279   Click image for larger version

Name:	Capture1.JPG
Views:	12
Size:	222.7 KB
ID:	180280  

Last edited by phossler; 06-29-2020 at 07:25 PM.
phossler is offline   Reply With Quote
Old 07-02-2020, 07:57 PM   #27
Ted Friesen
Member
Ted Friesen began at the beginning.
 
Posts: 11
Karma: 10
Join Date: May 2016
Device: Kindle
Thanks for the code Paul.

It sounds like you didn't code the function you sent me, but that it was "built-in". When I choose any of the dozen or so built-in functions the code is always the same


def replace(match, number, file_name, metadata, dictionaries, data, functions, *args, **kwargs):
return ''

Does your installation of Calibre actually display appropriate function code when you choose different built-in functions? If so, any ideas why?
Ted Friesen is offline   Reply With Quote
Old 07-03-2020, 10:27 AM   #28
phossler
Guru
phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.
 
Posts: 897
Karma: 295106
Join Date: Jan 2009
Location: Valley Forge, PA, USA
Device: Kindle Paperwhite
Yes, it was one of the built in ones that Calibre supplies


Quote:
Automatically fixing the case of headings in the document
Here, we will leverage one of the builtin functions in the editor to automatically change the case of all text inside heading tags to title case:

Find expression: <([Hh][1-6])[^>]*>.+?</\1>

For the function, simply choose the Title-case text (ignore tags) builtin function. The will change titles that look like: <h1>some TITLE</h1> to <h1>Some Title</h1>. It will work even if there are other HTML tags inside the heading tags.

Don't know why. If I look at the 'code' for the function, I see the attached
Attached Thumbnails
Click image for larger version

Name:	Capture.JPG
Views:	14
Size:	171.9 KB
ID:	180358  
phossler is offline   Reply With Quote
Old 07-03-2020, 01:47 PM   #29
Ted Friesen
Member
Ted Friesen began at the beginning.
 
Posts: 11
Karma: 10
Join Date: May 2016
Device: Kindle
Built-in regex-functions code missing

From recent replies, it seems that clicking on create/edit regex-function should reveal the code for built-in functions. My installation (Calibre 4.19 64-bit on Microsoft Windows [Version 10.0.18363.900]) does not.

Any ideas why? Have I turned something off inadvertently? Have I failed to install some module? Should I still be using OS/2?

Having some functions (more than what's in the manual) to play with will help me learn enough to fix my epub book library.

Any help you can give me (samples of code and search strings) will be greatly appreciated.
Ted Friesen is offline   Reply With Quote
Old 07-04-2020, 10:52 AM   #30
davidfor
Grand Sorcerer
davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.davidfor ought to be getting tired of karma fortunes by now.
 
Posts: 19,362
Karma: 32874111
Join Date: Jul 2011
Location: Sydney, Australia
Device: Kobo:Touch,Glo, AuraH2O, GloHD,AuraONE, ClaraHD, Libra H2O; tolinoepos
Quote:
Originally Posted by Ted Friesen View Post
Thanks for the code Paul.

It sounds like you didn't code the function you sent me, but that it was "built-in". When I choose any of the dozen or so built-in functions the code is always the same


def replace(match, number, file_name, metadata, dictionaries, data, functions, *args, **kwargs):
return ''

Does your installation of Calibre actually display appropriate function code when you choose different built-in functions? If so, any ideas why?
Quote:
Originally Posted by Ted Friesen View Post
From recent replies, it seems that clicking on create/edit regex-function should reveal the code for built-in functions. My installation (Calibre 4.19 64-bit on Microsoft Windows [Version 10.0.18363.900]) does not.

Any ideas why? Have I turned something off inadvertently? Have I failed to install some module?
What you have above looks like the default code if you open the function editor without a name in the "Function" field in the find box. And if you then use the "Function name" dropbox to select another function, it doesn't update the displayed code. I don't know if that is deliberate or not. I can see it working either way. If you select the function name in the find box, and then open the function editor, you get the code for that function.
Quote:
Should I still be using OS/2?
If only we could
Quote:
Having some functions (more than what's in the manual) to play with will help me learn enough to fix my epub book library.

Any help you can give me (samples of code and search strings) will be greatly appreciated.
I don't have any useful examples. I've tried a couple of things, but, it's actually the search that ends up being the problem, not the update.
davidfor is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
regex-function convert roman numerals weberr Editor 10 12-25-2014 10:31 PM
A regex function to number a mathematical ebook dmonasse Editor 3 12-23-2014 02:54 AM
Regex Function - Split unknown word Paulie_D Editor 19 12-07-2014 05:12 AM
Regex for Title Case or Sentence case? Turtle91 Sigil 3 01-19-2013 01:36 PM
Dutch title case function fvdham Library Management 8 10-11-2012 10:09 PM


All times are GMT -4. The time now is 01:54 AM.


MobileRead.com is a privately owned, operated and funded community.