Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Editor

Notices

Reply
 
Thread Tools Search this Thread
Old 02-01-2015, 09:12 AM   #1
phossler
Wizard
phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.
 
Posts: 1,087
Karma: 447222
Join Date: Jan 2009
Location: Valley Forge, PA, USA
Device: Kindle Paperwhite
S/R Function Title-Case

I had some 'unexpected' results trying to re-format a lot of Hx titles into TitleCase.

Ending up making just a small test doc to investigate further

It seems as if the TitleCase function stops if the first word is already in mixed case so the first H1 is found but not replaced. The rest are fine

Code:
<body>
<h1> Chapter 1 AAAAAAAAAAAAAAA BBBBBBBBBBBB CCCCCCCCCCCCCC</h1>
<h1>CHAPTER 1 AAAAAAAAAAAAAAA BBBBBBBBBBBB CCCCCCCCCCCCCC</h1>
<h1>chapter 1 aaaaaaaaaaaaa bbbbbbbbbbbbbb cccccccccccccccc</h1>
<h1>AAAAAAAAAAAAAAA BBBBBBBBBBBB CCCCCCCCCCCCCCCCCCCCCC</h1>
</body>

The workaround I've found it to do the S/R with the built-in UPPERCASE function first, and then the TitleCase

Is there a way to 'include the call' to the UPPECASE function as part of the TitleCase function?

I got far enough to see the function (or at least I think it was) but couldn't figure out to to do it myself

Code:
from calibre.utils.titlecase import titlecase
from calibre.ebooks.oeb.polish.utils import apply_func_to_html_text

def replace(match, number, file_name, metadata, dictionaries, data, functions, *args, **kwargs):
    '''Title-case matched text, ignoring the text inside tag definitions.'''
    return apply_func_to_html_text(match, titlecase)


Thanks
Attached Thumbnails
Click image for larger version

Name:	Capture.JPG
Views:	274
Size:	90.1 KB
ID:	134397  

Last edited by phossler; 02-01-2015 at 09:17 AM.
phossler is offline   Reply With Quote
Old 02-01-2015, 01:03 PM   #2
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
Works fine for me.
eschwartz is offline   Reply With Quote
Advert
Old 02-01-2015, 01:21 PM   #3
phossler
Wizard
phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.
 
Posts: 1,087
Karma: 447222
Join Date: Jan 2009
Location: Valley Forge, PA, USA
Device: Kindle Paperwhite
thanks for checking, but the first of the 4 <h1> still ends up like

<h1> Chapter 1 AAAAAAAAAAAAAAA BBBBBBBBBBBB CCCCCCCCCCCCCC</h1>

If you could verify my Saved Search in the screen shot is the same as yours please?
phossler is offline   Reply With Quote
Old 02-01-2015, 02:15 PM   #4
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
Now I see the problem.

Titlecasing ignores words that have capitals in the middle of the word (unless the whole thing is uppercased). Try titlecasing "ChaPter".

See:

src/calibre/utils/titlecase.py line 71
Code:
        if INLINE_PERIOD.search(word) or UC_ELSEWHERE.match(word):
            line.append(word)
            continue
I originally took you at your word and tried to replace "Some text."

Last edited by eschwartz; 02-01-2015 at 02:26 PM.
eschwartz is offline   Reply With Quote
Old 02-01-2015, 02:39 PM   #5
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 31,047
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Quote:
Originally Posted by eschwartz View Post
Now I see the problem.

Titlecasing ignores words that have capitals in the middle of the word (unless the whole thing is uppercased). Try titlecasing "ChaPter".

See:

src/calibre/utils/titlecase.py line 71
Code:
        if INLINE_PERIOD.search(word) or UC_ELSEWHERE.match(word):
            line.append(word)
            continue
I originally took you at your word and tried to replace "Some text."
Case changing (when mixed) has been problematic for me from way back in the DOS days.

I always Lowered, then set the desired case
theducks is offline   Reply With Quote
Advert
Old 02-01-2015, 02:44 PM   #6
phossler
Wizard
phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.phossler ought to be getting tired of karma fortunes by now.
 
Posts: 1,087
Karma: 447222
Join Date: Jan 2009
Location: Valley Forge, PA, USA
Device: Kindle Paperwhite
The workaround that I came up with is the UPPERCASE first and then TitleCase

That's why I was wondering about modifying the title case RegExFunction and just doing a upper case first; do it in one step
phossler is offline   Reply With Quote
Old 02-01-2015, 02:47 PM   #7
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
Try:

Func: "Upper-case text then Title-case it"
Code:
from calibre.utils.icu import upper
from calibre.utils.titlecase import titlecase
from calibre.ebooks.oeb.polish.utils import apply_func_to_match_groups

def replace(match, number, file_name, metadata, dictionaries, data, functions, *args, **kwargs):
    '''Title-case matched text. If the regular expression contains groups,
    only the text in the groups will be changed, otherwise the entire text is
    changed.'''
    text = apply_func_to_match_groups(match, upper)
    
    '''Now return it titlecased.'''
    return titlecase(text)
eschwartz is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
RegEx Function: Title Case phossler Editor 29 07-04-2020 10:52 AM
having metadata.opf, cover.jpg + title.[pdf|epub|..] what's the best function to use? kabirmaar Development 2 06-21-2017 03:33 PM
Regex for Title Case or Sentence case? Turtle91 Sigil 3 01-19-2013 01:36 PM
Dutch title case function fvdham Library Management 8 10-11-2012 10:09 PM
Title Case Dopedangel Calibre 6 10-16-2009 08:01 AM


All times are GMT -4. The time now is 01:49 PM.


MobileRead.com is a privately owned, operated and funded community.