|
![]() |
|
Thread Tools | Search this Thread |
![]() |
#1 |
Member
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 16
Karma: 472024
Join Date: Nov 2012
Device: Samsung Galaxy S3
|
[Old Thread] Capitalize first word in sentence with search and replace?
Hi Folks,
I've converted a .PDF file to .epub and was able to remove the headers and footers with only a little difficulty. ![]() I notice, however, that after the conversion, a lot of the capitalization at the beginning of sentences has been lost (unrelated to headers and footers), which is rather annoying. It occurred to me to use a regex to locate lower case chars at the start of sentences. Initially, I could think of two cases: 1) First character in the sentence after a paragraph break. Can locate with "\.<br>\s+[a-z]" 2) First character in the sentence in the middle of a paragraph, assuming the previous sentence ends with a period and is followed by one space. Can locate with "\. [a-z]". My question is, what should I use in the replacement text box to cause Calibre to substitute the upper case char for that which was found by the original search regex? At first I just tried "\.<br>\s+[A-Z]" and "\. [A-Z]", but the replacement just took those literal text strings and wrote them into the book, so that, for example, every sentence beginning with a lower case character in the middle of a paragraph now begins with "\. [A-Z]" rather than the correct letter. Thanks for any help, ianc |
![]() |
![]() |
![]() |
#2 |
Member
![]() Posts: 11
Karma: 30
Join Date: Nov 2012
Device: Nook Color
|
I am not a script wizard so I can't directly write the perfect solution, but it seems you would need to use this unless you iterate the operation for the whole [A-Z] range.
One trick could also convert the text to an xls file, use the capitalize function then convert back to a text file, though you'd have to think about keeping the paragraphs and chapter in the process. |
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Gadget Freak
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,169
Karma: 1043832
Join Date: Nov 2007
Location: US
Device: EE, Note 8
|
Use Sigil. calibre does not have the full set of PCRE functions.
|
![]() |
![]() |
![]() |
#4 |
Member
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 16
Karma: 472024
Join Date: Nov 2012
Device: Samsung Galaxy S3
|
Thanks guys. Yes, I eventually did download Sigil and take a look and was able to do it there. Sigil is a bit more inconvenient to use; or at least it seems so to me, but then I haven't put any time into learning it yet.
It would seem then that Calibre's search and replace function cannot use back references to a group in the initial search regexp? Thanks again, ianc |
![]() |
![]() |
![]() |
#5 |
Calibre Plugins Developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,720
Karma: 2197770
Join Date: Oct 2010
Location: Australia
Device: Kindle Oasis
|
Install the "Open With" plugin, assign "Open with Sigil" to a keyboard shortcut and you will be way ahead of anything you can do in calibre when it comes to search and replace - at least for working with EPUB. I assign Alt+E in my case and it is second nature to do any editing be it css tweaks, find/replace operations, TOC manipulations etc in Sigil that way.
Personally my toes curl every time I see one of these sort of threads to look at the sort of hoops people are jumping through to try to use the calibre S&R ![]() |
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Member
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 16
Karma: 472024
Join Date: Nov 2012
Device: Samsung Galaxy S3
|
OK, thanks again for the help guys, looks like Sigil it is...
ianc |
![]() |
![]() |
![]() |
#7 |
Groupie
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 159
Karma: 24430
Join Date: Mar 2012
Location: Australia
Device: Nexus 7"
|
FWIW, to capitalize the first word in a sentence (using Sigil) is quite straightforward. For the example below, I would have defined CSS classes "indentoff" (whatever) for the first sentence following a scene break, and "caps" to transform a string to capital letters. So:
Find: <p class="indentoff">(.*?)[space] will find the first text string after "indentoff" followed by a space. Note: [space] here represents a typed space; it is not part of the Regex! Replace: <p class="indentoff"><span class="caps">\1</span>[space] |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
search and replace - drops blanks in replace ? | cybmole | Conversion | 10 | 03-13-2011 03:07 AM |
Help with Word - Find & Replace | Big Kev | Workshop | 3 | 09-21-2010 06:51 PM |