![]() |
#1 |
Bookmaker
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 427
Karma: 2143650
Join Date: Sep 2010
Device: Cybook Opus
|
Converting Capitals -> Italics
I'm looking at a Project Gutenberg text that's used capital letters to stand in for italics, and I'd like to change it to display those italics instead. Is there a way to use regular expressions to change all words in all-caps to lower-case and slap a <i></i> tag around them? (I guess I'd have to remove it from every use of "I", but that's a simple find-replace.)
|
![]() |
![]() |
![]() |
#2 |
frumious Bandersnatch
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 7,543
Karma: 19001583
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
|
I wouldn't use a global search and replace, since words in all-caps can also be legitimate, or roman numbers, or small-caps...
|
![]() |
![]() |
Advert | |
|
![]() |
#3 |
eBook Enthusiast
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 85,544
Karma: 93383043
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
|
I'd suggest doing it manually, too.
You can find the capitals by doing a regex search for "[A-Z][A-Z]"; that will find any two consecutive capital letters. This kind of thing is a pretty standard part of converting PG texts. It's complicated by the fact that some of the words in capitals should probably remain capitals; the only way you'll find that is when you carry out the next stage of your eBook conversion - a thorough proof-read against a page scan or printed edition. |
![]() |
![]() |
![]() |
#4 |
Dylanologist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 200
Karma: 146754
Join Date: Apr 2010
Location: Hanover, New Hampshire, USA
Device: none/all/any
|
|
![]() |
![]() |
![]() |
#5 |
eBook Enthusiast
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 85,544
Karma: 93383043
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
|
I'm pretty sure that Sigil supports regex searches, yes. Perhaps someone can confirm?
|
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Created Sigil, FlightCrew
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,982
Karma: 350515
Join Date: Feb 2008
Device: Kobo Clara HD
|
|
![]() |
![]() |
![]() |
#7 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 972
Karma: 4999999
Join Date: Mar 2009
Location: Rosario, Argentina
Device: SONY PRS-T2, Kindle Paperwhite 11th gen
|
Sigil supports regex searches. In "book view" the search is restricted to the current html file, in "code view" you can search in all html files, but in this case it would be useless, as html tags are included in the search.
In short, in this case you have to repeat the search for each html file in your ePub. |
![]() |
![]() |
![]() |
#8 | |
Created Sigil, FlightCrew
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 1,982
Karma: 350515
Join Date: Feb 2008
Device: Kobo Clara HD
|
Quote:
|
|
![]() |
![]() |
![]() |
#9 |
Guru
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 972
Karma: 4999999
Join Date: Mar 2009
Location: Rosario, Argentina
Device: SONY PRS-T2, Kindle Paperwhite 11th gen
|
|
![]() |
![]() |
![]() |
#10 |
Enthusiast
![]() Posts: 39
Karma: 10
Join Date: Mar 2009
Device: Kindle 3, Motorola Droid with Aldiko
|
I'm not a regex guru, but I think this search/replace should at least insert the italics tag around capital letters (I don't know how to convert upper to lower case, however - perhaps set up a css lower case class for your italics tag if there's no other way?)
Search (case sensitive) for: ([ >“‘])([^a-z]+)([ <”’]) Replace: \1<i>\2</i>\3. It will search for a block of text which contains at least two characters, and no lower case (but punctuation and numbers are okay) book-ended by either spaces, open/close tag signs, or quotation marks, an then insert the <i></i> around the text. If you know you're only searching for caps, with no numbers or punctuation, then replace the text search term to [A-Z]+ Hope this helps. Cheers, Carl. |
![]() |
![]() |
![]() |
#11 |
eBook Enthusiast
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 85,544
Karma: 93383043
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
|
An automated search and replace is very dangerous for something like this. There are probably capitals that you DON'T want to turn to italics (eg things like "AM" or "PM" in a time). I really would recommend using a search to find the capitals, but deciding on a case by case basis what to do with each one.
|
![]() |
![]() |
![]() |
#12 | |
Dylanologist
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 200
Karma: 146754
Join Date: Apr 2010
Location: Hanover, New Hampshire, USA
Device: none/all/any
|
Quote:
I was not familiar with regex searches until this thread. I followed the ink from Sigil's documentation page to Regular Expression.Info where I discovered a new world of Find & Replace syntax. As I develop searches for my own work, I will keep a file of used regex expressions. This should speed things along. Thanks all. - Fabe |
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Workaround for lack of capitals in ADE | ozaru | ePub | 11 | 09-02-2010 08:59 PM |
Why are italics not retained when converting to RTF? | Ticallion | Calibre | 17 | 07-14-2010 09:39 AM |
LRF italics | bremler | Sony Reader | 11 | 01-10-2010 05:22 AM |
No italics | roquet | Bookeen | 18 | 04-26-2009 03:57 PM |
Fixing Book Designer problem with italics when converting from LIT | angelyne | Sony Reader | 3 | 07-09-2007 11:32 AM |