Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Sigil

Notices

Reply
 
Thread Tools Search this Thread
Old 11-10-2010, 08:21 AM   #1
Rand Brittain
Bookmaker
Rand Brittain ought to be getting tired of karma fortunes by now.Rand Brittain ought to be getting tired of karma fortunes by now.Rand Brittain ought to be getting tired of karma fortunes by now.Rand Brittain ought to be getting tired of karma fortunes by now.Rand Brittain ought to be getting tired of karma fortunes by now.Rand Brittain ought to be getting tired of karma fortunes by now.Rand Brittain ought to be getting tired of karma fortunes by now.Rand Brittain ought to be getting tired of karma fortunes by now.Rand Brittain ought to be getting tired of karma fortunes by now.Rand Brittain ought to be getting tired of karma fortunes by now.Rand Brittain ought to be getting tired of karma fortunes by now.
 
Posts: 416
Karma: 2143650
Join Date: Sep 2010
Device: Cybook Opus
Converting Capitals -> Italics

I'm looking at a Project Gutenberg text that's used capital letters to stand in for italics, and I'd like to change it to display those italics instead. Is there a way to use regular expressions to change all words in all-caps to lower-case and slap a <i></i> tag around them? (I guess I'd have to remove it from every use of "I", but that's a simple find-replace.)
Rand Brittain is offline   Reply With Quote
Old 11-10-2010, 11:18 AM   #2
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 7,515
Karma: 18512745
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
I wouldn't use a global search and replace, since words in all-caps can also be legitimate, or roman numbers, or small-caps...
Jellby is offline   Reply With Quote
Old 11-10-2010, 11:25 AM   #3
HarryT
eBook Enthusiast
HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.
 
HarryT's Avatar
 
Posts: 85,544
Karma: 93383043
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
I'd suggest doing it manually, too.

You can find the capitals by doing a regex search for "[A-Z][A-Z]"; that will find any two consecutive capital letters. This kind of thing is a pretty standard part of converting PG texts. It's complicated by the fact that some of the words in capitals should probably remain capitals; the only way you'll find that is when you carry out the next stage of your eBook conversion - a thorough proof-read against a page scan or printed edition.
HarryT is offline   Reply With Quote
Old 11-10-2010, 01:41 PM   #4
Fabe
Dylanologist
Fabe has survived committing the World's Second Greatest Blunder.Fabe has survived committing the World's Second Greatest Blunder.Fabe has survived committing the World's Second Greatest Blunder.Fabe has survived committing the World's Second Greatest Blunder.Fabe has survived committing the World's Second Greatest Blunder.Fabe has survived committing the World's Second Greatest Blunder.Fabe has survived committing the World's Second Greatest Blunder.Fabe has survived committing the World's Second Greatest Blunder.Fabe has survived committing the World's Second Greatest Blunder.Fabe has survived committing the World's Second Greatest Blunder.Fabe has survived committing the World's Second Greatest Blunder.
 
Fabe's Avatar
 
Posts: 200
Karma: 146754
Join Date: Apr 2010
Location: Hanover, New Hampshire, USA
Device: none/all/any
Quote:
Originally Posted by HarryT View Post
You can find the capitals by doing a regex search for "[A-Z][A-Z]"; that will find any two consecutive capital letters.
Can this kind of search be carried out in Sigil?
Fabe is offline   Reply With Quote
Old 11-10-2010, 02:02 PM   #5
HarryT
eBook Enthusiast
HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.
 
HarryT's Avatar
 
Posts: 85,544
Karma: 93383043
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
I'm pretty sure that Sigil supports regex searches, yes. Perhaps someone can confirm?
HarryT is offline   Reply With Quote
Old 11-10-2010, 02:36 PM   #6
Valloric
Created Sigil, FlightCrew
Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.
 
Valloric's Avatar
 
Posts: 1,982
Karma: 350515
Join Date: Feb 2008
Device: Kobo Clara HD
Quote:
Originally Posted by Fabe View Post
Can this kind of search be carried out in Sigil?
Quote:
Originally Posted by HarryT View Post
I'm pretty sure that Sigil supports regex searches, yes. Perhaps someone can confirm?
Yes, in both wildcard and regex modes.
Valloric is offline   Reply With Quote
Old 11-10-2010, 02:37 PM   #7
Pablo
Guru
Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.
 
Pablo's Avatar
 
Posts: 970
Karma: 4999999
Join Date: Mar 2009
Location: Rosario, Argentina
Device: SONY PRS-505, PRS-T2
Sigil supports regex searches. In "book view" the search is restricted to the current html file, in "code view" you can search in all html files, but in this case it would be useless, as html tags are included in the search.
In short, in this case you have to repeat the search for each html file in your ePub.
Pablo is offline   Reply With Quote
Old 11-10-2010, 02:48 PM   #8
Valloric
Created Sigil, FlightCrew
Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.Valloric ought to be getting tired of karma fortunes by now.
 
Valloric's Avatar
 
Posts: 1,982
Karma: 350515
Join Date: Feb 2008
Device: Kobo Clara HD
Quote:
Originally Posted by Pablo View Post
Sigil supports regex searches. In "book view" the search is restricted to the current html file, in "code view" you can search in all html files, but in this case it would be useless, as html tags are included in the search.
In short, in this case you have to repeat the search for each html file in your ePub.
All html tags are lowercased in Sigil, so it would still work well in Code View.
Valloric is offline   Reply With Quote
Old 11-10-2010, 03:22 PM   #9
Pablo
Guru
Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.
 
Pablo's Avatar
 
Posts: 970
Karma: 4999999
Join Date: Mar 2009
Location: Rosario, Argentina
Device: SONY PRS-505, PRS-T2
Quote:
Originally Posted by Valloric View Post
All html tags are lowercased in Sigil, so it would still work well in Code View.
True, only false detections would be in lines 2 and 3 at the beginning of each file (DOCTYPE, etc), and book title/author if in capitals.....
Pablo is offline   Reply With Quote
Old 11-10-2010, 06:18 PM   #10
Carl314
Enthusiast
Carl314 began at the beginning.
 
Posts: 39
Karma: 10
Join Date: Mar 2009
Device: Kindle 3, Motorola Droid with Aldiko
I'm not a regex guru, but I think this search/replace should at least insert the italics tag around capital letters (I don't know how to convert upper to lower case, however - perhaps set up a css lower case class for your italics tag if there's no other way?)

Search (case sensitive) for: ([ >“‘])([^a-z]+)([ <”’])
Replace: \1<i>\2</i>\3.

It will search for a block of text which contains at least two characters, and no lower case (but punctuation and numbers are okay) book-ended by either spaces, open/close tag signs, or quotation marks, an then insert the <i></i> around the text.

If you know you're only searching for caps, with no numbers or punctuation, then replace the text search term to [A-Z]+

Hope this helps.

Cheers, Carl.
Carl314 is offline   Reply With Quote
Old 11-11-2010, 02:47 AM   #11
HarryT
eBook Enthusiast
HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.
 
HarryT's Avatar
 
Posts: 85,544
Karma: 93383043
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
An automated search and replace is very dangerous for something like this. There are probably capitals that you DON'T want to turn to italics (eg things like "AM" or "PM" in a time). I really would recommend using a search to find the capitals, but deciding on a case by case basis what to do with each one.
HarryT is offline   Reply With Quote
Old 11-11-2010, 10:18 AM   #12
Fabe
Dylanologist
Fabe has survived committing the World's Second Greatest Blunder.Fabe has survived committing the World's Second Greatest Blunder.Fabe has survived committing the World's Second Greatest Blunder.Fabe has survived committing the World's Second Greatest Blunder.Fabe has survived committing the World's Second Greatest Blunder.Fabe has survived committing the World's Second Greatest Blunder.Fabe has survived committing the World's Second Greatest Blunder.Fabe has survived committing the World's Second Greatest Blunder.Fabe has survived committing the World's Second Greatest Blunder.Fabe has survived committing the World's Second Greatest Blunder.Fabe has survived committing the World's Second Greatest Blunder.
 
Fabe's Avatar
 
Posts: 200
Karma: 146754
Join Date: Apr 2010
Location: Hanover, New Hampshire, USA
Device: none/all/any
Quote:
Originally Posted by HarryT View Post
An automated search and replace is very dangerous for something like this. There are probably capitals that you DON'T want to turn to italics (eg things like "AM" or "PM" in a time). I really would recommend using a search to find the capitals, but deciding on a case by case basis what to do with each one.
I agree with the case-by-case notion. I have used Find Next & Replace to search about 100 instances in a file, and while I would have preferred it all happen correctly and automatically, the peace of mind it gave me was worth it.

I was not familiar with regex searches until this thread. I followed the ink from Sigil's documentation page to Regular Expression.Info where I discovered a new world of Find & Replace syntax.

As I develop searches for my own work, I will keep a file of used regex expressions. This should speed things along. Thanks all. - Fabe
Fabe is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Workaround for lack of capitals in ADE ozaru ePub 11 09-02-2010 08:59 PM
Why are italics not retained when converting to RTF? Ticallion Calibre 17 07-14-2010 09:39 AM
LRF italics bremler Sony Reader 11 01-10-2010 05:22 AM
No italics roquet Bookeen 18 04-26-2009 03:57 PM
Fixing Book Designer problem with italics when converting from LIT angelyne Sony Reader 3 07-09-2007 11:32 AM


All times are GMT -4. The time now is 08:45 AM.


MobileRead.com is a privately owned, operated and funded community.