Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre > Conversion

Notices

Reply
 
Thread Tools Search this Thread
Old 02-06-2011, 04:12 PM   #16
Perkin
Guru
Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.
 
Perkin's Avatar
 
Posts: 644
Karma: 64171
Join Date: Sep 2010
Location: Kent, England, Sol 3, ZZ9 plural Z Alpha
Device: Sony PRS-300, Kobo Aura HD
Do a search and replace

Search for
(\s)"(\w)
(copy & paste a backward quote instead of the ")

Replace with
\1"\2
(copy & paste an open quote instead of the ")

That should do it, step through a few first, and only if you're happy - replace all.

Edit:
Do this on the converted epub, with the incorrect 'smartquotes'
Needless to say, make sure you have a backup
Perkin is offline   Reply With Quote
Old 02-07-2011, 08:14 AM   #17
WillAdams
Guru
WillAdams ought to be getting tired of karma fortunes by now.WillAdams ought to be getting tired of karma fortunes by now.WillAdams ought to be getting tired of karma fortunes by now.WillAdams ought to be getting tired of karma fortunes by now.WillAdams ought to be getting tired of karma fortunes by now.WillAdams ought to be getting tired of karma fortunes by now.WillAdams ought to be getting tired of karma fortunes by now.WillAdams ought to be getting tired of karma fortunes by now.WillAdams ought to be getting tired of karma fortunes by now.WillAdams ought to be getting tired of karma fortunes by now.WillAdams ought to be getting tired of karma fortunes by now.
 
WillAdams's Avatar
 
Posts: 940
Karma: 1760710
Join Date: Feb 2008
Device: Sony PRS-600, Fujitsu Stylistic ST-4121
Unfortunately, automated attempts at this will fail if typed quotes have been used to indicate inches: 5' 2" &c.

Such special characters have to have special treatment and should be marked as such (and of course for foot and inches one should use prime and double primes respectively).

TEI has special tags to indicate begin / end quotes which allow for programmatic handling of them. Highly recommended.

William
WillAdams is offline   Reply With Quote
 
Enthusiast
Old 02-07-2011, 11:18 AM   #18
theducks
Grand Sorcerer
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 14,443
Karma: 5567061
Join Date: Aug 2009
Location: The (original) Silicon Valley, USA
Device: Galaxy Tab 2, Astak Pocket Pro, K4NT
Quote:
Originally Posted by WillAdams View Post
Unfortunately, automated attempts at this will fail if typed quotes have been used to indicate inches: 5' 2" &c.

Such special characters have to have special treatment and should be marked as such (and of course for foot and inches one should use prime and double primes respectively).

TEI has special tags to indicate begin / end quotes which allow for programmatic handling of them. Highly recommended.

William
Perkins REGEX solution does not allow numbers to be included.
space followed by quote followed by a word
theducks is offline   Reply With Quote
Old 02-07-2011, 11:44 AM   #19
Perkin
Guru
Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.
 
Perkin's Avatar
 
Posts: 644
Karma: 64171
Join Date: Sep 2010
Location: Kent, England, Sol 3, ZZ9 plural Z Alpha
Device: Sony PRS-300, Kobo Aura HD
If you've got some problems where an inch " has been replaced by a double-curly-quote
copy one of those curly quotes and paste it instead of the quote in this search

(\s\d+?)"(\s)

(Which is whitespace, followed by a number immediately followed by the quotemark, followed by whitespace)

replace with
\1"\2
(using the straight quote mark for inches)
Perkin is offline   Reply With Quote
Old 02-07-2011, 12:01 PM   #20
theducks
Grand Sorcerer
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 14,443
Karma: 5567061
Join Date: Aug 2009
Location: The (original) Silicon Valley, USA
Device: Galaxy Tab 2, Astak Pocket Pro, K4NT
Perkin
My original was supposed to be a Positive feature comment,
of not including those stray inches and feet
theducks is offline   Reply With Quote
Old 02-07-2011, 12:31 PM   #21
Perkin
Guru
Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.Perkin calls his or her ebook reader Vera.
 
Perkin's Avatar
 
Posts: 644
Karma: 64171
Join Date: Sep 2010
Location: Kent, England, Sol 3, ZZ9 plural Z Alpha
Device: Sony PRS-300, Kobo Aura HD
@theducks, I knew that, but wasn't sure about WillAdams's comment and if he wanted help.
Perkin is offline   Reply With Quote
Old 02-08-2011, 12:13 PM   #22
Rajmahid
Enthusiast
Rajmahid plays well with othersRajmahid plays well with othersRajmahid plays well with othersRajmahid plays well with othersRajmahid plays well with othersRajmahid plays well with othersRajmahid plays well with othersRajmahid plays well with othersRajmahid plays well with othersRajmahid plays well with othersRajmahid plays well with others
 
Posts: 34
Karma: 2584
Join Date: Jan 2011
Device: Kindle
Quote:
Originally Posted by Perkin View Post
Do a search and replace

Search for
(\s)"(\w)
(copy & paste a backward quote instead of the ")

Replace with
\1"\2
(copy & paste an open quote instead of the ")

That should do it, step through a few first, and only if you're happy - replace all.

Edit:
Do this on the converted epub, with the incorrect 'smartquotes'
Needless to say, make sure you have a backup
Actually, I am doing search & replace using a front-end smartquote copied & pasted from MS Word (“) to replace with the plain-text inch-marks -- it works just dandy. But with 93 chapters -- all suffering from the same issue -- it's taking time. And I have to be especially vigilant due to mid-sentences that also contain quotes, or else I have to go back and manually fix the missed ones. It's also giving me an opportunity to fix some split sentences the original formatter missed. I just wish there was an easier way...

Last edited by Rajmahid; 02-08-2011 at 12:17 PM.
Rajmahid is offline   Reply With Quote
Old 02-08-2011, 01:05 PM   #23
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
I just found a file that exhibited this problem heavily today, and the root cause wasn't unbalanced quotes in the source file. It was actually spaces between the quoted text and the quotes. That really confuses the smartypants function.

A 'bad' example:
" This is bad... "

A 'good' example:
"This is good..."

There isn't really an easy way to fix this automatically with regex - I tried - you just need to go through the text and find places where there is an offending space. You can use a regex to make it simpler though:
"\s+(?=\w)

(?<=\W)\s+"

(?<=\w)\s+"

Just search through the text with those patterns, and replace with " when it's a real match.
ldolse is offline   Reply With Quote
Old 02-13-2011, 06:27 PM   #24
Billi
Wizard
Billi ought to be getting tired of karma fortunes by now.Billi ought to be getting tired of karma fortunes by now.Billi ought to be getting tired of karma fortunes by now.Billi ought to be getting tired of karma fortunes by now.Billi ought to be getting tired of karma fortunes by now.Billi ought to be getting tired of karma fortunes by now.Billi ought to be getting tired of karma fortunes by now.Billi ought to be getting tired of karma fortunes by now.Billi ought to be getting tired of karma fortunes by now.Billi ought to be getting tired of karma fortunes by now.Billi ought to be getting tired of karma fortunes by now.
 
Billi's Avatar
 
Posts: 3,207
Karma: 13235489
Join Date: Jun 2009
Location: Berlin
Device: Cybook, iRex, PB, Onyx
Quote:
Originally Posted by Rajmahid View Post
I just encountered, for the first time, an issue with quotation marks appearing inverted -- that is close quotes -- at the BEGINNING of a number of sentences in an ePub when I ran it through Calibre's "smarten punctuation" option. ...

http://www.mobileread.com/forums/sho...+lucia+omnibus

I examined the ePub using Sigl's code view, but saw nothing obviously wrong. Any help with this perplexing issue would be greatly appreciated.
I've looked at the second book mentioned in this thread (The Chronicles of Barsetshire) that has the same problem with Calibres smarten punctuation. The double and single quotes are set absolute correctly, every opening mark has an end mark. Additionally there are a lot of abbreviations like 'Tis or 'Twas or 'oo or 'em or 'cause or 'ere ... that surely cause these kind of problem.

But even after disguising these abbreviations Calibre won't convert the book properly - some sentences have the right quotations, others not, and I must admit there is only little logic to recognise which quotes Calibre converts correctly and which not. At the beginning of a paragraph there are mostly backward marks, inside a paragraph a dialogue sometimes starts with the right opening mark and sometimes not. All pairs of single quotes are converted correctly.
I'm not sure if this is really a problem with the file or with Calibre.

Last edited by Billi; 02-13-2011 at 06:50 PM.
Billi is offline   Reply With Quote
Old 02-14-2011, 02:04 PM   #25
vanpelten
Member
vanpelten began at the beginning.
 
vanpelten's Avatar
 
Posts: 19
Karma: 10
Join Date: Feb 2011
Device: Kindle
Quote:
Originally Posted by Billi View Post
I've looked at the second book mentioned in this thread (The Chronicles of Barsetshire) that has the same problem with Calibres smarten punctuation. The double and single quotes are set absolute correctly, every opening mark has an end mark. Additionally there are a lot of abbreviations like 'Tis or 'Twas or 'oo or 'em or 'cause or 'ere ... that surely cause these kind of problem.

But even after disguising these abbreviations Calibre won't convert the book properly - some sentences have the right quotations, others not, and I must admit there is only little logic to recognise which quotes Calibre converts correctly and which not. At the beginning of a paragraph there are mostly backward marks, inside a paragraph a dialogue sometimes starts with the right opening mark and sometimes not. All pairs of single quotes are converted correctly.
I'm not sure if this is really a problem with the file or with Calibre.
As a grateful lurker here, I just had to comment on this relatively new problem. Like you, I do not know if the backward quote phenomena are a Calibre issue or problems with the books under examination. I'm very glad the Mapp and Lucia books have been 99% corrected by another user, though he had to do it the very hard way. I also downloaded The Chronicles of Barsetshire and had the same experience described regarding the quotation marks at the beginning of sentences. There appear to be an even number of quotes on the 10 pages of html I looked at, though there's no guarantee that some are misplaced and some are just left out.

I hope the author or Calibre takes a look to see what, if anything, can be done.

Other than that, Calibre is one of the finest programs for avid ebook readers. Thank you!
vanpelten is offline   Reply With Quote
Old 02-15-2011, 02:39 AM   #26
Billi
Wizard
Billi ought to be getting tired of karma fortunes by now.Billi ought to be getting tired of karma fortunes by now.Billi ought to be getting tired of karma fortunes by now.Billi ought to be getting tired of karma fortunes by now.Billi ought to be getting tired of karma fortunes by now.Billi ought to be getting tired of karma fortunes by now.Billi ought to be getting tired of karma fortunes by now.Billi ought to be getting tired of karma fortunes by now.Billi ought to be getting tired of karma fortunes by now.Billi ought to be getting tired of karma fortunes by now.Billi ought to be getting tired of karma fortunes by now.
 
Billi's Avatar
 
Posts: 3,207
Karma: 13235489
Join Date: Jun 2009
Location: Berlin
Device: Cybook, iRex, PB, Onyx
Quote:
Originally Posted by kovidgoyal View Post
"some text, "some quoted text"

Kovid, how does Calibre exactly convert the inch-signs?



- Simply by counting, changing every first, third, fifth... sign to open quotation marks and every second, fourth, sixth to close quotation marks?
- Or by analysing the chars around and applying some kind of script like that from Rylon in post 12 here?
Billi is offline   Reply With Quote
Old 02-15-2011, 03:22 AM   #27
HarryT
eBook Enthusiast
HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.
 
HarryT's Avatar
 
Posts: 63,147
Karma: 41205449
Join Date: Nov 2006
Location: UK
Device: PW2, iPad Retina Mini, iPhone 4, MS Surface Pro, Onyx T68, N7,
Quote:
Originally Posted by vanpelten View Post
As a grateful lurker here, I just had to comment on this relatively new problem. Like you, I do not know if the backward quote phenomena are a Calibre issue or problems with the books under examination. I'm very glad the Mapp and Lucia books have been 99% corrected by another user, though he had to do it the very hard way. I also downloaded The Chronicles of Barsetshire and had the same experience described regarding the quotation marks at the beginning of sentences. There appear to be an even number of quotes on the 10 pages of html I looked at, though there's no guarantee that some are misplaced and some are just left out.
Quotes do not, of course, have to be balanced. Eg, this style of quoting is very commonly encountered in novels:

Quote:
John said, "This is the first paragraph...
"And this the second...
"And this the third...
"And this the end of what John said."
ie, the close quotes are only present in the final paragraph of the quoted speech; the earlier paragraphs have only an open quote. Any "smart quote" software which can't handle this is doomed to failure.
HarryT is offline   Reply With Quote
Old 02-15-2011, 08:28 AM   #28
HarryT
eBook Enthusiast
HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.
 
HarryT's Avatar
 
Posts: 63,147
Karma: 41205449
Join Date: Nov 2006
Location: UK
Device: PW2, iPad Retina Mini, iPhone 4, MS Surface Pro, Onyx T68, N7,
Here are two test files, so we can see whether it's using non-breaking spaces for indentation that causes the problem. Look at the three final stories in the book to find dialogue with quotes.

"test_space.prc" uses non-breaking spaces for identation.
"test_indent.prc" is the same book using "real" indents.

Does the Calibre "smart quote" feature break for either or both of these?
Attached Files
File Type: prc Test_Indent.prc (737.8 KB, 52 views)
File Type: prc Test_Space.prc (744.2 KB, 56 views)
HarryT is offline   Reply With Quote
Old 02-15-2011, 10:23 AM   #29
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 25,652
Karma: 4998489
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
I didn't write this code, calibre just uses the smartypants library, though generally speaking I don't see how it's possible to design an algorithm that can get this transformation right in every case. The algorithm would need to understand the language to get it perfectly right, that's why this option is off by default. If it works for your books, you're lucky, if not, you have to fix them manually.
kovidgoyal is online now   Reply With Quote
Old 02-15-2011, 10:47 AM   #30
HarryT
eBook Enthusiast
HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.
 
HarryT's Avatar
 
Posts: 63,147
Karma: 41205449
Join Date: Nov 2006
Location: UK
Device: PW2, iPad Retina Mini, iPhone 4, MS Surface Pro, Onyx T68, N7,
These test uploads are to try to assist someone who's having trouble with my eBooks, Kovid. It was suggested that the fact that the paragraphs start with non-breaking spaces may cause issues to the smart-quote algorithm.
HarryT is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
"A Problem Occurred" on the "Shop" page. swallman Nook Color & Nook Tablet 1 12-25-2010 12:34 PM
Sudden Problem with "The" and "A" Titles Otter Calibre 13 09-05-2010 08:32 PM
Engadget:Toshiba Shows off "Smart Pad" promises launch before October kjk News 3 07-20-2010 07:18 PM
Classic Australian Daily Telegraph "Smart Edition" and the nook? firefox Barnes & Noble NOOK 0 07-19-2010 02:50 AM
An explanation for the "Oops, We're Working On Your problem," page at the Sony Store Dr. Drib Sony Reader 2 07-25-2009 07:40 AM


All times are GMT -4. The time now is 04:05 PM.


MobileRead.com is a privately owned, operated and funded community.