Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 10-12-2008, 12:14 PM   #1
ddavtian
Addict
ddavtian has a complete set of Star Wars action figures.ddavtian has a complete set of Star Wars action figures.ddavtian has a complete set of Star Wars action figures.ddavtian has a complete set of Star Wars action figures.
 
Posts: 271
Karma: 332
Join Date: Nov 2003
Location: San Francisco, USA
Device: Sage, Elipsa, Oasis, Galaxy Tab 8U, S22U
Typos during conversion

Hi guys.

I usually buy MS reader books at Fictionwise, use ConvertLit, then Calibre into lit. Yesterday I noticed one of my books had typos in lit file. All "fi" combinations were gone. They were in the lit file, they are in epub file, but not in lrt. "first" becomes "rst", "field" is "eld", etc. Also, pictures were not in the lit file but were in epub. I was surprised to see how often these two letters happen together.

This is not a big problem, just wanted to share with you.

David
ddavtian is offline   Reply With Quote
Old 10-12-2008, 12:37 PM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,835
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
You're saying fi is in the LIT and the EPUB, but not the LRF?
kovidgoyal is offline   Reply With Quote
Advert
Old 10-12-2008, 01:27 PM   #3
ddavtian
Addict
ddavtian has a complete set of Star Wars action figures.ddavtian has a complete set of Star Wars action figures.ddavtian has a complete set of Star Wars action figures.ddavtian has a complete set of Star Wars action figures.
 
Posts: 271
Karma: 332
Join Date: Nov 2003
Location: San Francisco, USA
Device: Sage, Elipsa, Oasis, Galaxy Tab 8U, S22U
Quote:
Originally Posted by kovidgoyal View Post
You're saying fi is in the LIT and the EPUB, but not the LRF?
I have to correct myself: it is in LRF when viewing it on my PC with Calibre viewer. It's not in LRF file when viewing in Sony software on PC or on the reader. "Infield" becomes "In ield", "fine" is " ine", "first" is " rst", etc.
So Calibre is converting it correctly, Sony is not displaying it.

EPUB is fine on the reader.

Last edited by ddavtian; 10-12-2008 at 01:37 PM.
ddavtian is offline   Reply With Quote
Old 10-12-2008, 01:52 PM   #4
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,835
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
that's because the LIT file uses a special unicode symbol to represent fi. SONY's LRF viewer's default font cant display that symbol.
kovidgoyal is offline   Reply With Quote
Old 10-12-2008, 01:57 PM   #5
ddavtian
Addict
ddavtian has a complete set of Star Wars action figures.ddavtian has a complete set of Star Wars action figures.ddavtian has a complete set of Star Wars action figures.ddavtian has a complete set of Star Wars action figures.
 
Posts: 271
Karma: 332
Join Date: Nov 2003
Location: San Francisco, USA
Device: Sage, Elipsa, Oasis, Galaxy Tab 8U, S22U
Thanks Kovid. I'll be reading books in epub then :-)
ddavtian is offline   Reply With Quote
Advert
Old 10-12-2008, 05:38 PM   #6
pilotbob
Grand Sorcerer
pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.
 
pilotbob's Avatar
 
Posts: 19,832
Karma: 11844413
Join Date: Jan 2007
Location: Tampa, FL USA
Device: Kindle Touch
Quote:
Originally Posted by kovidgoyal View Post
that's because the LIT file uses a special unicode symbol to represent fi. SONY's LRF viewer's default font cant display that symbol.
Can you conversion look for these ligature characters and replace them with the two correct characters?

BOb
pilotbob is offline   Reply With Quote
Old 10-12-2008, 07:23 PM   #7
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,835
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
It could but since there are a very large number of possible ligatures this would just slow things down a lot
kovidgoyal is offline   Reply With Quote
Old 10-12-2008, 09:59 PM   #8
pilotbob
Grand Sorcerer
pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.pilotbob ought to be getting tired of karma fortunes by now.
 
pilotbob's Avatar
 
Posts: 19,832
Karma: 11844413
Join Date: Jan 2007
Location: Tampa, FL USA
Device: Kindle Touch
Quote:
Originally Posted by kovidgoyal View Post
It could but since there are a very large number of possible ligatures this would just slow things down a lot
Really... is python that slow? Perhaps a utility to do this outside and have it be an option.

I think if you build a hash table of ligatures it would be quick to do a look up. However, I'm not sure how you are doing the conversion. Is this a stream based process? In .Net you would do this with stream readers and writers. The readers output is a stream which can input to the next reader/writer. The overhead is very small as the stream is processed from end to end and passed from adapter to adapter as the stream is processed.

Is there anything like that in python?

BOb
pilotbob is offline   Reply With Quote
Old 10-12-2008, 10:15 PM   #9
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,835
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
The actuall replacing happens in C. The problem is the number of ligatures, not the speed of python
kovidgoyal is offline   Reply With Quote
Old 10-20-2008, 12:15 AM   #10
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 73,866
Karma: 128597114
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
One solution is after the LIT has had the DRM removed, use lit2oeb to explode it. Then take the HTML, fix the ligatures an then use html2lrl witht he --use-spine on the OPF file and all will be well.
JSWolf is offline   Reply With Quote
Old 10-20-2008, 12:35 AM   #11
sahlberg
Junior Member
sahlberg began at the beginning.
 
Posts: 9
Karma: 10
Join Date: Oct 2008
Device: prs-505
Quote:
Originally Posted by kovidgoyal View Post
It could but since there are a very large number of possible ligatures this would just slow things down a lot
see http://unicode.org/charts/PDF/UFB00.pdf

I think for these kind of ligatures there are only a handful ones common in english.
0xfb00 to 0xfb04 are VERY common.

0xfb05 is rare, I even think there are additional rules that it is only used for some specific words. This one is so rare that I cant even be bothered to goole for the specific rules associated with it...

The ligatures fb00 to fb04 are very common and often used since they look so much better than the individual characters.

Just having an automatic translation of these 5 ligatures probably cover the vast majority of ligatures a calibre user will ever encounter.


Best of course would be to have the reader updated somehow so that it supports these 5 ligatures but that might be difficult. There is a reason these ligatures are used in printed media and books, they do look much better than the individual characters.


regards
ronnie sahlberg
sahlberg is offline   Reply With Quote
Old 10-20-2008, 12:57 AM   #12
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,835
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
OK I've added code to replace those five ligatures.
kovidgoyal is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Typos in ebooks raac General Discussions 223 05-28-2011 02:12 PM
Looking for examples of typos in eBooks Tonycole General Discussions 1 05-05-2010 04:23 AM
Kindle Errors and Typos rlparker News 2 07-15-2009 02:07 PM
eBooks and Typos seldan Reading and Management 9 10-08-2007 12:35 PM
ebook typos sugarbear2403 Sony Reader 6 10-09-2006 11:47 PM


All times are GMT -4. The time now is 10:09 PM.


MobileRead.com is a privately owned, operated and funded community.