Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > Workshop

Notices

Reply
 
Thread Tools Search this Thread
Old 08-31-2009, 09:27 AM   #1
Pablo
Guru
Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.
 
Pablo's Avatar
 
Posts: 970
Karma: 4999999
Join Date: Mar 2009
Location: Rosario, Argentina
Device: SONY PRS-505, PRS-T2
Programming language for development

I have some experience in computer programming in several languages (ANSI C/C++, Borland Delphi/Kylix, a little Java and Python in console applications, to name some). I would like to start developing platform independent tools for ebook processing and formatting, but have not decided what language to use.
Learning a new language/development environment is no problem, in fact I have to study a lot before I can start to do something worth sharing.
So I thought maybe some of the great developers here can give me some advise. I have access to Windows and Linux boxes and have used VMWare Server and Virtual PC. Thanks in advance.
Pablo
Pablo is offline   Reply With Quote
Old 08-31-2009, 09:29 AM   #2
HarryT
eBook Enthusiast
HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.
 
HarryT's Avatar
 
Posts: 85,544
Karma: 93383043
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
Python seems to be the most popular choice.
HarryT is offline   Reply With Quote
Advert
Old 08-31-2009, 09:40 AM   #3
GRiker
Comparer of the Ephemeris
GRiker ought to be getting tired of karma fortunes by now.GRiker ought to be getting tired of karma fortunes by now.GRiker ought to be getting tired of karma fortunes by now.GRiker ought to be getting tired of karma fortunes by now.GRiker ought to be getting tired of karma fortunes by now.GRiker ought to be getting tired of karma fortunes by now.GRiker ought to be getting tired of karma fortunes by now.GRiker ought to be getting tired of karma fortunes by now.GRiker ought to be getting tired of karma fortunes by now.GRiker ought to be getting tired of karma fortunes by now.GRiker ought to be getting tired of karma fortunes by now.
 
Posts: 1,496
Karma: 424697
Join Date: Mar 2009
Device: iPad
I'll second Harry's suggestion of Python. Python is free, quick, portable, interactive and multi-platform. Calibre is written in Python, making it possible to write your own plugins or modifications.

G
GRiker is offline   Reply With Quote
Old 08-31-2009, 10:02 AM   #4
Lo Zeno
Addict
Lo Zeno has a spectacular aura aboutLo Zeno has a spectacular aura aboutLo Zeno has a spectacular aura aboutLo Zeno has a spectacular aura aboutLo Zeno has a spectacular aura aboutLo Zeno has a spectacular aura aboutLo Zeno has a spectacular aura aboutLo Zeno has a spectacular aura aboutLo Zeno has a spectacular aura aboutLo Zeno has a spectacular aura aboutLo Zeno has a spectacular aura about
 
Posts: 202
Karma: 4379
Join Date: May 2009
Location: Italy
Device: Hanlin V3 (with lBook firmware & OpenInkPot)
Since you already know C++, you can surely use it: C++ is an "immortal" language that always does its work, and its flexibility is unparalleled.

You could also use Java: it is surely platform-independent and has an incredible amount of developement tools (both free and non-free) available, tons of APIs, easily available libraries, and so on. It also has a very wide and stable user base, which is always useful.

You could also try C#, if you develop using Mono and GTK+ (instead of winforms) you end up with a platform-independent program which can be run on Windows, Linux and Mac (that's what I use when I need cross-platform programs). The downside is, the opensource community looks with suspect at Mono and C#, because the fear is that it would take a moment for Microsoft to change its claims on C# and the .NET framework and ask to pay licenses.
Lo Zeno is offline   Reply With Quote
Old 08-31-2009, 10:21 AM   #5
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,744
Karma: 22446736
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Python all the way. Highest productivity language I've ever found and I know over twenty
kovidgoyal is offline   Reply With Quote
Advert
Old 08-31-2009, 12:28 PM   #6
radius
Lector minore
radius ought to be getting tired of karma fortunes by now.radius ought to be getting tired of karma fortunes by now.radius ought to be getting tired of karma fortunes by now.radius ought to be getting tired of karma fortunes by now.radius ought to be getting tired of karma fortunes by now.radius ought to be getting tired of karma fortunes by now.radius ought to be getting tired of karma fortunes by now.radius ought to be getting tired of karma fortunes by now.radius ought to be getting tired of karma fortunes by now.radius ought to be getting tired of karma fortunes by now.radius ought to be getting tired of karma fortunes by now.
 
radius's Avatar
 
Posts: 649
Karma: 1738720
Join Date: Jan 2008
Device: Aura One, Samsung Galaxy Tab S5e, Google Pixel Slate
Quote:
Originally Posted by kovidgoyal View Post
Python all the way. Highest productivity language I've ever found and I know over twenty
I like Python, but I wonder if being implemented in Python is one of the reasons that Calibre is slow?
radius is offline   Reply With Quote
Old 08-31-2009, 12:33 PM   #7
HarryT
eBook Enthusiast
HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.
 
HarryT's Avatar
 
Posts: 85,544
Karma: 93383043
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
Quote:
Originally Posted by radius View Post
I like Python, but I wonder if being implemented in Python is one of the reasons that Calibre is slow?
Undoubtedly - it would probably be 10x faster if it were written in C++. However, I don't know about you, but for me Calibre is "fast enough". If it takes 2 minutes to convert a book from Mobi to ePub for me I'm not really that bothered.
HarryT is offline   Reply With Quote
Old 08-31-2009, 01:39 PM   #8
ahi
Wizard
ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.
 
Posts: 1,790
Karma: 507333
Join Date: May 2009
Device: none
Quote:
Originally Posted by HarryT View Post
Undoubtedly - it would probably be 10x faster if it were written in C++. However, I don't know about you, but for me Calibre is "fast enough". If it takes 2 minutes to convert a book from Mobi to ePub for me I'm not really that bothered.
Actually Python can be remarkably fast. Certain sorts of arguably heavy processing of a 100 MB text file take less than 30 seconds on my machine... which now and then makes me wonder whether certain parts of Calibre are violating some basic Python best practices.

The most obvious one which in the past turned python programs of mine that should have processed under two minutes to take literally hours is the "do not build strings directly" or "do not build strings one character at a time" part.

Code:
output = ''
for tmpChar in longtext:
     outChar = tmpChar
     # some conditional processing of outChar here
     output += outChar
gets incredibly slow incredibly fast as the size of the longtext string grows. A string of size 2X takes far longer than twice the amount of time to process than a string of size X.

Code:
outputList = []
output = ''
for tmpChar in longtext:
     outChar = tmpChar
     # some conditional processing of outChar here
     outputList.append(outChar)
output = ''.join(outputList)
however is consistently fast, longer strings only taking reasonably longer to process.

Admittedly, I doubt this is news to Kovid... but I imagine there are more than one such pitfalls in Python where the obvious way is suboptimal for processing-heavy code. Might something as such act as a bottle-neck in some of your conversion scripts, Kovid?

- Ahi
ahi is offline   Reply With Quote
Old 08-31-2009, 03:02 PM   #9
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,744
Karma: 22446736
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Ah everyone's favorite debate. If you want to get into it seriously, I suggest being more precise. Whey you say calibre is "slow" waht do you mean?

1) Some people complain that the GUI is slow. Other people say taht the GUI is fast enough.
2) Some people claim it is slow with large databases, other people say it is fast enough.
3) Some people say it is slow with conversion, other people...
4) Some people say it is slow with news download, ...
5) Startup times are slow

Now as the person that knows the most about this (and yes, ahi I am well aware of those and lots of other pitfalls in string processing), let me shed some light on easch of these problems

1) Almost all the actual GUI rendering happens in compiled (C++) code
2) Calibre actually stores all the metadata in memory, so this is not a database/python/string processing problem. Though I do admit that the performance here can be optimized. In fact, that's on my agenda as aprt of database independence refactoring for 0.7
3) The biggest bottleneck in conversion is parsing of CSS. calibre uses a CSS library that uses regular expression to parse CSS (the regular expression are again in compiled code). This library's performance can be improved, but I am not going to sit down and re-write it.
4) This is slow because the browser simulation library that calibre uses is not multi-threaded, so all actual downloads happen in a single thread. Until I get around to fixing that library,...
5) This is actually one place where using python hurts. But as far as I am concerned, it's worth it.
kovidgoyal is offline   Reply With Quote
Old 08-31-2009, 03:16 PM   #10
ahi
Wizard
ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.ahi ought to be getting tired of karma fortunes by now.
 
Posts: 1,790
Karma: 507333
Join Date: May 2009
Device: none
Quote:
Originally Posted by kovidgoyal View Post
Ah everyone's favorite debate. If you want to get into it seriously, I suggest being more precise. Whey you say calibre is "slow" waht do you mean?
The part I intend to focus on is strictly file conversion.

I could be wrong... perhaps I'll write down some specific filesize/processing type/length of time data later this week... it seems to me I can quite heavily process several (5+, 10+) MB RTF files with Python in the same amount of time it takes to convert an under 2 MB file from one reflow format to another... which is basically an HTML-ish to HTML-ish type conversion. No?

If you are aware of the various pitfalls, doubtless there is good reason why things take as long as they do... but it always felt oddly (though not disruptively) slow to me.

- Ahi
ahi is offline   Reply With Quote
Old 08-31-2009, 03:18 PM   #11
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,744
Karma: 22446736
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
A good point about RTF, I forgot to mention that. RTF conversion is slow in calibre, but that's largely because of a rather badly designed RTF processing library calibre uses. I've always been meaning to swap it out for another, but never get around it. Any volunteers
kovidgoyal is offline   Reply With Quote
Old 08-31-2009, 05:38 PM   #12
Pablo
Guru
Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.Pablo ought to be getting tired of karma fortunes by now.
 
Pablo's Avatar
 
Posts: 970
Karma: 4999999
Join Date: Mar 2009
Location: Rosario, Argentina
Device: SONY PRS-505, PRS-T2
Quote:
Originally Posted by kovidgoyal View Post
Python all the way. Highest productivity language I've ever found and I know over twenty
OK, I'll give it a try..... thank you very much for your help. By the way, Calibre is GREAT.
Pablo is offline   Reply With Quote
Old 08-31-2009, 06:42 PM   #13
igorsk
Wizard
igorsk ought to be getting tired of karma fortunes by now.igorsk ought to be getting tired of karma fortunes by now.igorsk ought to be getting tired of karma fortunes by now.igorsk ought to be getting tired of karma fortunes by now.igorsk ought to be getting tired of karma fortunes by now.igorsk ought to be getting tired of karma fortunes by now.igorsk ought to be getting tired of karma fortunes by now.igorsk ought to be getting tired of karma fortunes by now.igorsk ought to be getting tired of karma fortunes by now.igorsk ought to be getting tired of karma fortunes by now.igorsk ought to be getting tired of karma fortunes by now.
 
Posts: 3,442
Karma: 300001
Join Date: Sep 2006
Location: Belgium
Device: PRS-500/505/700, Kindle, Cybook Gen3, Words Gear
Kovid, have you done any profiling of Calibre? From my experience bottlenecks often happen in places where you least expect them...
igorsk is offline   Reply With Quote
Old 08-31-2009, 10:12 PM   #14
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,744
Karma: 22446736
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Quote:
Originally Posted by igorsk View Post
Kovid, have you done any profiling of Calibre? From my experience bottlenecks often happen in places where you least expect them...
From time to time, though not nearly as often and thoroughly as I should. Too many new features to implement
kovidgoyal is offline   Reply With Quote
Old 09-01-2009, 02:53 AM   #15
HarryT
eBook Enthusiast
HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.
 
HarryT's Avatar
 
Posts: 85,544
Karma: 93383043
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
Quote:
Originally Posted by kovidgoyal View Post
3) The biggest bottleneck in conversion is parsing of CSS. calibre uses a CSS library that uses regular expression to parse CSS (the regular expression are again in compiled code). This library's performance can be improved, but I am not going to sit down and re-write it.
As I said, I have no "complaints" at all - the speed is generally perfectly acceptable, and I've never known anyone fix reported bugs as rapidly as you do, so kudos for that!

Conversion speed from Mobi to ePub seems to be remarkably inconsistent. All my books are created via the same route: BD -> HTML -> Mobi Creator -> Calibre, and yet sometimes I have two books which are pretty much the same size, and not obviously of different "complexities", and yet one will convert in 15 seconds and the other will take 15 minutes. Not a problem - just an observation .
HarryT is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Programming language code snippets in ebooks? Connochaetes Writers' Corner 7 10-18-2010 02:43 PM
Computer programming books JoshLessard Amazon Kindle 6 08-08-2010 06:08 PM
PRS-500 500 Programming MarzKrishna Sony Reader Dev Corner 1 12-17-2009 08:43 PM
Free Programming Resources hacker Deals and Resources (No Self-Promotion or Affiliate Links) 0 07-16-2005 11:24 AM


All times are GMT -4. The time now is 03:20 AM.


MobileRead.com is a privately owned, operated and funded community.