Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Development

Notices

Reply
 
Thread Tools Search this Thread
Old 08-07-2013, 10:31 PM   #1
blackjag
Junior Member
blackjag began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Aug 2013
Device: Amazon Kindle
Job Manager

Hello, to start with I'd like to thank you guys for an amazingly complete book management system. I'm primarily a Java programmer, working on telemetry and event processing engines. I've worked a few open source projects before, like InstallJammer and IzPack, and I'm looking for a new project to work on.
Like many others, I've been looking at improving the web functionality of Calibre. This seems the easiest way to make Calibre available for multiple users. Specifically, the ability to edit metadata or convert books through the built-in content server. Adding new dialogs and Ajax hooks to do this is easy, and its not too hard to hack in basic metadata editing. I've ran into a few problems with kicking off calibre jobs though.

The current job manager (gui2/jobs.py) is integrated with the Qt code for viewing and managing the jobs. Before I continue down the rabbit hole of starting to rewrite this stuff to separate it out into more of a MVC setup, I wanted to ask what the current plans for this are. I'd love to contribute towards separating out the gui from the tasks and actions, so that a more complete web gui could be made. Also, implementing a plugin system for adding features to the content server would hopefully let people focus on improving that rather than on creating yet another standalone web server.
blackjag is offline   Reply With Quote
Old 08-07-2013, 11:08 PM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,345
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
There are various things that are on my TODO list for the CS:

1) The content server can run in two contexts, embedded in the GUI and as a standalone process. When it is embedded in the GUI, you have to setup some mechanism for updating the GUI when the CS make changes. In particular for your question, for a jobs manager this means that it will have to integrate with the GUI job manager when running in the GUI. When running as a standalone process, you need to setup some form of inter-process locking to prevent the GUI from accessing the same library at the same time.

2) The CS needs to be rewritten to use a modern Ajax framework, separating the data from the markup/styling completely, making it easy to theme the server.

3) The CS needs to be refactored to use the new database backend

4) An in browser ebook viewer needs to be added to the CS.

5) Authentication and session management needs to be added along with some kind of system for per user restrictions on viewable libraries/subsets of libraries. We will likely have to move off HTTP digest based auth since the incompetents at google seem to be unable to implement support for it in Android. This means some kind of http form based auth, which is tricky since we cant use ssl in an application server.

6) The content server needs to get support for viewing multiple libraries, this is particularly tricky when running in the GUI and with multiple user sessions. This may actually prove too hard/resource intensive to do at all.

I will start work on 2 and 3 sometime in the relatively near future. It will most likely entail a complete re-write of the server. My plan is to do 2, 3 and 4 and then move on to adding write capabilities to the server. So I would suggest waiting until I am done with 2 and 3 before proceeding, or you will most likely have to redo a lot of work.

Note that adding write support to the current CS is a bad idea since the db backend is not thread safe, which means you will get data corruption/loss when running embedded in the GUI.

Last edited by kovidgoyal; 08-07-2013 at 11:11 PM.
kovidgoyal is offline   Reply With Quote
Advert
Old 08-08-2013, 12:36 AM   #3
blackjag
Junior Member
blackjag began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Aug 2013
Device: Amazon Kindle
Thanks for responding so completely.
1. I think separating the Qt GUI out using MVC would be a great fit here. If there was a controller layer between the database and the gui, including a task manager, then both the Qt Gui and the Web server could run as equal citizens, accessing the controller framework. It wouldn't matter if either or both were running. As tasks were added from each, they would be put into the same scheduler. This could be as simple as rewriting the job manager, and having any editing tasks from the gui be added as jobs. Both the gui and CS would need to be informed of database changes, and refresh. As far as the locking goes, can you assume that SQLite will lock successfully if both processes are on the same system?
2+3. Agreed, sounds good. It would be nice to be able to have plugins to add ajax hooks and functionality to the server, and javascript to access those hooks. Perhaps have known lists like header buttons, per book options, etc, and let the plugins add html snippets to those lists when they add in their ajax hook.
4. There are a number of javascript epub viewer libraries, so this seems pretty easy to integrate. If plugins could be added to the CS, then wrapping one or more of the ebook viewers up as a plugin would make it easy to add, update, and replace.
5. I think this could be a place where you could use one of the light-weight CMS frameworks. You'd get the user management, plus quite a lot of other features but its overkill and none of the user config would be available to the Qt Gui. If you don't want to do that, it might be better to just let users configure a proxy server to provide authentication to the CS.
6. This would be a great feature, but could be solved by running multiple CS. The user would just have to configure each to use different ports and run different libraries.

Note. Unfortunately, that's why I was looking to use the job manager to consolidate any write requests into one thread.
blackjag is offline   Reply With Quote
Old 08-08-2013, 03:24 AM   #4
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,345
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
The communication only needs to be one way, i.e. CS->GUI not the other way around. The CS does not maintain live views of data, unlike the GUI, therefore the db layer takes care of GUI->CS updating automatically.

Jobs are asynchronous, making every single write operation asynchronous is never going to fly. The performance and usablity would absolutely suck. Imagine having to launch a job every time you edit a single title in the book list.

CMS frameworks wont work, they will use bog standard HTTP form based authentication, which will not work for us, since we can't use ssl. Also, the CS offers a much more dynamic view of the underlying database than the typical CMS framework even dreams of.

In any case auth issues are mostly orthogonal to the rest. They can be addressed after the basic refactoring is done, as long as the refactoring keeps session management in mind, the exact details of the implementation can be switched around later.

For the viewer, I will simply use the same code I developed for the calibre viewer, which is tested, works, is well designed and most importantly maintainable by me, since I am the one that has to deal with the "my book isn't looking right in your viewer" bug reports.
kovidgoyal is offline   Reply With Quote
Old 08-09-2013, 02:44 PM   #5
blackjag
Junior Member
blackjag began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Aug 2013
Device: Amazon Kindle
I disagree about making jobs asynchronous bringing down performance. A proper job executor would allow for way more performance optimization while abstracting the implementation from the plugin developers. Need device jobs to execute synchronously? Then have a separate single thread executor for each device that's added. As jobs for a specific device id are added simply throw them on the end of the queue for that device. This sort of executor allows you to decide what needs to be synchronous and what doesn't, and best of all gets every write operation off the gui event dispatch thread.
To use your example of editing a book's title, perhaps the edit could happen on the database cache and release the event dispatch thread to allow the user to continue using the program. Then, asynchronously, edits to the book itself and the write out of the database change could be added to the executor. The edit is instantaneous as far as the user is concerned, and the i/o can happen as fast as it can without blocking the user. Since the database view has already been updated, the user can proceed just like the entire operation has already completed. This might involve using a task class for i/o that's lighter weight than the current job class.
That setup makes the most sense to me, but I can also see how you would be more comfortable just adding in hooks to have the CS inform the GUI of each change and not change up the existing code too much. I'm not trying to get you to rewrite the entire codebase, just trying to get a feel for some of your architectural decisions and future plans. I'd love to help and if any of these ideas sound good to you, I can upload some UML or a code prototype to better communicate what I'm talking about.
In any case, all write operations from the CS should be asynchronous. Also, I would argue that the CS needs a live data cache as much as the GUI, since it needs to support a fair number of users simultaneously browsing and downloading.
Also, I'm not sure what you mean about the CS being far more dynamic than a CMS. Are you just talking about the ease of adding and removing metadata and columns? In any case, I agree that a CMS isn't a very good fit, but if you really want to set up multiple users, with permissions on a per book or per library basis it seems silly to try to write that from scratch instead of just using the parts of an existing system that might fit.
blackjag is offline   Reply With Quote
Advert
Old 08-09-2013, 10:31 PM   #6
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,345
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
There is absolutely no way that the process of creating a job, running it in a separate thread updating the UI to show that a job is running, calling the actual update value code, updating the UI to show the job has finished and finally transferring the result back to the GUI thread could possibly be as performant as a simply calling the update code in the same thread, no matter what you do in term of "asynchronous job dispatch". Not to mention that in python only one CPU bound thread can ever execute at a time.

The database backend already maintains a live view of the data in your sense of live. My meaning of live simply refers to the fact that unlike a GUI, HTTP is a pull, not push interface, or in other words, web pages do not typically update automatically when the underlying database updates. They only update in response to some user action.

As for not rewriting stuff, that obviously makes sense, prima facie, but in my experience shoe horning a software stack onto a problem that is a bad fit just to avoid re-writing some code ends up costing much more time in maintenance over the long term.
kovidgoyal is offline   Reply With Quote
Old 08-10-2013, 03:37 AM   #7
blackjag
Junior Member
blackjag began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Aug 2013
Device: Amazon Kindle
Forgive my Java terminology. If you're in Python 2.7, the multiprocessing package is needed instead of using threads to speed up the jobs. Same design patterns, just more complicated implementations. The point I was trying to make was about user performance. When you block the event dispatch thread, the user has to wait for you to do all the editing i/o. If your change happens in the database cache immediately then forks, the gui can proceed and the user can continue to manipulate the library while a separate worker takes care of the slow io. If you're using the new global interpreter lock in python 3, then this doesn't give you any absolute speed advantage but the worker thread will release the GIL on I/O wait, returning it to the event dispatch thread, I believe. From the users perspective, it becomes far faster and allows them to start setting up the next edit while the software finishes writing out the previous edit. The current setup works decently well on my dev system (2ssd in raid0), but becomes painful when trying to edit a library on a thumb drive or 5400rpm laptop drive.
BTW, the GIL drive me crazy in the same way that the Java VM probably drives java developers crazy the first time they run out of heap space. Almost all my python experience is on Jython, which lacks the GIL, so I've only had to deal with it on one embedded project.
blackjag is offline   Reply With Quote
Old 08-10-2013, 08:38 AM   #8
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,345
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
You cannot use multiprocessing. On unix multiprocessing does a fork() without exec() which is totally broken for something as complex as calibre. multiprocessing is really a bit of a joke of a module.

And on windows launching a separate worker process that has to load python, then load the db would take *seconds* on a typical low powered notebook.

You need to step back from the abstract async vs. sync philosophical debate and think about actual concrete implementation.

A sync implementation updates the database in under 0.1 seconds, which is imperceptible to the user, even though the event loop might be in fact be blocked for that 0.1 second. An async implementation will result in the value *the user sees* not changing for more than 0.1 seconds, and for ~ 1 second on slower computers which is absolutely unacceptable.
kovidgoyal is offline   Reply With Quote
Old 11-12-2013, 07:30 AM   #9
tdack
Junior Member
tdack began at the beginning.
 
Posts: 1
Karma: 10
Join Date: Nov 2013
Device: iPads, Galaxy Tabs
Quote:
Originally Posted by kovidgoyal View Post
There are various things that are on my TODO list for the CS:
5) ... We will likely have to move off HTTP digest based auth since the incompetents at google seem to be unable to implement support for it in Android. This means some kind of http form based auth, which is tricky since we cant use ssl in an application server.
For those on Apache 2.3+ you can use mod_auth_form to provide authentication that works with Android devices - at least it works with the Galaxy Tab-10.1 V2 & V3 I have.
tdack is offline   Reply With Quote
Old 11-12-2013, 06:28 PM   #10
At_Libitum
Addict
At_Libitum ought to be getting tired of karma fortunes by now.At_Libitum ought to be getting tired of karma fortunes by now.At_Libitum ought to be getting tired of karma fortunes by now.At_Libitum ought to be getting tired of karma fortunes by now.At_Libitum ought to be getting tired of karma fortunes by now.At_Libitum ought to be getting tired of karma fortunes by now.At_Libitum ought to be getting tired of karma fortunes by now.At_Libitum ought to be getting tired of karma fortunes by now.At_Libitum ought to be getting tired of karma fortunes by now.At_Libitum ought to be getting tired of karma fortunes by now.At_Libitum ought to be getting tired of karma fortunes by now.
 
Posts: 265
Karma: 724240
Join Date: Aug 2013
Device: KyBook
Apologies for perhaps going a bit OT here, but since I happen to be burried deep in COPS atm, I am slightly puzzled by the possible pitfalls mentioned for point 6. This because that is exactly what COPS allows me to do, and the remarks sound a bit ominous. I haven't noticed any ill side effect from this so far, but would like to know if this could create issues. COPS only accesses the db's for reading so I don't really expect issues. At most there could occur a deadlock situation, but then I would need to be using both COPS and Calibre at the same time and same moment accessing the same books, highly unlikely for me so that alone diminishes any such risk I think.

Nevertheless, always better to know in advance what the problems might be.

on the subject of CS and theming. Would this somehow include some kind of interface for JS/PHP so users could easily create their own custom type frontends such as COPS?
At_Libitum is offline   Reply With Quote
Old 11-12-2013, 09:25 PM   #11
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,345
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Read only access to the database can cause no problems, the potential problems happen when you have multiple processes wanting write access to the same library.
kovidgoyal is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
ES File Explorer, Astro File Manager or File Manager HD? DreamWriter Android Devices 15 04-05-2012 03:00 PM
Keep a beloved but low-paying job, or take a higher-paying job I might not love? ficbot Lounge 26 07-18-2010 04:48 AM


All times are GMT -4. The time now is 07:50 AM.


MobileRead.com is a privately owned, operated and funded community.