View Full Version : Content Management


acemccloudxx
01-06-2008, 10:55 AM
I'm new to the mobile reader game - but not new to ebooks and quite frankly, I am unhappy with the choices for managing content.

What is the use of being an experienced Java developer if I can't use my skills to solve my own problems?

So I have started a project to "build a better mousetrap". Anyone who is interested in participating, please contact me by email or IM. This will be an open source project using other open source tools and libraries.

Ace

JSWolf
01-06-2008, 11:08 AM
I don't program in Java, but I can help with giving you ideas for some of the features.

acemccloudxx
01-06-2008, 11:14 PM
Well, it will need user testing. Users have no preconceived notions about how the software is supposed to work, so they try to use it in ways that the programmers never imagined.

Ace

RWood
01-07-2008, 12:25 AM
So what do you plan for your "better mousetrap" to do?

Is your "content management" an indexing/cataloging application? Is it a formatting application? What are the designer parameters?

BTW: While not a JAVA programmer, I did spend many years in application system design.

acemccloudxx
01-07-2008, 01:06 AM
1. I want to be able to import a variety of file formats. At this point, I am planning on storing them in a format neutral database.

2. I want to be able to export the same variety of file formats.

3. I want the usual sorts of catalog functions for listing and finding documents.

4. I want to be able to read the documents.

Extra credit

5. I want to be able to create new documents or update existing documents.

6. I want my content automatically backed up to my second drive.

7. I want to be able to manage the content on my reader.

Bonus extra credit

8. I want to be able to share content over the network with friends or coworkers.

kovidgoyal
01-07-2008, 03:39 AM
When you say "import" and "export", I assume you mean convert to/from? And which of these goals are you aiming for in version 0.1?

nairbv
01-07-2008, 06:33 AM
acemccloudxx:
you might want to look at kovidgoyals libprs500 software to see if it's what you need first, especially since you're using a sony (which it sounds like it caters to). It doesn't currently have a good mass-import feature, which prevented me from really trying it out much.

There's also "My ebook library", ... which seems pretty nice on first glance, but lacks what I consider to be a few critical features, and might be windows only. Mostly what kept me from continuing to use it after a few minutes of playing around, was that there was no way to select multiple books and edit the information for them all at once. I expect to be able to select a group of books and say that they are all some author, etc.

Ideally, I think the software should sort of be a cross between windows photo gallery, and itunes.

I'm also a java programmer (though I'm not opposed to programming in any language), so I might be interested in working on something. I think really good ebook cataloging software will be written eventually though, so I don't want to bother working on something that doesn't seem like it will end up being it.

There is quite a bit of ebook cataloging software out there. It's a big project that should be done well, and it seems like there's a lot of distributed effort in various less-than-ideal projects. Each piece of software I see has one or two of the features I seek implemented, but I feel like none will really be all that useful to me until all features are implemented in one piece of software.

Basically what I'm saying is, spend some time looking around first to make sure there isn't already a project out there that it would be more useful to contribute to, rather than starting another project that does one or two cool things but overall isn't all that useful.

I'm gonna go take a look at "ebook explorer 1.1" and "eKitaab" which I haven't tried yet.

looking at the ekitaab's website now, it might be good actually. it doesn't store a database at all, but just edits the filename to store the author, title, and isbn. Once that info is in the title, other software like "my ebook library" will also recognize all the files for what they are.

oh... and ekitaab is written in Java....... we'll see.

acemccloudxx
01-07-2008, 10:11 AM
Kovidgoyal: I want to keep the documents in a format that implements a superset of the features offered by various formats. A simple, mundane example is quotation marks - some programs differentiate between left and right quotation marks. It's easy enough to downgrade the file to plain quotation marks when exporting to a format that does not differentiate (text, LRS/LRF).

I also want a format that is easy to manipulate because I plan on creating my own content. That's why I was thinking of using a database (Derby) with an object relational mapping package (Hibernate). So when I say "import" and "export" - I really do mean that.

As far as my goals for the first implementation - something very minimal, importing at least two formats (LRS and maybe PDF). Export one format (LRF?). A simple GUI that can display the catalog.

nairbv: I agree that there are lots of pieces but no complete solution. I looked at the two items you suggested. I don't like the approach that ekitaab uses, nor does it meet all of my needs. "ebook explorer" is a Windows only product and I am seriously considering making my next computer purchase a Mac.

I have to admit that I would rather contribute to a project than start my own - the synergy of a group effort can be powerful. However, I also want what I want and if nobody else is going in a direction that I like, I'll have to do it myself.

Ace

JSWolf
01-07-2008, 10:13 AM
I was hoping when you said "managing content" that you meant you would be writing the better cataloging program. That's what most of us need. I for one do nt need this to do format conversion. I just need a better way to keep track of the content I have.

acemccloudxx
01-07-2008, 11:05 AM
Managing content is definitely one of the major features that I need too. I'm a big iTunes fan and something that can do for my books what it does for music is a big MUST for me.

There's more that I want though. I don't just want to be able to create my own content, I also want to be able to fix typos in some of the content that I have. I also want to be able to keep together things that belong together - like the books of a series.

Ace

tompe
01-07-2008, 11:19 AM
There's more that I want though. I don't just want to be able to create my own content, I also want to be able to fix typos in some of the content that I have. I also want to be able to keep together things that belong together - like the books of a series.


Why build that into one application?

I do not think it is a good idea to convert a file to another format and not have the original as the master file. And for DRM:ed files you can probably only edit meta information and not information in the book.

acemccloudxx
01-07-2008, 11:40 AM
For me, DRM files are pretty much irrelevant - perhaps 0.1% of my content is DRM. Perhaps the publishers will get their thumbs out of their ... noses and this will change but, since conventional books aren't the only kind content that I want to manage, I doubt it.

I guess that the motivation for "one tool" is philosophical. Different people have different ways of looking at things. I think that a single, carefully integrated solution can do what I want to do with less hassle than a bunch of individual tools, each with its own peculiarities.

I'm not sure how to answer the "master file" question, other than to say that I wasn't planning on deleting files that the user imports. Again, that may be a philosophical issue - I'm interested in the words, not the file that they're in.

Ace

kovidgoyal
01-07-2008, 12:57 PM
Unless you're wedded to Java, I'd highly recommend checking out libprs500. It's under active development and I'm very open to adding more developers :-)

tompe
01-07-2008, 01:55 PM
I'm not sure how to answer the "master file" question, other than to say that I wasn't planning on deleting files that the user imports. Again, that may be a philosophical issue - I'm interested in the words, not the file that they're in.


But you import will not be perfect and you will want to redo it when you add some enhancement.

JSWolf
01-07-2008, 02:01 PM
Managing content is definitely one of the major features that I need too. I'm a big iTunes fan and something that can do for my books what it does for music is a big MUST for me.

There's more that I want though. I don't just want to be able to create my own content, I also want to be able to fix typos in some of the content that I have. I also want to be able to keep together things that belong together - like the books of a series.

Ace
Please DO NOT base your program on iTunes. Lots of people hate iTunes. And I am one of them. So please just do it from the ground up without any format conversion or storing of the source content.

acemccloudxx
01-07-2008, 03:06 PM
Ok, JSWolf, what about iTunes do you dislike? You have to be specific or else your feedback will not be helpful.

Though there is also the possibility that the things that you hate about iTunes will be things that I like. In the field of Java Integrated Development Environments there are three that I have used. One I HATE, one I LOVE and one just doesn't have the features (though I find it eminently usable otherwise). For many folks the experience if the exact opposite.

For the record, I'm not wedded to Java. It's just that I do know Java and I don't know Python.

Xenophon
01-07-2008, 03:38 PM
@acemccloudxx
Do please consider swotting up enough Python to help out with libprs500. Kovid's doing a great job, but he's only one (amazingly productive) guy. He's already done many of the things you talk about.

I'd be working on picking up some Python to help out myself if I wasn't deep in thesis mode at the moment. Not to mention years past my original planned completion date. If I spend time on much of anything that isn't my thesis I'll quickly become dead meat.

Xenophon

acemccloudxx
01-07-2008, 06:14 PM
Ok, libprs500 seems to be pretty decent software. Is it writing files or data somewhere? If so, where would that be (on Windows)? If it's on the C drive, then that would be a big no-go right there.

However, I really am just not in the mood to learn python.

I guess that I'll have to think about this for a bit.

kovidgoyal
01-07-2008, 06:37 PM
It uses a database that can be stored at a configurable location and it uses temporary files a lot.

Why should writing to the C: drive be a problem?

acemccloudxx
01-07-2008, 06:44 PM
Why should writing to the C: drive be a problem?

Because the configuration of this computer was created by idiots and there is no safe or easy way to change it. I have a 160GB drive with < 13GB in the C partition and am constantly running out of space.

kovidgoyal
01-07-2008, 07:45 PM
Well you can configure the database path :)

GEBSEWS
01-07-2008, 09:29 PM
Collectorz.com

Great program for keeping up with all types of books including ebooks. You can download the info right into the program with very little typing.

I was having a hard time keeping up with which books I had in each format. Some I read on my IPAQ PDA, some on my Sony PRS505 and I've downloaded a lot for the Kindle that I'm on the waiting list for. You can download a fully funtional demo that limits your database to 100 items. I tried it for a while before I purchased. I actually created more than one database with 100 items in each and after I purchased the full version, I was able to merge them together. You can create links to cover images, web pages, and actual ebook files on your hard drive. All you have to do then is sort your list by author, title or format, etc and then click on the links if you want to locate the file on your Hard Drive. I've created links from Wikipedia, Manybooks.net, and others that take me directly to that book or author. You can also add books and magazines that have combined content from different authors. Your list can be sorted in many ways. You can check the ones you've read and rate them. You can mark books that you don't own yet. I can go on and on about this. You need to try it.

I've been using MS Access for many years to keep up with my books but have never had the time to make a detailed database like this program. They've thought of everything.

GEBSEWS
Sony PRS505
Kindle Waiting List


I was hoping when you said "managing content" that you meant you would be writing the better cataloging program. That's what most of us need. I for one do nt need this to do format conversion. I just need a better way to keep track of the content I have.

JSWolf
01-07-2008, 09:43 PM
Ok, JSWolf, what about iTunes do you dislike? You have to be specific or else your feedback will not be helpful.

Though there is also the possibility that the things that you hate about iTunes will be things that I like. In the field of Java Integrated Development Environments there are three that I have used. One I HATE, one I LOVE and one just doesn't have the features (though I find it eminently usable otherwise). For many folks the experience if the exact opposite.

For the record, I'm not wedded to Java. It's just that I do know Java and I don't know Python.
The thing to do is not copy itunes at all. Just start with something new. I quite like a nicely laid out data entry screen.

nairbv
01-08-2008, 04:25 AM
regarding itunes, ... it's obviously not going to be an exact copy of itunes, but itunes has a hell of a lot of good ideas and nice features. I'm also curious what in particular you don't like about it though.

Being able to edit the author of a selected list of books for example, is something that itunes can do (but with songs of course), and the most ebook software currently out there can't do. Most software expects you to select each book individually. This is lame. If I have a directory with 37 plays of Shakespeare, I should select be able to select the 37 files, right click, hit "get info," change the author, and be done with all 37 books. Sure that doesn't enable me to edit 37 titles in one go, but at least I can make *some* progress in organizing files. I don't understand why so many of the programs I've tried didn't implement this basic feature.

Photogallery likewise is another piece of software with lots of good ideas to borrow from, like the way it handles tagging of photos. It also has a very usable ways of finding your pictures... like all the "folders" (not all of them actual folders, just buttons that dynamically get a bunch of photos based on tags or other search criteria) for grabbing a set of photos by tag, date, recently imported, etc etc.

Also, if you copy your files to another computer or try a different piece of cataloging software, in both of these programs you don't have to start over on a data entry process like with some of the ebook software out there. I mean, sure it's a lot more difficult with varying ebook formats, and maybe some of these cataloging programs have some way to export these databases but... still far from ideal.

This is what I like about ekitaab. His solution isn't perfect, but at least it addresses this problem. meta-data updates would be better if possible.

having all three options (cached in a database, updating metadata, and renaming of files that don't support metadata, as well as the searching functions for getting new data like in ekitaab and my ebook library), and then allow the user to override a default configuration. That way the user can decide if he wants to allow the software to change filenames, update metadata, or just rely on the database, ... that would be the best solution. Maybe not built that way initially, but the code could be written in a way to allow adding all three options and configurability later.

@acemccloudxx:

For the sake of re-usability, and since you're focused on java, maybe a good start would be to work on java libraries for managing content, ... for example, a start on a library for performing file conversions, or at least fetching/updating metadata in various ebook file formats.

You'll have to write this code anyways for the sake of importing metadata etc into your own database, but if you're careful about writing it well, and writing it in an extendable reusable way, the same code could also fill a gap in the ekitaab software at the same time. You'd be bitting of a smaller first chunk, actually getting something useful finished quickly, and still making progress on what you want to eventually build.

If I start writing some code, that's probably what I would focus on. Just writing little libraries that will be useful in a bigger project, and that add functionality that I want.

Likewise, even if you don't work on the other java project, you might be able to pull out some code like for however he does his amazon lookup for fetching ISBN numbers from file names.

This is supposed to be much of why we write OO code anyways right? If enough of the necessary libraries for managing ebook content get written, then anyone can slap together exactly-what-they-want in no time.

I wonder how re-usable the code in libprs500 is.... even for someone does want to make a custom exactly-what-they-want ebook app, ... if libprs is written in a way that it has a bunch of reusable libraries, it's still a big argument for working on python book management code. I mean, he's already got convert-damn-near-anything-to-lrf built right? Even if you don't care about lrf in particular, that at least means that he has a way of extracting data from damn near every file type, which is a big piece to be able to borrow, even if writing a new program.

Of course I'm assuming all this software is licensed in ways that the code reusable too...

tompe
01-08-2008, 08:48 AM
Being able to edit the author of a selected list of books for example, is something that itunes can do (but with songs of course), and the most ebook software currently out there can't do. Most software expects you to select each book individually. This is lame.

This is why I nearly always prefer command line programs since then it is easy to do these kind of things. The first thing I check in a graphical user interface is how I can do the same operation on many object and if I can program scripts or macros. Usually you cannot do this and I continue to use command line programs.

recycledelectron
01-08-2008, 10:13 AM
I'm a college prof, and I've got a few students working on a senior project to develop a web-based database app to store a digital library. Send me an eMail, and I'll put them in touch with you.

Andy@RecycledElectrons.com

kovidgoyal
01-08-2008, 12:49 PM
I wonder how re-usable the code in libprs500 is.... even for someone does want to make a custom exactly-what-they-want ebook app, ... if libprs is written in a way that it has a bunch of reusable libraries, it's still a big argument for working on python book management code. I mean, he's already got convert-damn-near-anything-to-lrf built right? Even if you don't care about lrf in particular, that at least means that he has a way of extracting data from damn near every file type, which is a big piece to be able to borrow, even if writing a new program.

Of course I'm assuming all this software is licensed in ways that the code reusable too...

I come from the Unix world, which means libprs500 is architected in little pieces each with its own command line interface which means that each piece of functionality is reusabe not just in python programs but in any software. THe only exception to that being the database access, and that's only because I haven't had the time to write the actual command line tool, the underlying library design is completely modular.

In fact adding complete support for converting any new ebook format to libprs500 requires the writing of only two converters format->html and html->format as well as a metadata reading/writing tool. All the other features of libprs500 will work automatically with these three converters in place.

Xenophon
01-08-2008, 01:20 PM
I come from the Unix world, which means libprs500 is architected in little pieces each with its own command line interface which means that each piece of functionality is reusabe not just in python programs but in any software. THe only exception to that being the database access, and that's only because I haven't had the time to write the actual command line tool, the underlying library design is completely modular.

In fact adding complete support for converting any new ebook format to libprs500 requires the writing of only two converters format->html and html->format as well as a metadata reading/writing tool. All the other features of libprs500 will work automatically with these three converters in place.
Kovid:

:bulb2: Please go start a page on your wiki about contributing to and/or reusing libprs500. Stick this post in as the initial entry. Add a plea for someone to contribute the command-line database access (since you haven't had time yet). :bulb2:

And how 'bout a sticky thread here titled 'Writing Code for libprs500'? You could manage q&a and probably pick up some more contributors that way. :2thumbsup

And, once again... :thanks: for a super tool!

Xenophon

kovidgoyal
01-08-2008, 01:57 PM
I've added a "Design philosophy" section to

https://libprs500.kovidgoyal.net/wiki/Development

and there is a sticky (though in a somewhat obscure forum)

http://www.mobileread.com/forums/showthread.php?t=14083

And I have to say that I've recently been getting a few nice patches. In the next release you'll see support for epub2lrf, epub-meta and the Kindle thanks to patches from third parties.

acemccloudxx
01-13-2008, 08:15 PM
After giving the matter careful thought, I just don't think that libprs500 is compatible in approach or philosophy with what I want in a content management program. I haven't firmly decided to continue my Java efforts. I may just use libprs500 as is and devote my free time writing.