Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Library Management

Notices

Reply
 
Thread Tools Search this Thread
Old 01-14-2016, 05:35 PM   #1
kensmosis
Junior Member
kensmosis began at the beginning.
 
Posts: 9
Karma: 22
Join Date: Sep 2015
Device: none
Unix timestamps when adding books

Is there an easy way to get calibre to use the unix timestamps for added files when setting the date, published date, and/or modified date fields? Populating any one of these would be sufficient for my purposes. The problem is that I am using calibre to manage a large set of scanned PDFs. The unix timestamps of these pdfs are very important for sorting. However, I cannot seem to get calibre to read them into any of its date fields. It always seems to assign the current date to all the files. I'm probably missing something very basic. I could write a script to stat all the files and edit the database appropriately -- but I'd rather do it the right way (and avoid the extra work) if possible. Any help would be much appreciated.

Cheers,
Ken

Last edited by kensmosis; 01-14-2016 at 05:36 PM. Reason: Typo
kensmosis is offline   Reply With Quote
Old 01-14-2016, 07:10 PM   #2
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 21,725
Karma: 29711016
Join Date: Mar 2012
Location: Sydney Australia
Device: none
@kensmosis - by default, the Date column contains the date the book was added to the library. However it can be changed manually (which would imply the column has been re-purposed), but AFAIK it cannot be extracted from a format file or downloaded

The Published Date columns can be extracted from a format file, my guess is that for a PDF, it would have to be in embedded XML as the Dublin Core Date element

The Modification Date is the date when the books metadata was last changed in the database, it cannot be edited or updated, other than by changing a books metadata.

If you want to see them in ISO format, then that can be specified in Preferences->Tweaks->gui_pubdate_display_format (search for date) - 'iso' will show as something like - 2015-12-25T10:54:29+11:00

BR

Last edited by BetterRed; 01-14-2016 at 08:22 PM. Reason: typo
BetterRed is offline   Reply With Quote
Advert
Old 01-14-2016, 10:30 PM   #3
kensmosis
Junior Member
kensmosis began at the beginning.
 
Posts: 9
Karma: 22
Join Date: Sep 2015
Device: none
@BetterRed

Thanks so much for your reply. What I want to do is actually simpler and (to my mind) more natural than download the date from a file or from a format file or adjust it manually. I want to use the unix time stamp that the file has before being added. For example, suppose I have a file foo.pdf sitting in my home directory. It has 3 unix timestamps, but let's suppose we just care about the last modification time. For example, I run 'ls -l foo.pdf' and it displays Sept 3, 2015. When I add it to Calibre, a copy is created in calibre's library. However, the copy has today's date (Jan 14, 2016) and all the time info from the original file is lost (i.e. it is copied using the equivalent of 'cp' rather than 'cp -a'). Whether because of this or in addition to it, the metadata.db has Jan 14, 2016 in all of its date columns. It seems very natural to me that one could want to import the dates from the unix file itself rather than metadata or some other source. For example, the unix ctime could map to publication date, the unix mtime could map to the ordinary "timestamp" date in the db. This is a very natural thing to do -- so I'd be quite surprised if there isn't functionality for it. I just assumed I was missing something. But then, my use may not be the typical one. Calibre would be extremely well suited for managing scanned documents (bills, etc) as well as books if it has this feature, but not so much if it doesn't. If it's not a feature yet, then perhaps I'll post it as a suggestion on the calibre site. Unfortunately, because it involves the original file prior to addition, I doubt that a simple add-on can be written to implement it.
kensmosis is offline   Reply With Quote
Old 01-15-2016, 02:43 AM   #4
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 21,725
Karma: 29711016
Join Date: Mar 2012
Location: Sydney Australia
Device: none
@kensmosis - arrgh - you mean the file system dates Sorry, I think of the file system and the operating system as two different beasties.

You're not going to like this - I don't think there is a way to wrangle mtime, ctime, atime, or dtime directly into calibre.

Calibre has two accessible standard date columns: Date/timestamp and Published/pubdate, the Modified/last_modified column is a read only column.

If you were to put the create date (crdate) into the file name (via say a mass rename) then you could put that into Published/pubdate via an add time regex - see Preferences->Add Books->Configure metadata from file name.

The 'difficulty' with importing file system timestamps relates in part to: a) calibre supporting various file systems on various operating systems, b) there can be more than one format file for a book each with its own dates, and c) in most cases the file system dates bear no relationship to anything but when the file landed on my computer and what's happened to it since.

BR
BetterRed is offline   Reply With Quote
Old 01-15-2016, 03:52 AM   #5
kacir
Wizard
kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.
 
kacir's Avatar
 
Posts: 3,463
Karma: 10684861
Join Date: May 2006
Device: PocketBook 360, before it was Sony Reader, cassiopeia A-20
Quote:
Originally Posted by kensmosis View Post
Is there an easy way to get calibre to use the unix timestamps for added files when setting the date, published date, and/or modified date fields?
there is way.
THE Unix way (TM)

As the true Unix way this is going to be accomplished by using several simple tools step-by-step.

1. You have file called john_doe_-_book title.pdf that has the filesystem date set to the desired date.

2. You write a shell script that is going to rename this file to the 2015.01.02---john_doe_-_book title.pdf format.

2.A. I personally would make a directory listing to the file, like this
Code:
cd dir_with_files
ls -l -D%Y%m%d > list_of_files.txt
and then create a vim script that would edit the list_of_files.txt to create a shell script with lines looking like this:
Code:
cp "john_doe_-_book title.pdf" "2015.01.02---john_doe_-_book title.pdf.pdf"
You can also get creative with awk or other tool of your prefference.

2.B. If this was frequently recurring job, I would spend some time creating a script that would rename the files with a command similar to this:
Code:
for f in *; do mv "$f" "$(date -d@$(stat --printf='%Y' "$f") +%Y%m%d%H%M%S)-$f"; done
. The net is full of examples of how to add creation date to the file name, but this is not what you want, you want to pull the date the file was originally created. It is possible to find examples of your desired functionality (*). There might even be a gui tool for Ubuntu, because people often want to rename *.jpeg files from the camera. Krename is one example, Thunar is another. Krename pulls creation date from exif info inside jpeg, but it might be possible to persuade it to look at the file date. Have a look http://ftp.uni-stuttgart.de/gentoo-d...ame-3.0.12.pdf

3. Use Calibre with drag & drop to add files. Configure the Calibre Prefferences, Adding books to use template
Code:
^(?P<published>((?!\s-\s).)+)---(?P<author>((?!\s-\s).)+)\s-\s?(?P<title>[^(]+)(?:\(.*\))?
3.A. If you want to know what are the magic tags, such as ?P<published> for allowed fields, just hover your mouse over the text box next to the field in the Test section of calibre configuration dialog for Prefferences -> Adding books


Do not hesitate to ask when the above instructions are not enough. I do not know your level of expertise ;-)


(*) here are a few examples I was able to find in a hurry:
http://www.arj.no/2015/08/03/add-dat...les-in-ubuntu/
http://askubuntu.com/questions/51179...o-the-filename
http://stackoverflow.com/questions/4...o-date-created

EDIT:
I have replaced
ls -l > list_of_files.txt
in my example with
ls -l -D%Y%m%d > list_of_files.txt
to get date format that is easier to process further
You can also play with the stat command to bet the date for the file instead of ls

Last edited by kacir; 01-15-2016 at 06:04 AM.
kacir is offline   Reply With Quote
Advert
Old 01-15-2016, 10:05 AM   #6
kensmosis
Junior Member
kensmosis began at the beginning.
 
Posts: 9
Karma: 22
Join Date: Sep 2015
Device: none
@BetterRed You are correct, of course; I am speaking of filesystem dates. I'm afraid I was being too clumsy with my language From your initial reply, I began to glean that the feature I wanted was less likely present than I had initially hoped; and you've confirmed that. Thanks for clarifying this.

@kacir Thanks for the detailed and informative reply! From both answers, it seems pretty clear that I will have to script things. I had considered your approach as one of two if calibre couldn't do this natively. Perhaps you would be kind enough to advise me which would be better. I have about 10000 files, so I'll probably write a Perl script to manage the process either way.

1. As you say, embed the date in the filename. I appreciate the guidance, and don't see this as being a problem. This approach has the advantage of being easy and unintrusive regarding the metadata.db itself. However, it seems a bit unnatural because the files themselves (the ones managed by calibre) would still lose their timestamps. If for some reason I need that info down the road, I'll have to reconstruct it from the db or hope I have an original copy of the files floating about.

2. Alterately, I could edit the db. I would add the files to calibre as is, and then do two things (to clarify, the 2nd approach involves doing both):
(a) Run a SQL query to modify the "Date/timestamp" and/or "Published/pubdate" fields (or just add custom fields for the filesystem ctime and mtime and populate those). I also can directly modify the "modified" date by this approach regardless of its ro status in calibre.
(b) Using "touch -r $orig_file $calibre_copy_of_file", change the dates of the files themselves under calibre's management.

This 2nd approach requires a little finesse because my files are in multiple subdirectories (I add them using the "include subdirectories" option) -- so there may be recycled filenames. I probably would have to use an md5 checksum to match files instead, but this isn't a big deal. While this approach seems a bit more intrusive and involved, it has the advantage that -- if done right -- in the end the world will look to calibre just as it does to the filesystem itself. The files will seem as if they had been added one by one at the time of their (filesystem) creation, and last modified at the time of their last filesystem modification.

My inclination is to use (2) even though it is more complicated. In your opinion, is this a big mistake? My concern is that there are hidden dates in the calibre db (or in some config file I don't know about) that would blow everything up when it performs consistency checks. Then calibre would either fail or overwrite my dates with its own "corrected" ones. Of course, I could try (2) and then revert to (1) if it fails -- but it would definitely be useful to know at the outset whether it is a bad idea altogether. If you have any insight into whether I'm setting myself up for a world of hurt, I'd really appreciate it.

I should clarify that I don't mind a bit of unix scripting (including SQL queries). I'm ok with Perl and already have had to write scripts to OCR and process these scanned documents in bulk as part of my bigger project to manage my scanned bills, so an extra script is ok if it is necessary and will work.

I really appreciate your advice and help!

Cheers,
Ken
kensmosis is offline   Reply With Quote
Old 01-15-2016, 10:18 AM   #7
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 45,347
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
calibredb add -h
calibredb set_metadata -h
kovidgoyal is offline   Reply With Quote
Old 01-15-2016, 10:53 AM   #8
itimpi
Wizard
itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.itimpi ought to be getting tired of karma fortunes by now.
 
Posts: 4,553
Karma: 950151
Join Date: Nov 2008
Device: Sony PRS-950, iphone/ipad (Marvin/iBooks/QuickReader)
Quote:
Originally Posted by kensmosis View Post
I also can directly modify the "modified" date by this approach regardless of its ro status in calibre.
The 'modified' date in Calibre can be changed any time a change is made to metadata, so you do not want to rely on this not changing unexpectedly. The 'added' date is not subject to change so you could use that. You could also set up one or more user defined columns to store such information and these would not be touched by Calibre so that may be the more reliable way to do this sort of thing.
itimpi is offline   Reply With Quote
Old 01-15-2016, 12:23 PM   #9
kensmosis
Junior Member
kensmosis began at the beginning.
 
Posts: 9
Karma: 22
Join Date: Sep 2015
Device: none
@kovidgoyal: Thanks! I'd never played with the CLI; that will definitely make the scripting a lot easier, and I won't have to monkey about with the db directly!
@itimpi: Thanks for the dates guidance. I think I'll go the custom columns route and not try to coerce calibre to deviate too much from its normal way of doing things.

Thanks everyone, I really appreciate all the help and now have what it takes to script things efficiently!
kensmosis is offline   Reply With Quote
Old 01-15-2016, 12:25 PM   #10
theducks
Well trained by Cats
theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.theducks ought to be getting tired of karma fortunes by now.
 
theducks's Avatar
 
Posts: 31,054
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
Rather than mess (and create a maintenance nightmare) with a standard Timestamp field, create a custom column of the Timestamp type. It won't get overwritten if someone uses a standard Calibre feature

USE the CLI (as hinted by Kovid) and a custom script to populate your Library

(A blank, means the entry was not processed using your special rules)

I see all kinds of issues. The TZ settings. or any other Timestamp affecting setting
theducks is offline   Reply With Quote
Old 01-15-2016, 03:22 PM   #11
kacir
Wizard
kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.
 
kacir's Avatar
 
Posts: 3,463
Karma: 10684861
Join Date: May 2006
Device: PocketBook 360, before it was Sony Reader, cassiopeia A-20
Quote:
Originally Posted by kensmosis View Post
@kacir Thanks for the detailed and informative reply! From both answers, it seems pretty clear that I will have to script things. I had considered your approach as one of two if calibre couldn't do this natively. Perhaps you would be kind enough to advise me which would be better. I have about 10000 files, so I'll probably write a Perl script to manage the process either way.
You do not HAVE to write the script nowadays.
I have just tested krename and you can very easily do the rename from its GUI.
Just select files and set Filename to Custom name from dropbox and click on the lightbulb button and build following expression [creationdate;yyyy.MM.dd]---$
Depending on circumstances I personally would either use krename or write a Gvim script that would then write shell script from an ls -l > something.txt list.
Quote:
Originally Posted by kensmosis View Post
1. As you say, embed the date in the filename. I appreciate the guidance, and don't see this as being a problem. This approach has the advantage of being easy and unintrusive regarding the metadata.db itself. However, it seems a bit unnatural because the files themselves (the ones managed by calibre) would still lose their timestamps. If for some reason I need that info down the road, I'll have to reconstruct it from the db or hope I have an original copy of the files floating about.
I personally have used the modification of the filename as a preparation for adding to Calibre quite a few times and I see many advantages.
The name of the file remains unchanged in Calibre, so it will preserve the info about the original creation date.
Also, it is Not A Good Idea (TM) to work with files behind Calibre back. Use Calibre itself to access the files inside and treat filesystem as a black box. There are many, MANY threads discussing this. The files are user readable and this is great advantage in the case of disaster or something - you can salvage the remains of the library even if it is heavily damaged.

I always use simple Drag & Drop to add numerous files to Calibre to add files.
Quote:
Originally Posted by kensmosis View Post

2. Alterately, I could edit the db. I would add the files to calibre as is, and then do two things (to clarify, the 2nd approach involves doing both):
(a) Run a SQL query to modify the "Date/timestamp" and/or "Published/pubdate" fields (or just add custom fields for the filesystem ctime and mtime and populate those). I also can directly modify the "modified" date by this approach regardless of its ro status in calibre.
(b) Using "touch -r $orig_file $calibre_copy_of_file", change the dates of the files themselves under calibre's management.

This 2nd approach requires a little finesse because my files are in multiple subdirectories (I add them using the "include subdirectories" option) -- so there may be recycled filenames. I probably would have to use an md5 checksum to match files instead, but this isn't a big deal. While this approach seems a bit more intrusive and involved, it has the advantage that -- if done right -- in the end the world will look to calibre just as it does to the filesystem itself. The files will seem as if they had been added one by one at the time of their (filesystem) creation, and last modified at the time of their last filesystem modification.
I personally prefer solution #1. As I said it is not a good idea to do ANYTHING with files once they are inside the library behind Calibre's back ;-)
kacir is offline   Reply With Quote
Old 01-15-2016, 04:27 PM   #12
BetterRed
null operator (he/him)
BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.BetterRed ought to be getting tired of karma fortunes by now.
 
Posts: 21,725
Karma: 29711016
Join Date: Mar 2012
Location: Sydney Australia
Device: none
Quote:
Originally Posted by kensmosis View Post
I won't have to monkey about with the db directly!
@kensmosis - Not only won't you have to, but if you were to do so you would be opening up a Pandora's Box full of mare's nests and snake pits In other words - don't even think about tinkering with the metadata.db file in the library directory.

That said examining the structure with something like SQLite Browser can be very informative, and browsing tables is sometimes necessary - eg to get an author's id to use in a 'canned' content server query.

Regarding rest of the library i.e. the author and book folders, the rules are simple - don't add any subdirectories or files, don't change the names of subdirectories or files, and don't delete or move any subdirectories or files.

But, let's say you open a TXT format in, let's say OOo Writer via Calibres View->Open With feature, and you use Writer to add Chapter Headings and such, and then save it as DOCX. By default it will be saved to the book folder - that's OK, open the Book folder by tapping 'O', and then drag and drop the DOCX into the Book details sidebar (normally on the right).

There are occasions when you may want to edit a format file directly rather than via calibres View options - e.g. to replace a crappy cover image in an existing cbz via an archive utility.

Just be mindful of what you're doing.

BR
BetterRed is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Epub creation in unix shell SBT ePub 11 12-13-2011 01:02 PM
the unix faq in mobi (a good read for anyone who uses it) fbdev Kindle Developer's Corner 1 08-30-2011 08:44 AM
Newsweek: how to remove timestamps after title links kbfprivate Calibre 2 05-09-2009 02:18 AM
My fav Unix: FreeBSD 5.2 Release available Alexander Turcic Lounge 0 01-12-2004 12:54 PM
The Unix Haters Handbook Alexander Turcic Deals and Resources (No Self-Promotion or Affiliate Links) 6 06-16-2003 02:26 PM


All times are GMT -4. The time now is 09:50 AM.


MobileRead.com is a privately owned, operated and funded community.