![]() |
#1 |
Junior Member
![]() Posts: 9
Karma: 22
Join Date: Sep 2015
Device: none
|
Unix timestamps when adding books
Is there an easy way to get calibre to use the unix timestamps for added files when setting the date, published date, and/or modified date fields? Populating any one of these would be sufficient for my purposes. The problem is that I am using calibre to manage a large set of scanned PDFs. The unix timestamps of these pdfs are very important for sorting. However, I cannot seem to get calibre to read them into any of its date fields. It always seems to assign the current date to all the files. I'm probably missing something very basic. I could write a script to stat all the files and edit the database appropriately -- but I'd rather do it the right way (and avoid the extra work) if possible. Any help would be much appreciated.
Cheers, Ken Last edited by kensmosis; 01-14-2016 at 05:36 PM. Reason: Typo |
![]() |
![]() |
![]() |
#2 |
null operator (he/him)
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 21,725
Karma: 29711016
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
@kensmosis - by default, the Date column contains the date the book was added to the library. However it can be changed manually (which would imply the column has been re-purposed), but AFAIK it cannot be extracted from a format file or downloaded
The Published Date columns can be extracted from a format file, my guess is that for a PDF, it would have to be in embedded XML as the Dublin Core Date element The Modification Date is the date when the books metadata was last changed in the database, it cannot be edited or updated, other than by changing a books metadata. If you want to see them in ISO format, then that can be specified in Preferences->Tweaks->gui_pubdate_display_format (search for date) - 'iso' will show as something like - 2015-12-25T10:54:29+11:00 BR Last edited by BetterRed; 01-14-2016 at 08:22 PM. Reason: typo |
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Junior Member
![]() Posts: 9
Karma: 22
Join Date: Sep 2015
Device: none
|
@BetterRed
Thanks so much for your reply. What I want to do is actually simpler and (to my mind) more natural than download the date from a file or from a format file or adjust it manually. I want to use the unix time stamp that the file has before being added. For example, suppose I have a file foo.pdf sitting in my home directory. It has 3 unix timestamps, but let's suppose we just care about the last modification time. For example, I run 'ls -l foo.pdf' and it displays Sept 3, 2015. When I add it to Calibre, a copy is created in calibre's library. However, the copy has today's date (Jan 14, 2016) and all the time info from the original file is lost (i.e. it is copied using the equivalent of 'cp' rather than 'cp -a'). Whether because of this or in addition to it, the metadata.db has Jan 14, 2016 in all of its date columns. It seems very natural to me that one could want to import the dates from the unix file itself rather than metadata or some other source. For example, the unix ctime could map to publication date, the unix mtime could map to the ordinary "timestamp" date in the db. This is a very natural thing to do -- so I'd be quite surprised if there isn't functionality for it. I just assumed I was missing something. But then, my use may not be the typical one. Calibre would be extremely well suited for managing scanned documents (bills, etc) as well as books if it has this feature, but not so much if it doesn't. If it's not a feature yet, then perhaps I'll post it as a suggestion on the calibre site. Unfortunately, because it involves the original file prior to addition, I doubt that a simple add-on can be written to implement it.
|
![]() |
![]() |
![]() |
#4 |
null operator (he/him)
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 21,725
Karma: 29711016
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
@kensmosis - arrgh - you mean the file system dates
![]() You're not going to like this - I don't think there is a way to wrangle mtime, ctime, atime, or dtime directly into calibre. Calibre has two accessible standard date columns: Date/timestamp and Published/pubdate, the Modified/last_modified column is a read only column. If you were to put the create date (crdate) into the file name (via say a mass rename) then you could put that into Published/pubdate via an add time regex - see Preferences->Add Books->Configure metadata from file name. The 'difficulty' with importing file system timestamps relates in part to: a) calibre supporting various file systems on various operating systems, b) there can be more than one format file for a book each with its own dates, and c) in most cases the file system dates bear no relationship to anything but when the file landed on my computer and what's happened to it since. BR |
![]() |
![]() |
![]() |
#5 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,463
Karma: 10684861
Join Date: May 2006
Device: PocketBook 360, before it was Sony Reader, cassiopeia A-20
|
Quote:
THE Unix way (TM) As the true Unix way this is going to be accomplished by using several simple tools step-by-step. 1. You have file called john_doe_-_book title.pdf that has the filesystem date set to the desired date. 2. You write a shell script that is going to rename this file to the 2015.01.02---john_doe_-_book title.pdf format. 2.A. I personally would make a directory listing to the file, like this Code:
cd dir_with_files ls -l -D%Y%m%d > list_of_files.txt Code:
cp "john_doe_-_book title.pdf" "2015.01.02---john_doe_-_book title.pdf.pdf" 2.B. If this was frequently recurring job, I would spend some time creating a script that would rename the files with a command similar to this: Code:
for f in *; do mv "$f" "$(date -d@$(stat --printf='%Y' "$f") +%Y%m%d%H%M%S)-$f"; done 3. Use Calibre with drag & drop to add files. Configure the Calibre Prefferences, Adding books to use template Code:
^(?P<published>((?!\s-\s).)+)---(?P<author>((?!\s-\s).)+)\s-\s?(?P<title>[^(]+)(?:\(.*\))? Do not hesitate to ask when the above instructions are not enough. I do not know your level of expertise ;-) (*) here are a few examples I was able to find in a hurry: http://www.arj.no/2015/08/03/add-dat...les-in-ubuntu/ http://askubuntu.com/questions/51179...o-the-filename http://stackoverflow.com/questions/4...o-date-created EDIT: I have replaced ls -l > list_of_files.txt in my example with ls -l -D%Y%m%d > list_of_files.txt to get date format that is easier to process further You can also play with the stat command to bet the date for the file instead of ls Last edited by kacir; 01-15-2016 at 06:04 AM. |
|
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Junior Member
![]() Posts: 9
Karma: 22
Join Date: Sep 2015
Device: none
|
@BetterRed You are correct, of course; I am speaking of filesystem dates. I'm afraid I was being too clumsy with my language
![]() @kacir Thanks for the detailed and informative reply! From both answers, it seems pretty clear that I will have to script things. I had considered your approach as one of two if calibre couldn't do this natively. Perhaps you would be kind enough to advise me which would be better. I have about 10000 files, so I'll probably write a Perl script to manage the process either way. 1. As you say, embed the date in the filename. I appreciate the guidance, and don't see this as being a problem. This approach has the advantage of being easy and unintrusive regarding the metadata.db itself. However, it seems a bit unnatural because the files themselves (the ones managed by calibre) would still lose their timestamps. If for some reason I need that info down the road, I'll have to reconstruct it from the db or hope I have an original copy of the files floating about. 2. Alterately, I could edit the db. I would add the files to calibre as is, and then do two things (to clarify, the 2nd approach involves doing both): (a) Run a SQL query to modify the "Date/timestamp" and/or "Published/pubdate" fields (or just add custom fields for the filesystem ctime and mtime and populate those). I also can directly modify the "modified" date by this approach regardless of its ro status in calibre. (b) Using "touch -r $orig_file $calibre_copy_of_file", change the dates of the files themselves under calibre's management. This 2nd approach requires a little finesse because my files are in multiple subdirectories (I add them using the "include subdirectories" option) -- so there may be recycled filenames. I probably would have to use an md5 checksum to match files instead, but this isn't a big deal. While this approach seems a bit more intrusive and involved, it has the advantage that -- if done right -- in the end the world will look to calibre just as it does to the filesystem itself. The files will seem as if they had been added one by one at the time of their (filesystem) creation, and last modified at the time of their last filesystem modification. My inclination is to use (2) even though it is more complicated. In your opinion, is this a big mistake? My concern is that there are hidden dates in the calibre db (or in some config file I don't know about) that would blow everything up when it performs consistency checks. Then calibre would either fail or overwrite my dates with its own "corrected" ones. Of course, I could try (2) and then revert to (1) if it fails -- but it would definitely be useful to know at the outset whether it is a bad idea altogether. If you have any insight into whether I'm setting myself up for a world of hurt, I'd really appreciate it. I should clarify that I don't mind a bit of unix scripting (including SQL queries). I'm ok with Perl and already have had to write scripts to OCR and process these scanned documents in bulk as part of my bigger project to manage my scanned bills, so an extra script is ok if it is necessary and will work. I really appreciate your advice and help! Cheers, Ken |
![]() |
![]() |
![]() |
#7 |
creator of calibre
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 45,347
Karma: 27182818
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
|
calibredb add -h
calibredb set_metadata -h |
![]() |
![]() |
![]() |
#8 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,553
Karma: 950151
Join Date: Nov 2008
Device: Sony PRS-950, iphone/ipad (Marvin/iBooks/QuickReader)
|
The 'modified' date in Calibre can be changed any time a change is made to metadata, so you do not want to rely on this not changing unexpectedly. The 'added' date is not subject to change so you could use that. You could also set up one or more user defined columns to store such information and these would not be touched by Calibre so that may be the more reliable way to do this sort of thing.
|
![]() |
![]() |
![]() |
#9 |
Junior Member
![]() Posts: 9
Karma: 22
Join Date: Sep 2015
Device: none
|
@kovidgoyal: Thanks! I'd never played with the CLI; that will definitely make the scripting a lot easier, and I won't have to monkey about with the db directly!
@itimpi: Thanks for the dates guidance. I think I'll go the custom columns route and not try to coerce calibre to deviate too much from its normal way of doing things. Thanks everyone, I really appreciate all the help and now have what it takes to script things efficiently! |
![]() |
![]() |
![]() |
#10 |
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 31,054
Karma: 60358908
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Rather than mess (and create a maintenance nightmare) with a standard Timestamp field, create a custom column of the Timestamp type. It won't get overwritten if someone uses a standard Calibre feature
USE the CLI (as hinted by Kovid) and a custom script to populate your Library ![]() I see all kinds of issues. The TZ settings. or any other Timestamp affecting setting |
![]() |
![]() |
![]() |
#11 | |||
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,463
Karma: 10684861
Join Date: May 2006
Device: PocketBook 360, before it was Sony Reader, cassiopeia A-20
|
Quote:
I have just tested krename and you can very easily do the rename from its GUI. Just select files and set Filename to Custom name from dropbox and click on the lightbulb button and build following expression [creationdate;yyyy.MM.dd]---$ Depending on circumstances I personally would either use krename or write a Gvim script that would then write shell script from an ls -l > something.txt list. Quote:
The name of the file remains unchanged in Calibre, so it will preserve the info about the original creation date. Also, it is Not A Good Idea (TM) to work with files behind Calibre back. Use Calibre itself to access the files inside and treat filesystem as a black box. There are many, MANY threads discussing this. The files are user readable and this is great advantage in the case of disaster or something - you can salvage the remains of the library even if it is heavily damaged. I always use simple Drag & Drop to add numerous files to Calibre to add files. Quote:
|
|||
![]() |
![]() |
![]() |
#12 |
null operator (he/him)
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 21,725
Karma: 29711016
Join Date: Mar 2012
Location: Sydney Australia
Device: none
|
@kensmosis - Not only won't you have to, but if you were to do so you would be opening up a Pandora's Box full of mare's nests and snake pits
![]() That said examining the structure with something like SQLite Browser can be very informative, and browsing tables is sometimes necessary - eg to get an author's id to use in a 'canned' content server query. Regarding rest of the library i.e. the author and book folders, the rules are simple - don't add any subdirectories or files, don't change the names of subdirectories or files, and don't delete or move any subdirectories or files. But, let's say you open a TXT format in, let's say OOo Writer via Calibres View->Open With feature, and you use Writer to add Chapter Headings and such, and then save it as DOCX. By default it will be saved to the book folder - that's OK, open the Book folder by tapping 'O', and then drag and drop the DOCX into the Book details sidebar (normally on the right). There are occasions when you may want to edit a format file directly rather than via calibres View options - e.g. to replace a crappy cover image in an existing cbz via an archive utility. Just be mindful of what you're doing. BR |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Epub creation in unix shell | SBT | ePub | 11 | 12-13-2011 01:02 PM |
the unix faq in mobi (a good read for anyone who uses it) | fbdev | Kindle Developer's Corner | 1 | 08-30-2011 08:44 AM |
Newsweek: how to remove timestamps after title links | kbfprivate | Calibre | 2 | 05-09-2009 02:18 AM |
My fav Unix: FreeBSD 5.2 Release available | Alexander Turcic | Lounge | 0 | 01-12-2004 12:54 PM |
The Unix Haters Handbook | Alexander Turcic | Deals and Resources (No Self-Promotion or Affiliate Links) | 6 | 06-16-2003 02:26 PM |