View Single Post
Old 09-05-2009, 02:09 AM   #3
troymc
Groupie
troymc will become famous soon enoughtroymc will become famous soon enoughtroymc will become famous soon enoughtroymc will become famous soon enoughtroymc will become famous soon enoughtroymc will become famous soon enough
 
Posts: 161
Karma: 608
Join Date: Aug 2008
Location: Plano, TX
Device: Sony PRS-505 + B&N Nook + Motion LE1700 + Motorola Xoom Wifi
hmm...Ok, I don't understand. That answer raises more questions...

* So wait, this is calibre-server I'm talking about -- you're saying when I click the link to download a book it creates a copy of the book first and then updates the meta-data in that copy and then serves it?

* Instead of just updating the metadata in the original once? That would leave calibre-server to be a completely safe, fast, read-only app.

* The copy/update behaviour you mentioned would be safe, but as you can see from that one line I posted from strace, it's actually opening the original file RW. And below I show the complete output (minus unrelated futexes, fcntl, etc for clarity) from a successful download, and I don't see the behaviour you describe at all. Are you maybe thinking of save to disk or export behaviour?

I see it:
* accept incoming network connection as fd8
* receive HTTP GET
* stat & open original file Read/Write as fd9
* read in file and close fd9
* stream file back out fd8

Code:
[pid 2981root@plato:~# strace -f -p 29787 -e trace=\!select,futex,poll,fcntl
Process 29848 attached with 36 threads - interrupt to quit
[pid 29816] restart_syscall(<... resuming interrupted call ...>) = 0
6] accept(6, {sa_family=AF_INET, sin_port=htons(2921), sin_addr=inet_addr("127.0.0.1")}, [203278190779564048]) = 8
[pid 29841] recvfrom(8, "GET /get/pdf/The%20Philosophy%20o"..., 8192, 0, NULL, NULL) = 552
[pid 29813] access("/mnt/archive/library/elibrary/philosophy/metadata.db-journal", F_OK) = -1 ENOENT (No such file or directory)
[pid 29813] fstat(3, {st_mode=S_IFREG|0644, st_size=1092608, ...}) = 0
[pid 29813] lseek(3, 24, SEEK_SET)      = 24
[pid 29813] read(3, "\0\0K\334\0\0\0\0\0\0\4*\0\0\0\1"..., 16) = 16
[pid 29841] stat("/mnt/archive/library/elibrary/philosophy/Timothy Williamson/The Philosophy of Philosophy (634)/The Philosophy of Philosophy - Timothy Williamson.pdf", {st_mode=S_IFREG|0666, st_size=1554494, ...}) = 0
[pid 29841] open("/mnt/archive/library/elibrary/philosophy/Timothy Williamson/The Philosophy of Philosophy (634)/The Philosophy of Philosophy - Timothy Williamson.pdf", O_RDWR) = 9
[pid 29841] fstat(9, {st_mode=S_IFREG|0666, st_size=1554494, ...}) = 0
[pid 29841] stat("/mnt/archive/library/elibrary/philosophy/Timothy Williamson/The Philosophy of Philosophy (634)/The Philosophy of Philosophy - Timothy Williamson.pdf", {st_mode=S_IFREG|0666, st_size=1554494, ...}) = 0
[pid 29841] stat("/mnt/archive/library/elibrary/philosophy/Timothy Williamson/The Philosophy of Philosophy (634)/The Philosophy of Philosophy - Timothy Williamson.pdf", {st_mode=S_IFREG|0666, st_size=1554494, ...}) = 0
[pid 29841] fstat(9, {st_mode=S_IFREG|0666, st_size=1554494, ...}) = 0
[pid 29841] lseek(9, 0, SEEK_CUR)       = 0
[pid 29841] fstat(9, {st_mode=S_IFREG|0666, st_size=1554494, ...}) = 0
[pid 29841] mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f9b7be72000
[pid 29841] lseek(9, 0, SEEK_CUR)       = 0
[pid 29841] read(9, "%PDF-1.4\r%\342\343\317\323\r\n2079 0 obj <</Lin"..., 1552384) = 1552384
[pid 29841] read(9, "\r\n0000000000 65535 f\r\n0000000000 "..., 4096) = 2110
[pid 29841] read(9, ""..., 4096)        = 0
[pid 29841] close(9)                    = 0
[pid 29841] munmap(0x7f9b7be72000, 4096) = 0
[pid 29841] stat("/etc/localtime", {st_mode=S_IFREG|0644, st_size=3543, ...}) = 0
[pid 29841] fstat(5, {st_mode=S_IFREG|0644, st_size=815373, ...}) = 0
[pid 29841] lseek(5, 815373, SEEK_SET)  = 815373
[pid 29841] write(5, "127.0.0.1 - - [05/Sep/2009:00:30:"..., 259) = 259
[pid 29841] sendto(8, "HTTP/1.1 200 OK\r\nDate: Sat, 05 Se"..., 182, 0, NULL, 0) = 182
[pid 29841] sendto(8, "%PDF-1.4\r%\342\343\317\323\r\n2079 0 obj <</Lin"..., 1554494, 0, NULL, 0) = 49188
[pid 29841] sendto(8, "\375\300\254\262\371Y3d~\302\250%\f\2043\211\233\232`\206B\17k\246 \366\230\35F.Ez\230"..., 1505306, 0, NULL, 0) = 131168
[pid 29841] sendto(8, "\262\362\25+i2\177\314n_\324~\242\201Gq2e\1&\6X`\17D}\3540\2/9\373G"..., 1374138, 0, NULL, 0) = 180356
[pid 29841] sendto(8, "\364:Z\347u\21\227\273\246\304\325&\257\327\233\212u\221\211\261\204\244\22\236\16\374A!\354QC\\y"..., 1193782, 0, NULL, 0) = 245940
[pid 29841] sendto(8, " R/GS18 704 0 R/GS19 698 0 R/GS20"..., 947842, 0, NULL, 0) = 196752
[pid 29841] sendto(8, "p\242\177\341O\243\234\276;*\n\370\341\321\271\273\177\20\354\376\1\247\330\303\375O\16wE\310\16\314\271"..., 751090, 0, NULL, 0) = 262336
[pid 29841] sendto(8, "stream\r\nH\211\344W\333\216\33\271\21}\27\340\177\340\243\r\214{\232\315\276f\27\33\254"..., 488754, 0, NULL, 0) = 147564
[pid 29841] sendto(8, "\222\7\247\344\215\315\363\372\366\262]\332\367\262}\v0\0000\367\370\317\r\nendstream"..., 341190, 0, NULL, 0) = 147564
[pid 29841] sendto(8, "ype/Font>>\rendobj\r842 0 obj/Devic"..., 193626, 0, NULL, 0) = 147564
[pid 29841] sendto(8, "rent 1331 0 R/Count 5/Type/Pages/"..., 46062, 0, NULL, 0) = 46062

The behaviour you describe could get expensive on the back-end if the load gets high. Particularly since 50MB pdfs are not rare in my libraries. It would be nice to know/control where these copies are created so that I could look into maybe mounting a ramdisk there and avoid some disk I/O on my webserver.


Troy
troymc is offline   Reply With Quote