View Single Post
Old 04-07-2019, 08:26 PM   #15
ceridwen
Enthusiast
ceridwen began at the beginning.
 
Posts: 36
Karma: 10
Join Date: Feb 2017
Device: Kobo Aura H2O
Quote:
Originally Posted by DNSB View Post
I'm not sure how to take your comment. Are you trying to say that a .epub file is the same as a .cbz file? Or are you saying that your ereader should be able to open any file(s) stored in a zip container and display the contents regardless of how the contents were created?

In the real world, the extension is used to indicate how to handle the file. While theoretically a better solution, analyzing the file contents to determine how to handle the file is a quick trip to code bloat and programmers beating their heads against the wall. So how would you recommend to tell an epub from a cbz? They both use a zip format container. There are quite a few other filetypes that use a zip container as can be told from the the sheer number of times the first two bytes of a file are PK or 0x50, 0x4B.

So in your day job, if you happen to rename a Microsoft Word document with a .txt extension, this would not cause any problems? Or take a compressed log file saved as a .tgz file and rename it as .doc? Try to smuggle a MP4 video onto the network by renaming it as .asc?
Mostly I use Linux's file or bindings. This is more or less an open-source compilation of magic bytes and and heuristics for determining file types, and it can tell the difference between an EPUB and a non-EPUB zip. I'd have to dig through the code to find out what heuristics it uses for that, but for only two cases I would suspect there couldn't be that much code.
ceridwen is offline   Reply With Quote