03-23-2011, 06:16 PM | #1 |
Enthusiast
Posts: 39
Karma: 10
Join Date: Jan 2009
Location: South Pacific
Device: Kindle DX
|
Changing Format Without Parsing
for convenience there are some format changes I'd like to make, preferably without dumping everything out of Calibre and re-importing it. Anybody had any luck with this kind of thing? :
PRC-->MOBI (where the PRC file is actually just a MOBI file) PDB-->Various (output the actual file the pdb contains) TXT,RTF-->ZIP,RAR,TXTZ (convert uncompressed into compressed format) I've specifically been trying reduce the size of my library by compressing text files TXT-->TXTZ in batch, but this results in formatting issues and needs to babysat on a per file basis. I'd be just as happy with TXT-->ZIP. Conversion seems to engage file parsing so that it actually goes TXT-->HTML-->TXTZ and does annoying little things liking mashing together snippets of quoted verse/poetry. It seems like there are a few "conversions" of this nature that could skip the parsing process safely. |
03-24-2011, 10:08 AM | #2 | |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Quote:
You can also export, compress on your own, change extension to txtz and add to Calibre. Alternatively, with care, you could compress in the library (outside Calibre control), then run the library checker and it should find the new txtz files and the missing txt files. |
|
03-24-2011, 10:32 AM | #3 | |
Sigil & calibre developer
Posts: 2,488
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
|
Quote:
As stated all conversion is: Input - OEB - Output. As such all input plugins create OEB and all output plugins require OEB. This makes converting between formats much simplier. Previous <= 0.4 used per input / output conversion code. It was unmanagable and in may cases redundant. So the conversion process was changed to what you see now. |
|
03-24-2011, 11:13 AM | #4 | |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Quote:
If it's also got a copy of the image and opf file, I wonder how much saving of space he'll get? That was the OP's purpose, but since Calibre already has the image in the library as cover.jpg, would converting a txt to txtz create a second stored copy of the image in the txtz file that duplicated the cover.jpg image already stored? The jpg format is already compressed, so it won't get smaller inside the zip. It just strikes me as a lot of work to save a few pennies worth of storage capacity. Edit: I realized I could answer the latter question myself. Yes, it duplicates/adds the cover/opf, but I still got about 50% file size reduction as compared to the uncompressed txt format. For my entire library, with many pdfs and epubs, it wouldn't reduce the size by much. For a txt only library, it might be worth it if your reader read that format. Last edited by Starson17; 03-24-2011 at 11:32 AM. |
|
03-24-2011, 11:34 AM | #5 |
Sigil & calibre developer
Posts: 2,488
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
|
I created the TXTZ format because calibre requires ebooks be single files. No reader other than calibre supports it. Markdown and Textile formatted files can reference images so TXTZ allows everyting to be packaged into one file. A side effect of TXTZ is it allows for robust and standardized metadata due to the included OPF.
|
03-24-2011, 11:52 AM | #6 |
Wizard
Posts: 4,004
Karma: 177841
Join Date: Dec 2009
Device: WinMo: IPAQ; Android: HTC HD2, Archos 7o; Java:Gravity T
|
Thanks for the explanation. I can see the need.
I have a very old reader program (uBook) that handles zip files as a directory. It also reads txt format and the reader device (an IPAQ) has limited space on its SD card. I'd actually created a few zipped txt files with covers - the near equivalent of the txtz format - for use on that reader. Calibre seemed happy with a txt file inside the zip with a cover inside, and the reader program just sees it as a a folder with a txt and a cover, which it automatically displays. |
03-25-2011, 07:07 PM | #7 |
Enthusiast
Posts: 39
Karma: 10
Join Date: Jan 2009
Location: South Pacific
Device: Kindle DX
|
OK guys, sounds like the sort of fiddling I'd intend is best done outside of Calibre. Still, I think if you've got something that you're happy with the structure of it'd be nice to be able to convert without picking through and re-structuring.
About TXTZ though, a conversion yields this: \The Complete Works of William Shakespeare (11709) ..\cover.jpg ..\metadata.opf ..\The Complete Works of William Shakespear - William Shakespeare.txt 5.3mb ..\The Complete Works of William Shakespear - William Shakespeare.txtz ..\..\cover.jpg ..\..\index.txt 1990kb ..\..\metadata.opf Good with the commentary, but does interesting things to the iambic pentameter... What would be the consequences of zipping text files myself en masse and ending up with: \The Complete Works of William Shakespeare (11079) ..\cover.jpg ..\metadata.opf ..\The Complete Works of William Shakespear - William Shakespeare.txtz ..\..\The Complete Works of William Shakespear - William Shakespeare.txt 1990kb And then doing a database restore to pick up the txtz files. As an aside, any chance of getting windows mime code for txtz that would let windows explorer open the archive, or launch a text editor to open the text file? |
03-25-2011, 07:12 PM | #8 |
Sigil & calibre developer
Posts: 2,488
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
|
You can make your own TXTZ and use the metadata editor dialog to add the file to the entry. The TXTZ just needs the TXT file, any images (cover.jpg) and optionally a file called metadata.opf with the metadata in it.
I'll ping Perkin to respond. I believe he uses Windows and has it setup like this. |
03-26-2011, 03:32 AM | #9 |
Wizard
Posts: 3,720
Karma: 1759970
Join Date: Sep 2010
Device: none
|
with windows xp that you could set entire folders to be stored as compressed, then windows would compress / decompress indiividual files as needed. so you'd have your entire library in compressed folder, under the hood, yet calibre would work as normal & you'd have no need to mess with individual files formats.
in win 7 you right click calibre library goto properties ...advanced ...tick compress file to save space... then select apply to subfolders. I suspect it will not save much space tyhough, unless you have a lot of. txt, .rtf format books. the epub, mobi, zip, pdf formats are already compressed, so compressing them again is not going to help much. |
03-26-2011, 04:11 AM | #10 | |
Guru
Posts: 655
Karma: 64171
Join Date: Sep 2010
Location: Kent, England, Sol 3, ZZ9 plural Z Alpha
Device: Sony PRS-300, Kobo Aura HD, iPad (Marvin)
|
Quote:
I use 7Zip but you should be able to do it with any archive handler. To get it to open the TXTZ, I opened the folder one was in, right-clicked on the *.txtz file and selected 'Open with...', then browsed to where 7Zip was installed and selected '7zFM.exe', and used the option 'Always use the selected program...' Then when clicking the the link 'TXTZ' in 'Book details' pain or window, the archive will automatically open in 7Zip, I can then double-click on any entries to open them, or use drag'n'drop to add stuff to the archive. I've been using Textile mark-up text a lot and have set any *.text files to open in EditPadPro (which I made a textile styler for), and can then double-click any *.text file and have it open in EPP and automatically style the Textile markup as well. Any use? |
|
04-01-2011, 12:47 AM | #11 |
Enthusiast
Posts: 39
Karma: 10
Join Date: Jan 2009
Location: South Pacific
Device: Kindle DX
|
Excellent stuff guys, thanks. This ought to let me compress my library size temporarily until I can get around to converting everything to a uniform format.
That's what I was looking for Perkin, pretty much mimics what explorer does. Sorry I'm only able to check back sporadically... cyclone season is over and I've been jonesing to get out of the harbor and do some sailing. |
|
Similar Threads | ||||
Thread | Thread Starter | Forum | Replies | Last Post |
Parsing Titles | cgraving | Calibre | 3 | 01-17-2011 02:52 AM |
Changing the Input format | 69bonni | Calibre | 2 | 01-12-2011 11:16 AM |
eHarlequin changing their ebook format to ePub only as of 3/2011 | chilady1 | General Discussions | 4 | 12-21-2010 11:27 PM |
Error parsing attribute name? | seagull | Calibre | 1 | 01-01-2010 11:30 AM |
Calibre Author/Title parsing | sglinert | Calibre | 1 | 05-23-2008 10:18 AM |