Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Closed Thread
 
Thread Tools Search this Thread
Old 09-16-2011, 12:24 AM   #241
denysm
Member
denysm doesn't litterdenysm doesn't litterdenysm doesn't litter
 
Posts: 12
Karma: 200
Join Date: Jan 2011
Location: Canada
Device: Kobo Touch
Great work Unboggling. How I wish I'd had that document when I started 6 months ago.
denysm is offline  
Old 09-16-2011, 07:23 AM   #242
unboggling
by the bootstraps
unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.
 
Posts: 1,055
Karma: 858115
Join Date: Jan 2011
Location: Southeast US
Device: PRS-T2, Nexus 7, KindleT, iPad1, Kindle3KB
Quote:
Originally Posted by denysm View Post
Great work Unboggling. How I wish I'd had that document when I started 6 months ago.
Thanks, denysm. Yeah, I would have liked seeing something like this when I started out too. Helps to know other people think it's useful.
unboggling is offline  
Old 09-16-2011, 04:14 PM   #243
unboggling
by the bootstraps
unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.
 
Posts: 1,055
Karma: 858115
Join Date: Jan 2011
Location: Southeast US
Device: PRS-T2, Nexus 7, KindleT, iPad1, Kindle3KB
---

Link to latest: Workflow with Examples for New calibre Users, Version 0.90, 2011-09-24, ThreadPost #288.

---


Workflow Map for Managing eBooks with calibre, Version 0.80, 2011-09-16

This version: restructured as map of expanded workflow; new title.


Orient
Spoiler:

Start New Iteration. The workflow starts here. When the steps of the workflow cycle are completed, the workflow restarts here for the next iteration of the cycle. By iteration I mean repetition of a process, applying the results to any results of prior use of the process. Doing the workflow is like being a runner on a long circular track, with each lap around being an iteration, and the benefits of repeated laps or runs accumulating over time.


Orient to Workflow Map. This is a map of my workflow for managing eBooks with calibre. I want to streamline my use of eBooks, calibre, and software tools to facilitate the ultimate purpose, reading eBooks on mobile devices. When someone suggests different strategies, methods, or practices might work better, I test how those work during iterations through the workflow cycle. If warranted I adopt them into the workflow on a more permanent basis, and integrate them into the workflow map. That enables progress in small steps toward sounder strategies, more effective methods, and better practices. So the map changes as the workflow changes.

Code:
    
    
    
    
    
>--+--<-------------------------<--------------------------<--+
   |                                                          |
   V                                                          A
   |                                                          |
   +>-+-->  Orient  -->  Learn  -->  Set  -->  Get  >---+     |
      |                                                 |     |
      A                                                 |     |
      |                                                 |     |
   +<-+---<  Edit  <--  Fix  <--  Assess  <--  Add  <---+     |
   |                                                          |
   V                                                          A
   |                                                          |
   +-->-------------->  Rest & Relaxation  >--------------->--+-->

Understand Why. When I first started with eBooks and calibre, I wished I could find more examples of how other people used calibre to manage eBooks. Examples help me learn. That's all this is - examples of what one user does, put together in a workflow map for using eBooks and calibre. This isn't a user guide. It's not advice, except to keep it simple. When I keep it simple, I can venture into learning any new area using small steps from that simple baseline.


Set Learning/Testing Goals. After the first iteration, set goals for the next iteration for learning and trying various features of calibre, or for trying any new strategies, methods, or practices for managing eBooks using calibre along with various tools for fixing formats or reading on devices.


Incorporate Changes. After the first iteration, revise workflow accordingly to adopt any new strategy, methods, or practices that were learned and tested with positive results during the last iteration. Also revise to accomodate any new learning/testing goals.



Learn
Spoiler:

Learn from Help Documentation. Read it carefully initially after calibre installation, reread as necessary, and generally use it as reference. That's the Quick-Start Guide in the library at installation and the Help documents available on the calibre site consisting of the Frequently Asked Questions section of calibre User Manual, the other sections of the User Manual that seem appropriate at the time, and any relevant tutorials. The video tutorials are good, even if some of the graphics in them reflect old versions of calibre. I keep finding things I missed the first few times I went through it all, plus sometimes parts of the Help are updated.


Learn from MobileRead. Learn initially about eBooks, calibre, reading device(s), and Digital Rights Management by reading various entries in the MobileRead Wiki, reading recent threads in MobileRead Forums, and browsing various internet sites, then continue to stay abreast of recent developments.


Learn more about Digital Rights Management (DRM). Learn initially how to deal with DRM as it relates to converting eBooks to format of choice for reading device of choice by searching for "Apprentice Alf" on the internet and reading that blog, realizing DRM plugins are not supported or endorsed by MobileRead Forums or the calibre developers. Periodically read newer blog entries and check for updates.


Learn from Mouse Tips and Stickies. Pay close attention to Mouse Tips and Stickies. The little boxes that come up when hovering the cursor over something contain important help messages about how calibre works. They are more up to date than the manual and tutorials due to the calibre software changing so rapidly through revisions, additions, and updates. That's also true for many Stickies at the top of the calibre forum and each calibre sub-forum.


Read calibre's What's New Changelog. When downloading new versions of calibre, read the What's New changelog tab and review the Major New Features tab. They help with knowing what's new or changed, and sometimes enable changes for the better in workflow.


Ask Questions in Forum. After the first month, I thought I had a handle on everything and didn't ask questions right away in the appropriate calibre forums when I couldn't find an answer in the Help documents. That was a mistake. For example, after 8 months of using calibre, I asked a question about how to right align data in a column and found out I'd never thought to doubleclick a column heading and use any of the commands to be discovered there - mildly embarassing in that case, but worth it for productivity. When I ask a question, I put the specific question in the title of the post with a question mark, such as "How do I right-align a column?" then repeat it with any necessary details about the question or problem and its context in the body of the post - that way people can see at a glance whether they know the answer to the question from the title, and get all the necessary details to fill them in from the body of the post.


Keep An Open Mind. There is usually more than one way to accomplish something in calibre or anywhere else. Remembering that helps me change strategy, methods, or work habits when circumstances indicate change would be good. And helps me keep an open mind when people on MR Forums suggest solutions or other ways.


Learn Gradually. Learn gradually over the long-term more about conversions, format clean-up, regular expressions (regex), HTML, Cascading Style Sheets (CSS), software tools, devices, and other eBook or calibre related knowledge and skills.


Learn Other calibre Features. Learn and try all the calibre features, gradually, one by one. I want to learn and then use these calibre features soon: more sophisticated regex for Conversion Search and Replace, and Cascading Style Sheets (CSS), both to be applied during conversion. I only recently started using Get Books and Fetch News, after ignoring them for months thinking they'd be difficult to learn or not meet my needs - and learned neither was difficult to learn or do, and both meet needs and simplify getting books or getting news and reading it in eBook form. I've had no need yet to use the Content Server and Command Line Interface features, other than trying them a few times to see what general capabilities they provide.



Set
Spoiler:

Initially, do these steps. In future iterations, doublecheck all settings, consider how well each is working, and adjust as necessary.


Set Computer
  • Set Backups. Set backup software to do periodic backups automatically. Mine backs up my internal disk to an external drive on an hourly basis. The calibre application and all associated files are on my internal disk. I have file hosting/syncing services such as DropBox but haven't used them with calibre and eBooks because I don't want to add another layer of complexity yet. In the future if I do use one of them with calibre libraries, I'll continue doing my own automated backup rather than depending on a server owned by someone else. I've had to restore from backup three different times after making various blunders.
  • Set Security. Set antivirus software to auto-scan all volumes but to exclude calibre libraries from scans. The books that I add to calibre were previously scanned at download, scanned again if they were accessed by other applications like compression expander or reader, scanned again when calibre copied during Add Books. The exclusion prevents antivirus software causing slow-downs in calibre performance.
  • Set Raw Books. Initially, set up a folder for raw books that are kept outside of calibre and have nothing to do with calibre's book folders. I set Raw Books at the root level of the same external drive where I download to a Download folder that is also at root (the first or top level of folders when looking at icons). Raw Books folder has sub-folders Pending and Processed. I keep all original book formats in one of the other. When adding books to calibre, calibre copies them but doesn't move them. They have bad metadata or haven't been cleaned up but at least they are the original incoming formats; keeping them available is an insurance policy against future need. The Raw Books folder functions as a second kind of backup, but raw. I've found myself searching Raw Books numerous times and adding book formats from it into calibre for one reason or another and still rely on it being there.
  • Set Raw Books/Pending. Initially, move all eBooks elsewhere on the computer that are not yet in calibre into Pending folder into a relevantly named subfolder - such as "Odds and Ends Collected From Computer" - so they're easily findable for adding to calibre.
  • Minimize Automation. Keep any automation as minimal and simple as possible. Trying to combine different types of automation both inside and outside of calibre when I didn't know what I was doing led to me feeling frustrated and overwhelmed the first couple of months. That includes complex scripts, macros, complex computed columns that rely on other columns built from columns, regular expressions, templates. It's better for me to do a process manually for awhile until I'm familiar with it before trying to automate it, and not try to automate it until I have a better knowledge and feel for the technical parts of the automation. I want to keep things simple to avoid confusion, frustration, and extra work from unnecessary complexity.

Set calibre Preferences:

Initially, most calibre Preference settings are fine at default until getting a feel for what each one does by reading the documentation and testing it. Later, doublecheck relevant settings early in each workflow iteration and also before starting significant bulk operations such as saving books out or converting a lot of books at once. Some of my current Preference settings are:
  • Look and Feel/Main Interface Interface Layout: Narrow. Interface Font: Lucida Grande 14.
  • Look and Feel/Book Details. Displayed: Authors, Series, Title, Tags, Formats, Identifiers, Path, Comments, FQR, Genres, Kinds, Status. Default Link Template for wikipedia, unchecked Roman Numerals.
  • Look and Feel/Column Coloring. _q0 value in FQR (Format Quality Rating) for red text in Authors, Series, Title, and tag columns.
  • Look and Feel/Tag Browser. Partitioning method: disabled. For testing purposes, checked: Show average ratings in tag browser.
  • Behavior. Preferred Output Format: EPUB. Even though I primarily read MOBI on Kindle, I prefer EPUB format because it opens fast in the calibre viewer, works on my iPad without conversion, usually converts well to MOBI format, and is useful for clean-up purposes.
  • Behavior. Peferred input format order: EPUB at top, MOBI next, others in default order.
  • Behavior. Use internal viewer for: unchecked PDF and AZW4 formats, checked all others.
  • Add Columns. Discussed below as a Library feature, because unlike most other calibre preference settings that apply across all libraries, custom columns apply only to the library where they were created.
  • Toolbar. Set main toolbar and library context menu. I don't use the others, working mainly in the calibre library view booklist rather than device view booklist.
  • Searching. All unchecked and unused.
  • Conversion/Input Options/Comic Input. Checked Disable conversion of images to black and white.
  • Conversion/Common Options/Page Setup/Output Profile. Set to Kindle. Input profile is default.
  • Conversion. All other conversion options at default.
  • Adding Books. Checked: Read metadata from file contents, Copy To Library preserve date. Unchecked: Auto-merge. Tags to Add: "_New" (no quotes). Regular Expression box for adding by filename, the longest most complex choice in default menu:
    Code:
    (?P<author>[^_-]+) -?\s*(?P<series>[^_0-9-]*)(?P<series_index>[0-9]*)\s*-\s*(?P<title>[^_].+) ?
  • Saving Books. Checked: Save cover, Update metadata in saved copies, Save in OPF, Convert Non-English. Everything else default. Save template:
    Code:
    {author_sort}/{title}/{title} - {authors}
  • Sending Books. Automatic Management, everything else default.
  • Metadata Plugboards. Template for iPad not necessary for my reading needs yet. Template for Kindle adds series and series index to title on Kindle:
    Code:
    {series}{series_index:0>2s| - | - }{title}
  • Sharing by email. Set up, tested, seldom used.
  • Sharing over net. Set up, tested, for Content Server, seldom used. For loading iPad through iTunes, after re-install, didn't set it up in other preference settings yet.
  • Metadata Download. My downloaded metadata fields: Comments, Published date, Publisher, Rating. Sources checked and configured: Amazon (1; Comments, Published date, Publisher), ISBNdb (1; Comments, Publisher), Open Library (3).
  • Plugins. DRM plugins need periodic manual checking for updates. Calibre provides a message when updates are available for MobileRead/calibre-supported plugins and a great update method.
  • Tweaks. Publication date (year only): yyyy, Title and series sorting: strictly alphabetic.
  • Miscellaneous, Keyboard, and Template Functions. All at defaults.

Set calibre Libraries:

Initially, decide on the number of libraries to use and each library's purpose, then set up the libraries. Later, review the decision periodically and make any structural changes in libraries as desired early in an iteration.
  • Decide on Number of Libraries. At present I use two libraries: Core and Test. Core is the primary library for all activities except testing possible new library structures or big changes in dealing with books. Test is temporary, frequently deleted then recreated empty to match Core's current structure so I can test from that baseline on books copied from Core. At the start I also used a third library called Add just for processing books, evaluating and fixing formats, and working on the metadata. Now I do that in Core instead, adding books only a few at a time by author slowly, and working on them then and there just after I add them.

Reasons for One Primary Library:
  • Copy and Paste. Copying and pasting metadata across Libraries involves using Library/Quick Switch and search to find the item to copy, copy it, Quick Switch again, and then search to find desired paste location. So it means more time spent or more typing of metadata such as series information when adding new books.
  • Library Restructuring. After I learn something new or realize something I'm doing isn't working as smoothly as it could, I avoid changing Library structure - adding columns then moving metadata around or bulk changing a lot of it - until I have plenty of time and it's early in an iteration. With more than one Library, it's more work to implement Library structure changes across Libraries to make them consistent. While most Preference settings are global across all libraries, custom columns are not - they're specific to each library where created.
  • Metadata Restructuring. Doing this is more time consuming across multiple libraries. Also, I postpone any decisions that result in losing metadata until becoming aware of potential ramifications by asking about it on MobileRead or learning for myself through further experience. I don't want to simplify or streamline things to the extreme of losing data. Examples of what I did in all libraries that needed considerable work later to fix: deleted the articles "The", "An", and "A" from Title; changed publishers like Spectra (an imprint of Bantam) to a higher-level parent name (like Bantam) so publishers were more consistent; used only one of the 3 or 4 co-authors of an anthology rather than taking the trouble to enter all the authors.
  • Edit Metadata in Bulk. This only works across whatever books are selected in one library. For example, to delete an author's middle name, or to change a large multi-author multi-level series name. It's easier when all books by that author or in that complex series are all in the same library.
  • Search. Searching only works across one library at a time.
  • Tag Browser. The lists for each Category in Tag Browser only contain items from that specific library, not all libraries.
  • Content Server. Accessing books in calibre through the Content Server allows access only to whatever is in the Library that the Content Server is looking at. There's presently no way I know of to focus that Content Server on more than one library at a time.
  • Catalog. A Catalog may include all the books in the Library where it is generated, but can't include other books from other Libraries. An instance of calibre on one computer presently can't look at more than one Library simultaneously, or different parts of the same Library simultaneously. So when I want to compare different Libraries or parts of one Library visually, one method is to use a Catalog to see what's in one Library (or part) and use the calibre Library View booklist for the other, but the catalog on device is often awkward for seeing things at a glance.
  • Two Computers. With two libraries, the simplest way to work in Library 1 and see all the metadata in Library 2 at a glance is to use 2 computers side by side, Computer 1 and Computer 2, each running a separate instance of calibre, calibre 1 and calibre 2. This can be done wirelessly, or through a cabled external drive, or through a file hosting/syncing service - and each of those ways has its own various hassles and caveats. It's vital that calibre 1 and calibre 2 don't look at the same library files simultaneously, and that they don't make changes to the same library files successively which means calibre 2 can't also write to calibre 1's Library files without messing up file permissions, metadata.db, and OPFs. So the safest way to do this is access a copy of Library 2 from calibre 2 on Computer 2 and treat it as read-only, which is another hassle. Using just one library for everything except testing big changes avoids all of those convolutions and hassles.

Reasons for Separate Processing Library and Storage Library:
  • Simpler Searches. Ease of working with just books to be processed, no need to search on a tag for "New Book" or sort by Date to group them, simpler searches to group books for Edit Metadata in Bulk, simple sorts also group them easily.
  • Safety. Less chance of blunders using Edit Metadata in Bulk Search and Replace to process books if the entire set of previously processed books isn't also there at risk to any errors. After a book is processed in the Processing Library, it goes to live in relative safety in the Storage Library.
  • Sense of Accomplishment. Moving those newly processed books off to their new home results in feelings of accomplishment and satisfaction.

  • Create Libraries. In the Main Toolbar, under the Library icon, is the command to Switch/Create Library. Use that to create new libraries. For each library to be created: In dialog box, click the square button next to New Location box, navigate to where the new library will be created, click New Folder button to make new folder and name for new libary, click Choose button. Select the button for Create an empty library at the new location. Check the box for Copy structure from the current library, otherwise it will be created with default columns and no custom columns that are in the current library. Click okay. Repeat to create any other new libraries.
  • Set Columns for Each Library. Initially, add custom columns as desired for each library separately, then in future iterations review those decisions regarding effectiveness and ease of use, and change them with forethought when warranted. Unlike most other calibre preference settings working globally across all libraries, adding custom columns or deleting them apply only to the library where they are created or deleted. I try to minimize my use of metadata in general whether in custom or default columns - the more metadata kept, the more work later to maintain it consistently. Also the more complex automation in computed columns, the more work later to change something. So I try to minimize the number of custom columns. Those I use now are:
  • #dlgroup, DLgroup, text, for DownLoad group info like site name and web address, replaces Source column.
  • #formats, Formats, text, built from other columns, to see a books' formats at a glance in booklist.
  • #fqr, FQR*, comma separated text, for Format Quality Rating tag and format problem tags.
  • #genres, Genres*, comma separated text, for genres.
  • #isbn, ISBN, text, built from other columns, to see at a glance in booklist.
  • #kinds, Kinds*, comma separated text, for kinds such as anthology, collection, omnibus.
  • #pages, Pages, integers, format for numbers {0:,}, for pagecount from Count Pages plugin.
  • #prizes, Prizes*, comma separated text, for awards.
  • #status, Status*, comma separated text, for series-up-to-date, series-multi-author, my-rating, and other status tags.
  • #wkg, Wkg, comma separated text, for Working tags, grouping books for batch operations, also a temporary storage place for column information when restructuring library columns, or moving metadata around in bulk. Replaces #act column.
  • *Note 1. FQR, Genres, Kinds, Prizes, and Status are new columns to enable more precise searching. I moved most tags out of default Tags column to these.
  • Note 2. I tested yes/no columns temporarily: #read, #sma (series-multi-author), #sutd (series-up-to-date), then deleted them because I prefer using the column #status for tags for those and my-rating. I also hid the default ratings column again after testing ratings from various places - I don't agree with a lot of those ratings and prefer my own rating, which also means I've read the book.
  • Note 3. I don't use the default columns Languages, Modified, and Ratings, so keep them hidden unless testing something with one of them.
  • Note 4. Deleted the old #note column, will use default Tags column for any miscellaneous notes or tags that don't fit anywhere else until I have a specific need for some other dedicated column.

Set Plugins.

For new users just starting out with calibre, I'd suggest getting comfortable with calibre at least a month or so first before using plugins. I waited four months before installing any, not counting DRM plugins. When ready, add and configure any desired plugins. That's done through Plugin icon on bottom row of Preferences, the Get New Plugins button to see a list of calibre-supported plugins and install them automatically through calibre. In the Links section, there's a link to the calibre Forum's Plugins Sub-Forum and another link to the Index of Plugins, which has descriptions of each and also allows manual downloading. In Preferences/Plugins, the button Load Plugin from File is for any plugins that were manually downloaded from sites.
  • Plugins Work Across Libraries. Plugins work across all calibre libraries no matter which library was open when it was installed, unlike custom columns.
  • Plugins Frequently Used: Find Duplicates, Open With, Search Internet, Extract ISBN, Count Pages.
  • Plugins Occasionally Used: View Manager, Quick Preferences, Manage Series, Quality Check.
  • Plugins Testing. Sometimes I install and test other plugins to see if they'll fit specific needs, but in general try to minimize the number of plugins installed.
  • Plugin Commands and Icons. It's useful to access plugin commands that act on the current selection using the context menu for library booklist (right click selection), which allows leaving those plugins' icons off the main toolbar. I don't have any plugin icons in device toolbar or plugin commands in device view context menu because I only use plugins in the library view booklist, not in device view booklist.

Set New calibre Version Weekly:

Updates of calibre are usually released every Friday. I always upgrade right away because I want to see and work with any changes as soon as possible.



Get
Spoiler:

Decide, Reaffirm, or Change Strategy. I'm "going slow" browsing, downloading, and adding books into calibre. There are learning curves for calibre's more advanced features and for eBook formats and conversions. The more books in the library, the more work to do in applying newly learned knowledge and skills across an entire library. Upgrading 100 books takes significantly less time than 10,000 books. This strategy boils down to browsing, downloading, and adding books to calibre a few by one author at a time, then processing those in calibre before adding more.

Choose an Author. Choose one author to focus on.

Search that Author in calibre. Search in calibre for that author by using Tag Browser or Search Box to get a list of all books by that author in the library. Make sure the list includes all books by that author alone plus those with co-authors, as well as any wishlist items by that author.

Search that Author in Operating System. Search in operating system for all books by that author in the Pending folder in Raw Books.

Search that Author on Good Internet Site. Determine if there are any other books by that author I might want now by going to a good internet site such as Internet Speculative Fiction Database (ISFDB), searching the author there, and seeing a list of all books by that author. Also, keep this open in separate browser tab or window for later use.

Do new Wishlist Items. If there are any I want as new wishlist items - to obtain later - I add wishlist items to calibre library now. I do this now because frequently I change my mind about wishlist, and want to buy and download right away instead of waiting. I don't use the Empty Book command to create a book record without a format, except when I want to drop a format into it. I created a folder containing empty text files titled Empty01 through Empty10 by author "Empty AAA", at first as text files then later converted to EPUB format and saved back out to disk. When needed for wishlist items, I add a group of 10 of those "empty" book formats and change the metadata of one or several appropriately, assigning tag _q0 indicating it's a wishlist item.

Why Not Use Empty Books? Save to Disk doesn't copy empty books (without formats) to disk. Copy to Library (on context menu) will copy empty books (without formats) to a different library. When I routinely worked with two libraries in the past, I preferred using Save To Disk followed by Add Books to a different library instead of Copy To Library for moving books between libraries because Save to Disk ensured the metadata was saved to my EPUB formats' internal metadata fields during the Save, while Copy to Library didn't do that. The only commands that do are Convert format, Save to Disk, and Send to Device, and those commands work on the copy, not the original. An empty book record can't carry internal metadata fields in a format it doesn't have, while an EPUB placeholder format can.

Browse eBook Distribution sites. There are two ways to do this. Search by author's name in calibre's Get Books feature, or navigate by browser to a likely site such as Amazon or Barnes and Noble. Get Books is easiest for those sites it includes, provides price comparisons, and shows which books have DRM or not.

Download Books. Buy the desired books if they're not free, and download to computer rather than device to avoid extra steps.

Get Source Info. While still on Download site, enter the source (usually site name) in a Download Group folder name, put those downloaded books by one author into a sub-folder labeled with that author's name in the Download Group folder, and move that Download Group folder to Raw Books/Pending folder. The folder labeled with source, the Download Group, is usually named like this: "Amazon - Clancy - X" where X is some unique number or phrase like date-time to reduce folder-name conflicts later. In addition to noting source, I can add other metadata I want at the time from the download site in associated folder-names, filenames, or a new text-file - but usually don't do much of that at download time except to make sure I note source in Download Group folder.

Direct Download to Device? If the site allows download only directly to device, then after doing that copy to the computer, then into a Download Group folder labeled as in step above to add later with a group of books.



Add
Spoiler:

For New Author in One Download Group:
  • Set Preferences/Add Books. Ensure Preferences/Add Books is set like this: Add the tag _New to Tags column. Auto-merge unchecked. (Later when the little dialog box comes up asking "Add Duplicates?" always choose to add them, because I generally want to see and assess any new duplicates of book records or even formats before any deletion.)
  • Set Preferences/Add Books, "Read Metadata From". Decide "Read metadata from" methods for the Add and set them. I prefer to use checked "Read metadata from file contents" - because it's easiest, fastest, and works with most relatively recent retail EPUBs and MOBIs. Where that doesn't work well, unchecked "Read metadata from file contents" will read from the file name according to the regex in the Add using filename regex box, which I usually leave set as this menu choice:
    Code:
    (?P<author>[^_-]+) -?\s*(?P<series>[^_0-9-]*)(?P<series_index>[0-9]*)\s*-\s*(?P<title>[^_].+) ?
  • If Reading From Filename, Choose Method. These are choices of methods to use with reading from filename:
  • Standardize author, series, and title in the filename out in the Operating System first before Add Books, either manually or using successive passes with different regexes in a file renamer tool to standardize all of the file names to match a chosen regex for Add Books by filename.
  • Successively import small batches, each batch matching a regex written on the fly to suit that batch of books' different file naming convention, with the regex automatically putting the different elements into the correct calibre columns. This one seems easiest and fastest for someone with sophisticated regex skills, which I don't have.
  • Standardize author, series, and title from the filename inside calibre after Add Books imports them all as a mess, either manually or using Edit Metadata in Bulk, Search and Replace, Regex mode. The first six months I usually went with this "Add it all as a mess" option and did the corrections manually in calibre because I wasn't familiar enough with regex to do it the other ways.
For a Download Group Folder Containing That Author:
  • Add Books. Add Books by that author out of a specific Download Group (DLgroup) folder. Select them out in the Operating System and drag and drop them onto calibre's library view booklist.
  • Enter DLgroup. For that group, Edit Metadata in Bulk, enter the DLgroup info from that Download Group folder-name, including at least the name of the download site.
  • Repeat. Go back to beginning of the Add section, and repeat for that same author in the next of the other Download Group folders in Pending - dragging and dropping, then entering source. Do each Download Group that contains books by that author. When done, New Author Group is ready for further processing.
  • Mark Relevant Folders -Added. Out in the Operating System, append "-Added" to all the Download Groups for that author that have been added.
  • Note, about this Repeat. Each set of books from each different download site may need a different choice of add method, which is why they're not all combined into a single Add batch for that author. If all the books are relatively new retail books from reputable vendors, they could probably be done all in a batch using Read Metadata From File Contents to save steps, but since they might be from different Download sites, that would cause confusion with naming the correct DL Group for each book.


Assess
Spoiler:

For Entire New Author Group:
  • Treat Group of Newsfeeds as New Author Group. Any newsfeed downloads come into calibre directly without having to Add them, so to include them in this workflow, just deal with them along with the current New Author Group being processed, or as their own "Author" Group in their own dedicated workflow iteration if there are enough to make that worthwhile. Skip any steps that don't make sense for newsfeed items. For example, for the newsfeeds I've tried so far, by default Get News automatically converts the downloaded content to Preferred Format, so skip the next step for those. I don't know enough yet about Get News recipes, don't know if they can specify other formats or not.
  • Convert to Preferred Format. I convert everything to EPUB including problem PDFs and AZW4s (Amazon, PDF wrapped in MOBI). Most EPUBs are usually a good enough quality conversion for my fiction-reading purposes, after clean-up, excepting specific types of PDFs or other formats that are graphics-laden, complex layout, or textbooks. The exceptions can be dealt with later. EPUB is usually a good format for initial assessment, possible clean-up, reading on iPad, or conversion to MOBI for Kindle.
For Each Book One by One:
  • Assess Format. I do this using calibre viewer for EPUBs. For other formats, the Open With plugin will open each in Acrobat for PDFs, Kindle Previewer for MOBI based formats, or any other tools for other formats when necessary.
  • If Decide Not to Fix Now. If the format's clean-up needs look beyond my present skill level or I just want to delay fixing, I tag it as "needs fix", _q1, also tag the kind of format problems it may have and if serious and it wasn't free, try to get my money back from vendor. Now this book has been assessed, go to the next book to assess, the previous step.
  • If Decide to Fix Now. If I want to read it soon and it looks like I can successfully fix it in five minutes maximum time, I proceed with preliminary decisions about how to fix the format.
Choose a Format and Tool for Fixing. If a conversion is necessary, I want to use the cleanest and least converted format available as the conversion Input Format. That's usually the Original Format that was added to calibre, before any conversions. If the Original Format isn't already one of the format choices listed below, a conversion is necessary. Choices of Conversion Output Format to use for fixing include:
  • EPUB for fix in Sigil or other EPUB editor.
  • RTF for fix in Open Office, Word, or other editor.
  • HTMLZ for fix in any HTML editor, might be useful after I learn HTML.
  • PDF for fix in Acrobat or other PDF editor.
  • MOBI for fix in MOBI editor.
  • TXT or TXTZ lose formatting such as Bold/Italic, sometimes useful.
Choose a Fix-Format Conversion Sequence. Skip the first conversion in the sequence if a fixable format already exists, and just save out a copy of that and clean-up it up. These are some of the conversion sequences I've tried. I like #1 then #2 for quality of results, but I'm not yet comfortable enough in HTML to edit directly in HTML editor or in Sigil. (Note, EPUB innards are mostly XML, XHTML, and HTML, plus images.) Presently I'm choosing #3, which is open as RTF and fix in Open Office, save to ODT, convert that to EPUB in calibre - this is so far best for me in simplicity and ease, resulting usually in good readability though not finely-tuned format quality.
  1. Original Format --> HTMLZ --- HTML editor fix, save as HTMLZ (or HTML then zip) --> Preferred Format.
  2. Original Format --> EPUB --- Sigil fix, EPUB, save as EPUB --- Already was/is my Preferred Format.
  3. Original Format --> RTF --- Open Office fix, save as ODT --> Preferred Format.
  4. Original Format --> RTF --- Open Office fix, save as RTF --> Preferred Format.
  5. Original Format --> HTMLZ --- Unzip, Open Office fix, save as HTML, zip --> Preferred Format.
  6. Original Format --> RTF Word fix, save as DOCX --> Open Office ODT --> Preferred Format.
  7. Original Format --> RTF Word fix, save as RTF --> Open Office ODT --> Preferred Format.
  8. Original Format --> RTF Word fix, save as RTF --> Preferred Format.
  9. Original Format --> RTF Word fix, save as HTML, zip--> Preferred Format.
  • Note 1: Calibre supports ODT as input format but not as output format.
  • Note 2: Calibre does not support Word DOC and DOCX as input or output formats.
  • Note 3: Writer2ePub extension to Open Office is an option for converting to simple EPUB after fix in Open Office.


Fix
Spoiler:

For Each Book One by One (continued):
  • Convert. Convert if necessary to choice of fixable format. I chose #3 sequence this time, so it's RTF.
  • Doublecheck Save To Disk settings. If necessary change settings to include checks for Save cover separately, Update metadata in saved copies, and Save metadata in OPF file. I also check convert non-english characters to English equivalents.
  • Save To Disk. Save out the book including all formats existing in the record. Save into a FixFormats folder.
  • Open Format in Clean-up Application. Drag and drop the format to be fixed (#3, RTF) onto Open Office (#3) icon.
  • Choose Edit Menu, Find & Replace. Choose options in Find & Replace carefully.
  • Replace problems with fixes. This usually requires multiple passes for each different problem - for a person like me who is not sophisticated yet with regex syntax and convolutions. As a rule of thumb for using character-mode search/replace, the first pass finds/replaces the most complex string, next pass the most complex existing string, next pass the most complex existing string, until that sequence is done and particular problem fixed. For fiction, I always want to get rid of headers, footers, and page numbers and avoid splitting paragraphs in the process. Less frequently I'll spend time to fix other annoyances, such as pagination problems by removing all page breaks then inserting a pagebreak to precede each chapter heading; bold style applied to 17 chapters but not the others; margins, indents, and section breaks; Table of Contents only for large omnibuses or story collections, since I'm not worrying about or using TOCs in novels.
  • Keep in Mind:
  • Stick to Time Limit. The maximum time I'm willing to spend for clean-up of one format is five minutes. It all requires practice and care to do it well. The more regex I learn, the faster the clean-up process is accomplished. The more books I clean up, the better my related knowledge and skill-set. If it exceeds the time limit I stop, tag it _q0 format quality along with $xXYZABC tags describing the format problems, and it's demoted to placeholder.
  • Stick to Goal. My goal as a reader isn't a perfect eBook, but to spend the least amount of time to make it "readable by me with as little annoyance as possible."
  • Use What Works. I'm comfortable in Word so I had been using the conversion sequence discussed above that includes Word DOCX. The conversions from RTF to DOCX to ODT each reduced size considerably but I'm not experienced enough with evaluating formats to know much about their resulting quality except that it "looked okay for me to read now." During this current iteration of calibre use, I want to reduce the number of conversions and simplify that process so now I'm using RTF to fix in Open Office to ODT.
  • Learn better ways. After I'm more sophisticated using regex, I'll switch to using calibre's conversion search and replace to remove headers, footers, and page numbers. Once I learn enough HTML to be more comfortable, I'll switch to cleaning-up using the simplest path available, Sigil or HTML editor, or calibre's Tweak EPUB or Conversion Search/Replace. I'm making it a high priority to learn regex, Sigil, CSS, and HTML.
  • Minimize Conversions. Minimizing the number of conversions in a clean-up sequence saves time, simplifies workflow, and most important, achieves higher quality of format. Like photocopying copies of copies of copies, or successively converting audio files through "lossy" compressions or types of recording media, each step loses more formatting and content information while introducing more errors.
  • Avoid Extremes. There are two extremes regarding fixing format problems. One is to just want to read eBooks and not care much about the formatting and any format problems. The other is to want to make each format as perfect as possible, while skills tend to keep increasing, generating a need to periodically spend a lot of time going back to re-fix older books to bring them up to par across the library. I want to sit the fence between these extremes. I don't worry about all format problems, just the ones that annoy me the most that I currently know how to fix.
  • Revisit Older Unfixables. Occasionally, when processing books for an author, I see old formats tagged with various format problems that I couldn't fix months ago, and realize I can fix some of them now because along the way I learned something since then, which is why tagging unfixable formats with tags for types of format problems is helpful. So those get added to the books to fix now.

  • Save as ODT to FixFormats folder. Save it as ODT into the same folder in FixFormats that calibre saved out, into the subfolder holding the other formats, then quit Open Office.
  • Add into calibre. Drag and Drop from Clean-Up folder to add those book formats back into calibre.
  • Convert to EPUB. Convert ODT to EPUB.
  • Assign Format Quality Rating Tag. Do quick assessment of the new EPUB. Rate its format quality.
Format Quality Rating Tags:
  • _q0, wishlist item or bad format, both useful placeholders.
  • _q1, indicates "needs clean-up" if I delay clean-up until later, formerly I didn't use it.
  • _q2, rare cases where it's more than minor annoyance, not fixable, but retained anyway.
  • _q3, okay, readable with only minor annoyance.
  • _q4, good, readable with no annoyance.
  • _q5, excellent. I don't bother with this, except for a few examples.
  • Note. I also use _q0 as the basis to color a record's text red. For bad formats, it saves the trouble of using an empty book or empty book placeholder format. When using catalogs or content server with devices, it indicates wishlist items. For catalogs, it goes into the Read section's choice of columns (FQR) to get a checkmark indicating wishlist item.

  • Remove ODT Format. If new EPUB format looks okay, remove the ODT format in the new record.
  • Keep Original Format If Problem. If the Original Format didn't convert well to EPUB, I keep it in the book record along with the EPUB that was generated for initial assessment. This happens often with PDFs with complex graphics, old image-based PDFs, or technical PDFs with complex layouts. In these cases I add a specific tag _q2 that means "keep original format".
  • Delete Original Incoming Format When Have Good EPUB. If it wasn't an EPUB to begin with, and it converted well to EPUB or cleaned-up well, I delete the original incoming format from calibre's book record. I still have the downloaded original out in Raw Books. Most formats that I keep in calibre are readable with only "minor annoyance" or "better" on my reading devices, once metadata has been updated and corrected and annoying format problems cleaned up. If a format is not readable without major annoyance, I either delete it completely or tag it as "bad format" with _q0 along with tags describing the format problems, keep it as a placeholder, and don't read that book yet, hoping I'll find a better format some day or learn how to fix those problems in the future.
  • Remove Old Book. Remove the old record and all its formats.
  • Go to Next Book to Assess. Go back to Assess section, For Each Book One by One sub-section, and assess the next book, until they've all been assessed and those chosen for fixing now are all fixed.


Edit
Spoiler:

Column and Edit Metadata Order:
  • Note, Column Order. Moving from left to right, my columns are in this order: Authors, Series, Title, Tags, Formats, Pages, Size, DLgroup (Download Group), FQL (Format Quality Rating), Genres, Kinds, Prizes, Status, Wkg (Working), Published, Publisher, ISBN, Date. I don't use Modified, Languages, or Rating. I prefer to leave columns in my preferred order for browsing books in library booklist. The View Manager plugin allows switching to different column orders and sorts at the click of a button - but I don't use it often because I'm most comfortable using just one column order for everything.
  • Note, Metadata Editing Order. The general progression for entering and editing metadata is similarly left to right, with a few exceptions that don't match the order of columns.
For Each Book One by One:
  • Enter Format Quality Rating Tag. I put appropriate Format Quality Rating tag in FQR column.
  • Correct Authors. I correct the author(s) names in Authors from information in the book itself. Open it with calibre viewer to the title or copyright page areas and doublecheck information there.
  • Correct Series Name and Series Index. At this point I use whatever series name and index number is in the book. Later I correct it to match a series convention obtained at a web site, or a convention I've previously used on other series members in the library.
  • Correct Title. Do necessary Title corrections, including edition information (EdNo: 3; Ed: Editor's Name) and variant titles in parentheses (vt: Variant Title) as well as titles of major elements of omnibuses (Title of E1; Title of E2; Title of E3). So a search of Title field later will find all that information. Temporarily this extra metadata meant for Title column goes into the Working column leaving Title pure for the purpose of metadata download, then appended to Title after the metadata download.
For Entire New Author Group:
  • Count Pages. The plugin does the work on selected books.
  • Extract ISBN. The plugin does the work on selected books. This facilitates a more specific and accurate metadata download. Note the plugin can't do PDFs, which is one reason I convert all PDFs to EPUB.
  • Note, re EPUB. Count Pages and Extract ISBN plugins don't work on PDFs. Also many metadata fields do not get updated in PDFs, but do in EPUBs. Count Pages and Extract ISBN do work on EPUB and MOBIs, which is one reason I convert all formats to EPUB, even when I know the conversion will be bad. Can always delete the EPUBs later in those cases, which are tagged with Format Quality Rating _q2 during Assess, to know which records need to keep the original format.
  • Do Metadata Download. Prior corrections in Authors and Title columns are vital for this, and ISBN narrows it down. I do a limited Metadata Download making sure not to overwrite any of the columns I've just filled in, by checking only the desired fields for download in Preferences/Metadata Download. My download choices usually are: Published date, Publisher, Comments, Cover. Calibre grabs ISBN automatically if it's not already in the record. I always grab a cover, even when book has internal cover already. I keep only a few sources checked (figuring the more checked, the slower the grab). Amazon's seemed more consistently accurate with broader item availability than others. By default I also use ISBNdb and Open Library. Others I keep unchecked and only use on a case by case basis when needed or when testing something.
  • Optionally, do metadata download testing. I just finished testing Goodreads. When I have time I'll test Barnes & Noble, then Fantastic Fiction. The tags and ratings I sometimes downloaded from Goodreads were a temporary thing while testing. At this point My Tags column has only _New (unless testing a tag source, which tags it would have too). DLgroup column has source info. FQR column has Format Quality Rating, and possible tags on types of format problems still remaining after assessment and fix.
For Each Book One by One:
  • Enter my tags. I use my own tag scheme for these. Now I know enough from the format assessment and downloaded metadata such as comments to decide on what tags I want to use for genre, book-type (omnibus, collection, anthology, short story), and so on. I include a tag for "To Read" on any book that I want to read soon. I recently switched from using the default Tags column to custom columns for most tags, to enable more precise informal searching.
  • Delete Temporary Tags. Delete any temporary tags I may have downloaded for testing after noting how they compare to my own regarding standardization and consistency across similar kinds of books (which is poorly so far, even Goodreads tags).
For Entire New Author Group:
  • Move Extra Metadata to Title. Copy any extra metadata belonging to Title from Working column where it was stored temporarily, appending to Title using Edit Metadata in Bulk Search and Replace in Regex Mode. Delete those from Working by deleting all tags in the working column for that selection.
  • Find All Books by that Author. Use the Tag Browser or Search Boxto find the older books in the library by that author including any with co-authors, and group those with the newly added books. Those become part of New Author Group now.
  • [b][i]Deal with Duplicates. If duplicates exist, deal with with them. I do it by eye in booklist or I use the Find Duplicates Plugin command in the context menu for the selection of books by that author. Then I compare Format Quality Evaluation tags. If the newer is a better format then the older, I delete the formats in the older record and merge the newer better format into the older record, then delete the newer record. If the newer record is a worse format rating, I just delete that newer record without merging.
  • Consider, Less is More. That's a good rule of thumb to remember while entering metadata in tags or columns. Less is more. It's better for me to have fewer tags and columns that are relatively consistent and standard than a whole lot of tags and columns in which the information is an unstandardized and inconsistent mess. Making the metadata consistent across a library requires a lot of work, steadily increasing in time required as the library grows and I want to make changes, and the more metadata there is per record, that requires even more time. So less is more. Less time for maintaining the library. More time for reading.
  • Choose a Site for Metadata Standardization. I like Internet Speculative Fiction Database (ISFDB) best for standardization and consistency (except for genre tags, where they're not) and it's great for series name standardization, awards, covers, ISBN13s, identifying elements of ombnibuses, etc. Sometimes I also use the original download site (indicated in DLgroup column), author's site, Amazon, WorldCat, or Wikipedia. Most of my books are speculative fiction so using ISFDB where possible works well for me, even for some paranormal romance. For thriller, suspense, romance, or other-genre novels Wikipedia is often good if it's a popular author. I use the plugin Search the Internet sometimes, but when I'm not looking for one specific book title it's easier to go to a site navigating by browser url box or bookmark.
  • Do final additions and corrections. For the entire New Author Group, I double-check all metadata, and add more or correct it manually if necessary by comparing with existing books in calibre by that author and using good web sites that use relatively standard conventions across metadata categories to correct and standardize metadata across all books by that author. I get better covers, correct ISBN13s for editions, edition metadata, Series Name and Series Index, Published-dates, or other relevant metadata from relevant internet sites. My overall goal here is to make all the metadata standardized and consistent for that author, and ultimately for all books across the entire library.
  • Standardize ISBNs. I use a web site "ISBN Convert" that does conversions from ISBN10 to ISBN13 to change any remaining ISBN10s (Links section, Metadata, ISBN Convert).
  • Delete Unnecessary Formats. Delete any remaining unnecessary formats. Nearly all of my book records, after assessment, possible fix, and entering/updating metadata, contain just one format, EPUB, and some contain just PDF. For most records containing PDF format and tagged Format Quality Rating _q2 (readable with more than minor annoyance but retained anyway), I usually delete the assessment EPUB now and keep the original PDF alone in the record, particularly useful when it's reference material that I'll refer to a lot. By unchecking PDF in Preferences/Behavior "Use Internal Viewer For", and leaving it unchecked all the time, and setting that choice of application for opening .pdf filetype in the operating system, doubleclicking the record in calibre allows the operating system to open it directly in a PDF reader or editor like Acrobat on the computer. These "problem" PDFs are usually more readable in native format PDF on computer, Kindle, and iPad, than in a converted format. Another EPUB can always be generated from it later if necessary.
  • Set Metadata. Optionally, convert EPUB to EPUB to lock in metadata changes into internal fields in the EPUB format to immediately force them to be consistent with OPF and metadata.db. Another way to do this which is better because it avoids an unnecessary conversion but involves a couple more steps: Save out the entire New Author Group, Add it Back in, compare briefly with the older set to make sure none are missing, then delete the older set.
  • Delete _New Tag. Delete the tag _New .
  • Give Self a Pat on the Back. The group of books by an author calibre-wide and computer-wide is now completed.
  • Note, Re Raw Books and Author Folders. I don't bother to re-organize all books by one author into one relevant author folder per author throughout Raw Books folder, for several reasons. New books go into it by Download Group folder, not by a complete one-author group folder. Operating System search can find an author name wherever it resides across the Pending or entire Raw Books folders as long as author name is relatively intact somewhere in the folder path or filename. Sometimes there are multiple formats of the same title, all in various filename standards, and sometimes multiples of the same title and the same format, so trying to put all of these into order into one author folder without having filename conflicts is time consuming and not worth it. These formats are insurance, just there if needed in the future on a case by case basis, per author or title search.
  • Note, To Solve Raw Books Author Organization Problem. Optionally, to solve that problem, just save out all the books that were just processed for that author, into a folder at same level as Raw Books named something like "Calibred Books By Author." Periodic saves of each newly processed author to that folder will gradually build an alternate arrangement organized by author containing only processed formats with good metadata. I haven't started doing this to date because I'm not sure I need yet another kind of backup, since the books in the calibre library are also arranged similarly.
For Relevant Selections:
  • Generate Catalog. Select all books in Library, create catalog with only "Books by Authors" checked in the Catalog dialog box and wishlist items based on the setting in Catalog E-Book options Tab, Read Books, FQR (Format Quality Rating) column, "_q0" column value. That will put a checkmark next to wishlist items. When created, I tag the catalog with a "To Read" tag, to catch it in a search for the next time I load the device. Using a Catalog on a device is an easy way to compare two different Libraries simultaneously (one in calibre, the other on device in Catalog), or to look at two different parts of one Library simultaneously.
  • Convert Before Device Loading. Select for conversion any books tagged "To Read" that I want to load to device in next step. I convert to whatever format is necessary for the desired device, such as MOBI for the Kindle. Preferred conversion input format is the EPUB I converted to originally after Adding, or the cleaned up EPUB that replaced it. In the case of problem PDFs or any other problem or complex formats, I'll try reading it in native format without conversion on Kindle or iPad, and where that doesn't work well try determining how to convert it or fix it in a way that does work well for one of my devices.
  • Load Device. Load a reading device with any books tagged "To Read" that I want to read now, plus the most recent catalog and the most recent newsfeeds. I load a device only a few books at a time, usually less than ten counting catalog and newsfeeds. If I travelled more frequently, I'd load more books. Other considerations: different reading devices have different characteristics and capabilities (such as weight, color display or not) that make them suitable for different types of reading, so generally I use Kindle for fiction and iPad for technical, graphics, and problem-PDF reading.
  • Read Some Books. Read some books on a reading device. Not necessarily the same author recently processed. This is the reward for all the hard work.
  • Rate Each Book Read. After reading a book, rate that book's content quality with a rating tag in Status column (or alternatively if using a yes/no column for read, a checkmark in Read column and stars in Rating column) and correct any other metadata as necessary with the new knowledge about the book from reading it.
  • Delete Book from Device. After rating a book and updating the metadata, delete it off the device using calibre directly while device is connected.
  • Delete Conversion Format for Device. If a format was created by conversion just for reading on that device, such as MOBI for Kindle, delete it from the book record.
  • Note, Collections. I don't use collections on devices or in columns. Find it easier to refer to tags in calibre while avoiding doing any extra work for what I feel to be a redundant tag effort for collections. If I travelled more without a laptop handy I'd reconsider that and probably use collections and the Kindle Collection plugin.
For Next Iteration:
  • Go Slow. I don't want Pending to hold thousands of books. Less than 200 max seems good. I try not to Download more than I can process out of Pending into calibre at a slow, relaxed, and comfortable pace. Go easy on the fuel pedal. The process is fueled by downloading more books. Learning and experimenting are fuel addidants that enable smoother engine performance, better mileage, and less wear and tear on the vehicle and its owner/operator. With periodic re-supply of fuel and addidants the process never ends.
  • Restart. Go to first step of first section of workflow map, and start the cycle again for the next iteration.

________________


Link
Spoiler:

Links Key:
  • Some links are for information, some are for software tools to use in conjunction with calibre. Tools recommended by experienced people at MobileRead have "recommended" in parentheses. Any I haven't tried enough myself are indicated with "noted" in parentheses.
  • Internal links for MobileRead (MR) Thread Posts are for information, or in rare cases for a script and labeled "script".
  • External information links are for wikis and other information sources.
  • External links for tools are for software.

KISS Principle:
Calibre:
File Renaming:
Metadata:
Workflow:
Devices:
eBooks, MobileRead, Reader Software, Stores:
Formats, Conversions:
EPUB:
HTML:
HTML (Alternate Browsers & HTML Readers):
Graphics:
MOBI (or related Kindle formats):
PDF:
TXT, RTF, DOC:


KISS
Spoiler:

Goals. KISS my use of eBooks, calibre, and other software tools. Determine strategies and methods for gathering, managing, and cleaning up eBooks. Gradually learn relevant "best practices." Learn to use calibre and supportive software tools better. Manage eBooks better. All to facilitate the ultimate purpose, reading eBooks.

Definition of KISS. I use the "verb" form of the principle "Keep It Simple Stupid" as meaning "to simplify a complex project or series of tasks in order to improve results." The word "Stupid" in the principle is not used or intended in a pejorative manner. When I say "to KISS" I mean "to simplify and improve." Wikipedia explanation of KISS Principle.

Reasons to KISS. When I started out new to eBooks and calibre in January 2011, I frequently felt overwhelmed. Paper books are fundamentally different than eBooks, generating a need to determine different strategies and methods for managing and using eBooks, which I hadn't done yet. The calibre eBook library management application allows new users to use it in simple ways while also accommodating more advanced users with many features and complexities. I didn't know much about eBooks in general, let myself get tangled in complexities and sidetracks, felt overwhelmed and frustrated at first. Gradually the more I learned about eBooks in general and the more I consciously simplified my use of calibre, the more success I experienced managing eBooks.

History. At seven months into using calibre, I wanted some discussion on strategies, methods, and work habits, so laid out what I was doing as the original post of the thread "KISS for New calibre Users". After subsequent discussion, recast the KISS posts from "giving advice to new users" to "documenting what I'm doing, as one slightly experienced user." I also re-started at baseline zero each for eBooks and calibre, a situation somewhat similar to that of a starting-out new user of eBooks and calibre, with the addition of some perspective based on experience.

Baseline Configuration.
  • 2011-08-14. Deleted calibre application, library files, configuration directory, all associated system files on all computers (2 Macs). Deleted nearly all eBooks off all storage devices including backups. Installed latest binary version of calibre with its one library containing one eBook as a clean new calibre installation on primary computer. Configured Preferences with basic user information.
  • Only a few miscellaneous eBooks were scattered around: calibre Quick-Start Guide, Kindle User Guide, and a few PDF formats of user guides or reference material.

Plan.
  • Add books very slowly.
  • Learn, and integrate into my own use: CSS, HTML, Sigil and other relevant editors or tools, other calibre features.
  • Continue revisions of KISS/Workflow posts corresponding to my own experience, trying to integrate suggestions made by others into my use of eBooks, calibre, and relevant tools and then into the next revised KISS/Workflow post. Incorporate changes after testing.

Request Comment. Request comment after each revision is posted. The posts are offered not as advice but as examples of what one relatively new user is doing, struggling with, or trying to do. I hope this may be useful to new users, increasing in usefulness over time as it is refined in successive iterations. Feedback, input, and discussion will be helpful in correcting or improving any assumptions, strategies, methods, practices, and workflows.

Thank You. Thanks to everyone who posted on MobileRead, where I learned most of the content contained in this post. Particular thanks to those of you who posted in the KISS thread.


Version History.


Workflow Map for Managing eBooks with calibre:

KISS for New calibre Users:

Last edited by unboggling; 09-24-2011 at 11:40 AM. Reason: Link to newer version.
unboggling is offline  
Old 09-16-2011, 04:41 PM   #244
unboggling
by the bootstraps
unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.
 
Posts: 1,055
Karma: 858115
Join Date: Jan 2011
Location: Southeast US
Device: PRS-T2, Nexus 7, KindleT, iPad1, Kindle3KB
Request comment, discussion, criticism, ideas, other tools or info sources you recommend I include in links, notice of any glitches in the workflow that I missed.

unboggling is offline  
Old 09-18-2011, 07:51 AM   #245
travger
Evangelist
travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.
 
travger's Avatar
 
Posts: 469
Karma: 270594
Join Date: Aug 2010
Device: palm tx, Windows XP, Windows7
I agree on your reasons for separate libraries but would like to comment on the single one:

* Copy and Paste. Copying and pasting metadata across Libraries involves using Library/Quick Switch and search to find the item to copy, copy it, Quick Switch again, and then search to find desired paste location. So it means more time spent or more typing of metadata such as series information when adding new books.

If both libraries have exactly the same columns, all metadata will be copied. To find differences, you can just type something into all fields, copy (without deleting!) to other library and look closely at the fields that are empty. There must be some difference in column properties.

* Library Restructuring - Yes, could be painful if you have several libraries. One reason for taking it slowly for several months to find out exactly what and how you need.

* Edit Metadata in Bulk. This only works across whatever books are selected in one library. For example, to delete an author's middle name, or to change a large multi-author multi-level series name. It's easier when all books by that author or in that complex series are all in the same library.

If I have author already in main library, it's the correct (what I prefer) name. Any changes I make in 'new books' library will result in the same name. Series - some books in my 'new books' library have just note 'yes, something' in the series field. I'll deal with them in time, but for now it means "do not read before further research". After all, I have enough reading material so I'd rather avoid unfinished series.

About searches I have not much experience, also with content server.
When I'm not home, my computer (and Calibre) is off anyway. (I'm thinking about migrating library to Dropbox)
travger is offline  
Old 09-18-2011, 08:17 AM   #246
unboggling
by the bootstraps
unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.
 
Posts: 1,055
Karma: 858115
Join Date: Jan 2011
Location: Southeast US
Device: PRS-T2, Nexus 7, KindleT, iPad1, Kindle3KB
travger, thanks for the feedback.

Regarding using Copy to Library, I try to avoid using it because there are so many "ifs". If the column structure is the same. If you don't care that the metadata internal to the format doesn't get updated. If you remember which ones you've Copied To Library already without deleting, and which you didn't yet. All I want to do is copy/paste the series name (especially for complicated multi-sub-series series) or correct authors (especially if there are lots of authors). Copy To Library is extra iffy steps for me. I also like to complete all the metadata shortly after Adding, rather than waiting on some of it.

Whatever ways work for someone, that they're comfortable with, are good.



Edit: btw, the workflow is still rough, has some awkward parts, too much detail in some parts, not enough in others, needs refining. I've already got 2 pages of notes for next revision.

Last edited by unboggling; 09-18-2011 at 08:23 AM. Reason: Added comment re workflow needs work
unboggling is offline  
Old 09-18-2011, 09:13 AM   #247
travger
Evangelist
travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.
 
travger's Avatar
 
Posts: 469
Karma: 270594
Join Date: Aug 2010
Device: palm tx, Windows XP, Windows7
About format quality - I got so many different mistakes that it is easier to add txt file to the book record.

For example (all concerning mobi or prc)
- some places/chapters are indented
- indent too big (for my small screen)
- headings not centered
- * * * not centered
- should be empty row (like between chapters)
- paragraphs missing
- paragraphs in places where they should not be (middle of sentence)
- no italics
- no empty place between some words
- too much ads and stuff before actual book starts
- typos

Just some that came easily to mind. Most of them I will not know about before reading the converted book. And if I don't note them while reading, I'll probably forget something.

Also I can use txt file for notes about what steps I took and how successful I was. So Quality=4 says to me 'some minor things I couldn't fix', clicking on txt gives more detailed info. If I decide to cut things like "CO<sub><span style="font-size:xx-small">2</span></sub>" in favor of "CO<sub>2</sub>", I'll make note of it - just in case I want to change it back in the far future.
travger is offline  
Old 09-18-2011, 09:40 AM   #248
travger
Evangelist
travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.travger ought to be getting tired of karma fortunes by now.
 
travger's Avatar
 
Posts: 469
Karma: 270594
Join Date: Aug 2010
Device: palm tx, Windows XP, Windows7
"If you don't care that the metadata internal to the format doesn't get updated. "

For me - if I have internal metadata once, I have it forever. What's there to update?
I am still not very sure what I want to see there and how to get it. Once when I tried to add first page to epub, I got too much info; ended up exploding epub and modifying the first page. In mobi the result is not so nice anyway, much better to add things directly to html.
I don't want much - cover, author (BY-DY), first pubdate, awards won.

When I'm massaging the html anyway, it's not hard to slip those changes in.
travger is offline  
Old 09-18-2011, 09:49 AM   #249
unboggling
by the bootstraps
unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.
 
Posts: 1,055
Karma: 858115
Join Date: Jan 2011
Location: Southeast US
Device: PRS-T2, Nexus 7, KindleT, iPad1, Kindle3KB
Quote:
Originally Posted by travger View Post
About format quality - I got so many different mistakes that it is easier to add txt file to the book record.
I do most of the problem listing with abbreviated format-problem tags in the same new column as format quality rating, rather than using an associated txt format. I assess and fix it before I read, note problems I couldn't fix as tags if I fix right away, otherwise if I delay fixing I note the ones I noticed during assessment. In the future after I read something and am adding my-rating to the metadata in booklist, if I came across other format problems I add tags for them at that point, unless I got disgusted while reading and paused reading to fix those on the fly.

So I don't keep track of the things I've fixed, just the ones I've noticed that annoy me that I haven't fixed yet. I do tags for paragraph jams (no paras within the jam, often of dialog by different people), and missing text, missing quotes, missing spaces (word jams), bad margins or indents, lack of bold/italic, and a bunch of other things that have no fixes or I haven't learned how to fix yet. If there are too many I trash the format as not worth the trouble, during assessment or trying to read it. If a particular download source has enough of those for me to notice, I'll avoid using that source again.

Last edited by unboggling; 09-18-2011 at 09:52 AM.
unboggling is offline  
Old 09-18-2011, 10:03 AM   #250
unboggling
by the bootstraps
unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.
 
Posts: 1,055
Karma: 858115
Join Date: Jan 2011
Location: Southeast US
Device: PRS-T2, Nexus 7, KindleT, iPad1, Kindle3KB
Quote:
Originally Posted by travger View Post
"If you don't care that the metadata internal to the format doesn't get updated. "

For me - if I have internal metadata once, I have it forever. What's there to update?
I'm under the impression that various format wrappers carry various internal metadata fields, such as title, author or document creator, etc. TXT has none, converting to PDF in past did only title and author (that may have been changed in an update by now), EPUB has more fields. Copy To Library doesn't update those internal fields. Convert, Save, and Send all do update those fields, but don't update them in the original format, only in the new copy made in Convert, Save, or Send. <-- as I understand it, and I may have one or two bad assumptions in that. This is different than converting with the option to add metadata to a new page inserted in the format.

Edit: I distinguish between the metadata in the OPFs, which get auto-updated shortly after anything is changed in booklist or Edit Metadata then later used in Restore command, the metadata in the metadata.db that is reflected in unrestricted booklist and Edit Metadata form, the metadata in internal format fields, and the metadata page that can be inserted during conversion.

Edit2: All of those metadata places get updated in different processes at different timing.

Last edited by unboggling; 09-18-2011 at 10:23 AM. Reason: Add Edit: paragraphs
unboggling is offline  
Old 09-18-2011, 10:25 AM   #251
kiwidude
calibre/Sigil Developer
kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.kiwidude ought to be getting tired of karma fortunes by now.
 
Posts: 4,230
Karma: 1345754
Join Date: Oct 2010
Location: London, UK
Device: Kindle Paperwhite 3G, iPad 3, iPad Air
@unboggling - you seem to have a "bit of an obsession" about the format internal metadata. Personally I don't concern myself with it. As you have mentioned above there are way too many restrictions over what formats support it and to what extent. So rather than relying on it, instead use Calibre the way it is meant to be used - storing it's metadata internally in the database backed up to the opf files. Hence imho copy to library is the "best" way to transfer books between libraries.

The only time the internal metadata should matter is when you make use of something that relies on it - such as some devices. However provided you "properly" transfer to your device either using Save to Disk or Send to Device, then your metadata will get updated for it at that point. And more importantly you can override the metadata to do things like prefixing the metadata title with series etc in the plugboards.

Personally I don't think recommending people use Save to Disk instead of Copy to Library is the right thing. I understand it may be a preference you have but I don't see a valid reason for it when transferring between your own local libraries and there are too many downsides depending on the format.
kiwidude is offline  
Old 09-18-2011, 10:33 AM   #252
kacir
Wizard
kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.
 
kacir's Avatar
 
Posts: 2,858
Karma: 3164175
Join Date: May 2006
Device: PocketBook 360, before it was Sony Reader, cassiopeia A-20
Quote:
Originally Posted by unboggling View Post
Request comment, discussion, criticism, ideas,
I think that Regular Expression you use as example for adding books is not optimal, because the series info isn't optional, so it wouldn't work for majority books that don't have format author - series # - title. For example author - title wouldn't be processed.
kacir is offline  
Old 09-18-2011, 10:44 AM   #253
DoctorOhh
US Navy, Retired
DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.
 
DoctorOhh's Avatar
 
Posts: 8,908
Karma: 12755553
Join Date: Feb 2009
Location: North Carolina
Device: Nexus 7
Quote:
Originally Posted by kacir View Post
I think that Regular Expression you use as example for adding books is not optimal, because the series info isn't optional, so it wouldn't work for majority books that don't have format author - series # - title. For example author - title wouldn't be processed.
I know nothing about regex, but I tested his regex with this book:

Alan Jacobson - The 7th Victim.epub

and it filled in the proper title and author. This test says you may be incorrect in what you said above.

Personally I use the following pilfered regex (in various formats):
Code:
^((?P<author>([^\-_0-9]+)(?=\s*-\s*)(?!\s*-\s*[0-9.]+)|\b))(\s*-\s*)?((?P<series>[^0-9\-]+)(\s*-\s*)?(?P<series_index>[0-9.]+)\s*-\s*)?(?P<title>[^\-_0-9]+)

Last edited by DoctorOhh; 09-18-2011 at 10:47 AM.
DoctorOhh is offline  
Old 09-18-2011, 10:55 AM   #254
unboggling
by the bootstraps
unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.unboggling ought to be getting tired of karma fortunes by now.
 
Posts: 1,055
Karma: 858115
Join Date: Jan 2011
Location: Southeast US
Device: PRS-T2, Nexus 7, KindleT, iPad1, Kindle3KB
Quote:
Originally Posted by kiwidude View Post
@unboggling - you seem to have a "bit of an obsession" about the format internal metadata....

Personally I don't think recommending people use Save to Disk instead of Copy to Library is the right thing. I understand it may be a preference you have but I don't see a valid reason for it when transferring between your own local libraries and there are too many downsides depending on the format.
OK. I hear you. I'll make it less of a "recommendation" in the next KISS/workflow posts. You're right, most people don't want to worry or think about all that, so I'll tone it down. But I do think about it. I did this as just good hygiene, when going from one library to another, to set the metadata in those internal fields. Because, if those books are Copied to Library rather than Save/Adding, that metadata doesn't get set when going from the fix-it library to the storage library. So in a sense they are in an incomplete state. And will stay that way forever, because once in storage library, any convert, save, or send will only update those internal fields on a copy, not the original sitting there in incomplete state forever. I understand the way people use calibre, that doesn't matter. But it bothers me.
unboggling is offline  
Old 09-18-2011, 11:05 AM   #255
kacir
Wizard
kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.kacir ought to be getting tired of karma fortunes by now.
 
kacir's Avatar
 
Posts: 2,858
Karma: 3164175
Join Date: May 2006
Device: PocketBook 360, before it was Sony Reader, cassiopeia A-20
Quote:
Originally Posted by dwanthny View Post
I know nothing about regex, but I tested his regex with this book:

Alan Jacobson - The 7th Victim.epub

and it filled in the proper title and author. This test says you may be incorrect in what you said above.
Yes it does. Sorry. Should have tested it.
It is interesting RE, because my REs
Code:
^(?P<author>((?!\s-\s).)+)\s-\s(?:(?:\[\s*)?(?P<series>.+)\s(?P<series_index>[\d\.]+)(?:\s*\])?\s-\s)?(?P<title>[^(]+)(?:\(.*\))?
OR
(?P<author>[^-]+)(( - | *-- *)[[(]?(?P<series>[^-]+)[[( ]+(?P<series_index>[0-9.]+)?[])]?)?( - | *-- *)(?P<title>.+)
are making series optional much more explicitly.
kacir is offline  
Closed Thread

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Two Users--One Calibre Library? jerrypettit Devices 3 06-24-2011 02:01 PM
First We Kiss B.K. Wright Self-Promotions by Authors and Publishers 0 11-25-2010 03:31 PM
Hi, Calibre users zaphod234 Introduce Yourself 6 07-22-2010 09:16 AM
The Success of the iPad and KISS ColdSun General Discussions 68 07-10-2010 11:33 PM
How do Calibre users.... Lanyon Calibre 24 01-01-2009 02:29 PM


All times are GMT -4. The time now is 10:10 PM.


MobileRead.com is a privately owned, operated and funded community.