View Single Post
Old 01-17-2012, 02:05 PM   #1
Ian_Stott
Junior Member
Ian_Stott began at the beginning.
 
Posts: 8
Karma: 10
Join Date: Jan 2012
Device: Kindle
[GUI Plugin] Find Similar Stories

This plug-in helps you to find other books within your Calibre library that are similar to your target book. It does this by examining the full text of the books in your library, as opposed to using tags or other metadata.

Main Features:
  • Indexes all selected books and compares their word "fingerprint" with the target book.
  • Adds similarity score to the user identified "Custom Column".
  • Full HTML help available through the "Help about plugin" option.

Version History:
Spoiler:
  • v1.0.0
    • Inital Release
  • v1.0.5
    • Updated to add two new similarity measures:
      • Cosine : calculates the cosine of the angel between the word vectors for a pair of books.
      • Tanimoto (binary) : uses the tanimoto similarity metric on a binary fingerprint representation of a book, ie noting jus the presence or absence of a word within a book, as opposed to the frequency of occurance.
  • v1.0.53
  • v1.0.57
    • Updated to fix a bug that caused an incompatibility with the find_duplicates plugin.


Special Notes:
  • Requires Calibre v0.8.17 or later.
  • Currently only uses books in MOBI or EPUB formats. You can select which format is the preferred choice.
  • There is full documentation within the plug-in of each of the methods used to assess the similarity of two books. The methods used are described in detail or references provided, should you wish to examine them further.

    To view the documentation, select the help about the plugin menu option, once installed.
  • Methods currently implemented are:
    • Tanimoto
    • Euclid
    • Cosine
    • Tanimoto (binary)
    • PMRA (PubMed related articles)
    See the Help for details on each method.

    Let me know if you would like a particular method implemented (along with the suitable source material).


Installation & Usage:
  • Download the attached zip file and install the plugin/add to context menu or toolbar/restart Calibre as described in the Introduction to plugins thread.
  • The first time you use the plug-in you, need to identify a "Custom Column" that will hold the results.
    • If you have not already created a suitable Custom Column. Do this in the usual way by using the "Add your own columns" dialogue box (found by right-clicking on a column heading in the main view).
    • Select a column type of "floating point numbers" and leave the "Format for numbers" section empty.
    • The first time you run the plug-in, or when you configure it, you will be able to select this column to hold your results.
  • Select the target book. All other selected books will be compared to this one.
  • Add to your selection all of the other books that you wish to compare to your target book. If you are having trouble doing this while ensuring that the target book is the first selected, you can edit the Similarity score of your target book (in your selected custom column) so that it is set to 1, then sort by the Similarity score.
  • Run the plug-in
  • Once it has run, it will ask if you want the results loaded up into your selected custom column. You will have to respond with "Yes" to see the results.
  • The results will appear in your selected Custom Column. The higher the score, the better the match that book is to the target. A similarity score of 1 means that the book is a perfect match (and probably the same book). This means that the target book will have a score of 1.
  • I find it easiest if you sort the booked by your Similarity score, with the highest (most similar) at the top.
  • Full help is available through the "Help about plugin" option from the plugin menu.
Attached Files
File Type: zip similar_stories_plugin - v158.zip (55.7 KB, 207139 views)

Last edited by Ian_Stott; 03-08-2012 at 01:45 PM. Reason: V1.0.47, fixed bug leading to incompatibility with find_duplicates plug-in
Ian_Stott is offline   Reply With Quote