Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Library Management

Notices

Reply
 
Thread Tools Search this Thread
Old 09-07-2014, 03:41 AM   #1
rebl
r.eads e.njoys b.ooks lol
rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.
 
rebl's Avatar
 
Posts: 76
Karma: 580748
Join Date: Mar 2010
Location: It's time to get this Book a Rest
Device: Kindle 4 NT
using re() and shorten() for custom comments column?

Hello all,

I'm trying to build a custom comments column to display in the library list, that would show text from the comments filed, stripping (removing) any html tags and shortening the string to a predefined length.
I've been looking all over for more information but all I found is not enough.
Especially, I don't understand a few things:
1. When building a custom column from other columns, according to the tooltip in the GUI, "Field template uses the same syntax as save tamplates"
but the first google search for calibre re() function takes me to the calibre documentation (template reference), where the re() function has three arguments: re(val, pattern, replacement)
In the end I've found the correct page: http://manual.calibre-ebook.com/template_lang.html where only two arguments are needed.
But what is the diference betwenn the two? Where is the 3-argument version applied if the custom colums and save templates use the 2-argument one?

2. I suppose there is no (simple) way to combine two functions in a custom field? I've read somewhere here on the forum about the simple mode being the default and about some programing or advanced mode(s) but I haven't figured out how to you use the other two.

If I use
Code:
{comments:shorten(50,...,0)}
I get a shortened string but also html tags.

But I would like to remove the HTML tags too, so i'd need a function like:
Code:
{comments:re((\<.+\>)?,):shorten(50,...,0)}
Please ignore for the moment how efficiently written is the re() function.
I know the above does not work, but how exactly could I appy two functions to the same input field (comments) for a custom column?

3. I have some strange problem with the re() function and I would really appreciate your help.
If I use
Code:
{comments:re((\<.+\>)?,)}
The various html tags (
Code:
<hr> <div> </div>
etc) are removed but in some case (when a comma is present) the resulting string is reversed around the comma and I can't figure out why. It's like switching \2 \1 when swapping author surname/name.
Also in some other cases the whole test is removed (like a greedy behavior) and the output is empty.
This behavior is not 100% reproducible and this puzzles me, because when I've done some minor metadata edit in the comment field, the output became the expected one. But other time(s) after a minor edit like adding a space in the html source of the comment the whole output disappeared.
I had a comment similar to this (for a Thomas Mann book in case you're wondering):
Code:
<hr> 
A portrait of the German bourgeois society, throughout several decades of the 19th century.
When it was in Romanian (it was like
Code:
<hr> Some text 1, some text 2.
) the function's result was:
Code:
some text 2., some text 1
...and I have this kind result for most if not all of the books.
But after replacing the comment with the English translation while keeping the tags intact, (just for doing some tests and for pasting here a more meaningful comment ) it suddenly started working and now the results is the expected text without html.
My question(s) are why the editing solved the problem since I've not altered the text pattern and why the re() does not work for the other books and instead it reversed the text fragments around the comma.

Another example of the strange behavior of this custom column:
comments field:
Code:
<p class="description">Premiile British SF Association şi John W. Campbell Memorial, 2013.</p>
initial output:
Code:
2013., Premiile British SF Association şi John W. Campbell Memorial
after editing the comment (I've just inserted an additional space between W. and C) the result was just EMPTY, no text at all!!!

4. Regarding the (\<.+\>)? part i wanted a "not greedy" matching of strings that start with < and end with > but matching the shortest possible string. I'm not sure it works though.

Thank you all for your time and for any help on this!

Last edited by rebl; 09-07-2014 at 04:35 AM. Reason: (\<.+\>)?
rebl is offline   Reply With Quote
Old 09-07-2014, 04:23 AM   #2
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
general program mode is far more flexible and allows multiple functions to be used in arbitrary ways. Unlike single-function mode, it does not assume you are operating on the base template (because there is none) and the field you wish to use must be explicitly stated as the first parameter.

Try:
Code:
program:

shorten(
    re(
        field('comments'),
        'some-regex',
        'replacement-text is empty'
    ),
    50,
    '...',
    0
)
Whitespace is optional. I use it to lay out the template and what I am doing more clearly.

Last edited by eschwartz; 09-07-2014 at 04:31 AM.
eschwartz is offline   Reply With Quote
Advert
Old 09-07-2014, 04:32 AM   #3
rebl
r.eads e.njoys b.ooks lol
rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.
 
rebl's Avatar
 
Posts: 76
Karma: 580748
Join Date: Mar 2010
Location: It's time to get this Book a Rest
Device: Kindle 4 NT
Thank you eschwartz, so in program mode the re() function is used with three arguments, while in the single-function mode is used with just two arguments?
But how do I enter program mode? I can't see any optiosn in the add custom column field.
rebl is offline   Reply With Quote
Old 09-07-2014, 04:32 AM   #4
chaley
Grand Sorcerer
chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.
 
Posts: 12,365
Karma: 8012652
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
@rebl: You cannot nest functions in single function mode, what you called "simple" mode. It is also difficult to distinguish between an argument containing a comma and the argument separator comma. Because of this, you *really* want to use template program mode (TPM) or general program mode (GPM). Both give you the possibility of nesting function calls. Both distinguish between commas in strings and commas used as argument separators. TPM is better for small expressions needed in larger templates. GPM is better when complexity is high, when you want to be able to more easily read the code, or when constant strings in the template contain { and } characters.

Your example in GPM:
Code:
program:
	re(shorten(field('comments'), 50, '...', 50), '<.*?>','')
Your example in TPM:
Code:
{comments:'re(shorten($, 50, '...', 50), '<.*?>','')'}
In both cases HTML stripping won't work if the tag contains a string containing a > character. That situation is rather hard to deal with in regular expressions.

EDIT: trumped (somewhat) by eschwartz. To answer your question about how you "get into program mode", you simply begin the template with the word "program:"

Last edited by chaley; 09-07-2014 at 04:35 AM.
chaley is offline   Reply With Quote
Old 09-07-2014, 04:36 AM   #5
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
The dot plus will match all text including tags, thus matching everything from the opening of the first tag to the closing of the last tag.

Use:
Code:
<[^<>]*>
to restrict the match using a set which excludes tag opening and closing brackets.

chaley, if the html contains a bracket in the text or class there are other problems bigger than a failed regex...

Last edited by eschwartz; 09-07-2014 at 04:41 AM.
eschwartz is offline   Reply With Quote
Advert
Old 09-07-2014, 04:37 AM   #6
chaley
Grand Sorcerer
chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.
 
Posts: 12,365
Karma: 8012652
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
Quote:
Originally Posted by rebl View Post
Thank you eschwartz, so in program mode the re() function is used with three arguments, while in the single-function mode is used with just two arguments?
In single function mode the contents of the field referenced in the template are passed as a hidden first parameter. So in the template
Code:
{comments:shorten(50,...,50)}
shorten has
Code:
field('comments')
as its first parameter,
chaley is offline   Reply With Quote
Old 09-07-2014, 04:40 AM   #7
chaley
Grand Sorcerer
chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.
 
Posts: 12,365
Karma: 8012652
Join Date: Jan 2010
Location: Notts, England
Device: Kobo Libra 2
Quote:
Originally Posted by eschwartz View Post
The dot plus will match all text including tags, thus matching everything frkm the opening of the first tag to the closing of the last tag.

Use:
[CODE]<[^<>]*>[CODE]
to restrict the match using a set which excludes tag opening and closing brackets.
The dot+? will match up to the first >, not the last one.

And neither pattern will work for something like
Code:
<a href="...", alt="demonstrate that a > b">foo</a>
Yes, it will work if you use entities, but not everyone does.
chaley is offline   Reply With Quote
Old 09-07-2014, 04:44 AM   #8
eschwartz
Ex-Helpdesk Junkie
eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.eschwartz ought to be getting tired of karma fortunes by now.
 
eschwartz's Avatar
 
Posts: 19,421
Karma: 85400180
Join Date: Nov 2012
Location: The Beaten Path, USA, Roundworld, This Side of Infinity
Device: Kindle Touch fw5.3.7 (Wifi only)
Quote:
Originally Posted by chaley View Post
The dot+? will match up to the first >, not the last one.

And neither pattern will work for something like
Code:
<a href="...", alt="demonstrate that a > b">foo</a>
Yes, it will work if you use entities, but not everyone does.
cross-posting wars

bigger problems... nothing I can do about that save to say, "use them. Learn what they are and use them."

Didn't notice the "?"
eschwartz is offline   Reply With Quote
Old 09-07-2014, 04:58 AM   #9
rebl
r.eads e.njoys b.ooks lol
rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.rebl ought to be getting tired of karma fortunes by now.
 
rebl's Avatar
 
Posts: 76
Karma: 580748
Join Date: Mar 2010
Location: It's time to get this Book a Rest
Device: Kindle 4 NT
chaley and eschwartz - Thank you guys, very much! You are awesome.
You've put me on the right track and from now on I think I can manage.
Of course a little trial and error can't hurt
I don't have that many non-empty comments fields and I doubt many of them would contain "un-ampersanded" > or < characters... (&gt; is a html entity right?)
I need the comments filed because that was the only field I could think of, to store the "version" of a book, when reading metadata from the file name
Code:
john milton - paradise lost from guttenberg v1.1.rtf
I'm using a modified regex from "quick preferences" plugin:
Code:
^(?P<author>((?!\s-\s).)+)\s-\s(?:(?:\[\s*)?(?P<series>[^\[\]\(\)\d]+)(?:\s*\])?\s?(?P<series_index>[\d\.]+)(?:\s*\])?\s-\s)?(?P<title>.+)\s?v\.?(?P<comments>[^\[\]\(\)]+)(?:\s*\]\))?$
to get the version number and put it in the comments field.
The regex above puts 1.1 in the comments field.
Than I just need to display a part of the comments field in a custom column, in order to be able to sort and see which "comment" contain a version number.
But the fact that comments is a html field complicates things.
Since I am not allowed to use custom fields such as #vers I needed to use a built-in one.
But I guess it'd be simpler to use a different predefined filed that I have no use (initially) for temporary storing the "version number" - for example publisher... it doesn't matter which, 'cos I will anyway have to use "search replace" in metadata....

That is assuming I am definitely not allowed to use something like ?P<#version> when adding books...

Last edited by rebl; 09-07-2014 at 05:05 AM.
rebl is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Custom column returns value based on value of another custom column? calvin-c Calibre 3 09-14-2013 02:24 PM
Custom yes/no column built from long text column Philantrop Library Management 7 03-23-2013 07:44 PM
how to move value(s) of tag column to a custom made column zoorakhan Library Management 0 12-08-2012 03:53 AM
custom date column from two state column Dopedangel Library Management 7 01-03-2012 08:20 AM
Can custom book data be displayed in a custom column? kiwidude Development 9 03-02-2011 05:35 AM


All times are GMT -4. The time now is 06:04 PM.


MobileRead.com is a privately owned, operated and funded community.