MobileRead Forums - View Single Post - Using templates for file names, plugboards, composite columns, searches, tooltips

chaley · 10-13-2013, 03:32 AM

@Ayack:

Try

Code:

def evaluate(self, formatter, kwargs, mi, locals, val, is_read_pct,  
				is_reading_str, no_page_read_str):
	try:
		test_val = int(is_read_pct)
	except:
		return 'is_read_pct is not a number'

	import re
	pattern = u'.*(\d+[-/]\d+[-/]\d+).*?Dernière page lue : Emplacement \d+ \((\d+)%\)'
	mg = re.match(pattern, val, re.U + re.I + re.DOTALL);
	if mg is None:
		return no_page_read_str
	date = mg.group(1)
	pct = mg.group(2)
	try:
		f = int(pct)
		if f > test_val:
			return date
		elif f > 0:
			return is_reading_str + ': ' + pct + '%'
	except:
		pass
	return no_page_read_str

The differences:
1) Your sample string from the comment has no space before the date. The original function required that space ('\s' before the '(' in the pattern).
2) You must tell python that the strings contain unicode (the u' in the pattern).
3) You should tell the re.match function that the arguments are unicode (re.U).
4) The character after 'lue' is very strange. It is not a space. I had to change it to be a real space character. You might be able to change it back if you put that same character into the pattern.

I cannot guarantee that this function will work with your data because the comments are in html. What you see might not be what is actually in the comment.

10-13-2013, 03:32 AM	#331
chaley Grand Sorcerer Posts: 12,529 Karma: 8075938 Join Date: Jan 2010 Location: Notts, England Device: Kobo Libra 2	@Ayack: Try Code: def evaluate(self, formatter, kwargs, mi, locals, val, is_read_pct, is_reading_str, no_page_read_str): try: test_val = int(is_read_pct) except: return 'is_read_pct is not a number' import re pattern = u'.(\d+[-/]\d+[-/]\d+).?Dernière page lue : Emplacement \d+ \((\d+)%\)' mg = re.match(pattern, val, re.U + re.I + re.DOTALL); if mg is None: return no_page_read_str date = mg.group(1) pct = mg.group(2) try: f = int(pct) if f > test_val: return date elif f > 0: return is_reading_str + ': ' + pct + '%' except: pass return no_page_read_str The differences: 1) Your sample string from the comment has no space before the date. The original function required that space ('\s' before the '(' in the pattern). 2) You must tell python that the strings contain unicode (the u' in the pattern). 3) You should tell the re.match function that the arguments are unicode (re.U). 4) The character after 'lue' is very strange. It is not a space. I had to change it to be a real space character. You might be able to change it back if you put that same character into the pattern. I cannot guarantee that this function will work with your data because the comments are in html. What you see might not be what is actually in the comment.