|  07-28-2014, 09:52 AM | #16 | |
| Ex-Helpdesk Junkie            Posts: 19,421 Karma: 85400180 Join Date: Nov 2012 Location: The Beaten Path, USA, Roundworld, This Side of Infinity Device: Kindle Touch fw5.3.7 (Wifi only) | Quote: 
  http://www.regular-expressions.info/lookaround.html They provide a thorough explanation, and break down the examples. For my example: Code: <span class="none2">((?:(?!<span).)*?)</span> Code: <span class="none2">inner text</span>Code: ((?:(?!<span).)*?) Code: (?:(?!<span).)*? ) is: Code: (?:(?!<span).)(plus a confusing "?" which is redundant (the start already makes it optional) and I seem to have copied it randomly from the original source  .) This group contains the negative lookahead (a zero-length assertion) Code: (?!<span)So, putting it all back together, the dot-match-all must be preceded by the negative lookahead, and this "any character other than part of a span tag" is then grouped and repeated zero or more times, then captured as "\1" to produce the "inner text" which should be saved from in between the span. | |
|   |   | 
|  07-28-2014, 10:26 AM | #17 | |
| Resident Curmudgeon            Posts: 80,671 Karma: 150249619 Join Date: Nov 2006 Location: Roslindale, Massachusetts Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3 | 
			
			Is there a way to find a proper span when the code looks like  Quote: 
 | |
|   |   | 
|  07-28-2014, 12:18 PM | #18 | 
| Ex-Helpdesk Junkie            Posts: 19,421 Karma: 85400180 Join Date: Nov 2012 Location: The Beaten Path, USA, Roundworld, This Side of Infinity Device: Kindle Touch fw5.3.7 (Wifi only) | |
|   |   | 
|  07-28-2014, 03:24 PM | #19 | 
| Resident Curmudgeon            Posts: 80,671 Karma: 150249619 Join Date: Nov 2006 Location: Roslindale, Massachusetts Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3 | 
			
			But that's the problem. You don't know ahead of time. You don't know how many or which ones. The idea is for regex to get it right which it doesn't. This is why we need an update to regex to allow us to have more control so we can do things like this. Regex should be able to say I want the next instance of </span> after the <span>. But we cannot do it. So regex should be updated to handle such things.
		 | 
|   |   | 
|  07-28-2014, 03:33 PM | #20 | |
| Wizard            Posts: 3,720 Karma: 1759970 Join Date: Sep 2010 Device: none | Quote: 
 I'd do it for you but I'm feeling a lttle lazy myself , right now  wasn't one presented a few posts back ? | |
|   |   | 
|  07-28-2014, 04:11 PM | #21 | |
| Wizard            Posts: 2,306 Karma: 13057279 Join Date: Jul 2012 Device: Kobo Forma, Nook | Quote: 
 You would be using the wrong tool for the job, and what you want is a parser! Here is some discussion on why Regex isn't recommended for parsing HTML: https://stackoverflow.com/questions/...e-html-why-not There is a reason why they are separate beasts.   | |
|   |   | 
|  07-28-2014, 05:05 PM | #22 | |
| Ex-Helpdesk Junkie            Posts: 19,421 Karma: 85400180 Join Date: Nov 2012 Location: The Beaten Path, USA, Roundworld, This Side of Infinity Device: Kindle Touch fw5.3.7 (Wifi only) | Quote: 
 Also fun link: http://stackoverflow.com/questions/6...lanation-in-la The second answer does a good job explaining why regex is a bad tool for html parsing (and when it is a good tool!). | |
|   |   | 
|  07-28-2014, 07:40 PM | #23 | 
| Wizard            Posts: 1,090 Karma: 447222 Join Date: Jan 2009 Location: Valley Forge, PA, USA Device: Kindle Paperwhite | |
|   |   | 
|  07-28-2014, 07:50 PM | #24 | 
| Grand Sorcerer            Posts: 13,684 Karma: 79983758 Join Date: Nov 2007 Location: Toronto Device: Libra H2O, Libra Colour | 
			
			You need to look for the version posted by Reverend Bob.
		 | 
|   |   | 
|  07-28-2014, 09:56 PM | #25 | 
| Wizard            Posts: 1,090 Karma: 447222 Join Date: Jan 2009 Location: Valley Forge, PA, USA Device: Kindle Paperwhite | 
			
			Sorry, but I still don't see anything close to that title or by Rev Bob. Can you point me in the right direction please? | 
|   |   | 
|  07-28-2014, 10:00 PM | #26 | |
| Ex-Helpdesk Junkie            Posts: 19,421 Karma: 85400180 Join Date: Nov 2012 Location: The Beaten Path, USA, Roundworld, This Side of Infinity Device: Kindle Touch fw5.3.7 (Wifi only) | Quote: 
 | |
|   |   | 
|  07-28-2014, 10:09 PM | #27 | |
| Wizard            Posts: 1,090 Karma: 447222 Join Date: Jan 2009 Location: Valley Forge, PA, USA Device: Kindle Paperwhite | Quote: 
 I only had a chance to browse the last 5 or 6 of the 53 pages, but it looks like a very useful tool Is there a brief description of the options and features that will be in the final version? | |
|   |   | 
|  07-28-2014, 10:14 PM | #28 | 
| Ex-Helpdesk Junkie            Posts: 19,421 Karma: 85400180 Join Date: Nov 2012 Location: The Beaten Path, USA, Roundworld, This Side of Infinity Device: Kindle Touch fw5.3.7 (Wifi only) | 
			
			Buried, "somewhere".   It should add methods for 
 | 
|   |   | 
|  07-28-2014, 10:23 PM | #29 | 
| null operator (he/him)            Posts: 22,006 Karma: 30277294 Join Date: Mar 2012 Location: Sydney Australia Device: none | |
|   |   | 
|  07-28-2014, 10:27 PM | #30 | 
| Ex-Helpdesk Junkie            Posts: 19,421 Karma: 85400180 Join Date: Nov 2012 Location: The Beaten Path, USA, Roundworld, This Side of Infinity Device: Kindle Touch fw5.3.7 (Wifi only) | |
|   |   | 
|  | 
| Thread Tools | Search this Thread | 
| 
 | 
|  Similar Threads | ||||
| Thread | Thread Starter | Forum | Replies | Last Post | 
| Tags & Series | RealRedhair | Library Management | 22 | 07-22-2014 08:28 AM | 
| Calibre Tags & Aldiko Tags Not the Same | Themus | Calibre | 3 | 03-21-2012 08:23 PM | 
| Amazon Tags - Popular tags vs Unique tags. | chrisanthropic | Writers' Corner | 6 | 09-19-2011 11:18 PM | 
| FBReader tags on DR & PC | sasilk | iRex | 0 | 01-23-2010 01:38 AM |