| 
			
			 | 
		#1 | 
| 
			
			
			
			 Groupie 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 196 
				Karma: 1003498 
				Join Date: Jun 2010 
				
				
				
				Device: none 
				
				
				 | 
	
	
	
		
		
			
			 
				
				pdf regex question - regex that wraps to a new line
			 
			
			
			I'm trying to eliminate chapter titles that show up as headers in a pdf. 
		
	
		
		
		
		
		
		
		
		
		
		
	
	The pdf text looks something like this: Blah blah blah blah Blah blah blah blah <br> Blah blah blah blah Blah blah blah blah blah blah <br> Blah blah blah blah Blah blah blah blah <br> Blah blah blah blah Blah blah blah blah blah <br> <hr/> <a id="p55"></a>Some Chapter Title<br> 55<br> blah blah blah blah <br> rBlah blah blah blah blah.<br> Using the following regex, I'm able to select this text: <a id="p55"></a>Some Chapter Title<br> regex: <a id="p[0-9]*"></a>[A-Z][^<]*<br> But what I really want to match is the same text as above AND the page number on the next row: <a id="p55"></a>Some Chapter Title<br> 55<br> The reason I want to do this is not just to get rid of the page numbers, but also sometimes actual sentences of the book get captured by this regex, but these sentences are not followed by page numbers - the page numbers only follow the chapter title headers in this particular sequence. Problem is the regex won't wrap to the next line, so if I try: regex: <a id="p[0-9]*"></a>[A-Z][^<]*<br>[0-9]* I get zero matches. Any ideas?  | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| 
			
			 | 
		#2 | 
| 
			
			
			
			 Groupie 
			
			![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 196 
				Karma: 1003498 
				Join Date: Jun 2010 
				
				
				
				Device: none 
				
				
				 | 
	
	
	
		
		
		
		
		 
			
			Figured it out. 
		
	
		
		
		
		
		
		
		
		
		
		
	
	regex: <a id="p[0-9]*"></a>[^<]*<br>[\r\n]*[0-9]*<br> Will match: <a id="p55"></a>Some Chapter Title<br> 55<br>  | 
| 
		 | 
	
	
	
		
		
		
		
			 
		
		
		
		
		
		
		
			
		
		
		
	 | 
| Advert | |
| 
         | 
    
![]()  | 
            
        
    
            
  | 
    
			 
			Similar Threads
		 | 
	||||
| Thread | Thread Starter | Forum | Replies | Last Post | 
| Regex to count line wraps? | kboogie222 | Library Management | 12 | 09-15-2019 10:12 PM | 
| Removing Line breaks using regex in PDF when converting | tankervin | Conversion | 3 | 01-12-2017 05:23 PM | 
| how do I span more than one line with regex | BartB | Sigil | 3 | 12-11-2011 06:12 PM | 
| Importing RegEx Line | TheEldest | Calibre | 1 | 07-05-2011 11:18 PM | 
| Insert new line with regex | deckoff | Sigil | 6 | 08-08-2010 12:24 PM |