|
|
#1 |
|
Junior Member
![]() Posts: 4
Karma: 10
Join Date: Feb 2013
Device: kobo touch
|
Hello,
I am tryimg to match a string (with regex and sigil) that doesn't ends with .</p> Tried 1h and googeling. Thanks Martin |
|
|
|
|
|
#2 |
|
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
|
|
|
|
|
|
#3 |
|
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 31,240
Karma: 61360164
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Sigil 7 has a saved search: Join paragraphs that look for 'paragraphs that end with a letter or comma (not a period).
Does not find all the other sometimes valid cases. For those, you need to carefully craft your Search and step through, skipping (find) or replace+find Never Replace All with the others
|
|
|
|
|
|
#4 |
|
Junior Member
![]() Posts: 4
Karma: 10
Join Date: Feb 2013
Device: kobo touch
|
sry for late reply!
I am usig sigil 0.6.2 cause sigil 0.7.1 will start for a few counts, and then refuses to start(When starting it nothing happens, found a lot of stuff for linux but nothing for windows) maybe someone know where the problem is. back to the topic if have paragraphs like <p>blah some words </p> <p>next words and so on.</p> and i want <p>blah some words next words and so on.</p> regards and thx for help |
|
|
|
|
|
#5 | |
|
Well trained by Cats
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 31,240
Karma: 61360164
Join Date: Aug 2009
Location: The Central Coast of California
Device: Kobo Libra2,Kobo Aura2v1, K4NT(Fixed: New Bat.), Galaxy Tab A
|
Quote:
Code:
(?sm)([a-z,]) </p>\s+<p> do note: your example has a trailing space (the capture discards it), the replace inserts this. This S&R does not (deliberately) include a trailing hyphen/mdash you need to review each of those as join candidates |
|
|
|
|
|
|
#6 |
|
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,306
Karma: 13057279
Join Date: Jul 2012
Device: Kobo Forma, Nook
|
I answered this same exact situation last month in this topic (with the two regular expressions I use to clean this):
https://www.mobileread.com/forums/sho...89#post2446589 Last edited by Tex2002ans; 03-24-2013 at 07:13 PM. |
|
|
|
|
|
#7 |
|
actually it is /var/log
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 341
Karma: 2994236
Join Date: Sep 2012
Location: usually Europa
Device: prs t1
|
For me regex solution was not really comfortable enough. As I usually start cleaning with a pure text I thought I'll do something really nice and wrote this small peace:
Code:
/*
stripLF.c - remove false line breaks
compile with: cc stripLF.c -o stripLF
usage:
stripLF -h
stripLF [-C][-p][-H] < file.in > file.out
cat file.in | stripLF [-C][-p][-H] >file.out
options:
-h: print help text and exit
-C: line break before [C]apitals is legitimate too (for poetry)
-p: change line breaks into </p> LF <p>
-H: change text into bare html page
removes all carridge returns
removes all line breaks which are not
preceded by . _ ! ? * ' "]> other another line break or followed by capital letter: option -C
removes multiple spaces
(c)varlog 2013
LICENSE: FREE FOR ALL
*/
#include <stdio.h>
#define LF 0x0A
#define CR 0x0D
#define SPACE 0x20
#define SINGLE_QUOTE 0x27
#define DOPPEL_QUOTE 0x22
#define VERSION 1.02
void usage(){
printf("\n**********************************************************\n");
printf("stripLF: remove false line breaks \n");
printf("usage: \n");
printf("stripLF [-h] \n");
printf("stripLF [-C] [-p] [-H] < file.in > file.out \n");
printf("cat file.in |stripLF [-C] [-p] [-H] > file.out \n");
printf("options:\n");
printf("-h: print this help text and exit\n");
printf("-C: line break before [C]apitals is legitimate too\n");
printf("-p: change line breaks into </p> LF <p>\n");
printf("-H: add <html><body>......</body></html> tokens, implies -p \n");
printf("v %.2f 2013 (c) varlog\n",VERSION);
printf("***********************************************************\n");
}
main(int argc, char **argv)
{
int ch,pch=LF,nch=0;
int i;
int Cflag=0;
int Hflag=0;
int pflag=0;
int eflag=0;
if(argc>1){
for(i=1 ;i<argc; i++){
if(argv[i][0]=='-') {
switch (argv[i][1]){
case 'C':
Cflag=1; //capitals
break;
case 'p': // LF --> </p><p>
pflag=1;
break;
case 'h': // help
usage();
eflag=1;
break;
case 'H': //-->html
Hflag=1;
pflag=1;
break;
default:
break;
}
}
}
}
if(Hflag) printf("<html>\n<body>\n");
if(pflag) printf("<p>");
while(!eflag)
{
ch = getchar();
if(ch==EOF) break;
if(ch==SPACE && pch==SPACE) {
; //remove space if more than one by ignoring it
}else{
if(ch!=LF && ch!=CR) {
putchar(ch); //just next letter
}else {
if(ch==CR){
ch=pch; //remove CR by ignoring it
}else {
while((nch=getchar())==SPACE); //get next char ignoring SPACE
if(nch==EOF)
{
putchar(ch);
break;
}
if( // it is line break!
pch==']'||
pch=='>'||
pch=='*'||
pch=='_'||
pch=='.'||
pch=='!'||
pch=='?'||
pch==SINGLE_QUOTE||
pch==DOPPEL_QUOTE||
pch==LF||
nch==LF||
(Cflag==1 && nch>=0x40 && nch<=0x5A) //capitals and @
) {
if(pflag) printf("</p>");
putchar(ch);
if(pflag) printf("<p>");
putchar(nch);
ch=nch;
}
else { //phony line break
putchar(SPACE); //change LF into space
putchar(nch);
ch=nch;
}
}
}
pch=ch;
}
} //end while
if(pflag) printf("</p>");
if(Hflag) printf("\n</body>\n</html>");
}
- but it works for me .
|
|
|
|
![]() |
| Thread Tools | Search this Thread |
|
Similar Threads
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| how do I span more than one line with regex | BartB | Sigil | 3 | 12-11-2011 05:12 PM |
| End of the line for Sony Readers? | Rizla | Sony Reader | 264 | 07-06-2011 08:00 PM |
| Importing RegEx Line | TheEldest | Calibre | 1 | 07-05-2011 10:18 PM |
| Insert new line with regex | deckoff | Sigil | 6 | 08-08-2010 11:24 AM |
| Denial of Service 5: End of Line. | Steven Lyle Jordan | Writers' Corner | 19 | 11-10-2009 10:58 PM |