![]() |
#1 |
Zealot
![]() ![]() Posts: 119
Karma: 100
Join Date: Jan 2011
Location: Germany / NRW /Köln
Device: prs-650 / prs-350 /kindle 3
|
regex replace???
hi at all,
a new prob with regex i've found during converting my books. any sugg. ? ____________________________________ der neuen Rechtschreibung. <br> <br> 4 <br> <hr> <A name=5></a><i>Vorwort </i><br> <br> Katzen haben in meinem ____________________________________ result is: der neuen Rechtschreibung. Vorwort Katzen haben in meinem Leben but i will to have it: ____________________________________ der neuen Rechtschreibung. Vorwort Katzen haben in meinem Leben ____________________________________ is this possible? have tried many terms, but nothing do it as wished. how can i replace <i> and </i> with newline follow? or simple insert newlines to the tag's? thanx for help olaf |
![]() |
![]() |
![]() |
#2 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
|
It's relatively unclear to me what you want. Do you want to totally remove all markup from that snippet? Or are the results you posted results without the markup, as they would be rendered in the reader?
Also, without knowing what regexes you tried, it's hard to know what to suggest. |
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Zealot
![]() ![]() Posts: 119
Karma: 100
Join Date: Jan 2011
Location: Germany / NRW /Köln
Device: prs-650 / prs-350 /kindle 3
|
hi Manichean,
the results are without processing any regex. this is the result of the epub book converting. i want to insert "newlines" on every markup (<i> and </i>). in original the word "Vorwort" is alone in line, after processing it is member of the next line. hope i have explained it (sorry about my bad english, too far away from school) olaf |
![]() |
![]() |
![]() |
#4 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
|
Oh, alright, I understand. However, in your conversion results, the number "4" is missing, which I presume to be a page number removed by a regex, right?
What source format are you converting from and what format are you converting to? |
![]() |
![]() |
![]() |
#5 |
Zealot
![]() ![]() Posts: 119
Karma: 100
Join Date: Jan 2011
Location: Germany / NRW /Köln
Device: prs-650 / prs-350 /kindle 3
|
no,
not the number is the prob. i want to insert a newline after each <i> tag to get the tagged word in a separated line. remove of the page numbering is not the prob yet. source: pdf target: epub |
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
|
I don't know if that is possible. The problem is, I believe, that the source document, judging from the snippet you posted, doesn't seem to be marked up using paragraph tags, but rather "dumb" linebreaks. You could try enabling the preprocessing facility (found in the structure detection part of the conversion settings) and see if Calibre fixes the markup to include paragraph tags. However, I cannot say if that will necessarily occur for every italics tag.
Another, probably better, solution that comes to mind would be using Sigil. Assuming that the italics tags are preserved, as they should be, you could do a search and replace in sigil on the XHTML and add linebreaks (or paragraph tags) before and/or after the italics as you like. |
![]() |
![]() |
![]() |
#7 |
Sigil & calibre developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,487
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
|
Wait for the next release. The header and header and footer regexes are replaced with true regex search and replace. You will be able to specify <i> to be replaced with <i>\n.
|
![]() |
![]() |
![]() |
#8 |
Zealot
![]() ![]() Posts: 119
Karma: 100
Join Date: Jan 2011
Location: Germany / NRW /Köln
Device: prs-650 / prs-350 /kindle 3
|
thanx for info user_none,
I hope the next release coming soon, so that I can edit my books quickly. have you a date for the release? olaf |
![]() |
![]() |
![]() |
#9 |
Sigil & calibre developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,487
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
|
There is never a firm date for a release. However, they typically happen once a week.
|
![]() |
![]() |
![]() |
#10 |
Zealot
![]() ![]() Posts: 119
Karma: 100
Join Date: Jan 2011
Location: Germany / NRW /Köln
Device: prs-650 / prs-350 /kindle 3
|
I'll wait patiently
|
![]() |
![]() |
![]() |
#11 | |
Avid reader
![]() Posts: 19
Karma: 10
Join Date: Feb 2009
Location: Argentina
Device: Kindle 3 wifi
|
Quote:
Wolf. |
|
![]() |
![]() |
![]() |
#12 | |
Avid reader
![]() Posts: 19
Karma: 10
Join Date: Feb 2009
Location: Argentina
Device: Kindle 3 wifi
|
Quote:
Thanks, Wolf. |
|
![]() |
![]() |
![]() |
#13 | |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 3,130
Karma: 91256
Join Date: Feb 2008
Location: Germany
Device: Cybook Gen3
|
Quote:
|
|
![]() |
![]() |
![]() |
#14 | |
Sigil & calibre developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,487
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
|
Quote:
They have been replaced with the search and replace. Put the regex in the regex filed and leave the replace field blank if you want to have it delete the content. |
|
![]() |
![]() |
![]() |
#15 |
Avid reader
![]() Posts: 19
Karma: 10
Join Date: Feb 2009
Location: Argentina
Device: Kindle 3 wifi
|
Thanks Manichean, user_none for the confirmation. I'll go & craft my regex removal then. Thanks! Wolf
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
RegEx find and replace | iblesq | Sigil | 1 | 01-10-2011 09:26 PM |
need regex help search and replace | schuster | Calibre | 4 | 01-10-2011 09:00 AM |
REGEX find and replace help please | potestus | Sigil | 13 | 09-18-2010 04:14 PM |
Help with a regex | A.T.E. | Calibre | 1 | 04-05-2010 07:50 AM |
Regex help... | Bobthebass | Workshop | 6 | 04-26-2009 03:54 PM |