View Single Post
Old 02-25-2023, 06:47 AM   #1
HenryHutton
Junior Member
HenryHutton began at the beginning.
 
HenryHutton's Avatar
 
Posts: 3
Karma: 10
Join Date: Feb 2023
Device: Kobo Nia (Not so happy, so far) [Formerly Kindle (7th gen)]
[Regex] How to remove a whole final section from a blog post?

I "news fetched" some posts from a blog of an author I would like to have on my e-reader as an ebook.

Now I am trying to edit out with the Editor the last section of each post , which contains links and informations I don't need on the final ebook.

All posts end with a signature, a motto, which is:
Code:
<p class="calibre10"><span>[Il mondo è bello, siamo noi ad esser ciechi]</span></p>
So my aim is to get an expression that includes this last previous bit (as a (group) to feed the "replace" field), down to
Code:
</body>

</html>
(ideally a second (group) ), so to trim out all the links and unneeded infos.

Well, so far I didn't achieved much..

My BEST () guess was...
Code:
(\[Il mondo è bello, siamo noi ad esser ciechi\])*</body>\w+</html>
but of course it doesn't work.

Any other functions/tricks that would achieve the same output are welcome!

I running short of time, that's why I am asking some hints instead of reading and learning more (or edit them all out manually).



I attach one of the html fetched.

The blog is reachable here, for the record:
http://www.salvatorebrizzi.com/
Attached Files
File Type: zip index_u10.html.zip (7.4 KB, 135 views)
HenryHutton is offline   Reply With Quote