All you need is something like this in preprocess_raw_html
Code:
soup = self.index_to_soup(raw)
script = soup.find('script', type="application/ld+json")
data = json.loads(str(script.contents[0]))
then you simply exract the data from json and cnvert it to simple html and return that.