Okay, so I thought I had everything entirely figured out. I've generated the proper mappings in create_opf without any issue.
And in postprocess_book, I can even find every HTML file and fix the hrefs, for example:
Code:
def postprocess_book(self, oeb, opts, log):
output_files = [ self.path_remappings[key] for key in self.path_remappings.keys() ]
for output in output_files:
# Load the HTML file in
f = open(self.output_dir + '/feed_0/' + output)
soup = bs(f)
f.close()
# Replace all the anchors
anchors = soup.findAll('a')
for anchor in anchors:
if '/case/' in anchor['href']:
if anchor['href'] in self.path_remappings:
anchor['href'] = '../' + self.path_remappings[ anchor['href'] ]
# Write it back out
with open(self.output_dir + '/feed_0/' + output, "wb") as f:
html = unicode(soup)
f.write(html.encode('utf-8'))
f.close()
Looking at the Soup, I see that the href went from:
their <a href="/case/174">newly appointed</a> master-in-training Zjing decided that they should work in separate shifts -- Landhwa by day, Wangohan by night.</p>
To:
their <a href="../article_5/index.html">newly appointed</a> master-in-training Zjing decided that they should work in separate shifts -- Landhwa by day, Wangohan by night.</p>
However, in the very final file (article_5/index_u1.html), it ends up like this:
their <a href="../..//case/174">newly appointed</a> master-in-training Zjing decided that they should work in separate shifts -- Landhwa by day, Wangohan by night.</p>
Am I going about this the wrong way, by messing with the HTML files in the output directory? Should I instead be mucking around with some internal structure in oeb?