Originally Posted by ldolse
Cool, I'll check out html.py and open some bugs. I think the two 1st and the third regexes are pretty conservative so I'll see how they work across some other novels. The third one is a bit more questionable, it works well with what I've tested so far, but I'm not sure if it's worth the time expense for most users with the existing greedy matches.
@StefTeamEdward, haven't run into a PDF with page numbers that need dealing with yet, so haven't put much thought there, sorry. So far Calibre doesn't recognize any chapters at all in the pdfs I've converted.
|