View Single Post
Old 04-07-2009, 01:45 PM   #5
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
Cool, I'll check out html.py and open some bugs. I think the two 1st and the third regexes are pretty conservative so I'll see how they work across some other novels. The third one is a bit more questionable, it works well with what I've tested so far, but I'm not sure if it's worth the time expense for most users with the existing greedy matches.

@StefTeamEdward, haven't run into a PDF with page numbers that need dealing with yet, so haven't put much thought there, sorry. So far Calibre doesn't recognize any chapters at all in the pdfs I've converted.
ldolse is offline   Reply With Quote