Brilliant, thanks Kovid, I got close, it appears I should have just ignored experimenting with PDFDocument

. Two further questions if I may...
(1) Correct me if wrong but pdfreflow seems to use the current working directory to produce its output in. I can't find "the magic" which would allow using a PersistentTemporaryDirectory as the current working directory? I looked for something like os.chdir in the Calibre code but couldn't spot anything - what is the recommended way?
(2) With your changes to pdfreflow, will it be possible to either know how many pages there are, or be able to specify in some way a range for the end pages? I have found quite a number of PDFs where the ISBN is at the end unfortunately, so in an ideal world I would scan say the first 10 pages and the last 5. If there is no way to do that without scanning the whole document then so be it.