MobileRead Forums - View Single Post - HIGHLY RECOMMEND - for non-PDF

Philosopher · 04-01-2015, 05:40 PM

Quote:

Originally Posted by JSWolf

You can't do PDF>another format all that well because you will have errors. The only way to avoid errors in the conversion is to A/B compare the entire PDF to the entire other format. Then after you've fixed the errors can you say you've done a good conversion.

Yes - I have been trying to "perfect" a system to convert PDF to mobi - for my Kindle - but so much time editing the little details. But I have been reading a few suggestions on streamlining this process - like for getting rid of headers/footers , etc. And so I am keeping it on the back burner to try and see if I can come up with a "set" of standard procedures (the other problem - is that not all PDF's are THE SAME format - so it will vary by file).

I hope someone comes up with a fairly EASY (GUI) transform program for PDF to others. It would be a great program. (Perhaps the people at A-PDF - who have so many great PDF tools - might consider it).

But one which has a dual window - with the ORIGINAL PDF on the left - showing it formatting, etc. And on the right - the NEW FORMAT as it WILL APPEAR.

So you can SEE if you make a particular change - if it will be something that will universally work throughout the whole file, etc. And so then be able to see the final product - before doing the processing - and so be fairly certain of the quality of the outcome.

That seems possible/doable - and would be great. (I guess it could go the other way too - TO PDF - but I really have no problem with the general product you get converting mobi or epub to PDF. It would be nice to have it do a better job PAGINATING when becoming the PDF - but its still easily readable without that so that when I need that - I can use it fairly productively).

On that note - ONE OTHER idea I wish could be developed - is a way to have a DJVU file (with text not just image) convert to a PDF (and keep the text - not just give you the image - as is now a LIMITATION of all that I know of). I guess that's probably a bit harder - given that there are actually two layers - the image - and in the background (but "linked" to the image [sometimes not that great - the highlight not really going over the actual line - but this is not the norm)] - and it would have to know how to create a new image in the pdf - and take that background text layer in the DJVU - but know how to "link" it to the "place" on the page where the image has the line of text.

I guess its possible - but perhaps too difficult to make it a really viable idea. Thus leaving me the task of convert DJVU (with text) to PDF (as image) - and then have Acrobat OCR the PDF (or another program). But that's not going to give you the "perfect" output - that the DJVU had. But at least the OCR functionality has been getting better and better every year.