I have 1400+ png files of pages. Images are not scanned, they are probably computer generated because there is no color corruption, distortion etc. They are probably from an official e-book version of a book. They are all in 1519 x 2459 resolution (I believe the book itself has 105 x 170 mm papers).
- I tried to merge them by Adobe Acrobat Pro, but after merging I couldn't save the final pdf. It somehow broke save function. I tried a couple of time. Try to do it by halves. First 700 pages can saved but second half broke it again.
- I tried to merge them by Abbyy FineReader, however, no matter which setting I tried, it produced a pdf with inconsistent page sizes (even though all images are in same resolution) and sometimes zoomed in for some pages.
- I finally merged them by PDF Shaper Professional. It gave me a 2GB+ pdf file with all pages having same size which is 401.9 x 650.6 mm (for comparison, A4 is 210 x 297 mm). Then, I processed this pdf file with Abbyy and convert it to searchable PDF. Then, I compressed it by Acrobat Pro and finally have 250MB+ searchable pdf.
This pdf is still heavy. Abbyy crashes when I try to convert it to epub. MS Word can't process both size and (after splitting the pdf) page size. Kindle and Google Play Books not accepting because of size.
I need to lower its page size to make it lighter. However, every solution I tried produced image only pdfs and deleted OCR data. I can again scan this by Abbyy to make it searchable but I suppose it will have less correct OCR since I lowered the quality of images (pages). I want to have a light and searchable pdf of this book and then further convert it to epub so I can read it easily from my phone. There is no graphics in the pages as far as I saw, except for the first cover page, it is a anthology book for short stories. I'm using Windows 10, the OCR language is Turkish but the book is old and have old words so dictionary based OCRs work more inaccurately. What do you suggest I do?