|  07-07-2024, 07:35 AM | #1 | 
| Junior Member  Posts: 5 Karma: 10 Join Date: Jun 2024 Device: desktop | 
				
				Conversion from pdf to txt yields (near) empty file
			 
			
			Hi there, I'm faced with a small problem: I am trying to convert a pdf to txt, and that pdf allows me to select and copy paste (it's not just images). I first tried: Code: pdftotext source.pdf source.txt Code: ebook-convert source.pdf source.txt Does anybody know if there are options I could try in this case, or is it likely hopeless (e.g. because of the complexity of the pdf) ? Using ebook-convert (calibre 7.13.0). Thanks a lot in advance! | 
|   |   | 
|  07-07-2024, 10:07 AM | #2 | 
| creator of calibre            Posts: 45,600 Karma: 28548974 Join Date: Oct 2006 Location: Mumbai, India Device: Various | 
			
			If you can select, you can possibly select all and copy paste to get the text out.
		 | 
|   |   | 
|  07-07-2024, 10:25 AM | #3 | 
| Zealot            Posts: 142 Karma: 642206 Join Date: Mar 2021 Device: Kindle Voyage | 
			
			If copy/paste does not work, then you could try OCR. Mind you it didn't work for me when I tried it. I had the same issue on a pdf of a really old public domain book and when I took a closer look at the pdf, I came to the conclusion that it was just a collection of photographs of the pages of a physical book. Some of the pages were not exactly straight and some were more faded than others. Because I prefer using my kindle and because reading such pdf's on my kindle is such a pain, I tried a few things to get an epub text version including using OCR. But although the quality of the images was enough to make it easily readable as a pdf, it obviously was not good enough for OCR, so I basically gave up and read the pdf on my computer | 
|   |   | 
|  07-07-2024, 10:37 AM | #4 | 
| Junior Member  Posts: 5 Karma: 10 Join Date: Jun 2024 Device: desktop | 
			
			I tried, but somehow only the text of the current page gets copied (both in Calibre and in MacOS Preview). After selecting all, the current page content is selected, but other pages are somehow selected 'overall', unsure how to describe this, and the content is therefore not captured by the copying.
		 | 
|   |   | 
|  | 
| Tags | 
| .pdf, .txt, conversion .pdf .txt, ebook-convert | 
| 
 | 
|  Similar Threads | ||||
| Thread | Thread Starter | Forum | Replies | Last Post | 
| conversion to txt or rtf makes empty file | lunixer | Calibre | 10 | 08-25-2010 04:56 PM | 
| PDF to TXT conversion | alkr | Calibre | 0 | 10-02-2009 04:34 AM | 
| Newbie Question re txt File Conversion | GJN | Calibre | 7 | 09-04-2009 07:40 AM | 
| Calibre comic conversion yields poor results when target is epub; looks fine on LRF | acidzebra | Calibre | 2 | 08-17-2009 10:54 AM |