MobileRead Forums - View Single Post

adrian_loetscher · 01-03-2012, 08:28 AM

Thank you for your response. Then I will try to manipulate the parsed-subdirectory, as the pre-conversion of the pdf-file has a good resultat and recognizes the paragraphs correctly.

I thought about the following way:

First conversion (pdf->epub): with options -d DEBUG_PIPELINE and substitute page-break with the marker "*****"
Processing manually: Then manipulate the generated html-file in the parsed-subdirectory with sed using the introduced marker to still detect the page-break, which is important in my case to detect headings
Second conversion (opf->epub): Then convert the opf-file in the parsed-subdirectory to epub

01-03-2012, 08:28 AM	#5
adrian_loetscher Junior Member Posts: 3 Karma: 10 Join Date: Dec 2011 Device: Archos 70B	Thank you for your response. Then I will try to manipulate the parsed-subdirectory, as the pre-conversion of the pdf-file has a good resultat and recognizes the paragraphs correctly. I thought about the following way: First conversion (pdf->epub): with options -d DEBUG_PIPELINE and substitute page-break with the marker "*****" Processing manually: Then manipulate the generated html-file in the parsed-subdirectory with sed using the introduced marker to still detect the page-break, which is important in my case to detect headings Second conversion (opf->epub): Then convert the opf-file in the parsed-subdirectory to epub