Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 10-31-2010, 03:35 PM   #1
Aia
Junior Member
Aia began at the beginning.
 
Posts: 4
Karma: 10
Join Date: Oct 2010
Device: nook
PDF to epub convertion grief; keeping indentation

Converting pdf to epub for my nook, using Calibre is easy. Except, I can not find a way to keep the proper space (indentation) when the pdf has portion of code examples like python, that requires proper block indentation.
How can I make a proper conversion to epub, where the output file will display the python indentation properly?

Instead of resulted output
Code:
def _find_note(self, note_id):
'''Locate the note with the given id.'''
for note in self.notes:
if str(note.id) == str(note_id):
return note
return None
I would like to have
Code:
def _find_note(self, note_id):
    '''Locate the note with the given id.'''
    for note in self.notes:
        if str(note.id) == str(note_id):
            return note
    return None
I have tried
Code:
p { white-space=pre; }
in the Look and Feel -> Extra CSS in the conversion wizard box. Nevertheless, it doesn't keep the proper indentation.

Is there any thing else I can do, or is this the final state of affairs in the conversion world?
Aia is offline   Reply With Quote
Old 10-31-2010, 04:57 PM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 26,109
Karma: 5101571
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
no the space is clobbered in the pdf input stage itself.
kovidgoyal is offline   Reply With Quote
Old 10-31-2010, 05:09 PM   #3
frostschutz
Linux User
frostschutz ought to be getting tired of karma fortunes by now.frostschutz ought to be getting tired of karma fortunes by now.frostschutz ought to be getting tired of karma fortunes by now.frostschutz ought to be getting tired of karma fortunes by now.frostschutz ought to be getting tired of karma fortunes by now.frostschutz ought to be getting tired of karma fortunes by now.frostschutz ought to be getting tired of karma fortunes by now.frostschutz ought to be getting tired of karma fortunes by now.frostschutz ought to be getting tired of karma fortunes by now.frostschutz ought to be getting tired of karma fortunes by now.frostschutz ought to be getting tired of karma fortunes by now.
 
frostschutz's Avatar
 
Posts: 742
Karma: 2031211
Join Date: Sep 2010
Device: iriver Story HD
In general, PDF has very little knowledge about formatting and contents of a document; instead it is a set of instructions like "draw line from point A to B" or "place letter X in size Y on coordinates Z". So you're lucky to even get simple things such as paragraphs or chapters or headings out of a PDF file. While the indentation is certainly visible to you as a human, this information is not actually readily available in a PDF file since it's more an image of a page layout, rather than the information about the formatting rules that led to this particular page layout.

It's not impossible to convert it, however you'd have to write a custom script that does it. It'd have to be smart enough to recognize the Python snippets and deduct the indentation based on how the text is positioned.

This is one of the two things OCR has to do; one is recognizing the characters - you can skip that step with most (but not all) PDFs; the other is recognizing the layout.

If there aren't too many snippets in the book, it'd probably be faster to just reindent them manually.
frostschutz is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Doc to Epub convertion problems johnbajer Calibre 5 06-04-2010 05:30 PM
Cover pictures after convertion from ePub to Mobi paulpeer Calibre 8 03-23-2010 09:23 AM
Best PDF Convertion Tool Nathan Campos Workshop 5 12-27-2009 10:47 AM
Epub and negative indentation Nate the great ePub 6 04-27-2009 11:48 AM
PDF conversion & indentation Shiren Calibre 5 12-11-2008 02:09 PM


All times are GMT -4. The time now is 04:29 PM.


MobileRead.com is a privately owned, operated and funded community.