Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 10-18-2010, 10:09 AM   #1
alanjay
Member
alanjay began at the beginning.
 
Posts: 16
Karma: 42
Join Date: Oct 2010
Device: kindle
Talking Converting a film script (PDF) to EPUB with Calibre

Hi,

I've been using Calibre for a few months and think it is a great tool.

I was looking for help and discovered this forum anyway

I keep on being sent film scripts as PDFs and although I can make a stab at converting them to a format that I can comfortably read on my Kindle.

So the route is unfortunately PDF -> MOBI

Scripts are an odd format that is very well suited to e-readers but there don't seem to be any easy way to convert scripts into EPUB format easily.

Anyway if anyone has any tips on this it would be gratefully received.

I assume by playing with the CSS you can try to mimic the way a script is formatted.

Thanks in advance for any comments, sugestions and pointers
alanjay is offline   Reply With Quote
Old 10-18-2010, 02:29 PM   #2
jackie_w
Wizard
jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.
 
Posts: 2,835
Karma: 4199513
Join Date: Sep 2009
Location: UK
Device: Sony PRS-350, PB360, Kobo Glo/AuraHD/Aura6"/AuraH2O
There are some things you could try in this recent thread
jackie_w is offline   Reply With Quote
 
Enthusiast
Old 10-18-2010, 03:35 PM   #3
alanjay
Member
alanjay began at the beginning.
 
Posts: 16
Karma: 42
Join Date: Oct 2010
Device: kindle
Jackie_W thanks for the pointer very useful though not quite solving the problem

I have been looking at working on a set of CSS options to try to make the script readable (if not perfect) and my work so far has led me to this:

Code:
body { font-weight: normal; font-size: 12pt; font-family: courier }
p { font-weight: normal; font-size: 12pt; font-family: courier }
h1 { font-weight: normal; font-size: 12pt; font-family: courier; margin-left: 25em }
h2 { font-weight: normal ; font-size: 12pt; font-family: courier}
calibre1 { font-weight: normal ; font-size: 12pt; font-family: courier; margin-left: 10em}
calibre2 { font-weight: normal ; font-size: 12pt; font-family: courier; margin-left: 25em}
calibre3 { font-weight: normal ; font-size: 12pt; font-family: courier; margin-left: 20em}
br { font-weight: normal ; font-size: 12pt; font-family: courier; margin-left: 20em}
calibre4 { font-weight: italic ; font-size: 12pt; font-family: courier }
i { font-weight: italic ; font-size: 12pt; font-family: courier }
calibre5 { font-weight: normal ; font-size: 12pt; font-family: courier }
h3 { font-weight: normal ; font-size: 12pt; font-family: courier }
h4 { font-weight: normal ; font-size: 12pt; font-family: courier }
It is far from perfect and doesn't properly get the indented character dialogue but is just about readable.

Does anyone know if there is any flexibility on the PDF import routines when the file format is unusual (ie not like a book).

Film scripts from Final Draft and other film / tv screenplay tools have a very highly defined format which I'm sure if one could find the right place in the process to work on would be easy to analyse as it is so different.

For example centered dialogue and character names in capitals, left formatted descriptive passages and left formatted capitals being scene names. Quite a small number of conventions define the way a script is formatted but so far finding a way to convert a screen plat / script onto a Kindle 3 has proven less than satisfactory or consistent
alanjay is offline   Reply With Quote
Old 10-18-2010, 06:39 PM   #4
jackie_w
Wizard
jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.
 
Posts: 2,835
Karma: 4199513
Join Date: Sep 2009
Location: UK
Device: Sony PRS-350, PB360, Kobo Glo/AuraHD/Aura6"/AuraH2O
The problem is that (I think) any conversion of format X to format Y (e.g. PDF to MOBI) is going to remove leading spaces and multiple spaces. I believe this is because conversions use HTML as the intermediate file format and HTML does not (normally) believe in leading/multiple spaces.
  • The only way you can retain the spaces in HTML, and hence the layout, is if you wrap <pre>...</pre> tags around your text. This is why I outlined a method of doing it this way in the thread I pointed you to. However, <pre> tags can be problematic if your lines are too long as the text will disappear off the right-hand edge of your reader.

  • Another option is to copy the entire text from the PDF and paste into a .TXT file. Manually edit the TXT to remove any 'waste' leading spaces if you want. Then, don't convert it, just send the .TXT file to your reader and see what it looks like (do Kindles read TXT files?). I found this to be perfectly readable on my Sony, but it did display in the reader's default serif font not monospace. Is this a problem? Sample attached below.

  • Another way to retain layout, is to read the original file as a PDF on your reader. The problem is usually that 'wasteful' margins leave the text as a tiny area in the middle of your 6-inch screen. You can try to get round this by using a software utility to crop off the whitespace from the PDF then send the cropped PDF to the reader. There are quite a lot of candidates around, I have had some success with both BRISS and soPDF.

If you try to do a conversion of a TXT file, I notice there is a Convert - TXT Input - Preserve spaces option, but the spaces weren't preserved for me. I don't know if I was using it wrongly. I didn't pursue it because I settled for one of the previous options.

Unless you want to splash out on an iPad or some other 'big screen' reader I can't think of any other options. Perhaps someone else will come up with a bright idea.
Attached Thumbnails
Click image for larger version

Name:	Script_txt.jpg
Views:	170
Size:	49.6 KB
ID:	59937  
jackie_w is offline   Reply With Quote
Old 10-18-2010, 07:14 PM   #5
jackie_w
Wizard
jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.
 
Posts: 2,835
Karma: 4199513
Join Date: Sep 2009
Location: UK
Device: Sony PRS-350, PB360, Kobo Glo/AuraHD/Aura6"/AuraH2O
[Update:]

As soon as I'd submitted my reply I had another thought.

By taking the .TXT file and manually editing it to replace the first space of every line with leading spaces into a dot e.g.

Code:
               This is my dialogue

becomes

.              This is my dialogue
I was then able to get the TXT conversion with 'Preserve spaces' to work.

i.e convert TXT to EPUB/MOBI with:
  • Convert - TXT Input - 'Preserve spaces' checked
  • Convert - TXT Input - 'Treat each line as a paragraph' checked
  • Convert - Look&Feel - 'ExtraCSS' set to p {font-family:monospace; font-weight:bold; margin:0;}

My resulting EPUB doesn't look too bad - despite the dots. see below
Attached Thumbnails
Click image for larger version

Name:	script_epub.jpg
Views:	193
Size:	30.6 KB
ID:	59939  

Last edited by jackie_w; 10-18-2010 at 09:51 PM. Reason: updated ExtraCSS
jackie_w is offline   Reply With Quote
Old 10-19-2010, 05:04 AM   #6
alanjay
Member
alanjay began at the beginning.
 
Posts: 16
Karma: 42
Join Date: Oct 2010
Device: kindle
Thanks for the ideas. The dots don't look too bad. If only on my Mac I could copy and paste the text without loosing the spaces not sure why that is.

I suppose I come at this from the other perspective. Both HTML and (I assume) epub formats are/were originally designed to be agnostic on the size and shape of the output device screen.

Where I have / am able to get input in a text form direct from Final Draft then I can manipulate things sufficiently to make the transition work.

Unfortunately the hard part of this process is the conversion from PDF to HTML - if it was possible in Calibre to tweak how this is performed then one could tag (with separate CSS names the various different styles) one could then manipulate the output to produce something more suitable for the display screen yet maintaining the rules for changing font sizes and rotation.

There are tools to find headers and footers on Calibre and Calibre already tags things with multiple css styles so I'm sure there must be a way to crack this.

Though I suspect finding an intermediate way from PDF to some format that can be manipulated might be the future.

For then your solution looks pretty good and I'll try it today...

Thanks.

Alan
alanjay is offline   Reply With Quote
Old 10-19-2010, 05:20 AM   #7
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
The reason you're having the problem with spacing is because the spacing doesn't exist in the pdf document. PDFs are basically just a series of draw commands, which say 'Start drawing xyz at these coordinates'. There are no spaces involved, just coordinates. When Calibre converts pdf to html it doesn't look at where on the page the draw command was started, it just converts that line of text to a paragraph (there is no margin information).

There are other pdftohtml converters out there which do retain some of this information - one I saw used a combination of divs and css to retain where everything starts/ends. That type of conversion goes to the opposite extreme though, so the document retains hard line breaks, isn't very compatible across ebook formats, etc. Google pdf to html and look at the different online converters to see if one gets you closer to what you want, and you could use that as a source instead of the pdf.
ldolse is offline   Reply With Quote
Old 10-19-2010, 10:41 AM   #8
alanjay
Member
alanjay began at the beginning.
 
Posts: 16
Karma: 42
Join Date: Oct 2010
Device: kindle
Thanks for that Idolse any recommendations for PDF to HTML convertors as a first starting point?

The ones that I have played with so far don't seem to be any better than Calibre so any tips from peoples experience would be gratefully received.
alanjay is offline   Reply With Quote
Reply

Tags
calibre, film, final draft, script

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Script for converting ePUB to PDF using Prince Jellby ePub 39 10-21-2014 03:37 PM
Problem converting pdf to epub (size) using calibre abadguy PDF 6 03-23-2012 05:33 AM
Problem converting PDF to EPUB in calibre adgpro Calibre 2 07-09-2010 01:10 AM
Converting from PDF to ePub, Calibre not working Alda ePub 10 07-09-2010 01:00 AM
Calibre: wrong drawings when converting Pdf to epub gillesB. Calibre 1 05-01-2009 12:48 PM


All times are GMT -4. The time now is 05:08 AM.


MobileRead.com is a privately owned, operated and funded community.