Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 10-14-2010, 03:37 PM   #16
kidblue
Connoisseur
kidblue began at the beginning.
 
Posts: 79
Karma: 10
Join Date: Oct 2010
Device: Kindle 3
I think my problem is that both Preview and Adobe Reader (OS X) seem to be highlighting and copying the text and not retaining the formatting. You're on a PC, so there may be a difference in how to retain the indentations, but it's basically copying the text to a clipboard and then spitting it out left-justified.
kidblue is offline   Reply With Quote
Old 10-14-2010, 03:42 PM   #17
jackie_w
Grand Sorcerer
jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.
 
Posts: 6,212
Karma: 16534894
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
Does that mean the first MOBI did look OK on your Kindle, because I just realised that I had Convert - PageSetup - OutputProfile set to Sony instead of Kindle when I created the MOBI?
jackie_w is offline   Reply With Quote
Advert
Old 10-14-2010, 03:51 PM   #18
kidblue
Connoisseur
kidblue began at the beginning.
 
Posts: 79
Karma: 10
Join Date: Oct 2010
Device: Kindle 3
Yes, they both look fine - I just can't figure out how to copy the PDF text with the margins/formatting. Highlighting and copying just results in left-justified text.
kidblue is offline   Reply With Quote
Old 10-14-2010, 04:01 PM   #19
jackie_w
Grand Sorcerer
jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.
 
Posts: 6,212
Karma: 16534894
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
I have absolutely no MAC experience so I can't help with this. If any MAC users are reading this maybe they will offer some help. It's probably something really easy.

Meanwhile I'll see if I can think of a way (free utility) to get the text out of the PDF without copy/paste. Ile's XPDF suggestion worked for me but using it on OS X also seemed to be in question.
jackie_w is offline   Reply With Quote
Old 10-14-2010, 04:02 PM   #20
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
For one thing, Calibre's existing pdf conversion engine doesn't have the ability to preserve that information during conversion. There are some other free pdftohtml utilities out there that can do this, I found a decent online one a while back, can't remember the link right now, but there are numerous options if you just google it:
http://www.google.com.my/search?hl=e...=&oq=&gs_rfai=

For another thing, Kindle's margin handling is a mess. Based on the screenshot those margins are possible using nested blockquotes, but the whole document would need to be formatted by hand to get similar indenting, as it can't easily be done just using standard margins, as the kindle/mobi doesn't support those.

There is some discussion/links at this bug:
http://bugs.calibre-ebook.com/ticket/7015#comment:20
ldolse is offline   Reply With Quote
Advert
Old 10-14-2010, 04:48 PM   #21
jackie_w
Grand Sorcerer
jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.
 
Posts: 6,212
Karma: 16534894
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
@Idolse, I'm confused with your reply. Unless I've totally got the wrong end of the stick, kidblue seemed to confirm that the MOBI I attached, read OK on his Kindle and the HTML I attached both converted OK to MOBI and the MOBI read OK on his Kindle. His only problem is within his OS (OS X) where manually copying the text from the PDF and pasting into a text file didn't retain the leading spaces (all spaces?) correctly. I do not understand this problem because on Windows it just works with no problem.
jackie_w is offline   Reply With Quote
Old 10-14-2010, 05:00 PM   #22
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
I might be the one confused actually - I thought I read that the problem was on the Kindle display, but upon re-reading I don't see that.
ldolse is offline   Reply With Quote
Old 10-14-2010, 06:20 PM   #23
jackie_w
Grand Sorcerer
jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.
 
Posts: 6,212
Karma: 16534894
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
@kidblue,

In the absence of a solution to your copy/paste problem in OSX you could try the following:
  • Import the screenplay PDF into Calibre.

  • Convert PDF to MOBI with the Convert - Debug feature enabled (ignore the resulting MOBI).

  • Go to the Debug directory's Input subdir. There is an HTML file called index.html. Copy this file to a temporary working area (or it will get overwritten if you run another conversion with debug enabled).

  • Open index.html in a text editor.

  • Change the <BODY bgcolor=... ...> tag to a simple <BODY>

  • Enter an opening <pre> tag on the line after the <BODY> tag

    Enter a closing </pre> tag on the line before the closing </BODY> tag.

  • Then you need to do a number of find-and-replaces:
    Find: &nbsp; ReplaceAll: a single space
    Find: <br> ReplaceAll: nothing
    Find: <hr> ReplaceAll: nothing
    Find: &quot; ReplaceAll: " (a double quote)

    Remove a fixed number of leading spaces on each line which represents the wasteful left margin (looks like 8). Remove any other "waste" (excessive blank lines) if you want.

  • Completely remove the lines that begin <A name=... ...
    (You can do this in one hit if your text editor supports Regular Expressions.)

  • This HTML file should now look very much like the one I sent you earlier. Save it.

The above replaces steps 1-5 in the earlier instructions. Continue with step 6 onwards as before.

If this doesn't work I may start throwing things!

[Edit:] I'm joking - please report back with results.

Last edited by jackie_w; 10-14-2010 at 06:35 PM.
jackie_w is offline   Reply With Quote
Old 10-14-2010, 06:38 PM   #24
jackie_w
Grand Sorcerer
jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.
 
Posts: 6,212
Karma: 16534894
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
Oops, I missed a Find/Replace. I've edited the above post and added the blue line.

This cobbled together potential solution is not ideal. You would still be better finding the answer to your OSX copy/paste problem. As you can see Page 1 is not quite right but the dialogue looks OK.

Last edited by jackie_w; 10-14-2010 at 06:44 PM.
jackie_w is offline   Reply With Quote
Old 10-18-2010, 03:34 PM   #25
alanjay
Member
alanjay began at the beginning.
 
Posts: 16
Karma: 42
Join Date: Oct 2010
Device: kindle
film script / screen play issues on Mac to Kindle

Thanks for this very interesting thread. I had already tried some of the suggested routes without much success on the Mac - the problem seems to be that the formatting is being lost out of the PDF for some reason.

Even starting with the Calibre conversion and looking at the HTML it seems to have lost all the formatting and spaces which is slightly frustrating.

I have been looking at working on a set of CSS options to try to make the script readable (if not perfect) and my work so far has led me to this:

Code:
body { font-weight: normal; font-size: 12pt; font-family: courier }
p { font-weight: normal; font-size: 12pt; font-family: courier }
h1 { font-weight: normal; font-size: 12pt; font-family: courier; margin-left: 25em }
h2 { font-weight: normal ; font-size: 12pt; font-family: courier}
calibre1 { font-weight: normal ; font-size: 12pt; font-family: courier; margin-left: 10em}
calibre2 { font-weight: normal ; font-size: 12pt; font-family: courier; margin-left: 25em}
calibre3 { font-weight: normal ; font-size: 12pt; font-family: courier; margin-left: 20em}
br { font-weight: normal ; font-size: 12pt; font-family: courier; margin-left: 20em}
calibre4 { font-weight: italic ; font-size: 12pt; font-family: courier }
i { font-weight: italic ; font-size: 12pt; font-family: courier }
calibre5 { font-weight: normal ; font-size: 12pt; font-family: courier }
h3 { font-weight: normal ; font-size: 12pt; font-family: courier }
h4 { font-weight: normal ; font-size: 12pt; font-family: courier }
It is far from perfect and doesn't properly get the indented character dialogue but is just about readable.

Does anyone know if there is any flexibility on the PDF import routines when the file format is unusual (ie not like a book).

Film scripts from Final Draft and other film / tv screenplay tools have a very highly defined format which I'm sure if one could find the right place in the process to work on would be easy to analyse as it is so different.

For example centered dialogue and character names in capitals, left formatted descriptive passages and left formatted capitals being scene names. Quite a small number of conventions define the way a script is formatted but so far finding a way to convert a screen plat / script onto a Kindle 3 has proven less than satisfactory or consistent

I'm glad other people are contemplating this as well maybe together there is a solution.
alanjay is offline   Reply With Quote
Old 06-24-2012, 01:47 PM   #26
Analoggab
Member
Analoggab began at the beginning.
 
Posts: 21
Karma: 10
Join Date: Dec 2011
Device: Sony T1
Hey guys great thread!!
I've been looking for a long time a way to do this.

Jackie_w's solution works well but stangely, it only works when I import a html into calibre, then convert it to Mobi, then to epub. html > Mobi > epub

If I convert it from html to epub, it doesn't retain the format as does the technique and Mobi file Jackie_w shared with us.

Any ideas?
It must be somewhere in the conversion process.


edit: I know this thread is old but I hope some of you are still subscribed or will take a look.

Last edited by Analoggab; 06-25-2012 at 09:04 PM.
Analoggab is offline   Reply With Quote
Old 06-28-2012, 05:31 PM   #27
jackie_w
Grand Sorcerer
jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.
 
Posts: 6,212
Karma: 16534894
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
A blast from the past...

Funnily enough, the 'screenplay question' also came up again very recently and I posted an updated option. See this thread, my contribution starts at post #8.

This method involves creating a TXT file (containing all the formatting spaces) from the PDF using the XPDF (free) Windows utilities. Google should find it quickly.

Once you have the nicely formatted TXT it is simple to run a TXT-->epub conversion to get something which reads very nicely on a PRST1 in landscape orientation. The linked thread shows some pictures.

It all hinges on whether you can get the relevant XPDF commandline utility to run in your Mac-Windows environment.

If it works, the rest is easy, but if you get stuck with the necessary 'TXT Input' calibre conversion options, I can be more specific.

[Edit:] dear, oh dear I really should get my facts straight before posting The TXT to epub calibre conversion does work, but I got a better result by dropping the TXT file into a boilerplate HTML file and doing a ZIP to epub conversion. Let me relook at my test files and I'll post further details later.

Last edited by jackie_w; 06-28-2012 at 05:48 PM.
jackie_w is offline   Reply With Quote
Old 06-28-2012, 06:17 PM   #28
jackie_w
Grand Sorcerer
jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.
 
Posts: 6,212
Karma: 16534894
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
Follow up...

OK, try again...
  1. Create the TXT file with XPDF's pdftotext.exe program
    Code:
    pdftotext -layout screenplay_in.pdf screenplay_out.txt
  2. Open the output TXT file in a text editor. Make sure the text visually resembles the PDF layout (leading spaces etc)
  3. Add this html code before the first line of the screenplay text
    Code:
    <html>
    <head>
        <title>Screenplay title</title>
        <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    </head>
    <body>
    <pre>
  4. Add this html code after the last line of the screenplay text
    Code:
    </pre>
    </body>
    </html>
  5. Save the file as screenplay.html
  6. Import screenplay.html into calibre (which will turn it into a ZIP file)
  7. Convert zip --> epub. No special conversion options required. The <pre> tags should make sure the epub displays in your monospace font.
  8. Send epub to PRST1 and display in landscape orientation for best results.

P.S. This is one way of doing it and should not be considered as the 'One True Way'

Last edited by jackie_w; 06-28-2012 at 06:27 PM. Reason: more detail)
jackie_w is offline   Reply With Quote
Old 06-29-2012, 03:17 PM   #29
Analoggab
Member
Analoggab began at the beginning.
 
Posts: 21
Karma: 10
Join Date: Dec 2011
Device: Sony T1
Hey Jackie! Nice to see you back.
I'll test out the XPDF program when I get home.
I had tested various online pdftotext solutions (like this) but with mixed results. Perhaps the conversion engine is not as good.

Have you had success going from the edited html with the <pre> tags directly to epub?
Strangely, using your attached html in this thread, I only had success going from html >*Mobi >*epub. But html >*epub directly never seem to work. Default settings.
Analoggab is offline   Reply With Quote
Old 06-29-2012, 06:01 PM   #30
jackie_w
Grand Sorcerer
jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.jackie_w ought to be getting tired of karma fortunes by now.
 
Posts: 6,212
Karma: 16534894
Join Date: Sep 2009
Location: UK
Device: Kobo: KA1, ClaraHD, Forma, Libra2, Clara2E. PocketBook: TouchHD3
Quote:
Originally Posted by Analoggab View Post
Hey Jackie! Nice to see you back.
I'll test out the XPDF program when I get home.
I had tested various online pdftotext solutions (like this) but with mixed results. Perhaps the conversion engine is not as good.
I think many (most?) of them get rid of leading spaces, which is exactly what you don't want. The XPDF pdftotext utility, using the -layout option, does exactly what you need, if you can make it work on the Mac. The reason I switched to this utility is that a simple PDF Select all>Copy>Paste into a TXT file retained leading spaces only for some PDFs -- even on Windows. It probably depends on how the PDF was created in the first place.

Quote:
Originally Posted by Analoggab View Post
Have you had success going from the edited html with the <pre> tags directly to epub?
Strangely, using your attached html in this thread, I only had success going from html >*Mobi >*epub. But html >*epub directly never seem to work. Default settings.
Yes, a direct zip-->epub calibre conversion works for me. No interim mobi required. If you can't make it work, if you like you can send me a PM with a link to your manually-created html file and we'll take it from there.
jackie_w is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
kindle 2: hightlighting text in a pdf? venkat3 Amazon Kindle 4 09-13-2012 07:49 PM
Sony or kindle for text based PDFs? paulpod Which one should I buy? 1 10-12-2010 11:11 AM
HTML to MOBI text format is off when I get it on Kindle cloudyvisions Calibre 5 07-14-2010 12:42 AM
will kindle text 2 speech work on any .mobi books? neoromance Kindle Formats 1 01-31-2010 06:12 PM
Cybook & text-based pdfs StephieP Bookeen 17 04-28-2008 11:50 AM


All times are GMT -4. The time now is 09:12 PM.


MobileRead.com is a privately owned, operated and funded community.