Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > Workshop

Notices

Reply
 
Thread Tools Search this Thread
Old 10-16-2008, 11:11 AM   #1
JGB
Groupie
JGB ought to be getting tired of karma fortunes by now.JGB ought to be getting tired of karma fortunes by now.JGB ought to be getting tired of karma fortunes by now.JGB ought to be getting tired of karma fortunes by now.JGB ought to be getting tired of karma fortunes by now.JGB ought to be getting tired of karma fortunes by now.JGB ought to be getting tired of karma fortunes by now.JGB ought to be getting tired of karma fortunes by now.JGB ought to be getting tired of karma fortunes by now.JGB ought to be getting tired of karma fortunes by now.JGB ought to be getting tired of karma fortunes by now.
 
Posts: 168
Karma: 1010000
Join Date: Jul 2008
Device: PRS505
What should I convert my .pdf files to?

so far I've been opening them in acrobat and saving as a .rtf, then importing into caliber then converting to epub.
But it takes 40 minutes to convert a single book.
And the formatting is only ok, not nearly as nice as .lit to .epub


Also I'm stuck with a file that is 5 times larger at least if I want to keep a format other then the .epub
is there an export or save option in acrobat that is more efficient?
Would I be better off using a converter like PDFRead?
or is that more for converting scanned PDF files that are all images?

Would saving to html work better, and should I clean it up after that?
if I save it to HTML should I use 3.32 or 4.0?

Is there anywhere with an indepth explanation of how to use caliber command line? every time I use it I get lost or it does nothing.


Is there anywhere with a good resource for starting to dig into the ebook conversion and so on information?
I've tried searching here but its hard to find things.

Last edited by JGB; 10-16-2008 at 11:45 AM.
JGB is offline   Reply With Quote
Old 10-16-2008, 11:44 AM   #2
DaleDe
Grand Sorcerer
DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.
 
DaleDe's Avatar
 
Posts: 11,470
Karma: 13095790
Join Date: Aug 2007
Location: Grass Valley, CA
Device: EB 1150, EZ Reader, Literati, iPad 2 & Air 2, iPhone 7
Just keep the ePUB. If you rename the extension to zip you will find you can unzip your ePUB and get to html if you ever need to.

Dale
DaleDe is offline   Reply With Quote
Old 10-16-2008, 12:05 PM   #3
nrapallo
GuteBook/Mobi2IMP Creator
nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.
 
nrapallo's Avatar
 
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
Quote:
Originally Posted by JGB View Post
so far I've been opening them in acrobat and saving as a .rtf, then importing into caliber then converting to epub.
But it takes 40 minutes to convert a single book.
And the formatting is only ok, not nearly as nice as .lit to .epub


Also I'm stuck with a file that is 5 times larger at least if I want to keep a format other then the .epub
is there an export or save option in acrobat that is more efficient?
Would I be better off using a converter like PDFRead?
or is that more for converting scanned PDF files that are all images?
PDFRead will only create images from the .pdf pages; no words or text will be extracted. It's meant mainly to be used as a last resort; when all else fails.

For extracting text, I've had GREAT success using Mobipocket Creator to convert a (Word created) .pdf directly into .prc as explained in this (similar) thread. Then you can use calibre's mobi2lrf to convert from .prc to .lrf.

Quote:
Would saving to html work better, and should I clean it up after that?
if I save it to HTML should I use 3.32 or 4.0?
If possible, I usually save to HTML 3.2 (simpler and easier to manipulate afterwards). The saving to HTML 4.0 usually uses more complex positioning tags which are harder to strip or make work in an ebook, IMHO.

Quote:
Is there anywhere with an indepth explanation of how to use caliber command line? every time I use it I get lost or it does nothing.
Though, I'm no expert here and not wanting to spread false information, I usually cd to the calibre install directory and copy my files there ( ) and use command line syntax like:
Code:
html2lrf -o Outputfilename.lrf -t "My Title" -a "My author"  --category="My Cat" --publisher="NR" --verbose Inputfilename.html
Quote:
Is there anywhere with a good resource for starting to dig into the ebook conversion and so on information?
I've tried searching here but its hard to find things.
Have you tried looking at the programs listed in our E-book conversion wiki?

Last edited by nrapallo; 10-16-2008 at 02:46 PM. Reason: typo
nrapallo is offline   Reply With Quote
Old 10-17-2008, 04:24 AM   #4
jaffab
Enthusiast
jaffab is on a distinguished road
 
Posts: 27
Karma: 70
Join Date: Jul 2008
Device: Sony PR505
Quote:
Originally Posted by DaleDe View Post
Just keep the ePUB. If you rename the extension to zip you will find you can unzip your ePUB and get to html if you ever need to.

Dale

Hi,

There is a good discussion on converting PDFs (and the problems they have on word wrapping) to other format including a link to tools, and options here:

https://www.mobileread.com/forums/showthread.php?t=30263

Hope this helps

Jaffa
jaffab is offline   Reply With Quote
Old 10-17-2008, 07:35 AM   #5
nrapallo
GuteBook/Mobi2IMP Creator
nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.
 
nrapallo's Avatar
 
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
Quote:
Originally Posted by JGB View Post
Is there anywhere with an indepth explanation of how to use caliber command line? every time I use it I get lost or it does nothing.
When using windows, open a command prompt from the start menu, and type the calibre command line tool name you want to use, any options and then drag any file onto it and it just inputs say "f:/folder1/folder2/folder3/file.lit".

In summary, type lit2oeb [options] "f:/folder1/folder2/folder3/file.lit" and then it does stuff for a while, and says output to directory .

Unfortunately this uses the current directory (short form of this is just a dot "."). However, the calibre command lit2oeb has [options] that may go between the command's name and the inputfilename. When I say may, I mean it's optional (short form of this is just to put square brackets "[ ]" around what is optional).

So putting all this together, open the cmd prompt and change to the calibre install directory i.e. issue:
Code:
C:\> cd "C:/Program Files/calibre"
and then type the calibre command line tool's name without any other text. You will get the following:
Code:
C:\Program Files\calibre>lit2oeb
Usage: lit2oeb [options] LITFILE

Whenever you pass arguments to lit2oeb that have spaces in them, enclose the arg
uments in quotation marks.

Options:
  --version             show program's version number and exit

  -h, --help            show this help message and exit

  -o OUTPUT_DIR, --output-dir=OUTPUT_DIR
                        Output directory. Defaults to current directory.

  -p, --pretty-print    Legibly format extracted markup. May modify meaningful
                        whitespace.

  --verbose             Useful for debugging.


Created by Kovid Goyal <kovid@kovidgoyal.net>
Now give it your command, with the required options in the middle as follows:

Code:
C:\Program Files\calibre>lit2oeb -o "MyDirectory" --verbose f:/folder1/folder2/folder3/file.lit
Again, the square brackets [options] just mean that the options are all optional i.e. don't need to be specified (but should if you want the command line tool to do something different than what is it's default behaviour).

Just try the above lit2oeb command and see if it created "MyDirectory" in the calibre install directory with your contents/results therein. Be sure to surround any inputfilename with quotes if it uses spaces in the filename! (also holds true for any option you may use i.e. -o "New Directory of mine"!)

You really need to understand how the dos/command prompt (terminal in linux) works to truly harvest the power that lies in the calibre command line tools.

Last edited by nrapallo; 10-17-2008 at 11:44 AM. Reason: typo
nrapallo is offline   Reply With Quote
Old 10-17-2008, 07:50 AM   #6
nrapallo
GuteBook/Mobi2IMP Creator
nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.
 
nrapallo's Avatar
 
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
Quote:
Originally Posted by nrapallo View Post
Though, I'm no expert here and not wanting to spread false information, I usually cd to the calibre install directory and copy my files there ( ) and use command line syntax like:
Code:
html2lrf -o Outputfilename.lrf -t "My Title" -a "My author"  --category="My Cat" --publisher="NR" --verbose Inputfilename.html
After installing version 0.4.96, I now see that the calibre install directory has been added to my PATH enviroment variable; so I DON'T need to be in the calibre install directory to get it's command line tools to work properly!

GREAT NEWS!

Just work in your local directory; you can issue the calibre command line tools from anywhere!
nrapallo is offline   Reply With Quote
Old 11-12-2008, 04:22 AM   #7
Peto
Legal Alien
Peto doesn't litterPeto doesn't litter
 
Peto's Avatar
 
Posts: 288
Karma: 105
Join Date: Jan 2008
Device: Sony PRS-505/T1/Kindle PW2
Quote:
Originally Posted by JGB View Post
so far I've been opening them in acrobat and saving as a .rtf, then importing into caliber then converting to epub.
But it takes 40 minutes to convert a single book.
And the formatting is only ok, not nearly as nice as .lit to .epub


Also I'm stuck with a file that is 5 times larger at least if I want to keep a format other then the .epub
is there an export or save option in acrobat that is more efficient?
Would I be better off using a converter like PDFRead?
or is that more for converting scanned PDF files that are all images?

Would saving to html work better, and should I clean it up after that?
if I save it to HTML should I use 3.32 or 4.0?

Is there anywhere with an indepth explanation of how to use caliber command line? every time I use it I get lost or it does nothing.


Is there anywhere with a good resource for starting to dig into the ebook conversion and so on information?
I've tried searching here but its hard to find things.
Right now I have dumped RLF and Epub. I convert PDFs to DOC, edit them to fit (the Sony Reader) and then convert them back to PDF. File sizes are bigger than with other file types, but you can embed fonts, edit the text as much as you want and add pictures. It will all be perfectly preserved (except TOCS, dunno why). Page turns are really fast and books don't need repaginating when first opened, so you can just dump them in straight from the PC. File size is no problem for me so...

On the other hand, once PDF to Epub is fully functional, it should be quite straightforward to convert these files and save some memory.

You can take a look to the link below called "Libro" it is in Spanish, but will give you an idea of what you can achieve and how it performs. Font type, size paragraphs and the like are edited to my liking. Anything different will work equally well.
Peto is offline   Reply With Quote
Old 11-23-2008, 01:02 AM   #8
JGB
Groupie
JGB ought to be getting tired of karma fortunes by now.JGB ought to be getting tired of karma fortunes by now.JGB ought to be getting tired of karma fortunes by now.JGB ought to be getting tired of karma fortunes by now.JGB ought to be getting tired of karma fortunes by now.JGB ought to be getting tired of karma fortunes by now.JGB ought to be getting tired of karma fortunes by now.JGB ought to be getting tired of karma fortunes by now.JGB ought to be getting tired of karma fortunes by now.JGB ought to be getting tired of karma fortunes by now.JGB ought to be getting tired of karma fortunes by now.
 
Posts: 168
Karma: 1010000
Join Date: Jul 2008
Device: PRS505
Quote:
Originally Posted by Peto View Post
Right now I have dumped RLF and Epub. I convert PDFs to DOC, edit them to fit (the Sony Reader) and then convert them back to PDF. File sizes are bigger than with other file types, but you can embed fonts, edit the text as much as you want and add pictures. It will all be perfectly preserved (except TOCS, dunno why). Page turns are really fast and books don't need repaginating when first opened, so you can just dump them in straight from the PC. File size is no problem for me so...

On the other hand, once PDF to Epub is fully functional, it should be quite straightforward to convert these files and save some memory.

You can take a look to the link below called "Libro" it is in Spanish, but will give you an idea of what you can achieve and how it performs. Font type, size paragraphs and the like are edited to my liking. Anything different will work equally well.
I like the idea, but what happens with those pdf's when you try to change the size?
you'd have to reconvert them all over again(I'm thinking in the next 2-3 years we will see some screen size changes)
JGB is offline   Reply With Quote
Old 11-23-2008, 05:38 AM   #9
Peto
Legal Alien
Peto doesn't litterPeto doesn't litter
 
Peto's Avatar
 
Posts: 288
Karma: 105
Join Date: Jan 2008
Device: Sony PRS-505/T1/Kindle PW2
Quote:
Originally Posted by JGB View Post
I like the idea, but what happens with those pdf's when you try to change the size?
you'd have to reconvert them all over again(I'm thinking in the next 2-3 years we will see some screen size changes)
Could be. I just store the docs too, just in case I want to change something, or any friend wants to customize the pdf to his own likings. The reader is perfect for correcting. What I do is bookmark the pages where I find errors while reading and once finished I correct them backwards in the original doc.

Keeping the doc file, almost all corrections are minor changes. (When changing sizes or fonts, you have to recheck the page jumps, but that is about it).
Peto is offline   Reply With Quote
Old 11-23-2008, 09:51 PM   #10
ProDigit
Karmaniac
ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.
 
Posts: 2,553
Karma: 11499146
Join Date: Oct 2008
Location: Miami FL
Device: PRS-505, Jetbook, + Mini, +Color, Astak Ez Reader Pro, PPW1, Aura H2O
Why not convert them to LRF?
ProDigit is offline   Reply With Quote
Old 11-23-2008, 10:57 PM   #11
RWood
Technogeezer
RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.RWood ought to be getting tired of karma fortunes by now.
 
RWood's Avatar
 
Posts: 7,233
Karma: 1601464
Join Date: Nov 2006
Location: Virginia, USA
Device: Sony PRS-500
To convert PDFs to anything the best tool I have found is ABBYY PDF Transformer. At $99 it will convert text based PDFs (like those made with Word) as well as graphic based PDFs (such as scans) to RTF files. I can then use these files as they are or convert these files to HTML (3.2, never 4), DOC, TXT, or whatever format I need for their future use. I keep a copy in RTF as I like to have a copy as close as possible to the original source. (Disk space is cheap these days and I can always burn a CD or DVD of the project later.) Once I have cleaned the file (to correct OCR errors for example), I may delete the work files. Sometimes I keep these if I feel that I may someday revisit the files to do more work on them. (For example I keep threatening to revise the Harvard Classics to correct formatting error that others or I have found since they were published. Nick has offered to convert these to the multiple forms of IMP for me when I do complete the next series but I am still backlogged and this has just not come to the surface yet.)

As Nick mentioned, PDFread should be a last resort. I have used it a few times for things that I wanted to read but not keep or convert. To me ePub sounds nice but I have no use for it as my reader does not support it. I do keep the LRF of many of the files I create; however, I view it as a terminal format rather than one that I would use to archive or create other things from.
RWood is offline   Reply With Quote
Old 11-23-2008, 11:01 PM   #12
nrapallo
GuteBook/Mobi2IMP Creator
nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.nrapallo ought to be getting tired of karma fortunes by now.
 
nrapallo's Avatar
 
Posts: 2,958
Karma: 2530691
Join Date: Dec 2007
Location: Toronto, Canada
Device: REB1200 EBW1150 Device: T1 NSTG iLiad_v2 NC Device: Asus_TF Next1 WPDN
Quote:
Originally Posted by RWood View Post
(For example I keep threatening to revise the Harvard Classics to correct formatting error that others or I have found since they were published. Nick has offered to convert these to the multiple forms of IMP for me when I do complete the next series but I am still backlogged and this has just not come to the surface yet.)
Thanks for the update, RWood.

I'll set the "nag" reminder clock ahead a few months now...
nrapallo is offline   Reply With Quote
Old 11-24-2008, 02:44 AM   #13
Peto
Legal Alien
Peto doesn't litterPeto doesn't litter
 
Peto's Avatar
 
Posts: 288
Karma: 105
Join Date: Jan 2008
Device: Sony PRS-505/T1/Kindle PW2
Quote:
Originally Posted by ProDigit View Post
Why not convert them to LRF?
First, because you'll lose a big part of the edition.
Second you'll lose the embedded fonts, and if you get them embedded you'll lose page turn speed.

You can embed your favorite fonts on the reader itself thanks to Valloric, but you'll only improve this lost page turn speed and you'll still have just three fonts. And the edition will remain all but lost.

That LRF will only work in your Sony and if you replace it by another brand, you won't be able to read them even if they are 6" screens.

I really don't know what's so hot about lrf any more.
Peto is offline   Reply With Quote
Old 11-24-2008, 09:59 AM   #14
ProDigit
Karmaniac
ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.ProDigit ought to be getting tired of karma fortunes by now.
 
Posts: 2,553
Karma: 11499146
Join Date: Oct 2008
Location: Miami FL
Device: PRS-505, Jetbook, + Mini, +Color, Astak Ez Reader Pro, PPW1, Aura H2O
My apologies,
I saw you had the Sony PRS-505, and assumed you wanted to convert documents for that reader.
ProDigit is offline   Reply With Quote
Old 11-24-2008, 11:30 AM   #15
Peto
Legal Alien
Peto doesn't litterPeto doesn't litter
 
Peto's Avatar
 
Posts: 288
Karma: 105
Join Date: Jan 2008
Device: Sony PRS-505/T1/Kindle PW2
Quote:
Originally Posted by ProDigit View Post
My apologies,
I saw you had the Sony PRS-505, and assumed you wanted to convert documents for that reader.
Of course. That is what I am doing. But the PDFs I create for the 505 happen to work in any e-reader with the same screen size. That is another advantage, but not the reason for me to use PDF.
Peto is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
ebook-convert segfaults converting PDF files bloovis Calibre 4 09-30-2009 08:38 PM
Unable to convert the pdf files correctly raon1008 Workshop 5 11-22-2008 07:52 PM
Using Finereader to batch convert PDF files to RTF gdxf Sony Reader 9 10-28-2006 04:14 PM


All times are GMT -4. The time now is 08:22 PM.


MobileRead.com is a privately owned, operated and funded community.