Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 01-12-2010, 06:32 PM   #1
Dark123
Zealot
Dark123 doesn't litterDark123 doesn't litter
 
Posts: 111
Karma: 105
Join Date: Jan 2010
Device: Kindle 3 WiFi
PDF to ePub problem

I converted something in PDF to ePub (This only happened with 1 file)
This is how is came out.
Hello this is a
test.


Anyone know how to fix this?
Dark123 is online now   Reply With Quote
Old 01-12-2010, 07:08 PM   #2
ac4lt
Connoisseur
ac4lt began at the beginning.
 
ac4lt's Avatar
 
Posts: 61
Karma: 36
Join Date: Jan 2010
Location: Reston, Virginia, US
Device: ipad
I suspect you can fix this by going to the "pdf input" section on the conversion dialog and changing the line unwrapping value to 0.5. That worked for me.
ac4lt is offline   Reply With Quote
Old 01-13-2010, 12:05 AM   #3
Dark123
Zealot
Dark123 doesn't litterDark123 doesn't litter
 
Posts: 111
Karma: 105
Join Date: Jan 2010
Device: Kindle 3 WiFi
Thank you ac4lt. That worked.
Dark123 is online now   Reply With Quote
Old 01-13-2010, 10:55 AM   #4
Vargagy
Junior Member
Vargagy began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Jan 2010
Device: HTC PDA-phone
Thumbs down Broken lines in the epub output

Hi,

I am new user of Calibre, and new member in forum. Regret my english isn't very good, my original language is hungarian. Sorry!

My deal is to convert Adobe Indesign CS2 origin books to epub format. CS3 and CS4 contains epub export, but CS2 not. Avoid of problem I exported text from InDesign in PDF format, and converted the PDF by Calibre.

In the Calibre converted epub file are broken lines, the last word of paragraph stays in a stand alone line, after this line begins in a newer line the next paragraph. The broken line isn't in all paragraph, only in the few, and I can not found some sytem in that.

If I change something in the layout (e.g. shorter lines) and tha structure of paragraph, the error on the earlier place go out, but on newer place coming in.

Do You found already this effect? What can I do?
Vargagy is offline   Reply With Quote
Old 01-13-2010, 11:01 AM   #5
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 38,543
Karma: 19637653
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Aura H2O, Sony PRS-650, Sony PRS-T1, nook STR, iPad 1, iPhone 5
The problem is there is NO program that can convert PDF to any other format without errors. So once you convert the PDF, the only choice you have is to a/b compare the PDF to the output and fix all the errors in the conversion. I know it's tedious. But, there's nothing else you can do.
JSWolf is online now   Reply With Quote
Old 01-20-2010, 09:51 PM   #6
bthoven
Evangelist
bthoven will become famous soon enoughbthoven will become famous soon enoughbthoven will become famous soon enoughbthoven will become famous soon enoughbthoven will become famous soon enoughbthoven will become famous soon enough
 
bthoven's Avatar
 
Posts: 452
Karma: 544
Join Date: Aug 2009
Location: Bangkok, Thailand
Device: Kindle'r 3, iPhone 3Gs, iPad 2, Galaxy Tab Wifi
Quote:
Originally Posted by ac4lt View Post
I suspect you can fix this by going to the "pdf input" section on the conversion dialog and changing the line unwrapping value to 0.5. That worked for me.
Hi Ac4lt...thanks for your tip, it works for me too!

By the way, what this setting is doing?
bthoven is offline   Reply With Quote
Old 01-20-2010, 10:49 PM   #7
ac4lt
Connoisseur
ac4lt began at the beginning.
 
ac4lt's Avatar
 
Posts: 61
Karma: 36
Join Date: Jan 2010
Location: Reston, Virginia, US
Device: ipad
I *think* it's deciding at what percentage of a line length it needs to pull in the next line, but I'm not exactly sure how it's implemented.
ac4lt is offline   Reply With Quote
Old 01-20-2010, 11:09 PM   #8
AnemicOak
Bookaholic
AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.AnemicOak ought to be getting tired of karma fortunes by now.
 
AnemicOak's Avatar
 
Posts: 10,428
Karma: 28936355
Join Date: Oct 2007
Location: Minnesota
Device: HDX 8.9, AuraHD, Nook HD+, Kindle 2,3,T , Opus, Nexus7, iPhone5, etc
Quote:
Originally Posted by Vargagy View Post
My deal is to convert Adobe Indesign CS2 origin books to epub format. CS3 and CS4 contains epub export, but CS2 not. Avoid of problem I exported text from InDesign in PDF format, and converted the PDF by Calibre.
Does CS2 have export to XHTML? I can't remember if that was added in CS2 or CS3. If so you could do that and use Sigil to format your ePub. Or perhaps exporting to XML and converting that to HTML and going from there or something? Not sure if there's a converter that will go from XML straight to ePub.
AnemicOak is offline   Reply With Quote
Old 01-21-2010, 12:33 PM   #9
alecE
Evangelist
alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.
 
alecE's Avatar
 
Posts: 401
Karma: 546196
Join Date: Mar 2009
Location: UK canal boat
Device: sony prs505, prs650 liseuses
Fwiw, I've converted 20+ pdf titles to epub. I haven't found any foolproof, quick way of doing this. My rather laborious procedure now is:
1. If necessary, remove the pdf restrictions which prevent exporting as text;
2. Use the File | Save As Text menu option to create a .txt version;
3. Using the text editor of your choice, remove page headers, page numbers, extraneous front matter. Then find and change quotes, apostrophes and accents etc., to html named entities (e.g., " --> “
4. Create an epub file using Sigil (thank you so, so much Valloric)
5. Load into Calibre (thank you Kovid);
6. Transfer to Sony reader;
7. Read, enjoy, bookmark the typos;
8. Back to Sigil, correct all the typos etc., repeat steps 4 to 7.

This works, it's labour intensive, and during the process you may lose the will to live.
alecE is offline   Reply With Quote
Old 01-21-2010, 01:39 PM   #10
kjk
.
kjk ought to be getting tired of karma fortunes by now.kjk ought to be getting tired of karma fortunes by now.kjk ought to be getting tired of karma fortunes by now.kjk ought to be getting tired of karma fortunes by now.kjk ought to be getting tired of karma fortunes by now.kjk ought to be getting tired of karma fortunes by now.kjk ought to be getting tired of karma fortunes by now.kjk ought to be getting tired of karma fortunes by now.kjk ought to be getting tired of karma fortunes by now.kjk ought to be getting tired of karma fortunes by now.kjk ought to be getting tired of karma fortunes by now.
 
Posts: 3,408
Karma: 5647231
Join Date: Oct 2008
Device: never enough
Quote:
Originally Posted by alecE View Post
Fwiw, I've converted 20+ pdf titles to epub. I haven't found any foolproof, quick way of doing this. My rather laborious procedure now is:
1. If necessary, remove the pdf restrictions which prevent exporting as text;
2. Use the File | Save As Text menu option to create a .txt version;

This works, it's labour intensive, and during the process you may lose the will to live.
ahahaha I feel your pain.
I am curious though, why do you go all the way back to text? Don't you lose italics, bold, etc.? Or do you put that back in during editing?
kjk is offline   Reply With Quote
Old 01-21-2010, 02:30 PM   #11
chaley
"chaley", not "charley"
chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.chaley ought to be getting tired of karma fortunes by now.
 
Posts: 5,909
Karma: 1217216
Join Date: Jan 2010
Location: France
Device: Many android devices
I have had good luck converting single-column PDFs by:
1. Crop margins of the PDF to the text area.
2. Save as RTF
3. Use Calibre to convert RTF to EPUB
4. Use Sigil to fix line breaks across page breaks (and a few others)

This method conserves all the character formatting, and (with one exception) the 10 or so files I have converted hasn't resulted in a large amount of repair work.

The one case involved fixing up chapter headings. The original document had chapters in the form
ROMAN_NUMERAL.
CHAPTER TITLE
followed by a few empty lines. The easiest way for me to fix this was to use VIM's global search & replace on the .html that came out of the epub conversion. I used a regexp that matched the two lines and replaced them with a single line wrapped in <h1></h1> tags. I tried using sigil, but couldn't figure out how to make a multi-line match expression (I admit I didn't look for a long time).
chaley is offline   Reply With Quote
Old 01-22-2010, 03:49 AM   #12
Vargagy
Junior Member
Vargagy began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Jan 2010
Device: HTC PDA-phone
HTML is my friend

Quote:
Originally Posted by AnemicOak View Post
Does CS2 have export to XHTML? I can't remember if that was added in CS2 or CS3. If so you could do that and use Sigil to format your ePub. Or perhaps exporting to XML and converting that to HTML and going from there or something? Not sure if there's a converter that will go from XML straight to ePub.
Yes, the actual solution of problem is the simple exporting of text in HTML format. This is going to convert into the Calibre. The result ePub is correct in Calibre reader and some other ePub reader, but - it isn't' understable - not correct in the Adobe Digital Edition, all in east-europe used akcent-characters (e.g. á, é, ö, ü) substitued with "?"
Vargagy is offline   Reply With Quote
Old 01-22-2010, 04:11 AM   #13
hidari
MR Drone
hidari ought to be getting tired of karma fortunes by now.hidari ought to be getting tired of karma fortunes by now.hidari ought to be getting tired of karma fortunes by now.hidari ought to be getting tired of karma fortunes by now.hidari ought to be getting tired of karma fortunes by now.hidari ought to be getting tired of karma fortunes by now.hidari ought to be getting tired of karma fortunes by now.hidari ought to be getting tired of karma fortunes by now.hidari ought to be getting tired of karma fortunes by now.hidari ought to be getting tired of karma fortunes by now.hidari ought to be getting tired of karma fortunes by now.
 
hidari's Avatar
 
Posts: 1,604
Karma: 15260410
Join Date: Oct 2007
Location: DRONEZONE
Device: OPUS/PB360,Nexus 7,GzONE, Kobo Mini
Quote:
Originally Posted by Vargagy View Post
Yes, the actual solution of problem is the simple exporting of text in HTML format. This is going to convert into the Calibre. The result ePub is correct in Calibre reader and some other ePub reader, but - it isn't' understable - not correct in the Adobe Digital Edition, all in east-europe used akcent-characters (e.g. á, é, ö, ü) substitued with "?"


Just in Case you want to do it a more simple way: I myself have converted quite a few books from PDF to epub. like alicE said convert it to a text format.

Then, I pop it into Calibre. Conver it to epub and it's ready to read.

Not sure about you but I care not if there are mistakes in quotes/indentations etc...... Yes I am a heathen I drink instant coffee. BUT for me content is more important than the bits and bobs...

sum up:

1. convert PDF to text
2. Convert .txt format to epub with Calibre
3. Read..........
hidari is offline   Reply With Quote
Old 01-24-2010, 07:59 AM   #14
alecE
Evangelist
alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.alecE ought to be getting tired of karma fortunes by now.
 
alecE's Avatar
 
Posts: 401
Karma: 546196
Join Date: Mar 2009
Location: UK canal boat
Device: sony prs505, prs650 liseuses
Quote:
Originally Posted by kjk View Post
ahahaha I feel your pain.
I am curious though, why do you go all the way back to text? Don't you lose italics, bold, etc.? Or do you put that back in during editing?
I like to go all the way back to text as that gives me the greatest control. With respect to italics etc., yes, I then have to put those back in, but, so many titles are poorly edited to begin with that very often there are errors in the formatting which have to be corrected anyway.
With respect to the pain, whilst asprin helps, a single malt whisky is even better.
alecE is offline   Reply With Quote
Old 01-25-2010, 04:22 PM   #15
avresbo
Member
avresbo began at the beginning.
 
Posts: 13
Karma: 10
Join Date: Jan 2010
Device: cool-er
Does calibre recognize *.doc?
Why replace special caracters when using *.txt? The original is correct.
Thank you for your help.

Last edited by avresbo; 01-25-2010 at 04:28 PM.
avresbo is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
PDF 2 EPUB - font problem sulka Calibre 18 09-16-2010 07:20 AM
PDF to Epub (problem with pages) violentlyserene Calibre 1 08-22-2010 11:38 AM
Problem with accents converting PDF to EPUB madeira Calibre 0 07-09-2010 06:15 PM
Problem converting pdf to epub smartin Calibre 3 05-02-2010 07:55 AM
PDF to ePub (New line problem) Dark123 Calibre 3 02-13-2010 09:41 PM


All times are GMT -4. The time now is 05:37 PM.


MobileRead.com is a privately owned, operated and funded community.