Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > Workshop

Notices

Reply
 
Thread Tools Search this Thread
Old 06-25-2011, 06:30 AM   #1
Dillinquent
eBook pro
Dillinquent shares his or her toysDillinquent shares his or her toysDillinquent shares his or her toysDillinquent shares his or her toysDillinquent shares his or her toysDillinquent shares his or her toysDillinquent shares his or her toysDillinquent shares his or her toysDillinquent shares his or her toysDillinquent shares his or her toysDillinquent shares his or her toys
 
Dillinquent's Avatar
 
Posts: 65
Karma: 5634
Join Date: Jan 2011
Location: Hertford, UK
Device: PC, iPad, Kindle, Kindle Fire, Galaxy Ace
Question Italics not exporting from pdf

I am having problems exporting italicisation from PDFs.

I have been working on a series of books where the source supplied is pdf, so I am exporting from Acrobat X as HTML, the HTML is a mess but a few minutes work in Dreamweaver makes it usable with Sigil.

So far so good.

But now the problem. After doing 40 books the 41st refuses to export the italics, the text is there, its just not italicised (no <i> tags or italic styles in the css). The italics are visible in the pdf and there is an italic font embedded in the pdf. I have tried loads of online and stand alone conversion tools but none of them solves my problem.

Any ideas anyone?
Dillinquent is offline   Reply With Quote
Old 06-26-2011, 03:46 AM   #2
Jellby
frumious Bandersnatch
Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.Jellby ought to be getting tired of karma fortunes by now.
 
Jellby's Avatar
 
Posts: 6,055
Karma: 4571547
Join Date: Jan 2008
Location: Spaniard in Sweden
Device: Cybook Orizon, Kobo Aura
Try with OCR, treating the PDF as scanned images.
Jellby is offline   Reply With Quote
 
Enthusiast
Old 06-27-2011, 05:08 AM   #3
Ciprian
Junior Member
Ciprian began at the beginning.
 
Posts: 5
Karma: 10
Join Date: Feb 2011
Device: Iphone
have you tried with rtf export?
Ciprian is offline   Reply With Quote
Old 06-30-2011, 03:39 PM   #4
Dillinquent
eBook pro
Dillinquent shares his or her toysDillinquent shares his or her toysDillinquent shares his or her toysDillinquent shares his or her toysDillinquent shares his or her toysDillinquent shares his or her toysDillinquent shares his or her toysDillinquent shares his or her toysDillinquent shares his or her toysDillinquent shares his or her toysDillinquent shares his or her toys
 
Dillinquent's Avatar
 
Posts: 65
Karma: 5634
Join Date: Jan 2011
Location: Hertford, UK
Device: PC, iPad, Kindle, Kindle Fire, Galaxy Ace
Quote:
Originally Posted by Ciprian View Post
have you tried with rtf export?
Quote:
Originally Posted by Jellby View Post
Try with OCR, treating the PDF as scanned images.
Yep, tried rtf, xml, word.doc, word.docx, html, html - filtered and OCR. nothing works except PDFtoEPUB, which makes the worst HTML I've ever seen.
Dillinquent is offline   Reply With Quote
Old 07-01-2011, 04:26 AM   #5
Faster
Connoisseur
Faster is a glorious beacon of lightFaster is a glorious beacon of lightFaster is a glorious beacon of lightFaster is a glorious beacon of lightFaster is a glorious beacon of lightFaster is a glorious beacon of lightFaster is a glorious beacon of lightFaster is a glorious beacon of lightFaster is a glorious beacon of lightFaster is a glorious beacon of lightFaster is a glorious beacon of light
 
Posts: 60
Karma: 12096
Join Date: Sep 2010
Location: Tasmania
Device: Sony PRS 650
I know nothing of Acrobat x so forgive me if my post seems silly.
When I have a similar problem with Adobe Reader I copy a section of the text which includes the problem text. Paste into Word.
From the Format menu I select Styles and Formatting to open the panel. I next put the insertion point inside the problem word(s and see what the panel tells me in the 'Formatting of selected text' box.
Hover the mouse over box then click on the right down arrow.
Select 'Reveal Formatting'.
This may give you some clue as to what's happening.
Faster is offline   Reply With Quote
Old 07-02-2011, 07:04 AM   #6
mrmikel
Book Twiddler
mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.mrmikel ought to be getting tired of karma fortunes by now.
 
Posts: 1,984
Karma: 1405001
Join Date: Apr 2008
Location: Central Oregon Coast
Device: PRS-300
Sometimes in PDFs I have noticed text being listed twice, once in plain text and another time as part of a caption box or something of that sort. Sometimes a conversion program will get the one, but not the other.

Is the text always associated with something else like an illustration? If it is, finding and fixing might be the least stressful way to approach it.
mrmikel is offline   Reply With Quote
Old 07-04-2011, 11:37 AM   #7
Michael Grossman
I Michael Grossman
Michael Grossman began at the beginning.
 
Posts: 8
Karma: 10
Join Date: Jun 2011
Location: Rhode Island
Device: ipad
Dillinquent - I'm having a similar problem when I cut and past books in the word.doc format into Indesign to begin to create an epub. It leaves out all the italics and I have to style them all manually. Has anyone else run into this going from MS Word into Indesing? Thanks - Michael
Michael Grossman is offline   Reply With Quote
Old 07-12-2011, 07:26 PM   #8
wannabee
Media Bloke
wannabee ought to be getting tired of karma fortunes by now.wannabee ought to be getting tired of karma fortunes by now.wannabee ought to be getting tired of karma fortunes by now.wannabee ought to be getting tired of karma fortunes by now.wannabee ought to be getting tired of karma fortunes by now.wannabee ought to be getting tired of karma fortunes by now.wannabee ought to be getting tired of karma fortunes by now.wannabee ought to be getting tired of karma fortunes by now.wannabee ought to be getting tired of karma fortunes by now.wannabee ought to be getting tired of karma fortunes by now.wannabee ought to be getting tired of karma fortunes by now.
 
Posts: 2,377
Karma: 113956855
Join Date: Sep 2010
Location: NSW - Australia
Device: iOS
Quote:
Originally Posted by Dillinquent View Post
Yep, tried rtf, xml, word.doc, word.docx, html, html - filtered and OCR. nothing works except PDFtoEPUB, which makes the worst HTML I've ever seen.
Dilliquent - I think Jelby may have meant OCR'ing hard copies rather than OCR from inside Acrobat. I also found that if you open that "really bad" HTML from the Acrobat export in a browser you can copy from the display rather than the source to another program that exports it to cleaner HTML. i.e. I exported from Acrobat to HTML with CSS opened in Firefox copied from Firefox to InDesign and didn't get a paragraph character at the end of every line which you do with most of the exports from PDF.

Quote:
Originally Posted by Michael Grossman View Post
Dillinquent - I'm having a similar problem when I cut and past books in the word.doc format into Indesign to begin to create an epub. It leaves out all the italics and I have to style them all manually. Has anyone else run into this going from MS Word into Indesing? Thanks - Michael
Try importing the word doc rather than copy and paste. That might work.
wannabee is offline   Reply With Quote
Old 07-14-2011, 12:04 PM   #9
DaleDe
Grand Sorcerer
DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.DaleDe ought to be getting tired of karma fortunes by now.
 
DaleDe's Avatar
 
Posts: 9,540
Karma: 4597554
Join Date: Aug 2007
Location: Grass Valley, CA
Device: EB 1150, EZ Reader, Literati, iPad 2
Quote:
Originally Posted by wannabee View Post
Dilliquent - I think Jelby may have meant OCR'ing hard copies rather than OCR from inside Acrobat. I also found that if you open that "really bad" HTML from the Acrobat export in a browser you can copy from the display rather than the source to another program that exports it to cleaner HTML. i.e. I exported from Acrobat to HTML with CSS opened in Firefox copied from Firefox to InDesign and didn't get a paragraph character at the end of every line which you do with most of the exports from PDF.
Actually, many OCR programs can OCR directly from PDF files. This is generally the best method, if available.

Dale
DaleDe is offline   Reply With Quote
Old 07-15-2011, 05:44 PM   #10
frabjous
Wizard
frabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameterfrabjous can solve quadratic equations while standing on his or her head reciting poetry in iambic pentameter
 
frabjous's Avatar
 
Posts: 1,213
Karma: 12890
Join Date: Feb 2009
Location: Amherst, Massachusetts, USA
Device: Sony PRS-505
Have you tried pdfreflow?
frabjous is offline   Reply With Quote
Old 07-21-2011, 07:14 AM   #11
Dillinquent
eBook pro
Dillinquent shares his or her toysDillinquent shares his or her toysDillinquent shares his or her toysDillinquent shares his or her toysDillinquent shares his or her toysDillinquent shares his or her toysDillinquent shares his or her toysDillinquent shares his or her toysDillinquent shares his or her toysDillinquent shares his or her toysDillinquent shares his or her toys
 
Dillinquent's Avatar
 
Posts: 65
Karma: 5634
Join Date: Jan 2011
Location: Hertford, UK
Device: PC, iPad, Kindle, Kindle Fire, Galaxy Ace
Bingo!

Quote:
Originally Posted by frabjous View Post
Have you tried pdfreflow?


Thanks frabjous; pdfreflow works a treat.
Dillinquent is offline   Reply With Quote
Reply

Tags
conversion error, export, italics, pdf, styling

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Pdf loses italics or bold during conversion.. StringJam PDF 4 04-09-2011 07:52 PM
Exporting highlighting changes to pdf problems edwinious Calibre 6 01-03-2010 06:05 PM
Exporting annotated Kindle2 documents to PDF? mike_mja Amazon Kindle 1 12-26-2009 09:54 AM
No italics roquet Bookeen 18 04-26-2009 03:57 PM
PdfGrabber 2.0 exporting PDF content Alexander Turcic Workshop 2 05-10-2005 06:32 PM


All times are GMT -4. The time now is 03:47 AM.


MobileRead.com is a privately owned, operated and funded community.