Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Software > Calibre > Conversion

Notices

Reply
 
Thread Tools Search this Thread
Old 07-06-2011, 06:41 PM   #1
Snauzoo
Enthusiast
Snauzoo has a complete set of Star Wars action figures.Snauzoo has a complete set of Star Wars action figures.Snauzoo has a complete set of Star Wars action figures.
 
Posts: 31
Karma: 282
Join Date: May 2010
Device: iPad
Question Two format (text) questions

One, I have a book that had a black bar in the text that replaced every apostrophe and quote marks. Not sure if the original had a single quote mark like 'this' instead of "this" - when I looked at this book in ADE, the black marks appeared as ? marks. Question marks, that is. I cannot see a way to fix this since if I do a search/replace it will remove the ? marks from real questions. I used smarten punctuation in my conversion to no avail.

ETA: I have located an uncoverted .zip form of this particular book, and am willing to give it a whirl to figure this out. It turns out it is not just apostrophes that have gone bad, but a lot of what is called soft hyphens.

----------------------------------------------------------------------------------------

Two, this is an ipad question. When I have a book with two authors in Calibre, such as Douglas Preston and Lincoln Child, they are entered in the author field as Douglas Preston & Lincoln Child. When I add these to iTunes, the ampersand gets changed to &amp or something equally wierd. How can I fix this ? I have ten books by them, plus a bunch of cookbooks with multiple authors and I have had to edit each individually.

Thanks in advance. My library project is coming along well. Just added about 40 books to the NOOK! And many more to the iPad.

Last edited by Snauzoo; 07-06-2011 at 07:43 PM.
Snauzoo is offline   Reply With Quote
Old 07-06-2011, 09:11 PM   #2
dwig
Wizard
dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.dwig ought to be getting tired of karma fortunes by now.
 
dwig's Avatar
 
Posts: 1,613
Karma: 6718479
Join Date: Dec 2004
Location: Paradise (Key West, FL)
Device: Current:Surface Go & Kindle 3 - Retired: DellV8p, Clie UX50, ...
Quote:
Originally Posted by Snauzoo View Post
...when I looked at this book in ADE, the black marks appeared as ? marks...
This happens when there is confusion about the font encoding during format conversion. It often happens when text files are converted to ebooks and the conversion software doesn't recognize the font encoding correctly (and guesses wrong) or needs to have the encoding set manually and the user fails to do so or do it correctly. Most ebook reading software displays a question mark or a special glyph (e.g. question mark in a box, ...) when it encounters a character it doesn't understand. The vertical bar is apparently what the software you are using on your iPad uses in such cases.

When you try to convert the contents of your ZIP file pay attention to the encoding setting in the conversion dialogs. Many encoding schemes are referred to as "code pages" and are often listed as "cp1252" or other numbers.

If you don't know the source encoding its a matter of trial-and-terror. Try conversions with different encoding settings until to hit on the correct one.
dwig is offline   Reply With Quote
Old 07-07-2011, 12:29 AM   #3
DoctorOhh
US Navy, Retired
DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.
 
DoctorOhh's Avatar
 
Posts: 9,864
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Nexus 7
Quote:
Originally Posted by dwig View Post
Quote:
Originally Posted by Snauzoo View Post
ETA: I have located an uncoverted .zip form of this particular book, and am willing to give it a whirl to figure this out. It turns out it is not just apostrophes that have gone bad, but a lot of what is called soft hyphens.
When you try to convert the contents of your ZIP file pay attention to the encoding setting in the conversion dialogs. Many encoding schemes are referred to as "code pages" and are often listed as "cp1252" or other numbers.
Zip is not a book format so it is hard to give advice. If this zip file is html then the encoding has to be entered in the HTML to Zip file type plugin prior to the book being added to calibre. Read #2 in this section of the manual. If it isn't html then follow #1 in that section per dwig's suggestion.

Last edited by DoctorOhh; 07-07-2011 at 12:32 AM.
DoctorOhh is offline   Reply With Quote
Old 07-07-2011, 12:44 AM   #4
Snauzoo
Enthusiast
Snauzoo has a complete set of Star Wars action figures.Snauzoo has a complete set of Star Wars action figures.Snauzoo has a complete set of Star Wars action figures.
 
Posts: 31
Karma: 282
Join Date: May 2010
Device: iPad
Okay. I tried every single option for encoding that was listed. After each attempt I removed the book, and added back the html book. I can open the file with notepad and it appears perfectly normal. I can also see it just fine with Sigil but Sigil does state that it has an ability to render text that ADE cannot. Some of the encodings actually made it worse. Either I got a series of BBBB for each indent, or a series of boxes. And one of those preceeded every hyphen or apostrophe.
Snauzoo is offline   Reply With Quote
Old 07-07-2011, 12:51 AM   #5
DoctorOhh
US Navy, Retired
DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.
 
DoctorOhh's Avatar
 
Posts: 9,864
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Nexus 7
Quote:
Originally Posted by Snauzoo View Post
Okay. I tried every single option for encoding that was listed. After each attempt I removed the book, and added back the html book.
Did you change the encoding in the htmltozip plugin prior to adding the book back each time?
DoctorOhh is offline   Reply With Quote
Old 07-07-2011, 12:58 AM   #6
Snauzoo
Enthusiast
Snauzoo has a complete set of Star Wars action figures.Snauzoo has a complete set of Star Wars action figures.Snauzoo has a complete set of Star Wars action figures.
 
Posts: 31
Karma: 282
Join Date: May 2010
Device: iPad
Actually Walt I posted my reply to dwig before I saw yours. So what I did was per his suggestion, not yours.

Now, looking at my html document, there is no encoding in the doctype - my understanding is that for dtd xhtml 1.1 that an encoding must be stated unless it is utf 8 or 16. So I have absolutely no idea what the encoding is. And trying over and over for a dozen of them seems to be a crazy wild goose chase.

On a whim, since it is html, I popped it open in Firefox, and the browser chose iso 8859-1 (its on autodectect since I work with a lot of utf8 and cjk stuff at work) and I played around with it in Firefox - the windows encoding also displays it correctly. So I am going to try that one next. I tried using iso 8859 -1 in the plug in and it did not work. I did manage to crash Calibre good. Then the light went on! Firefox calls cp 1252 "Western Windows 1252" So I thought, well maybe someone used Word to create this document, and tried that and voil'a it worked.

Thanks for the help I never would have figured that out on my own.

Last edited by Snauzoo; 07-07-2011 at 01:15 AM.
Snauzoo is offline   Reply With Quote
Old 07-07-2011, 01:04 AM   #7
DoctorOhh
US Navy, Retired
DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.
 
DoctorOhh's Avatar
 
Posts: 9,864
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Nexus 7
Quote:
Originally Posted by Snauzoo View Post
Now, looking at my html document, there is no encoding in the doctype - my understanding is that for dtd xhtml 1.1 that an encoding must be stated unless it is utf 8 or 16. So I have absolutely no idea what the encoding is. And trying over and over for a dozen of them seems to be a crazy wild goose chase.
Starson17 suggests the following guesses in this thread.

Quote:
Most people don't do it that way, however. They just try reasonable options until one seems to work. Here are the ones I usually try:
cp1252
cp1251
latin1
utf-8
DoctorOhh is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Text to speech + 3g questions RedNara Amazon Kindle 1 11-30-2010 04:38 AM
Damaged screen, any way to format the text around this? jenkins Sony Reader 6 06-12-2010 08:21 PM
Can you edit the text in LRF format Lanyon Calibre 9 03-13-2009 06:58 AM
How I format text for my PRS 500 stevejay Calibre 0 03-04-2009 11:25 AM
Text format and how to use reb1 Bookeen 15 10-12-2007 11:49 AM


All times are GMT -4. The time now is 09:29 AM.


MobileRead.com is a privately owned, operated and funded community.