Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Software > Calibre

Notices

Reply
 
Thread Tools Search this Thread
Old 08-30-2010, 11:46 PM   #1
Dresden
Junior Member
Dresden began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Aug 2010
Device: Ipad
Punctuation

I have been using Calibre for a while now with mixed results... When I get a .lit file and convert that... I have 100 Percent success... but when I get an html file... it can be good... and it can be horrible...

My current problem is that I have an html file, that when I covert it... It will take away all the upper punctuation.

Example:

"Bob, what are you doing?"

Will end up:

Bob, what are you doing?

The HTML code starts each sentence with: <p width="4%" height="0%"> and I thought that might be the issue I am having... and I assume if I edit the entire document to take out the width and height, I would be fine.. but then again... I could just be chasing my tail as well.

I know this is ultra trivial, but it has been causing me to pull my hair out for the last 4 hours trying to figure it out... Anyone have any suggestions? Any help is appreciated.
Dresden is offline   Reply With Quote
Old 08-31-2010, 01:10 AM   #2
DoctorOhh
US Navy, Retired
DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.
 
DoctorOhh's Avatar
 
Posts: 9,864
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Nexus 7
Quote:
Originally Posted by Dresden View Post
I have been using Calibre for a while now with mixed results... When I get a .lit file and convert that... I have 100 Percent success...
Count yourself lucky. I usually have great success with .lit files but there are plenty of garbage files out there in .lit format.

Quote:
Originally Posted by Dresden View Post
My current problem is that I have an html file, that when I covert it... It will take away all the upper punctuation.
This is a character encoding problem that has to be adjusted in a plugin prior to adding the book. Read this section of the FAQ. Calibre attempts to figure out the proper encoding of a html file but sometimes you have to enter the encoding into the plugin manually, then re-add the html to calibre (see attached).

Encoding you might try cp1252, cp1251, latin1, iso-8859-1, utf-8.
Attached Thumbnails
Click image for larger version

Name:	Character_encoding.jpg
Views:	319
Size:	118.5 KB
ID:	57431  

Last edited by DoctorOhh; 08-31-2010 at 01:16 AM.
DoctorOhh is offline   Reply With Quote
Advert
Old 08-31-2010, 02:32 AM   #3
ldolse
Wizard
ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.ldolse is an accomplished Snipe hunter.
 
Posts: 1,337
Karma: 123455
Join Date: Apr 2009
Location: Malaysia
Device: PRS-650, iPhone
Alternatively just open your file in a text editor and re-save it as UTF-8. Make sure to delete any encoding statements at the start of the file, as they may not agree with the actual encoding it's been saved as.

The width and height probably wouldn't have anything to do with it. I've never seen this problem with an html file, though typically I'll be working with html that was extracted from one of the crappy lit files before I send it to Calibre. I think Lit files may all use UTF-8...
ldolse is offline   Reply With Quote
Old 08-31-2010, 03:39 AM   #4
DoctorOhh
US Navy, Retired
DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.
 
DoctorOhh's Avatar
 
Posts: 9,864
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Nexus 7
Quote:
Originally Posted by ldolse View Post
Alternatively just open your file in a text editor and re-save it as UTF-8.
The plugin I mentioned automatically converts every html file added to calibre to UTF-8 during the add book process. I don't see where it is any different then saving it in a text editor. Either way if the editor doesn't know the initial encoding things will get lost.

That's why, after converting, I always check that my quotes and apostrophes are present. I have converted around 300 html books and have only needed to change the encoding a hand full of times.
DoctorOhh is offline   Reply With Quote
Old 08-31-2010, 04:01 AM   #5
phenomshel
ZCD BombShel
phenomshel ought to be getting tired of karma fortunes by now.phenomshel ought to be getting tired of karma fortunes by now.phenomshel ought to be getting tired of karma fortunes by now.phenomshel ought to be getting tired of karma fortunes by now.phenomshel ought to be getting tired of karma fortunes by now.phenomshel ought to be getting tired of karma fortunes by now.phenomshel ought to be getting tired of karma fortunes by now.phenomshel ought to be getting tired of karma fortunes by now.phenomshel ought to be getting tired of karma fortunes by now.phenomshel ought to be getting tired of karma fortunes by now.phenomshel ought to be getting tired of karma fortunes by now.
 
phenomshel's Avatar
 
Posts: 4,793
Karma: 8293322
Join Date: Jan 2009
Location: The Frozen North (aka Illinois, USA)
Device: iPad, STB Kindle Oasis
I've run into it more than a handful of times....and although I realize it is in no way Calibre's fault, it is extremely annoying. I just wish everyone would get on the same page with encoding!
I've run into a related problem with Popelli Reader and pReader - both ebook reading programs for WebOS. Between the two programs, they say they accept just about every format you can think of. In practice, though, they send everything to their servers to be converted to HTML and then back to your WebOS device. This means that if it's encoded differently than the converter thinks it should be, you either get no apostrophes and quotation marks, or you get a string of symbols where apostrophes and quotation marks should be. And do you know how difficult it is to read a book where every direct quote with a contraction looks like this?:

’Police are all too willing to believe it’s Lindsey’s handwriting’

It's enough to make you want to pull out your hair, for sure. I'm just grateful there's a relatively easy solution for Calibre....wish it was that simple for the other two programs; as yet I haven't figured out a solution other than to pre-convert everything to html and check it, and then load it on my phone.
phenomshel is offline   Reply With Quote
Advert
Old 08-31-2010, 05:08 AM   #6
DoctorOhh
US Navy, Retired
DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.
 
DoctorOhh's Avatar
 
Posts: 9,864
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Nexus 7
Quote:
Originally Posted by phenomshel View Post
It's enough to make you want to pull out your hair, for sure. I'm just grateful there's a relatively easy solution for Calibre....wish it was that simple for the other two programs; as yet I haven't figured out a solution other than to pre-convert everything to html and check it, and then load it on my phone.
It might be that simple for those two programs. You would do the same thing you do for calibre, specify an encoding, by adding a line to the head area of your html page. If you view the source for this page you will see at the top of the code in the head section:
Code:
<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
This tells any browser what character encoding was used on this page.

If those programs can't tell the encoding they might be guessing ISO-8859-1, I bet if you added one of the following to the html head you might see normal viewing. View here for acceptable formats for this code depending on xml, html, html5 etc...

Code:
<meta http-equiv="Content-Type" content="text/html; charset=Windows-1252" />

or

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />

or

<meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1" />
Disclaimer: The previous is advice is based on conjecture, I'm not a web developer and have never played one on TV.

Good Luck.
DoctorOhh is offline   Reply With Quote
Old 08-31-2010, 05:12 AM   #7
Dresden
Junior Member
Dresden began at the beginning.
 
Posts: 2
Karma: 10
Join Date: Aug 2010
Device: Ipad
Going into the plugins and changing it to cp1252 worked perfect... Thank you so very much!
Dresden is offline   Reply With Quote
Old 08-31-2010, 05:14 AM   #8
DoctorOhh
US Navy, Retired
DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.DoctorOhh ought to be getting tired of karma fortunes by now.
 
DoctorOhh's Avatar
 
Posts: 9,864
Karma: 13806776
Join Date: Feb 2009
Location: North Carolina
Device: Icarus Illumina XL HD, Nexus 7
Quote:
Originally Posted by Dresden View Post
Going into the plugins and changing it to cp1252 worked perfect... Thank you so very much!
Don't forget to change it back to blank or it will bite you again for forcing it to assume everything is cp1252.
DoctorOhh is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Thanks for new 'Smarten Punctuation' feature jackie_w Calibre 1 09-21-2010 02:53 PM
Correct formatting of punctuation ghostyjack Workshop 12 08-16-2010 01:36 PM
Punctuation jgray Workshop 10 04-14-2010 07:38 AM
Vanishing punctuation Robotech_Master Calibre 25 06-01-2009 10:24 PM
Emdash - punctuation macro ProDigit Sony Reader 8 11-28-2008 02:32 AM


All times are GMT -4. The time now is 06:02 AM.


MobileRead.com is a privately owned, operated and funded community.