![]() |
#1 |
Captain Courageous
![]() ![]() Posts: 239
Karma: 102
Join Date: Apr 2009
Device: calibre, PRS 505
|
![]()
Upon conversion, first from the command line and then from the GUI, of a txt file to an epub, I see a weird character shaped like a diamond, black, with a white question mark in the middle. This character seems to be replacing mostly the apostrophes in the sentences. This is supposed to be an ascii txt file, so I don't understand why it can't display the apostrophe. before conversion, the file looks fine.
Thanks, Paul |
![]() |
![]() |
![]() |
#2 |
Color me gone
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,089
Karma: 1445295
Join Date: Apr 2008
Location: Central Oregon Coast
Device: PRS-300
|
Probably not ascii
Sometimes texts that are advertised as ASCII are not really. If you look at the text the apostrophes, quotes, etc have & in front of them then a series of numbers such as &8216#.
As I understand it, calibre tries to guess what the text is and when it guesses wrong you get these odd symbols. You can go through using a search and replace to eliminate them, which is a lot of work or you specify the code page or encoding and see if that makes them go away. A lot of times cp1252 will work. Some day this will all be settled, but by then most of us will not be making books for ourselves as many of us no longer write BASIC programs to accomplish what we want to do with a computer. |
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Captain Courageous
![]() ![]() Posts: 239
Karma: 102
Join Date: Apr 2009
Device: calibre, PRS 505
|
Yeah, I'm going to play with this some. The original file was in RTF and I just copied and pasted it into notepad. After that I ran it through Textify, and it looked good. When you copy and paste like that, does it retain the encoding? If so, how could I re-encode it?
Thanks |
![]() |
![]() |
![]() |
#4 |
Color me gone
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,089
Karma: 1445295
Join Date: Apr 2008
Location: Central Oregon Coast
Device: PRS-300
|
weird ascii
In calibre under look and feel, input character encoding you can try different encodings to see if they match what you have. cp1252, utf-8 iso 8859-1 are some possibilities. There is another thread on encodings I happened to see another day, you might search for it. I did not find an official list of what calibre accepts and in just exactly what format.
Or you can open the document in a notepad like program and see what they are using for these odd characters. You can do a search for the & symbol and replace all the characters that have this encoding with keyboard characters. Then you will be closer to ascii. If you find a character while you are in your notepad program, copy it and paste into the search area of your browser and hit enter. Google at least will gladly print it for you. Then you will know what you need to search and replace with, being sure to wrap around so you get all of them. |
![]() |
![]() |
![]() |
#5 |
Captain Courageous
![]() ![]() Posts: 239
Karma: 102
Join Date: Apr 2009
Device: calibre, PRS 505
|
Thanks, I saved your suggestions to a tip file I keep on calibre.
I did solve it by loading the file into word, saving it as html and then saving that as a text file. I then converted the resulting text file to epub. Long way around but it worked! |
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Wizard
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 4,553
Karma: 950151
Join Date: Nov 2008
Device: Sony PRS-950, iphone/ipad (Marvin/iBooks/QuickReader)
|
You should have been able to convert the HTML file direct to EPUB.
|
![]() |
![]() |
![]() |
#7 | |
Captain Courageous
![]() ![]() Posts: 239
Karma: 102
Join Date: Apr 2009
Device: calibre, PRS 505
|
Quote:
Thanks, Paul UPDATE: The solution was right under my nose! I just loaded the Textified version into word, saved it as txt(unicode UT-8) and then converted that. Result: perfectly formatted epub file! Thanks for all your contributions. ![]() Last edited by p3aul; 10-14-2009 at 05:10 PM. |
|
![]() |
![]() |
![]() |
#8 |
Sigil & calibre developer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 2,487
Karma: 1063785
Join Date: Jan 2009
Location: Florida, USA
Device: Nook STR
|
Specify the encoding of the input file when you are converting.
--input-encoding cp1252 |
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Non-ascii symbols on Story | James_Wilde | iRiver Story | 13 | 09-12-2010 05:15 AM |
PDF -> AZW conversion, weird character spacing | beacher | Amazon Kindle | 7 | 08-17-2010 09:54 PM |
Ascii file | ProDigit | Lounge | 1 | 12-25-2008 10:08 PM |
"ascii' codec can't encode character" bug ? | zelda_pinwheel | Calibre | 2 | 12-21-2008 08:12 PM |
WM Live Video in ASCII! | TadW | Lounge | 1 | 06-22-2006 07:14 PM |