Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book Formats > Workshop

Notices

Reply
 
Thread Tools Search this Thread
Old 10-17-2009, 08:31 AM   #1
crutledge
eBook FANatic
crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.
 
crutledge's Avatar
 
Posts: 18,301
Karma: 16071131
Join Date: Apr 2008
Location: Alabama, USA
Device: HP ipac RX5915 Wife's Kindle
Encoding of Emdash

I have been running into a series of PG books encoded in charset=iso-8859-1. The emdash is encoded as #8212 followed by a soft hyphen #173. I go into the PG file in the editor and replace these with #151.

I would like to add this to the Book Cleaner. The emdash seems to be handled by 2.bcf showing:
find what: uni(137)
replace with: uni(151)

I must be mis-interpreting something because I cannot reference uni(137) with the endash.

Would someone point me in the right direction?

Charlie
crutledge is offline   Reply With Quote
Old 10-17-2009, 08:48 AM   #2
HarryT
eBook Enthusiast
HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.
 
HarryT's Avatar
 
Posts: 85,544
Karma: 93383043
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
Quote:
Originally Posted by crutledge View Post
I have been running into a series of PG books encoded in charset=iso-8859-1. The emdash is encoded as #8212 followed by a soft hyphen #173. I go into the PG file in the editor and replace these with #151.

I would like to add this to the Book Cleaner. The emdash seems to be handled by 2.bcf showing:
find what: uni(137)
replace with: uni(151)

I must be mis-interpreting something because I cannot reference uni(137) with the endash.

Would someone point me in the right direction?

Charlie
You need to look at "1.bcf" (which runs as the file is loaded) and "2.bcf" (which runs after BD has done all its initial default processing to the file) as a pair, Charlie.

By default, BD converts dashes into hyphens, so this is a "workaround" to stop it from doing so. If you look at "1", you'll see that it replaces #151 with #137, and then "2" replaces #137 with #151 again. The affect of this is to make BD "preserve" dashes.

What you need to do is to edit "1.bcf" and tell it to replace your character sequence with #137.
HarryT is offline   Reply With Quote
Advert
Old 10-17-2009, 11:47 AM   #3
crutledge
eBook FANatic
crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.
 
crutledge's Avatar
 
Posts: 18,301
Karma: 16071131
Join Date: Apr 2008
Location: Alabama, USA
Device: HP ipac RX5915 Wife's Kindle
Harry,
Is this correct?
find: uni(8212) replace with uni(137)

Charlie
crutledge is offline   Reply With Quote
Old 10-17-2009, 11:53 AM   #4
HarryT
eBook Enthusiast
HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.HarryT ought to be getting tired of karma fortunes by now.
 
HarryT's Avatar
 
Posts: 85,544
Karma: 93383043
Join Date: Nov 2006
Location: UK
Device: Kindle Oasis 2, iPad Pro 10.5", iPhone 6
Sounds right to me, Charlie, but the best way to find out is to try it!
HarryT is offline   Reply With Quote
Old 10-17-2009, 03:31 PM   #5
crutledge
eBook FANatic
crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.
 
crutledge's Avatar
 
Posts: 18,301
Karma: 16071131
Join Date: Apr 2008
Location: Alabama, USA
Device: HP ipac RX5915 Wife's Kindle
It works!
Thanks, Harry.

Charlie
crutledge is offline   Reply With Quote
Advert
Old 10-17-2009, 05:41 PM   #6
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 73,660
Karma: 127838196
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Quote:
Originally Posted by crutledge View Post
Harry,
Is this correct?
find: uni(8212) replace with uni(137)

Charlie
Should I add that into the Book Cleaner files?
JSWolf is offline   Reply With Quote
Old 10-18-2009, 11:24 AM   #7
crutledge
eBook FANatic
crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.
 
crutledge's Avatar
 
Posts: 18,301
Karma: 16071131
Join Date: Apr 2008
Location: Alabama, USA
Device: HP ipac RX5915 Wife's Kindle
Jon,
I don't see how it could hurt. I don't know how many folks have run into it. I guess it could be the way I process the PG files.

I download the complete HTM and the text file. I use the Gutenberg Prettifier to convert the text file to HTML. This gets rid of page numbers and other annoying junk. I then load the HTML file to BD and then place the BD display along side the original complete HTM and move down page by page to format the BD file.

The HTM file shows charset=windows-1252. This should work fine with the BCF as it stands. The HTML files specifies no charset

It seems to be Gutenberg Prettifier inserting the different codes.I guess only those using the Prettifier will run into this. As to why the prettifier is suddenly producing these codes I have yet to determine.

I'm not sure it is worth your time and effort in making the change and then distributing the results.

Charlie
crutledge is offline   Reply With Quote
Old 10-19-2009, 09:43 AM   #8
crutledge
eBook FANatic
crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.
 
crutledge's Avatar
 
Posts: 18,301
Karma: 16071131
Join Date: Apr 2008
Location: Alabama, USA
Device: HP ipac RX5915 Wife's Kindle
Jon,
Somewhere I have totally screwed-up. The attempt to use the BCF to catch the 8212 really doesn't work! I don't know how, but I did.

I made the following change:

ENTRY IN 1,BCF
find what: uni(8212) replace by: uni(137)


I built the fillowing test file:

HTML TEST FILE
Code:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<head>
  <title></title>
  <meta http-equiv="content-type" content="text/html;charset=us-ascii" />
</head>
<body>

<p>Test of emdash code 151:  —</p>
<br/>
<br/>
<p>Test of emdash code 8212:  —</p>

</body>
</html>

IE displays the following"
Test of emdash code 151: —
Test of emdash code 8212: —

BOOK DESIGNER displays:
Test of emdash code 151: —
Test of emdash code 8212: -
This file was created with BookDesigner program
bookdesigner@the-ebook.org
10/19/2009

I am obviously into something I don't understand. Comments please.
Charlie
crutledge is offline   Reply With Quote
Old 10-19-2009, 10:07 AM   #9
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 73,660
Karma: 127838196
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
On my screen (with Firefox), both encodings look the same. Do you have a file you know is a properly encoded? Can you zip it if you do and attach it here?
JSWolf is offline   Reply With Quote
Old 10-19-2009, 10:30 AM   #10
crutledge
eBook FANatic
crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.
 
crutledge's Avatar
 
Posts: 18,301
Karma: 16071131
Join Date: Apr 2008
Location: Alabama, USA
Device: HP ipac RX5915 Wife's Kindle
Couldn't upload as html file. Change .txt to .html.
Charlie
Attached Files
File Type: txt emdash.txt (302 Bytes, 263 views)
crutledge is offline   Reply With Quote
Old 10-27-2009, 08:31 PM   #11
crutledge
eBook FANatic
crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.crutledge ought to be getting tired of karma fortunes by now.
 
crutledge's Avatar
 
Posts: 18,301
Karma: 16071131
Join Date: Apr 2008
Location: Alabama, USA
Device: HP ipac RX5915 Wife's Kindle
Jon,
As the old folks used to say "Even a blind hog will find an acorn sometimes."

I have continued to play wirh the BCF file and think I have it working. I have tested it on several files with no problems.

After looking at the BCF file every way I could, including HEX format. I finally inserted a row as the third row of the table instead of adding uni(8212) as the last row. What will be interesting will be to see what happens on the next one.

Perhaps it did work at the beginning and somehow I screwed up. Anyway, give it a try if you like. I'll let you know what happens.

Charlie
crutledge is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Encoding prusaks Recipes 0 09-27-2010 06:25 AM
Sigil, UTF-8 and the emdash crutledge Sigil 5 06-30-2010 12:35 PM
how to add encoding? nsg Calibre 5 02-25-2009 09:51 PM
Emdash - punctuation macro ProDigit Sony Reader 8 11-28-2008 02:32 AM
More emdash woes Patricia Sony Reader 10 07-06-2007 04:32 PM


All times are GMT -4. The time now is 10:31 PM.


MobileRead.com is a privately owned, operated and funded community.