Register Guidelines E-Books Today's Posts Search

Go Back   MobileRead Forums > E-Book Formats > Workshop

Notices

Reply
 
Thread Tools Search this Thread
Old 03-04-2008, 08:30 PM   #1
llasram
Reticulator of Tharn
llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.
 
llasram's Avatar
 
Posts: 618
Karma: 400000
Join Date: Jan 2007
Location: EST
Device: Sony PRS-505
Obelisk -- legal distribution of format-shifted copyrighted works

I love curly quotation marks. They're so round and inviting. I also love free e-books, and so have been delighted by Tor's current free–e-book–each–week program. Perhaps by Tor my loves may be joined? But alas not – the HTML versions Tor provides have ASCII quotation marks, and when I asked if this could be rectified was told “I'm afraid the quotation-mark conversion has to stay.”

So for Robert Charles Wilson’s Spin I rolled up my crazy-sleeves, pulled out by regexps, and fixed them myself. Every last one. And modified the CSS and some of the markup to much more more closely resemble the formatting in the PDF version. Then wrapped it up as a valid .epub book. Then converted/tweaked to produce a great-looking Sony Reader BBeB book.

And they’re all for only me! Nope, can’t give them to you. The power of copyright compels me! I can add those curly quotes myself because I have the source HTML to start with. If I start handing people my curly-quoted version I have no means to stop it from falling into new hands which didn’t already have the straight-from-Tor edition.

Or do I?

I could provide you with a grid of just the byte offsets of the various curly quotes. Some extreme variant of diff/patch in which nothing of the original copyrighted text persists. It would contain just my curly quotes, owned by me under copyright law and free to give you as I wish. You provide the straight-from-Tor e-book, mix in my curly quotes and poof! – you have a be-curled edition of Spin. But this doesn’t work for format-shifting over compression, encoding changes, etc., where “put a curly quote here” ceases to makes sense.

Unless we distill the idea down to the lowest level – what is XOR but the difference between two bits?

Let’s try an experiment, which I’m calling Obelisk[1]. Download the following files:Then get your copy of WilsonSpin_HTML.zip handy, pop open your favorite shell, and run:

Code:
python obelisk.py Mohm5pei WilsonSpin_HTML.zip Mohm5pei#WilsonSpin_HTML.zip#Spin.epub.obelisk Spin.epub
python obelisk.py AhZe5shu WilsonSpin_HTML.zip AhZe5shu#WilsonSpin_HTML.zip#Spin.lrf.obelisk Spin.lrf
The results should be curly-quoted .epub and BBeB versions of Spin, seamlessly merging Tor’s bits with mine into unified wholes.

Let me know what you think.

[1] Obelisk is similar to and inspired by a “project” called Monolith, although with rather different goals.
llasram is offline   Reply With Quote
Old 03-04-2008, 08:46 PM   #2
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,843
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Assuming the source file has an even number of quotes, shouldn't replacing them with curly quotes be as simple as

Code:
intag = False
inquote = False
for i, chr in enumerate(data):
  if chr == '<':
    intag = True
  elif chr == '>'
    intag = False
  elif not intag and chr == '"':
    if inquote:
      data[i] = right curly quote
      inquote = False
    else:
      data[i] = left curly quote
      inquote = True
Or is there something about curly quotes I'm missing?
kovidgoyal is offline   Reply With Quote
Advert
Old 03-04-2008, 09:07 PM   #3
llasram
Reticulator of Tharn
llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.
 
llasram's Avatar
 
Posts: 618
Karma: 400000
Join Date: Jan 2007
Location: EST
Device: Sony PRS-505
Quote:
Originally Posted by kovidgoyal View Post
Assuming the source file has an even number of quotes, shouldn't replacing them with curly quotes be as simple as
It’s mostly mechanizable, but not quite that simply. For example:
“This quotation-marked bit goes on for more than one paragraph. It doesn’t end with a double quote.

“And here I have some ‘examples’ of single quotes. I’ve got several of ’em. The examples’ quotation marks point in all kinds of directions.

“And here ends the quote.”
So pretty much the rules are:

Code:
<ws>" == “
"<ws> == ”
\w'\w == ’
'<ws> == ’
<ws>' == ‘
Where <ws> is whitespace plus ( ) [ ] - – —.

But then have to manually check all the instances of “<ws>‘” and probaly start by looking for any quotations marks with white space on both sides (usually found when doing "something like 'this' ").

So anyway. Mostly mechanizable, but still some manual labor to get it perfect. And can’t automate improving the CSS. :-)
llasram is offline   Reply With Quote
Old 03-04-2008, 09:16 PM   #4
kovidgoyal
creator of calibre
kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.kovidgoyal ought to be getting tired of karma fortunes by now.
 
kovidgoyal's Avatar
 
Posts: 43,843
Karma: 22666666
Join Date: Oct 2006
Location: Mumbai, India
Device: Various
Ah I see, well lets see if Tor starts beating on your door in the middle of the night.
kovidgoyal is offline   Reply With Quote
Old 03-04-2008, 10:38 PM   #5
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 73,894
Karma: 128597114
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Um... it won't work because it never came zipped. And how do we know the filename to use in the ZIP file or even if we have the exact same contents?
JSWolf is offline   Reply With Quote
Advert
Old 03-04-2008, 11:21 PM   #6
llasram
Reticulator of Tharn
llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.
 
llasram's Avatar
 
Posts: 618
Karma: 400000
Join Date: Jan 2007
Location: EST
Device: Sony PRS-505
Quote:
Originally Posted by JSWolf View Post
Um... it won't work because it never came zipped. And how do we know the filename to use in the ZIP file or even if we have the exact same contents?
The e-mails actually contain links to two separate HTML versions. One is the HTML content served directly, the other is a ZIP archive which contains the images used in the book, a (broken) OPF file, etc.

Last edited by llasram; 03-04-2008 at 11:21 PM. Reason: Fix typo.
llasram is offline   Reply With Quote
Old 03-04-2008, 11:34 PM   #7
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 73,894
Karma: 128597114
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Quote:
Originally Posted by llasram View Post
The e-mails actually contain links to two separate HTML versions. One is the HTML content served directly, the other is a ZIP archive which contains the images used in the book, a (broken) OPF file, etc.
Yes, you are correct. My apologies. I'll give your script another go and see how it works out.
JSWolf is offline   Reply With Quote
Old 03-04-2008, 11:39 PM   #8
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 73,894
Karma: 128597114
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
How do I use your script to generate a diff file for other content? I'd love to do one for Mistborn based on the PDF to make the LRF from it.
JSWolf is offline   Reply With Quote
Old 03-04-2008, 11:52 PM   #9
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 73,894
Karma: 128597114
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
I've taken the EPUB edition and built an LRF to my specification. Looks nice. Now all I need to do is build a proper ToC and I'll be all set.
JSWolf is offline   Reply With Quote
Old 03-05-2008, 08:10 AM   #10
llasram
Reticulator of Tharn
llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.
 
llasram's Avatar
 
Posts: 618
Karma: 400000
Join Date: Jan 2007
Location: EST
Device: Sony PRS-505
Quote:
Originally Posted by JSWolf View Post
How do I use your script to generate a diff file for other content? I'd love to do one for Mistborn based on the PDF to make the LRF from it.
It's symmetric, so:

Code:
python obelisk.py SALT KEYFILE INFILE OUTFILE
For both decryption and encryption. The SALT parameter is some string of your choosing but should not be reused for a particular KEYFILE. For example:

Code:
python obelisk.py sai3sahS 9780765350381.zip Mistborn.lrf sai3sahS#9780765350381.zip#Mistborn.lrf.obelisk
HOWEVER – I am not a lawyer. This certainly seems reasonable given that one needs the original file to reconstitute the derived file, but I don’t really know if Tor and/or your nation’s legal system will see it that way. This is an experiment – use at your own risk.
llasram is offline   Reply With Quote
Old 03-05-2008, 10:11 AM   #11
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 73,894
Karma: 128597114
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
I got it to work. Thank you. This will make it a lot easier now to post conversions without having to post the converted file.
JSWolf is offline   Reply With Quote
Old 03-05-2008, 10:38 AM   #12
NatCh
Gizmologist
NatCh ought to be getting tired of karma fortunes by now.NatCh ought to be getting tired of karma fortunes by now.NatCh ought to be getting tired of karma fortunes by now.NatCh ought to be getting tired of karma fortunes by now.NatCh ought to be getting tired of karma fortunes by now.NatCh ought to be getting tired of karma fortunes by now.NatCh ought to be getting tired of karma fortunes by now.NatCh ought to be getting tired of karma fortunes by now.NatCh ought to be getting tired of karma fortunes by now.NatCh ought to be getting tired of karma fortunes by now.NatCh ought to be getting tired of karma fortunes by now.
 
NatCh's Avatar
 
Posts: 11,615
Karma: 929550
Join Date: Jan 2006
Location: Republic of Texas Embassy at Jackson, TN
Device: Pocketbook Touch HD3
Heh, we may have to have another category in the Book Uploads area.
NatCh is offline   Reply With Quote
Old 03-05-2008, 10:39 AM   #13
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 73,894
Karma: 128597114
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Quote:
Originally Posted by NatCh View Post
Heh, we may have to have another category in the Book Uploads area.
That is a good idea.
JSWolf is offline   Reply With Quote
Old 03-05-2008, 12:59 PM   #14
llasram
Reticulator of Tharn
llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.llasram ought to be getting tired of karma fortunes by now.
 
llasram's Avatar
 
Posts: 618
Karma: 400000
Join Date: Jan 2007
Location: EST
Device: Sony PRS-505
Quote:
Originally Posted by kovidgoyal View Post
Ah I see, well lets see if Tor starts beating on your door in the middle of the night.
I decided to – you know – just go ahead and ask them. One Laurence Hewitt first says no, but I ask:

Quote:
Just to clarify -- you are asking me not to distribute the derivative work even in the form which requires the recipient of the derived work already have a copy of the original?

Which to clarify one step further, what I'm actually wishing to distribute is instructions for transforming the original version of the e-book into my modified version. So not directly distributing either the original or the derived version, but the means to transform the original into the derived version.
And he responds:

Quote:
Sorry, I utterly misunderstood your message. I've gotten into the bad habit of merely skimming my e-mail rather than actually reading it. Yes, of course you can distribute.
So Tor at least is fine with this. Hooray! :-)
llasram is offline   Reply With Quote
Old 03-05-2008, 01:07 PM   #15
JSWolf
Resident Curmudgeon
JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.JSWolf ought to be getting tired of karma fortunes by now.
 
JSWolf's Avatar
 
Posts: 73,894
Karma: 128597114
Join Date: Nov 2006
Location: Roslindale, Massachusetts
Device: Kobo Libra 2, Kobo Aura H2O, PRS-650, PRS-T1, nook STR, PW3
Nice! I don't see any issue. Because as you said without the source the diff file is useless. What if you've downloaded the PDF of Mistborn then you can get my LRF conversion over at https://www.mobileread.com/forums/sho...356#post156356
JSWolf is offline   Reply With Quote
Reply


Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
What format works best on Jetbook Lite? clerky96 Ectaco jetBook 18 02-10-2010 04:06 PM
Format Shifting - Soon Legal in the UK? bingle News 15 01-21-2008 09:07 AM
Baen format upgrade in the works Nate the great Workshop 11 12-09-2007 09:32 PM
Canadian government requires non-DRM "legal deposit" of digital works nekokami News 2 01-22-2007 03:21 PM


All times are GMT -4. The time now is 08:41 AM.


MobileRead.com is a privately owned, operated and funded community.