Quote:
Originally Posted by j.p.s
(US specific) 80% of books published 1924-63 allegedly in public domain
archive.org has a human readable list of copyright registrations.
The NY Public Library encoded the list into XML. Data mining revealed 80% of the copyrights were never renewed.
|
I'm not sure why this news is making the rounds again now...
Stanford's "Copyright Renewal Database" has been around for many many years now.
You're able to search to see if a book was renewed using this URL:
https://exhibits.stanford.edu/copyri...s?forward=home
You're even able to download the raw CSV info as well.
Here's what they've had to say about copyright renewals from that period:
Quote:
Originally Posted by Copyright Renewal Database
The period from 1923 to 1963 is of special interest for US copyrights: starting in 1964, works' copyrights were automatically renewed by statute; and works published before 1923 have generally fallen into the public domain. But between those dates, a renewal registration was required to prevent the expiration of copyright. However, it's challenging to determine whether a work's registration has been renewed. Renewals received by the Copyright Office after 1977 are searchable in an online database, but renewals received between 1950 and 1977 were announced and distributed only in a semi-annual print publication. The Copyright Office does not have a machine-searchable source for this renewal information, and the only public access is through the card catalog in their DC offices.
|
And here's what they say on their About page:
Quote:
Stanford's Copyright Renewal Database compiles all US Class A (book) renewal registrations for works published between 1923 and 1963. These renewals were received by the US Copyright Office between 1950 and 1993.
[...]
Building on work done by Project Gutenberg to transcribe the 1950-1977 renewals, and on early conversion efforts by Michael Lesk, we have converted the published renewal announcements to machine-readable form, and combined them with the renewals for later years made available on the Copyright Office's website. Note that this database covers only renewals, not original registrations, and is limited to books (Class A registrations) published in the US. Note that the Catalog of Copyright Entries, and therefore this database, does not include entries for assignments, and so cannot be used for searches involving the ownership of rights. Please also note that copyrights restored under Section 104(a) of the copyright act are not represented in this database.
Stanford has performed two rounds of testing in order to assess the accuracy of this database. In each round, we pulled a minimum of 500 book titles published in the US between 1923 and 1963 from the Stanford library catalog. The works were checked manually in the CCE, and, in the first round, a subset of 100 records was also sent to the Copyright Office to be checked by their in-house staff. Each of these items was then separately searched by project staff in the Copyright Renewals Database. In each round, the error rate for the database was found to be less than 1%, although in practice there is significant opportunity for user error or other problems in searching. Details of these issues can be found in Stanford's final report to Hewlett on the project [...]
|
Edit: Ahh, now that I clicked on URLs in the original post's article, it makes more sense. The actual source article at NYPL is much better and goes into much more detail:
"U.S. Copyright History 1923–1964" by Sean Redmond