Quote:
Originally Posted by salamanderjuice
LLMs can't recreate the entirety of the source texts.
|
I just went to chatgpt.com and and asked for the text of
Moby Dick. It noted that the book is public domain and provided the text.
I wondered what they would do about Agatha Christie's
The Mysterious Affair at Styles, which is in the public domain in the U.S. but hardly anywhere else.
Answer: They told me it was public domain in the U.S., and provided the text.
Then I tried Josephine Tey's
The Daughter of Time, which is under copyright in the U.S. and hardly anywhere else. And they declined to provide me with the text on copyright grounds.
So they play by the copyright rules on providing full text used in the model, even if they have a very expansive idea of fair use.
As for learning, this depends on the definition of learning. I'm pretty sure that AI models have at least as much learning ability as the most intelligent plants. Concerning plants, see:
Learning in Plants: Lessons from Mimosa pudica