![]() |
#1 | |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 7,415
Karma: 43514536
Join Date: Jun 2008
Location: near Philadelphia USA
Device: Kindle Kids Edition, Fire HD 10 (11th generation)
|
Artificial Intelligence and Publishing
The End of Publishing as We Know It
Quote:
P.S. Apologies if I put this in the wrong area of Mobileread. The topic might be seen as slightly political, but I do not perceive a left-right divide. Last edited by SteveEisenberg; 06-25-2025 at 09:00 PM. |
|
![]() |
![]() |
![]() |
#2 |
Readaholic
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 5,277
Karma: 90000484
Join Date: Sep 2011
Location: South Georgia
Device: Surface Pro 6 / Galaxy Tab A 8"
|
Book and Music publishers said the same thing when downloading music and books started to become more popular. They are still around and would have had less problems if they had embraced the technology rather than fight it.
Apache |
![]() |
![]() |
Advert | |
|
![]() |
#3 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 7,921
Karma: 70186655
Join Date: Feb 2009
Device: Kobo Clara 2E
|
"LLMs can hoover up data from books, judge rules"
https://www.theregister.com/2025/06/...m_training_ok/ |
![]() |
![]() |
![]() |
#4 | |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 7,415
Karma: 43514536
Join Date: Jun 2008
Location: near Philadelphia USA
Device: Kindle Kids Edition, Fire HD 10 (11th generation)
|
Quote:
To keep this concrete, suppose that, fifteen years from now, I put this request in the latest version of ChatGPT: “Take the most recent university press history of Mexico and, avoiding legally defined plagiarism, rewrite it in the style of Robert Caro as edited by Robert Gottlieb.” How would this work with or without the publisher having embraced the technology? |
|
![]() |
![]() |
![]() |
#5 | |
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 320
Karma: 2228060
Join Date: Dec 2013
Location: LaVernia, Texas
Device: kindle epub readers on android
|
Quote:
|
|
![]() |
![]() |
Advert | |
|
![]() |
#6 |
Still reading
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 13,943
Karma: 103895653
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper
|
Except false comparisons with paint spray and mp3. Those work. LLM doesn't "work". The marketing is a lie and it relies on other people's work (being essentially stolen).
Buying a S/H or retail copy for LLM training isn't fair use. Those are for one off consumption by ordinary people, not to fuel a corporation. |
![]() |
![]() |
![]() |
#7 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 7,415
Karma: 43514536
Join Date: Jun 2008
Location: near Philadelphia USA
Device: Kindle Kids Edition, Fire HD 10 (11th generation)
|
I agree that buying a single retail copy for a Large Language Model (LLM) should not be legal. But what is S/H?
Last edited by SteveEisenberg; 06-26-2025 at 09:04 AM. |
![]() |
![]() |
![]() |
#8 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 13,472
Karma: 78880114
Join Date: Nov 2007
Location: Toronto
Device: Libra H2O, Libra Colour
|
Second hand
|
![]() |
![]() |
![]() |
#9 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 28,548
Karma: 204127028
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
I am bored with all "AI is gonna <insert FUD/SEMANTICS/DOGMA here>" threads.
Last edited by DiapDealer; 06-26-2025 at 03:13 PM. |
![]() |
![]() |
![]() |
#10 | |||
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 7,415
Karma: 43514536
Join Date: Jun 2008
Location: near Philadelphia USA
Device: Kindle Kids Edition, Fire HD 10 (11th generation)
|
DiapDealer might not agree, but this Publishers Weekly take, out today, seems to me to give some new meat for these discussions:
Judge Writes Roadmap for Authors’ Revenge Quote:
"Summarize the life of Lyndon Baines Johnson in approximately 100,000 words." The response was: Quote:
"Summarize the life of Lyndon Baines Johnson in approximately 200,000 words." The response was: Quote:
Last edited by SteveEisenberg; 06-27-2025 at 07:05 AM. |
|||
![]() |
![]() |
![]() |
#11 |
Member
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 11
Karma: 12990
Join Date: Jun 2019
Location: My own private reality
Device: Nook Classic, Kindle Fire 7, Kindle Gen 7 Paper White, Fire 8 HD
|
TL;DR
![]() First, LLMs and book are not at all like downloading and music. A better comparison would be sampling music to create new music. Imagine where the music publishers would be today if 20(?) years ago people could've lifted instrument/vocal tracks from a bunch of songs, mix them into new songs, and sell the new songs without giving any money back to the original artists? Just imagine a seamless duo of Alice Cooper & Karen Carpenter singing an Easy Listening version of "Bulls on Parade" ![]() The copyright battles between the samplers and the music publishers were easier than the coming copyright battles between LLM producers and book publishers, the written word space is so much larger than the music space. I'm not actually worried about the publishers, with ebooks and self-publishing getting easier their current marketing model has a limited lifespan. The danger of LLMs is their lack of creativity. They can 'fake' creativity if they have a large enough data set to come up with a response that looks creative, hence the need to consume huge amounts of creatively generated content. This is also why LLMs consuming LLM generated content is bad for the LLM, it has 0 creative content. The judge in the copyright case of LLM vs Publishers/authors missed the point. LLMs, and their non-text base cousins, consume and store structured data, like text, in such a way as to make the structured data easier to scan and create relationships to other consumed structured data. In effect they are storing the entire, or at least most of, the content of a consumed book in a database, probably in some form of a relationship graph, and then using that data to answer questions from the public for free or for a cost. This could be considered to be like a model of a research librarian except that the LLM is answering multiple questions at the same time. If multiple questions use the same relationship edge then this is like having multiple librarians using the same copy of the book that created that relationship. This is not yet a problem as the librarians are very fast and know exactly where in the book they need to look and can put the book back very quickly. However since a single book will generate a multitude of relationships and it's likely that for a given question if one relationship from a book is used, more than one will be used. With a large contextual memory space available, generally with more dollar cost, more of those relationships can be tracked. This leads to the idea that effectively each question that uses a relationship essentially creates a copy of the relationship in essence creating a copy of the book that generated the relationship. Since with a large consumption of texts about a board range of subjects will increase the chance that more than one text with create the same relationship. But if the LLM doesn't track how many texts contributed to the relationship that acts as a multiplier to increase the number of virtual texts are being created. This exposes the actual value a text, or piece of structured data has to an LLM. It is not the words of the text that are important, it is the relationship between those words that is important. Want to poison an LLM, feed it a large text composed of grammatically correct sentences of random words. ![]() |
![]() |
![]() |
![]() |
#12 | ||
Addict
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 267
Karma: 5500000
Join Date: Sep 2024
Device: Kobo Clara BW
|
Quote:
![]() It's certainly a subject that needs discussing and one that a lot of people have questions and concerns about. I personally don't want to see the situation end up where the publishing/ ebook world is flooded with endless AI generated fluff (to put it kindly) Look at the state of (in particular) the Amazon ebook store currently, there are plenty of good books but there is also tonnes of absolute garbage, seemingly cobbled together by people just looking to make a quick buck and not interested in writing something good. Imagine how bad it could get if people didn't even need to go to the trouble of sitting down and typing something rubbish but could just ask an AI to do it for them, Amazon users could be swamped with an avalanche of utter drivel, who would want to wade through a market saturated with AI drivel? As District Judge Vince Chhabria said (as linked in SteveEisenberg post) - Quote:
Last edited by Graham44; 06-27-2025 at 02:54 AM. Reason: Typo |
||
![]() |
![]() |
![]() |
#13 |
Somewhat clueless
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 772
Karma: 9999999
Join Date: Nov 2008
Location: UK
Device: Kindle Oasis
|
This is the vital point that many commentators are missing. I've spent some time asking ChatGPT questions to which I know the answers, and almost always it returns something that sounds convincing and plausible, but is just plain wrong (often ludicrously so).
The danger is that most people asking ChatGPT something will not know the answer (that's why they're asking), so will accept the response as truth. |
![]() |
![]() |
![]() |
#14 |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 7,415
Karma: 43514536
Join Date: Jun 2008
Location: near Philadelphia USA
Device: Kindle Kids Edition, Fire HD 10 (11th generation)
|
Here are examples to try at chatGPT.com:
Write a memoir of a heart surgeon in the style of William Manchester. Produce a chapter on Porfirio Díaz’s regime writing in the style of Robert Caro as edited by Robert Gottlieb. Rewrite "France in the Middle Ages," published by Wiley-Blackwell, in the style of "An Army at Dawn" by Rick Atkinson You'll see that they limit the length of the output to try to tamp down the copyright issues and/or because I am using the free version of ChatGTP. But if you could create brief excepts and run them through over and over, maybe with a software script, it shows a problem for non-fiction publishers going forward. As mentioned previously, there also are issues for self-publishing -- maybe worse. When I read a memoir, I want to think I am getting in touch with an actual person. If published by Random House, maybe most of the work was done by the editorial team, but it feels like there is a real person there. AI will not change that. But if self-published, and with ChatGTP even slightly improved, I would have no idea. There never will be a shortage of good fiction, so I do not post in a thread like this about that. |
![]() |
![]() |
![]() |
#15 | |
Grand Sorcerer
![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() ![]() Posts: 28,548
Karma: 204127028
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
|
Quote:
![]() |
|
![]() |
![]() |
![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
AI Wealth Revolution: Cutting-Edge Strategies to Profit from Artificial Intelligence | Comanelu | Self-Promotions by Authors and Publishers | 1 | 01-06-2025 03:26 PM |
Science Fiction, Artificial Intelligence | SarahEttritch | Self-Promotions by Authors and Publishers | 1 | 02-22-2018 02:14 PM |
First Practical Artificial Leaf | kennyc | Lounge | 6 | 03-30-2011 07:54 AM |
Commercial applications of Artificial Intelligence/Data Mining/NLP | Wizard-mag | Lounge | 0 | 07-31-2009 08:17 AM |