Register Guidelines E-Books Search Today's Posts Mark Forums Read

Go Back   MobileRead Forums > E-Book General > News

Notices

Reply
 
Thread Tools Search this Thread
Old 02-06-2025, 07:24 PM   #1
Frogm4n
Addict
Frogm4n ought to be getting tired of karma fortunes by now.Frogm4n ought to be getting tired of karma fortunes by now.Frogm4n ought to be getting tired of karma fortunes by now.Frogm4n ought to be getting tired of karma fortunes by now.Frogm4n ought to be getting tired of karma fortunes by now.Frogm4n ought to be getting tired of karma fortunes by now.Frogm4n ought to be getting tired of karma fortunes by now.Frogm4n ought to be getting tired of karma fortunes by now.Frogm4n ought to be getting tired of karma fortunes by now.Frogm4n ought to be getting tired of karma fortunes by now.Frogm4n ought to be getting tired of karma fortunes by now.
 
Posts: 369
Karma: 3003003
Join Date: Jul 2023
Device: Scribe, OA2, Glo HD, PRS-350
Meta admits to training LLM AI with terabytes of torrented copyrighted works.

https://arstechnica.com/tech-policy/...i-authors-say/

Quote:
Last month, Meta admitted to torrenting a controversial large dataset known as LibGen, which includes tens of millions of pirated books. But details around the torrenting were murky until yesterday, when Meta's unredacted emails were made public for the first time. The new evidence showed that Meta torrented "at least 81.7 terabytes of data across multiple shadow libraries through the site Anna’s Archive, including at least 35.7 terabytes of data from Z-Library and LibGen," the authors' court filing said. And "Meta also previously torrented 80.6 terabytes of data from LibGen."
Frogm4n is offline   Reply With Quote
Old 02-06-2025, 07:52 PM   #2
DNSB
Bibliophagist
DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.
 
DNSB's Avatar
 
Posts: 43,486
Karma: 165170834
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos
Reading that did make me laugh—again—at OpenAI's hypocrisy in complaining the DeepSeek may, in part, have been trained on data distilled from them. Their attitude that using data they mined from the Internet regardless of rights is okay but now someone may be mining their data and that's absolutely horrific appeals to my sense of humour.
DNSB is online now   Reply With Quote
Advert
Old 02-06-2025, 08:50 PM   #3
ownedbycats
Custom User Title
ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.
 
ownedbycats's Avatar
 
Posts: 10,333
Karma: 73404781
Join Date: Oct 2018
Location: Canada
Device: Kobo Libra H2O, formerly Aura HD
Also pirated work irony.
ownedbycats is online now   Reply With Quote
Old 02-07-2025, 11:33 AM   #4
SleepyBob
Evangelist
SleepyBob ought to be getting tired of karma fortunes by now.SleepyBob ought to be getting tired of karma fortunes by now.SleepyBob ought to be getting tired of karma fortunes by now.SleepyBob ought to be getting tired of karma fortunes by now.SleepyBob ought to be getting tired of karma fortunes by now.SleepyBob ought to be getting tired of karma fortunes by now.SleepyBob ought to be getting tired of karma fortunes by now.SleepyBob ought to be getting tired of karma fortunes by now.SleepyBob ought to be getting tired of karma fortunes by now.SleepyBob ought to be getting tired of karma fortunes by now.SleepyBob ought to be getting tired of karma fortunes by now.
 
Posts: 425
Karma: 8522810
Join Date: Dec 2010
Location: Wisconsin, USA
Device: Kindle PW3
The torrented part is pretty damning. I think there's a reasonable argument in favor of them being able to use copyrighted works as fair use if they had legally obtained them.
SleepyBob is offline   Reply With Quote
Old 02-08-2025, 09:15 PM   #5
jackm8
Addict
jackm8 ought to be getting tired of karma fortunes by now.jackm8 ought to be getting tired of karma fortunes by now.jackm8 ought to be getting tired of karma fortunes by now.jackm8 ought to be getting tired of karma fortunes by now.jackm8 ought to be getting tired of karma fortunes by now.jackm8 ought to be getting tired of karma fortunes by now.jackm8 ought to be getting tired of karma fortunes by now.jackm8 ought to be getting tired of karma fortunes by now.jackm8 ought to be getting tired of karma fortunes by now.jackm8 ought to be getting tired of karma fortunes by now.jackm8 ought to be getting tired of karma fortunes by now.
 
jackm8's Avatar
 
Posts: 276
Karma: 3000000
Join Date: Nov 2015
Device: none
On a side note, I find it curious that the complete archive is just 81.7 terabytes. 1500 euro for five 18T HDD drives, and anyone can have a home backup with plenty of space to spare.
jackm8 is offline   Reply With Quote
Advert
Old 02-08-2025, 09:41 PM   #6
DNSB
Bibliophagist
DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.
 
DNSB's Avatar
 
Posts: 43,486
Karma: 165170834
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos
Quote:
Originally Posted by jackm8 View Post
On a side note, I find it curious that the complete archive is just 81.7 terabytes. 1500 euro for five 18T HDD drives, and anyone can have a home backup with plenty of space to spare.
I'd go for 6 or 7 drives to allow a RAID 5 or RAID 6 array. Our new backup array at work used 18TB drives (Western Digital Red Pro NAS) and we had several failures in the first 6 months. RAID allowed us to simply swap the failed drive and then the array rebuilt itself. Not exactly fast but all automated. With it's full 12 drives, gave us about 170TB of usable storage over a 10GB fibre connection.
DNSB is online now   Reply With Quote
Old 02-08-2025, 09:48 PM   #7
ownedbycats
Custom User Title
ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.ownedbycats ought to be getting tired of karma fortunes by now.
 
ownedbycats's Avatar
 
Posts: 10,333
Karma: 73404781
Join Date: Oct 2018
Location: Canada
Device: Kobo Libra H2O, formerly Aura HD
Quote:
Originally Posted by jackm8 View Post
On a side note, I find it curious that the complete archive is just 81.7 terabytes. 1500 euro for five 18T HDD drives, and anyone can have a home backup with plenty of space to spare.
Books are small and easily compressable.
ownedbycats is online now   Reply With Quote
Old 02-09-2025, 06:58 AM   #8
Quoth
Still reading
Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.
 
Quoth's Avatar
 
Posts: 13,438
Karma: 102739837
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper
Quote:
Originally Posted by DNSB View Post
I'd go for 6 or 7 drives to allow a RAID 5 or RAID 6 .
I used to use RAID 5 with ultra wide & fast 10,000 rpm 20G SCSI drives. HW EISA RAID controller with onboard RAM, on board battery and a big UPS. Later PCI. Any server needs a UPS. Really workstions too; that's the advantage of a laptop. Very noisy and power hungry. It lived in the attic above the bathroom and the noise baffled visitors.

The RAID 5 rebuild time gets excessive with 250G+ drives. A mirror is simpler. Also decent HW RAID controllers for modern drives are not everyday things.

Highest end thing I built was for a college in 1998 or 1999. It had two shelves each with own UPS and each end of each SCSI bus connected to a SCSI buffer to a Pentium Pro with dual channel HW RAID controller. The two Pentium pro servers also on separate UPS. Ran NT4.0 Enterprise with 1st non-beta MS Cluster SW developed by DEC. It was some sort of combo RAID so you could lose a shelf. Maybe RAID 5 per shelf, mirrored?

Now our server lives in a fireproof, waterproof shed, with its own UPS, but that is fed from main Solar + Grid UPS (6000+ Wh). Just a single drive and backups because down time no longer matters. No live services. In the old days the Server handled proxy for Internet, all mail, Windows Update, Print server, and a VPN server for people away from home to securely do email etc.
Quoth is offline   Reply With Quote
Old 02-09-2025, 07:05 AM   #9
Quoth
Still reading
Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.
 
Quoth's Avatar
 
Posts: 13,438
Karma: 102739837
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper
Quote:
Originally Posted by SleepyBob View Post
The torrented part is pretty damning. I think there's a reasonable argument in favor of them being able to use copyrighted works as fair use if they had legally obtained them.
Legally as in an obtained licence. But any Publisher is really violating rights of the author to do that without the author's permission. Like if a publisher gets rights for a book, they can't produce an audio book, TV series, play, Cinema, sell prints & models of characters, produce a translation.

Considering what LLMs do, the idea of fair use even via a licence is obnoxious. Most authors I know would never agree. It's a "licence" to let the LLM users plagiarise. Really ANYTHING other than using PD works is violation of rights of ANY content creator, including bloggers and forum posts. A use that was never intended. They can't be forbidden from using PD content.
Quoth is offline   Reply With Quote
Old 02-09-2025, 09:32 AM   #10
rcentros
eReader Wrangler
rcentros ought to be getting tired of karma fortunes by now.rcentros ought to be getting tired of karma fortunes by now.rcentros ought to be getting tired of karma fortunes by now.rcentros ought to be getting tired of karma fortunes by now.rcentros ought to be getting tired of karma fortunes by now.rcentros ought to be getting tired of karma fortunes by now.rcentros ought to be getting tired of karma fortunes by now.rcentros ought to be getting tired of karma fortunes by now.rcentros ought to be getting tired of karma fortunes by now.rcentros ought to be getting tired of karma fortunes by now.rcentros ought to be getting tired of karma fortunes by now.
 
rcentros's Avatar
 
Posts: 7,771
Karma: 50741051
Join Date: Mar 2013
Location: Boise, ID
Device: PB HD3, GL3, Tolino Vision 4, Voyage, Clara HD
Quote:
Originally Posted by DNSB View Post
Reading that did make me laugh—again—at OpenAI's hypocrisy in complaining the DeepSeek may, in part, have been trained on data distilled from them. Their attitude that using data they mined from the Internet regardless of rights is okay but now someone may be mining their data and that's absolutely horrific appeals to my sense of humour.
I have to agree with you on this. OpenAI's hypocrisy is funny. I've only been on the peripheral of the DeepSeek vs OpenAI news but, apparently, DeepSeek is throwing a monkey wrench into OpenAI's world domination plans. I hope they "monkey wrench" each other to death.
rcentros is offline   Reply With Quote
Old 02-09-2025, 01:07 PM   #11
DNSB
Bibliophagist
DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.
 
DNSB's Avatar
 
Posts: 43,486
Karma: 165170834
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos
Quote:
Originally Posted by Quoth View Post
I used to use RAID 5 with ultra wide & fast 10,000 rpm 20G SCSI drives. HW EISA RAID controller with onboard RAM, on board battery and a big UPS. Later PCI. Any server needs a UPS. Really workstions too; that's the advantage of a laptop. Very noisy and power hungry. It lived in the attic above the bathroom and the noise baffled visitors.
The drives we used were SAS 7.2K rpm with 4 4TB SSDs (two per controller) as the cache. The controllers were slide-in modules so in theory if one failed, you could hot swap it. When I dug below the GUI, the controllers were running a modified Linux with an i5 CPU and 64GB of RAM per controller. It made a great backup device for the virtualization servers. I'm still not overly fond of iSCSI though a lot easier to configure than a Fibre Channel connection.
DNSB is online now   Reply With Quote
Old 02-10-2025, 03:36 PM   #12
SleepyBob
Evangelist
SleepyBob ought to be getting tired of karma fortunes by now.SleepyBob ought to be getting tired of karma fortunes by now.SleepyBob ought to be getting tired of karma fortunes by now.SleepyBob ought to be getting tired of karma fortunes by now.SleepyBob ought to be getting tired of karma fortunes by now.SleepyBob ought to be getting tired of karma fortunes by now.SleepyBob ought to be getting tired of karma fortunes by now.SleepyBob ought to be getting tired of karma fortunes by now.SleepyBob ought to be getting tired of karma fortunes by now.SleepyBob ought to be getting tired of karma fortunes by now.SleepyBob ought to be getting tired of karma fortunes by now.
 
Posts: 425
Karma: 8522810
Join Date: Dec 2010
Location: Wisconsin, USA
Device: Kindle PW3
Quote:
Originally Posted by Quoth View Post
Legally as in an obtained licence. But any Publisher is really violating rights of the author to do that without the author's permission. Like if a publisher gets rights for a book, they can't produce an audio book, TV series, play, Cinema, sell prints & models of characters, produce a translation.

Considering what LLMs do, the idea of fair use even via a licence is obnoxious. Most authors I know would never agree. It's a "licence" to let the LLM users plagiarise. Really ANYTHING other than using PD works is violation of rights of ANY content creator, including bloggers and forum posts. A use that was never intended. They can't be forbidden from using PD content.
But when a publisher sells a book, I can buy it. And then I can write and publish a "Cliff's Notes" version using the book that discusses important themes, characters, plot points and memorable quotes. And if I can do it as a person, there's a reasonable argument that a LLM should be allowed to do something similar. I don't see that as fundamentally different than what they already do. The publisher doesn't get to say "don't use this book to write a book about the book."

If I could get chatGPT to quote me a full chapter out of Harry Potter word for word, that would be one thing. But I'm pretty sure it can't.
SleepyBob is offline   Reply With Quote
Old 02-10-2025, 05:20 PM   #13
DNSB
Bibliophagist
DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.DNSB ought to be getting tired of karma fortunes by now.
 
DNSB's Avatar
 
Posts: 43,486
Karma: 165170834
Join Date: Jul 2010
Location: Vancouver
Device: Kobo Sage, Libra Colour, Lenovo M8 FHD, Paperwhite 4, Tolino epos
Quote:
Originally Posted by SleepyBob View Post
But when a publisher sells a book, I can buy it. And then I can write and publish a "Cliff's Notes" version using the book that discusses important themes, characters, plot points and memorable quotes. And if I can do it as a person, there's a reasonable argument that a LLM should be allowed to do something similar. I don't see that as fundamentally different than what they already do. The publisher doesn't get to say "don't use this book to write a book about the book."

If I could get chatGPT to quote me a full chapter out of Harry Potter word for word, that would be one thing. But I'm pretty sure it can't.
Perhaps that you legally obtained the book while the companies responsible for LLM training have very rarely paid anything to anybody and that only when the courts have ordered such payment. Copyright and other legal doctrines seem to mean little to them other than being annoyances.

You may also want to take a look at the Fair Use/Fair Dealing laws in your area.
DNSB is online now   Reply With Quote
Old 02-11-2025, 09:54 AM   #14
Quoth
Still reading
Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.Quoth ought to be getting tired of karma fortunes by now.
 
Quoth's Avatar
 
Posts: 13,438
Karma: 102739837
Join Date: Jun 2017
Location: Ireland
Device: All 4 Kinds: epub eink, Kindle, android eink, NxtPaper
Quote:
Originally Posted by SleepyBob View Post
But when a publisher sells a book, I can buy it. And then I can write and publish a "Cliff's Notes" version using the book that discusses important themes, characters, plot points and memorable quotes. And if I can do it as a person, there's a reasonable argument that a LLM should be allowed to do something similar. I don't see that as fundamentally different than what they already do. The publisher doesn't get to say "don't use this book to write a book about the book."

If I could get chatGPT to quote me a full chapter out of Harry Potter word for word, that would be one thing. But I'm pretty sure it can't.
You can do a review. You might need publisher permission for such extensive quotes, that's not fair use in many countries. Some don't even have a "fair use" law. Even in a review there may be a limit on what you can quote without written permission in many countries.

It's not the same thing, also the big corps didn't even buy a copy. They used pirate copies.

Totally false analogy that also reveals you don't understand how LLM work or copyright.
Quoth is offline   Reply With Quote
Old 02-11-2025, 10:11 AM   #15
DiapDealer
Grand Sorcerer
DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.DiapDealer ought to be getting tired of karma fortunes by now.
 
DiapDealer's Avatar
 
Posts: 28,271
Karma: 203719142
Join Date: Jan 2010
Device: Nexus 7, Kindle Fire HD
Quote:
Originally Posted by Quoth View Post
Totally false analogy that also reveals you don't understand how LLM work or copyright.
Sorry. But I think what you actually meant to say was that "that also reveals that your interpretation of the legal ramifications of how LLM works does not match my own." Because let's face it. Your interpretation of AI and LLM in general are a bit more philosophical (not to mention semantic) than most's.
DiapDealer is offline   Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
LLM created tags scruffynerf Plugins 8 11-05-2024 11:39 AM
Amazon admits Kindles made by illegally hired underpaid workers GeoffR News 9 06-11-2018 04:58 AM
John Scalzi admits he's a Hack. :) kennyc Writers' Corner 16 02-02-2013 09:24 AM
Wikipedia can be torrented... spirits Amazon Kindle 0 10-26-2008 11:41 PM
Obelisk -- legal distribution of format-shifted copyrighted works llasram Workshop 26 10-11-2008 12:37 PM


All times are GMT -4. The time now is 02:26 PM.


MobileRead.com is a privately owned, operated and funded community.