TL;DR: The big tech AI company LLMs have gobbled up all of our data, but the damage they have done to open source and free culture communities are particularly insidious. By taking advantage of those who share freely, they destroy the bargain that made free software spread like wildfire.

  • illusionist@lemmy.zip
    link
    fedilink
    English
    arrow-up
    0
    ·
    26 days ago

    I may not be up to date

    • What damage to open source did the big tech ai companies do?
    • how do they take advantage of us?
    • ooterness@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      26 days ago

      A lot of open-source software uses copyleft licenses like GPL. If a company uses that code to build its own products, then some or all of their new code may also become open source. This is an important part of how open-source projects stay open. Organizations like FSF have taken big companies to court over this and won.

      AI companies trained their slop-generators on that open-source code. In many cases, it will reproduce it line-for-line. But courts currently hold that the generated code is no longer subject to the original copyright restrictions. It’s nearly impossible to publish open-source software without being scraped for AI training.

    • N.E.P.T.R@lemmy.blahaj.zone
      link
      fedilink
      English
      arrow-up
      0
      ·
      26 days ago
      • Most “Open source” LLMs are really just open weights, which is useless without the training data. This dilutes the definition of OSS. There is no way to train the model as a normal person (aka not Google or Meta, etc)
      • LLM producers don’t credit the OSS they trained on, no attribution. Most models violate the licenses of all their training data (eg. GPL).
      • LLM scraper bots put high stress on server infrastructure, creating a DDOS attack.