A user on the online forum 4chan has leaked a massive 270GB of data purportedly belonging to The New York Times. This leak includes what is claimed to be the source code for the newspaper’s digital operations.

  • muntedcrocodile@lemm.ee
    link
    fedilink
    English
    arrow-up
    0
    ·
    edit-2
    25 days ago

    Thats a lot of data but surly its not all their articles cos I’d very much like to train mixtral7x8b on it along with 4chan data and shir from the dark web. Surly there is a project where such a model is public and being trained on literally everything regardless of legality.

    EDIT: why am i getting downvoted?

    • reddithalation@sopuli.xyz
      link
      fedilink
      English
      arrow-up
      0
      ·
      edit-2
      24 days ago

      you’re getting downvoted because LLMs are simply not very good, they consume lots of energy (bad for climate), and seemingly most people involved in ai hype want to replace human creativity or something.

      how about instead of training a not very trustworthy or useful LLM on lots of nyt, 4chan, and “dark web”, you go read lots of nyt, 4chan, and dark web to train your own (much better) model (your brain).

      • muntedcrocodile@lemm.ee
        link
        fedilink
        English
        arrow-up
        0
        ·
        23 days ago

        They are very good they exceed the capability of many humans in many tasks. If consume energy = bad for environment then all electric vehicles are bullshit cos they have energy inefficiencies that petrol cars don’t (thermodynamics is a bitch). U do realise the argument about if asking an ai to create an image is art argument is literally the same argument that was had about if photography is art.

        Llm are decently trustworthy especially with chain of thought reasoning and tool capabilities. And they are extraordinarily useful people wouldnt be using them and creating a market for them of they weren’t. I can’t train my brain then share it for free to everyone on the internet to download I can with an ai tho.