ChatGPT Answers Programming Questions Incorrectly 52% of the Time: Study

ForgottenFlux@lemmy.world · 1 month ago

ChatGPT Answers Programming Questions Incorrectly 52% of the Time: Study

efstajas@lemmy.world · 1 month ago

Yeah it’s wrong a lot but as a developer, damn it’s useful. I use Gemini for asking questions and Copilot in my IDE personally, and it’s really good at doing mundane text editing bullshit quickly and writing boilerplate, which is a massive time saver. Gemini has at least pointed me in the right direction with quite obscure issues or helped pinpoint the cause of hidden bugs many times. I treat it like an intelligent rubber duck rather than expecting it to just solve everything for me outright.

InternetPerson@lemmings.world · 1 month ago

That’s a good way to use it. Like every technological evolution it comes with risks and downsides. But if you are aware of that and know how to use it, it can be a useful tool.
And as always, it only gets better over time. One day we will probably rely more heavily on such AI tools, so it’s a good idea to adapt quickly.

BeatTakeshi@lemmy.world · edit-2 1 month ago

Who would have thought that an artificial intelligence trained on human intelligence would be just as dumb

capital@lemmy.world · edit-2 1 month ago

Hm. This is what I got.

I think about 90% of the screenshots we see of LLMs failing hilariously are doctored. Lemmy users really want to believe it’s that bad through.

Edit:

gravitas_deficiency@sh.itjust.works · 1 month ago

Holy fuck did it just pass the Turing test?

gravitas_deficiency@sh.itjust.works · 1 month ago

C-suites:

tHis iS inCReDibLe! wE cAn SavE sO MUcH oN sTafFiNg cOStS!

NounsAndWords@lemmy.world · 1 month ago

GPT-2 came out a little more than 5 years ago, it answered 0% of questions accurately and couldn’t string a sentence together.

GPT-3 came out a little less than 4 years ago and was kind of a neat party trick, but I’m pretty sure answered ~0% of programming questions correctly.

GPT-4 came out a little less than 2 years ago and can answer 48% of programming questions accurately.

I’m not talking about mortality, or creativity, or good/bad for humanity, but if you don’t see a trajectory here, I don’t know what to tell you.

14th_cylon@lemm.ee · 1 month ago

Seeing the trajectory is not ultimate answer to anything.

systemglitch@lemmy.world · 1 month ago

That comes off as disingenuous in this instance.