• ClusterBomb@lemmy.blahaj.zone
    link
    fedilink
    English
    arrow-up
    0
    ·
    11 hours ago

    “My hammer is not well suited to cut vegetables” 🤷

    There is so much to say about AI, can we move on from “it can’t count letters and do math” ?

    • ReallyActuallyFrankenstein@lemmynsfw.com
      link
      fedilink
      English
      arrow-up
      0
      ·
      8 hours ago

      I get that it’s usually just a dunk on AI, but it is also still a valid demonstration that AI has pretty severe and unpredictable gaps in functionality, in addition to failing to properly indicate confidence (or lack thereof).

      People who understand that it’s a glorified autocomplete will know how to disregard or prompt around some of these gaps, but this remains a litmus test because it succinctly shows you cannot trust an LLM response even in many “easy” cases.

  • daniskarma@lemmy.dbzer0.com
    link
    fedilink
    English
    arrow-up
    0
    ·
    edit-2
    12 hours ago

    That happens when do you not understand what is a llm, or what its usecases are.

    This is like not being impressed by a calculator because it cannot give a word synonym.

    • xigoi@lemmy.sdf.org
      link
      fedilink
      English
      arrow-up
      0
      ·
      10 hours ago

      Sure, maybe it’s not capable of producing the correct answer, which is fine. But it should say “As an LLM, I cannot answer questions like this” instead of just making up an answer.

      • daniskarma@lemmy.dbzer0.com
        link
        fedilink
        English
        arrow-up
        0
        ·
        10 hours ago

        I have thought a lot on it. The LLM per se would not know if the question is answerable or not, as it doesn’t know if their output is good of bad.

        So there’s various approach to this issue:

        1. The classic approach, and the one used for censoring: keywords. When the llm gets a certain key word or it can get certain keyword by digesting a text input then give back a hard coded answer. Problem is that while censoring issues are limited. Hard to answer questions are unlimited, hard to hard code all.

        2. Self check answers. For everything question the llm could process it 10 times with different seeds. Then analyze the results and see if they are equivalent. If they are not then just answer that it’s unsure about the answer. Problem: multiplication of resource usage. For some questions like the one in the post, it’s possible than the multiple randomized answers give equivalent results, so it would still have a decent failure rate.

        • xigoi@lemmy.sdf.org
          link
          fedilink
          English
          arrow-up
          0
          ·
          9 hours ago

          Why would it not know? It certainly “knows” that it’s an LLM and it presumably “knows” how LLMs work, so it could piece this together if it was capable of self-reflection.

  • Allero@lemmy.today
    link
    fedilink
    English
    arrow-up
    0
    ·
    edit-2
    12 hours ago

    Here’s my guess, aside from highlighted token issues:

    We all know LLMs train on human-generated data. And when we ask something like “how many R’s” or “how many L’s” is in a given word, we don’t mean to count them all - we normally mean something like “how many consecutive letters there are, so I could spell it right”.

    Yes, the word “strawberry” has 3 R’s. But what most people are interested in is whether it is “strawberry” or “strawbery”, and their “how many R’s” refers to this exactly, not the entire word.

    • Opisek@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      12 hours ago

      But to be fair, as people we would not ask “how many Rs does strawberry have”, but “with how many Rs do you spell strawberry” or “do you spell strawberry with 1 R or 2 Rs”

  • zipzoopaboop@lemmynsfw.com
    link
    fedilink
    English
    arrow-up
    0
    ·
    15 hours ago

    I asked Gemini if the quest has an SD slot. It doesn’t, but Gemini said it did. Checking the source it was pulling info from the vive user manual

  • rumba@lemmy.zip
    link
    fedilink
    English
    arrow-up
    0
    ·
    19 hours ago

    Yeah and you know I always hated this screwdrivers make really bad hammers.

  • gerryflap@feddit.nl
    link
    fedilink
    English
    arrow-up
    0
    ·
    edit-2
    22 hours ago

    These models don’t get single characters but rather tokens repenting multiple characters. While I also don’t like the “AI” hype, this image is also very 1 dimensional hate and misreprents the usefulness of these models by picking one adversarial example.

    Today ChatGPT saved me a fuckton of time by linking me to the exact issue on gitlab that discussed the issue I was having (full system freezes using Bottles installed with flatpak on Arch). This was the URL it came up with after explaining the problem and giving it the first error I found in dmesg: https://gitlab.archlinux.org/archlinux/packaging/packages/linux/-/issues/110

    This issue is one day old. When I looked this shit up myself I found exactly nothing useful on both DDG or Google. After this ChatGPT also provided me with the information that the LTS kernel exists and how to install it. Obviously I verified that stuff before using it, because these LLMs have their limits. Now my system works again, and figuring this out myself would’ve cost me hours because I had no idea what broke. Was it flatpak, Nvidia, the kernel, Wayland, Bottles, some random shit I changed in a config file 2 years ago? Well thanks to ChatGPT I know.

    They’re tools, and they can provide new insights that can be very useful. Just don’t expect them to always tell the truth, or to actually be human-like

    • lennivelkant@discuss.tchncs.de
      link
      fedilink
      English
      arrow-up
      0
      ·
      14 hours ago

      Just don’t expect them to always tell the truth, or to actually be human-like

      I think the point of the post is to call out exactly that: people preaching AI as replacing humans

  • eggymachus@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    0
    ·
    1 day ago

    A guy is driving around the back woods of Montana and he sees a sign in front of a broken down shanty-style house: ‘Talking Dog For Sale.’

    He rings the bell and the owner appears and tells him the dog is in the backyard.

    The guy goes into the backyard and sees a nice looking Labrador Retriever sitting there.

    “You talk?” he asks.

    “Yep” the Lab replies.

    After the guy recovers from the shock of hearing a dog talk, he says, “So, what’s your story?”

    The Lab looks up and says, “Well, I discovered that I could talk when I was pretty young. I wanted to help the government, so I told the CIA. In no time at all they had me jetting from country to country, sitting in rooms with spies and world leaders, because no one figured a dog would be eavesdropping, I was one of their most valuable spies for eight years running… but the jetting around really tired me out, and I knew I wasn’t getting any younger so I decided to settle down. I signed up for a job at the airport to do some undercover security, wandering near suspicious characters and listening in. I uncovered some incredible dealings and was awarded a batch of medals. I got married, had a mess of puppies, and now I’m just retired.”

    The guy is amazed. He goes back in and asks the owner what he wants for the dog.

    “Ten dollars” the guy says.

    “Ten dollars? This dog is amazing! Why on Earth are you selling him so cheap?”

    “Because he’s a liar. He’s never been out of the yard.”

  • Grandwolf319@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    0
    ·
    edit-2
    1 day ago

    There is an alternative reality out there where LLMs were never marketed as AI and were marketed as random generator.

    In that world, tech savvy people would embrace this tech instead of having to constantly educate people that it is in fact not intelligence.

    • Static_Rocket@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      17 hours ago

      That was this reality. Very briefly. Remember AI Dungeon and the other clones that were popular prior to the mass ml marketing campaigns of the last 2 years?

  • whotookkarl@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    1 day ago

    I’ve already had more than one conversation where people quote AI as if it were a source, like quoting google as a source. When I showed them how it can sometimes lie and explain it’s not a primary source for anything I just get that blank stare like I have two heads.

  • Tgo_up@lemm.ee
    link
    fedilink
    English
    arrow-up
    0
    ·
    1 day ago

    This is a bad example… If I ask a friend "is strawberry spelled with one or two r’s"they would think I’m asking about the last part of the word.

    The question seems to be specifically made to trip up LLMs. I’ve never heard anyone ask how many of a certain letter is in a word. I’ve heard people ask how you spell a word and if it’s with one or two of a specific letter though.

    If you think of LLMs as something with actual intelligence you’re going to be very unimpressed… It’s just a model to predict the next word.

    • Grandwolf319@sh.itjust.works
      link
      fedilink
      English
      arrow-up
      0
      ·
      1 day ago

      If you think of LLMs as something with actual intelligence you’re going to be very unimpressed

      Artificial sugar is still sugar.

      Artificial intelligence implies there is intelligence in some shape or form.

      • Scubus@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        0
        ·
        14 hours ago

        Thats because it wasnt originally called AI. It was called an LLM. Techbros trying to sell it and articles wanting to fan the flames started called it AI and eventually it became common dialect. No one in the field seriously calls it AI, they generally save that terms to refer to general AI or at least narrow ai. Of which an llm is neither.

      • corsicanguppy@lemmy.ca
        link
        fedilink
        English
        arrow-up
        0
        ·
        1 day ago

        Artificial sugar is still sugar.

        Because it contains sucrose, fructose or glucose? Because it metabolises the same and matches the glycemic index of sugar?

        Because those are all wrong. What’s your criteria?

      • JohnEdwa@sopuli.xyz
        link
        fedilink
        English
        arrow-up
        0
        ·
        edit-2
        1 day ago

        Something that pretends or looks like intelligence, but actually isn’t at all is a perfectly valid interpretation of the word artificial - fake intelligence.

    • renegadespork@lemmy.jelliefrontier.net
      link
      fedilink
      English
      arrow-up
      0
      ·
      1 day ago

      If you think of LLMs as something with actual intelligence you’re going to be very unimpressed… It’s just a model to predict the next word.

      This is exactly the problem, though. They don’t have “intelligence” or any actual reasoning, yet they are constantly being used in situations that require reasoning.

      • Tgo_up@lemm.ee
        link
        fedilink
        English
        arrow-up
        0
        ·
        11 hours ago

        What situations are you thinking of that requires reasoning?

        I’ve used LLMs to create software i needed but couldn’t find online.

      • sugar_in_your_tea@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        0
        ·
        1 day ago

        Maybe if you focus on pro- or anti-AI sources, but if you talk to actual professionals or hobbyists solving actual problems, you’ll see very different applications. If you go into it looking for problems, you’ll find them, likewise if you go into it for use cases, you’ll find them.

        • renegadespork@lemmy.jelliefrontier.net
          link
          fedilink
          English
          arrow-up
          0
          ·
          17 hours ago

          Personally I have yet to find a use case. Every single time I try to use an LLM for a task (even ones they are supposedly good at), I find the results so lacking that I spend more time fixing its mistakes than I would have just doing it myself.

          • Scubus@sh.itjust.works
            link
            fedilink
            English
            arrow-up
            0
            ·
            14 hours ago

            So youve never used it as a starting point to learn about a new topic? You’ve never used it to look up a song when you can only remember a small section of lyrics? What about when you want to code a block of code that is simple but monotonous to code yourself? Or to suggest plans for how to create simple sturctures/inventions?

            Anything with a verifyable answer that youd ask on a forum can generally be answered by an llm, because theyre largely trained on forums and theres a decent section the training data included someone asking the question you are currently asking.

            Hell, ask chatgpt what use cases it would recommend for itself, im sure itll have something interesting.

  • Fubarberry@sopuli.xyz
    link
    fedilink
    English
    arrow-up
    0
    ·
    1 day ago

    I asked mistral/brave AI and got this response:

    How Many Rs in Strawberry

    The word “strawberry” contains three "r"s. This simple question has highlighted a limitation in large language models (LLMs), such as GPT-4 and Claude, which often incorrectly count the number of "r"s as two. The error stems from the way these models process text through a process called tokenization, where text is broken down into smaller units called tokens. These tokens do not always correspond directly to individual letters, leading to errors in counting specific letters within words.

  • Lazycog@sopuli.xyz
    link
    fedilink
    English
    arrow-up
    0
    ·
    1 day ago

    I can already see it…

    Ad: CAN YOU SOLVE THIS IMPOSSIBLE RIDDLE THAT AI CAN’T SOLVE?!

    With OP’s image. And then it will have the following once you solve it: “congratz, send us your personal details and you’ll be added to the hall of fame at CERN Headquarters”

  • otp@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    0
    ·
    1 day ago

    From a linguistic perspective, this is why I am impressed by (or at least, astonished by) LLMs!