• theneverfox
    link
    fedilink
    English
    arrow-up
    2
    arrow-down
    6
    ·
    7 months ago

    it is a little funny to me that they’re taking about using AI to detect AI garbage as a mechanism of preventing the sort of model/data collapse that happens when data sets start to become poisoned with AI content. because it seems reasonable to me that if you start feeding your spam-or-real classification data back into the spam-detection model, you’d wind up with exactly the same degredations of classification and your model might start calling every article that has a sentence starting with “Certainly,” a machine-generated one. maybe they’re careful to only use human-curated sets of real and spam content, maybe not

    Ultimately, LLMs don’t use words, they use tokens. Tokens aren’t just words - they’re nodes in a high-dimensional graph… Their location and connections in information space is data invisible to humans.

    LLM responses are basically paths through the token space, they may or may not overuse certain words, but they’ll have a bias towards using certain words together

    So I don’t think this is impossible… Humans struggle to grasp these kinds of hidden relationships (consciously at least), but neural networks are good at that kind of thing

    I too think it’s funny/sad how AI is being used… It’s good at generation, that’s why we call it generative AI. It’s incredibly useful to generate all sorts of content when paired with a skilled human, it’s insane to expect common sense out of something easier to gaslight than a toddler. It can handle the tedious details while a skilled human drives it and validates the output

    The biggest, if rarely used, use case is education - they’re an infinitely patient tutor that can explain things in many ways and give you endless examples. Everyone has different learning styles - you could so easily take an existing lesson and create more concrete or abstract versions, versions for people who need long explanations and ones for people who learn through application

    • ebu@awful.systems
      link
      fedilink
      English
      arrow-up
      10
      ·
      7 months ago

      Ultimately, LLMs don’t use words,

      LLM responses are basically paths through the token space, they may or may not overuse certain words, but they’ll have a bias towards using certain words together

      so they use words but they don’t. okay

      this is about as convincing a point as “humans don’t use words, they use letters!” it’s not saying anything, just adding noise

      So I don’t think this is impossible… Humans struggle to grasp these kinds of hidden relationships (consciously at least), but neural networks are good at that kind of thing

      i can’t tell what the “this” is that you think is possible

      part of the problem is that a lot of those “hidden relationships” are also noise. knowing that “running” is typically an activity involving your legs doesn’t help one parse the sentence “he’s running his mouth”, and part of participating in communication is being able to throw out these spurious and useless connections when reading and writing, something the machine consistently fails to do.

      It’s incredibly useful to generate all sorts of content when paired with a skilled human

      so is a rock

      It can handle the tedious details while a skilled human drives it and validates the output

      validation is the hard step, actually. writing articles is actually really easy if you don’t care about the legibility, truthiness, or quality of the output. i’ve tried to “co-write” short-format fiction with large language models for fun and it always devolved into me deleting large chunks – or even the whole – output of the machine and rewriting it by hand. i was more “productive” with a blank notepad.exe. i’ve not tried it for documentation or persuasive writing but i’m pretty sure it would be a similar situation there, if not even more so, because in nonfiction writing i actually have to conform to reality.

      this argument always baffles me whenever it comes up. as if writing is 5% coming up with ideas and then the other 95% is boring, tedium, pen-in-hand (or fingers-on-keyboard) execution. i’ve yet to meet a writer who believes this – all the writing i’ve ever done required more-or-less constant editorial decisions from the macro scale of format and structure down to individual choices. have i sufficiently introduced this concept? do i like the way this sentence flows, or does it need to go earlier in the paragraph? how does this tie with the feeling i’m trying to convey or the argument i’m trying to put forward?

      writing is, as a skill, that editorial process (at least to one degree or another). sure, i can defer all the choice to the machine and get the statistically-most-expected, confusing, factually dubious, aimless, unchallenging, and uncompelling text out of it. but if i want anything more than that (and i suspect most writers do), then i am doing 100% of that work myself.

      • froztbyte@awful.systems
        link
        fedilink
        English
        arrow-up
        4
        ·
        7 months ago

        this is about as convincing a point as “humans don’t use words, they use letters!” it’s not saying anything, just adding noise

        I’m sorry I communicate exclusively in mouthnoises, optionally delivered as Ritual Sigils Riding Beams Of Light

      • ebu@awful.systems
        link
        fedilink
        English
        arrow-up
        6
        ·
        7 months ago

        at least if it was “vectors in a high-dimensional space” it would be like. at least a little bit accurate to the internals of llm’s. (still an entirely irrelevant implementation detail that adds noise to the conversation, but accurate.)

      • theneverfox
        link
        fedilink
        English
        arrow-up
        1
        ·
        7 months ago

        That’s literally what a neural network is… It’s a graph, a bunch of nodes and the edges that connect them, and weights on those edges. It’s not three dimensional, it’s n-dimensional, where n is based on the size of the network

        Here’s a tool that visualizes it

        We can click on one of these points (“iceland”) to see its nearest neighbors in the high-dimensional space (mostly other countries!) as well as other points that belong to the same cluster (Cluster 18 is this red cluster).

    • blakestacey@awful.systems
      link
      fedilink
      English
      arrow-up
      9
      ·
      7 months ago

      The biggest, if rarely used, use case is education - they’re an infinitely patient tutor that can explain things in many ways and give you endless examples.

      No. They’re not.

      • theneverfox
        link
        fedilink
        English
        arrow-up
        1
        arrow-down
        6
        ·
        7 months ago

        They’re famously terrible at math, you can relatively easily offload that to a conventional program

        I didn’t mean for children (aside from generating learning materials). They can be wrong - it’s crippling to teach the fundamentals wrong, and children probably lack the nuance to keep from asking leading questions

        I meant more for high school, college, and beyond. I’ve been using it for programming this way - the docs for what I’m using suck and are very dry, getting chat gpt to write an explanation and examples is far more digestible. If you ask correctly, it’ll explain very technical topics in a relatable way

        Even with math, you could probably get a better calculus education than I got… It’ll be able to explain concepts and their application - I had zero interest in calculus because I little explanation on why I should learn it or what it’s good for, I only really started to learn it when it came up in kerbal space program and I had a reason

        But you should never trust its math answers lol

      • theneverfox
        link
        fedilink
        English
        arrow-up
        4
        arrow-down
        4
        ·
        7 months ago

        Try reading something like Djikstra’s algorithm on Wikipedia, then ask one to explain it to you. You can ask for a theme, ask it to explain like you’re 5, or provide an example to check if you understood and have it correct any mistakes

        It’s fantastic for technical or dry topics, if you know how to phrase things you can get quick lessons tailored to be entertaining and understandable for you personally. And of course, you can ask follow up questions

        • V0ldek@awful.systems
          link
          fedilink
          English
          arrow-up
          8
          ·
          7 months ago

          Try reading something like Djikstra’s algorithm on Wikipedia, then ask one to explain it to you.

          I did! I feel entitled to compensation now!

          • Eccitaze@yiffit.net
            link
            fedilink
            English
            arrow-up
            8
            ·
            7 months ago

            I am increasingly convinced that the people who claim AIs are useful for any given subject of any import (coding, art, math, teaching, etc.) should immediately be regarded as having absolutely zero knowledge in that subject, even (and especially) if they claim otherwise.

            From what I can see in my interactions with LLMs, the only thing they are actually decent at are summarizing blocks of text, and even then if it’s important you should parse the summary carefully to make sure they didn’t miss important details.

          • theneverfox
            link
            fedilink
            English
            arrow-up
            1
            arrow-down
            4
            ·
            7 months ago

            I mean… Yeah? Most explanations aren’t great compared to a comprehensive understanding in your head, you already understand it - it would have to be extremely insightful to impress me at that point

            The results vary greatly based on the prompt too - not only that, it changes based on the back and forth you’ve already had in the session

            It’s not a god, it’s not a human expert, but it’s always available, and it’s interactive.

            It doesn’t give you amazing writeups, but (at least for me) it makes things click in minutes that I might need an hour or two to understand through reading up on it. I can get a short summary with key terms, ask about key terms I don’t know, ask for an example in a given context, challenge the example for an explanations of how the example can be generalized, and every once in a while along the way I learn about a blind spot I never realized I had

            It’s like talking to a librarian - it gives you the broad strokes of a topic well, which prepares you well enough that you’re ready for deeper reading to fill in the details.

            It doesn’t replace a teacher, a tutor, further reading, or anything else - but it’s still a fantastic education tool that can make learning easier and faster

            • gerikson@awful.systems
              link
              fedilink
              English
              arrow-up
              10
              ·
              7 months ago

              To be honest, I think the world would be a better place if all the money now poured into “AI” would be spent on expanding access to libraries and librarians for everyone.

            • self@awful.systems
              link
              fedilink
              English
              arrow-up
              8
              ·
              7 months ago

              so the LLM is worthless if you already understand the topic because its explanations are terrible, but if you don’t know the topic the LLM’s explanations are worthless because you don’t know when it’ll be randomly, confidently, and extremely wrong unless you luck into the right incantation

              what a revolutionary technology. thank fuck we can endanger the environment and funnel money into the pockets of a bunch of rich technofascists so we can have fancy autocomplete tell us about a basic undergrad CS algorithm in very little detail and with a random chance of being utterly but imperceptibly wrong

              • theneverfox
                link
                fedilink
                English
                arrow-up
                1
                arrow-down
                4
                ·
                7 months ago

                I don’t find the explanations bad at all… But it’s extremely useful if you know nothing or not enough about a topic

                FWIW, I’m a strong proponent of local AI. The big models are cool and approachable. But a model that runs on my 5 year old budget gaming PC isn’t that much less useful.

                We needed the big, expensive AI to get here… But the reason I’m such an advocate is because this technology can do formerly impossible things. It can do so much good or harm - which is why we need as many people as possible to learn how to use it for what it is, not to mindlessly chase the promise of a replacement for workers.

                AI is here to stay, and it’ll change everything for better or worse. Companies aren’t going to use it for better, they’re going to chase bigger profits until the world burns. They’re already ruining the web and society, with both AI and enshitification

                Individuals skillfully using AI can do more than they can without it - we need every advantage we can get.

                It’s not “AI or no AI”, it’s “AI everywhere or only FAANG controlled AI”

                  • froztbyte@awful.systems
                    link
                    fedilink
                    English
                    arrow-up
                    6
                    ·
                    7 months ago

                    that statement being true is quite probable: it was likely impossible before this point to set this much money on fire this pointlessly!

                  • theneverfox
                    link
                    fedilink
                    English
                    arrow-up
                    1
                    ·
                    7 months ago

                    These kinds of things. For a simple example, captioning pictures… Basically impossible, now you can convince a home computer to do it in a weekend. All running locally with a few hundred lines of code

                    Imagine what that would do for someone using a screen reader

                • self@awful.systems
                  link
                  fedilink
                  English
                  arrow-up
                  6
                  ·
                  7 months ago

                  yeah, you’re still doing everything you can to dodge the articles we’ve linked and points we’ve made showing that the fucking things just make up plausible bullshit and are therefore worthless for learning, and you’ve taken up more than enough of this thread repeating basic shit we already know. off you fuck

                  • theneverfox
                    link
                    fedilink
                    English
                    arrow-up
                    1
                    ·
                    7 months ago

                    Do you ever learn something on lemmy? I do all the time

                    Do you trust random Internet strangers at their word? I sure as hell don’t

                    You can definitely learn even with the risk that the info might be made the fuck up… It’s easy, you don’t trust the LLM.

                    Do you really not see any value in a tool that can introduce you to endless topics, even if you have to verify it wasn’t made the fuck up?

    • froztbyte@awful.systems
      link
      fedilink
      English
      arrow-up
      8
      ·
      7 months ago

      Education? Really? You think that a good use for the essentially-unverifiable synthesis engine that generates without provenance or reference is good for education? Really?

      I guess you must’ve learned that stance from a LLM