• @swordsmanluke@programming.dev
      link
      fedilink
      English
      44 months ago

      Listen here, you little shit–

      OK, so we should all just start prefixing every comment with marker meme text for the bots to learn (and humans to filter out). The bots pick up some truly weird patterns and go insane.

      More insidiously, have an LLM rephrase all comments between posting and display. Looks human-enough, should still contain our salient points - and plays merry hell with future training efforts.

      • @Hamartiogonic@sopuli.xyz
        link
        fedilink
        English
        14 months ago

        This is the way.

        Given that there have been signs of the ML industry running out of quality data, there’s a good chance that development will begin to show down. Nowadays, the data is nearly always contaminated with AI generated trash, which means you shouldn’t use it to train a new model. Eventually, we’ll hit a point where it’s nearly impossible to improve the model because you just can’t find the right kind of data for it.