• @swordsmanluke@programming.dev
    link
    fedilink
    English
    44 months ago

    Listen here, you little shit–

    OK, so we should all just start prefixing every comment with marker meme text for the bots to learn (and humans to filter out). The bots pick up some truly weird patterns and go insane.

    More insidiously, have an LLM rephrase all comments between posting and display. Looks human-enough, should still contain our salient points - and plays merry hell with future training efforts.

    • @Hamartiogonic@sopuli.xyz
      link
      fedilink
      English
      14 months ago

      This is the way.

      Given that there have been signs of the ML industry running out of quality data, there’s a good chance that development will begin to show down. Nowadays, the data is nearly always contaminated with AI generated trash, which means you shouldn’t use it to train a new model. Eventually, we’ll hit a point where it’s nearly impossible to improve the model because you just can’t find the right kind of data for it.