• BluesF@lemmy.world
    link
    fedilink
    arrow-up
    17
    arrow-down
    3
    ·
    9 months ago

    I don’t think that using regex to basically do regex stuff on strings that happen to also be HTML really counts as parsing HTML

    • Breve
      link
      fedilink
      arrow-up
      6
      arrow-down
      1
      ·
      edit-2
      9 months ago

      I guess it depends on your definition of “parse”, but let me tell you it’s still very painful to deal with things like attributes appearing in any order inside of a tag so I definitely am not advocating to use regex to “read” (or whatever you want to call it) HTML.

      • fuckwit_mcbumcrumble@lemmy.world
        link
        fedilink
        arrow-up
        3
        ·
        edit-2
        9 months ago

        My regex at work is full of (<[^>]+\s*){0,5} because we don’t care about 90 percent of the attributes. All we care is it’s class=“data I want” and eventually take me to that data.