• dmtalon@infosec.pub
    link
    fedilink
    English
    arrow-up
    77
    ·
    edit-2
    1 year ago

    It has seemed to work on less and less sites for me recently, to the point that I do not visited it as often as I used to.

    But that tweet does sound like pretty bad news…

        • Cinner@lemmy.worldB
          link
          fedilink
          English
          arrow-up
          6
          ·
          1 year ago

          I also get infinitely CAPTCHA blocked on Android trying to connect to archive.md and the other domains. Doesn’t matter if I use Firefox, chrome, Samsung mobile browser.

            • Cinner@lemmy.worldB
              link
              fedilink
              English
              arrow-up
              2
              ·
              1 year ago

              Same. Started a few months ago. VPNs don’t work either so it’s some bug with cloudflare I guess, but it only seems to happen on their site.

                • Cinner@lemmy.worldB
                  link
                  fedilink
                  English
                  arrow-up
                  1
                  ·
                  edit-2
                  1 year ago

                  Interesting thanks for sharing that. I don’t use cloudflare as my dns resolver though.

                  EDIT: That’s not true. I just double-checked my DNS settings for this network and it WAS using Cloudflare after all! I guess the private DNS settings weren’t working. Let’s hope this fixes the issue because it’s been a major PITA for months. I will update after the cache has had some time to clear. Thank you!!!

  • empireOfLove@lemmy.one
    link
    fedilink
    English
    arrow-up
    24
    ·
    edit-2
    1 year ago

    It stopped working on any of the sites I ever bothered to use it on anyway- most of them wisened up to the crawler bypass and simply made a 2 sentence tagline visible to crawlers that hit the SEO terms, with everything else hidden. Soooo nothing of value lost and Capital comes to claim its pie once again.

  • tun@lemm.ee
    link
    fedilink
    English
    arrow-up
    15
    ·
    1 year ago

    I used to use 12ft.io whenever I needed to read a paywalled article.

    Is the “Bypass paywall clean” extension better than 12ft.io?

  • hedgehog@ttrpg.network
    link
    fedilink
    English
    arrow-up
    11
    arrow-down
    1
    ·
    1 year ago

    Some extra context / clarification from the thread re Vercel: they did warn him starting two weeks ago. They’ve stated he has a line open with customer support to get his other projects restored but that hasn’t happened yet.

  • MigratingtoLemmy@lemmy.world
    link
    fedilink
    English
    arrow-up
    12
    arrow-down
    2
    ·
    1 year ago

    Technically, if one were to disable the JS used for said paywall on a site, they would never see it again. I haven’t personally done this but has anyone tried?

      • Baleine@jlai.lu
        link
        fedilink
        English
        arrow-up
        7
        ·
        1 year ago

        On a majority of sites all of the page’s content will be present at least for SSO. And you have the added bonus that they don’t ask for cookies etc…

          • MigratingtoLemmy@lemmy.world
            link
            fedilink
            English
            arrow-up
            3
            arrow-down
            2
            ·
            1 year ago

            There are multiple scripts being used on almost every website. You need to find the one that pops up the paywall. Use NoScript or just unlock origin (I use both) and with some trial and error it’ll work just fine

            • killeronthecorner@lemmy.world
              link
              fedilink
              English
              arrow-up
              2
              arrow-down
              2
              ·
              edit-2
              1 year ago

              This is an oversimplification. Paywalls are generally designed to circumvent simplistic “remove popover” approaches. Sites like 12ft.io and paywalls removal extensions use myriad solutions across different site to circumvent both local and server side paywalls.

            • myliltoehurts@lemm.ee
              link
              fedilink
              English
              arrow-up
              1
              arrow-down
              1
              ·
              1 year ago

              It would only work if they specifically bundle the functions which cause the paywall in a separate file (it is very unlikely for this to be the case), and also relies on the assumption that the paywall is entirely front-end side, as well as the “default” content to be without paywall (as opposed to the default content being paywalled and requiring JavaScript to load the actual content).

              • MigratingtoLemmy@lemmy.world
                link
                fedilink
                English
                arrow-up
                2
                arrow-down
                1
                ·
                edit-2
                1 year ago

                Not a specific file but a domain. And yes, if the processing is done server-side then there is very little we can do about that. Note that I’m not asking one to disable every script on the page, just the specific script for the pop-up/blurring by the paywall

                • myliltoehurts@lemm.ee
                  link
                  fedilink
                  English
                  arrow-up
                  3
                  ·
                  1 year ago

                  I think I understood what you were suggesting: try disabling the script tags one by one on a website until either we tried them all or we got through the paywall.

                  My point is that it’s very unlikely to be feasible on most modern websites.

                  I mention files because very few bits of functionality tend to be inline scripts these days, 90-95% of JavaScript will be loaded from separate .js files the script tags reference.

                  In modern webapps the JavaScript usually goes through some sort of build system, like webpack, which does a number of things but the important one for this case is that it re-structures how the code is distributed into .js files which are referenced from script tags in the html. This makes it very difficult to explicitly target a specific bit of functionality to be disabled, since the code for paywall is likely loaded from the same file as a hundred other bits of code which make other features work - hence my point that the sites would actively have to go out of their way to make their build process separate their paywall code from other bits of functionality in their codebase, which is probably not something they would do.

                  On top of this, the same build system may output differently named files after the build since they’re often named after some hashing of the content, so if any code changes in any of the sources the output file name changes as well in an unpredictable way. This would likely be a much smaller issue since I can’t imagine them actively working on all parts of their codebase all the time.

                  Lastly, if the way a website works is that it loads the content and then some JavaScript hides it behind a paywall then it’s much simpler to either hide the elements in front of it or make the content visible again just by using CSS and HTML - i.e. the way adblockers remove the entire ad element from the pages.

    • tias@discuss.tchncs.de
      link
      fedilink
      English
      arrow-up
      11
      arrow-down
      1
      ·
      1 year ago

      If the website developer is worth their salt, the article contents won’t be delivered from the web server until the reader has been authorized. So it doesn’t matter how much JS code you disable.

  • satan@r.nf
    link
    fedilink
    English
    arrow-up
    5
    arrow-down
    109
    ·
    edit-2
    1 year ago

    It’s just a glorified web scraper, I didn’t know it was this popular. You could build a barebones scraper and output in less than 10 lines with curl in PHP. And 12ftio used to inject its own code into the output, it’s funny how people were Ozzy with that.

    Everyone who ever does web scraping knew serving it on his own public domain was going to be a problem.

    Boy, people are lazy.