• @bassomitron@lemmy.world
    link
    fedilink
    English
    25 months ago

    Do you know if there are any plans to quantize it? I’d love to test it, but my 3090 can’t handle 70b models without quantization, unfortunately.

    • midnight
      link
      fedilink
      3
      edit-2
      5 months ago

      There are quantized versions on hugging face. There’s a q2 version, but idk how well that performs

    • ffhein
      link
      fedilink
      English
      25 months ago

      Only quantized versions of the model were leaked. If you see any unquantized version of it then it’s something which was recreated from these, and not the original model. People have also requanted it from GGUF to EXL2 and probably other formats too.