Cohere Drops Command-R 35B 08-2024 Update, Just About a Perfect Local LLM for 24GB GPUs.

huggingface.co

cross-posted to:
fosai@lemmy.world

Cohere Drops Command-R 35B 08-2024 Update, Just About a Perfect Local LLM for 24GB GPUs.

huggingface.co

brucethemoose@lemmy.world to

AI@lemmy.mlEnglish · 3 months ago

cross-posted to:
fosai@lemmy.world

CohereForAI/c4ai-command-r-08-2024 · Hugging Face

huggingface.co

We’re on a journey to advance and democratize artificial intelligence through open source and open science.

cross-posted from: https://lemmy.world/post/19242887

I can run the full 131K context with a 3.75bpw quantization, and still a very long one at 4bpw. And it should barely be fine-tunable in unsloth as well.

It’s pretty much perfect! Unlike the last iteration, they’re using very aggressive GQA, which makes the context small, and it feels really smart at long context stuff like storytelling, RAG, document analysis and things like that (whereas Gemma 27B and Mistral Code 22B are probably better suited to short chats/code).

You must log in or # to comment.

Chat