NVIDIA Nemotron 3 (Hub)
NVIDIA Nemotron 3 open model checkpoints (dense and MoE) on Hugging Face for reasoning, coding, and agentic workloads at scale.
Why it is included
NVIDIA’s Nemotron-3 line appears near the top of Hub `text-generation` traffic for large open MoE/dense stacks in 2025–2026.
Best for
Datacenter GPU deployments optimizing for TensorRT-LLM and NVIDIA reference recipes.
Strengths
- Large-scale open weights from NVIDIA
- MoE options
- HF + NGC alignment
Limitations
- Vendor-tied optimization story; legal review for redistribution
Good alternatives
Llama · Qwen3 MoE · Mixtral
Related tools
AI & Machine Learning
TensorRT-LLM
NVIDIA TensorRT–based library for optimized LLM inference on GPUs with multi-GPU and speculative decoding features.
AI & Machine Learning
Mistral AI (open models)
Mistral’s open-weight checkpoints (e.g. 7B era, Mixtral MoE) and Apache-2.0–licensed **code** alongside proprietary flagship lines—verify each checkpoint.
AI & Machine Learning
vLLM
High-throughput LLM serving with PagedAttention, continuous batching, and OpenAI-compatible APIs for GPU clusters.
AI & Machine Learning
OpenAI gpt-oss (Hub)
OpenAI’s open-weight GPT-OSS checkpoints (e.g. 20B, 120B) hosted on Hugging Face for local inference and fine-tuning.
AI & Machine Learning
GPT-2 (Hugging Face)
Historic decoder-only LM family (124M–1.5B) under `openai-community` on the Hub—still a default tutorial and pipeline test target.
AI & Machine Learning
OPT (Hugging Face)
Meta’s Open Pretrained Transformer suite (125M–175B) released with reproducible logbooks—canonical Hub org `facebook` / `facebook/opt-*`.
