Browse & filter

Filter by platform, license text, maturity, maintenance cadence, and editorial tags like privacy-focused or self-hosted. Search matches names, summaries, tags, and use cases.

Ollama

Top pick

Local LLM runner and model library with simple CLI and API for workstation inference.

llmlocalinference

AI & Machine Learning

llama.cpp

Top pick

Plain C/C++ inference for LLaMA-class models with broad community backends.

llminferencec++local

AI & Machine Learning

vLLM

Top pick

High-throughput LLM serving with PagedAttention, continuous batching, and OpenAI-compatible APIs for GPU clusters.

llminferenceservinggpuapi

AI & Machine Learning

SGLang

Also strong

Structured generation language for fast serving: RadixAttention, constrained decoding, and multi-turn batching for frontier-class workloads.

llminferenceservinggpustructured-output

AI & Machine Learning

LiteLLM

Top pick

Unified OpenAI-compatible proxy and SDK for 100+ model providers (local, cloud, Bedrock, Azure) with budgets, fallbacks, and logging.

llmapiproxymulti-providergateway

AI & Machine Learning

MLX LM

Also strong

Apple MLX-based LLM inference and training on Apple silicon: efficient Metal-backed transformers and examples for local chat models.

llmapple-siliconinferencemetallocal

AI & Machine Learning

llamafile

Honorable mention

Single-file distributable LLM weights + llama.cpp runtime: run large models from one executable with broad OS CPU/GPU support.

llmlocalinferenceportable

AI & Machine Learning

MLC LLM

Also strong

Universal deployment stack compiling models to Vulkan, Metal, CUDA, and WebGPU via TVM/Unity for phones, browsers, and servers.

llmedgewebgpumobilecompilation

AI & Machine Learning

ExLlamaV2

Honorable mention

Memory-efficient CUDA inference kernels for quantized Llama-class models—popular in consumer GPU chat UIs.

llminferencecudaquantizationlocal

AI & Machine Learning

TensorRT-LLM

Also strong

NVIDIA TensorRT–based library for optimized LLM inference on GPUs with multi-GPU and speculative decoding features.

llminferencenvidiatensorrtgpu

AI & Machine Learning

Axolotl

Top pick

YAML-configured fine-tuning for LLMs: LoRA, QLoRA, FSDP, and many architectures on top of Hugging Face trainers.

llmfine-tuningloratraininghuggingface

AI & Machine Learning

Unsloth

Also strong

Optimized fine-tuning library claiming 2× faster LoRA/QLoRA with less VRAM via custom kernels and Hugging Face compatibility.

llmfine-tuningloratrainingoptimization

AI & Machine Learning

Meta Llama (open models)

Top pick

Meta’s Llama family of open **weights** (subject to Llama license) with reference code, tooling, and downloads via Hugging Face and meta-llama org.

llmopen-weightsmetafoundation-model

AI & Machine Learning

Mistral AI (open models)

Top pick

Mistral’s open-weight checkpoints (e.g. 7B era, Mixtral MoE) and Apache-2.0–licensed **code** alongside proprietary flagship lines—verify each checkpoint.

llmopen-weightsmoeeuropefoundation-model

AI & Machine Learning

Qwen

Top pick

Alibaba’s Qwen family (dense and MoE) with strong multilingual and coding variants; weights and code on Hugging Face under stated licenses per release.

llmopen-weightscodingmultilingualfoundation-model

AI & Machine Learning

DeepSeek

Top pick

DeepSeek open-weight models (e.g. V3/R1 lineage) with MIT or custom terms per release—high capability coding and reasoning checkpoints.

llmopen-weightsreasoningcodingfoundation-model

AI & Machine Learning

Google Gemma

Also strong

Google’s smaller open **weights** Gemma line (Gemma 2/3, etc.) with Gemma license terms, plus `gemma.cpp` for lightweight CPU inference.

llmopen-weightsgoogleedgefoundation-model

AI & Machine Learning

Microsoft Phi

Also strong

Small language model family (Phi-3/4 lineage) emphasizing strong quality per parameter; weights on Hugging Face under Microsoft licenses per release.

llmslmmicrosoftonnxedge

AI & Machine Learning

Falcon

Honorable mention

Technology Innovation Institute Falcon open weights (7B–180B era) under Apache-2.0 weights for many releases—landmark UAE-led open model line.

llmopen-weightsapache-2foundation-model

AI & Machine Learning

RWKV

Honorable mention

RNN-meets-transformer linear-attention LM architecture running with O(n) memory—unique open line for long-context and embedded inference.

llmarchitecturelinear-attentionopen-weights

AI & Machine Learning

Yi

Honorable mention

01.AI Yi open-weight bilingual models (EN/ZH focus) with Apache-2.0 or Yi license per checkpoint on Hugging Face.

llmopen-weightsmultilingualchineseenglish

AI & Machine Learning

TinyLlama

Honorable mention

1.1B-parameter Llama-architecture model trained on ~3T tokens—Apache-2.0 weights for fast experiments and teaching.

llmslmapache-2educationedge

AI & Machine Learning

OLMo

Also strong

Allen AI fully open LLM **pipeline**: weights, training code, data mixes, and evaluation—research transparency flagship.

llmopen-sciencetrainingresearchtransparent

AI & Machine Learning

BLOOM

Honorable mention

BigScience 176B multilingual causal LM—landmark collaborative open training effort on Jean Zay (weights under BigScience Responsible AI License).

llmmultilingualopen-weightsresearchhistory

AI & Machine Learning

GPT-NeoX

Honorable mention

EleutherAI framework and 20B-class models for training large autoregressive LMs with 3D parallelism—Apache-2.0 training stack.

llmtrainingdistributedresearcheleutherai

AI & Machine Learning

SmolLM

Honorable mention

Hugging Face TB small LM family (135M–1.7B) with Apache-2.0 weights aimed at on-device and edge quality per size.

llmslmedgeapache-2huggingface

AI & Machine Learning

OpenAI gpt-oss (Hub)

Top pick

OpenAI’s open-weight GPT-OSS checkpoints (e.g. 20B, 120B) hosted on Hugging Face for local inference and fine-tuning.

llmhuggingfaceopen-weightsopenaitext-generation

AI & Machine Learning

GPT-2 (Hugging Face)

Honorable mention

Historic decoder-only LM family (124M–1.5B) under `openai-community` on the Hub—still a default tutorial and pipeline test target.

llmhuggingfacegpt-2educationtext-generation

AI & Machine Learning

OPT (Hugging Face)

Honorable mention

Meta’s Open Pretrained Transformer suite (125M–175B) released with reproducible logbooks—canonical Hub org `facebook` / `facebook/opt-*`.

llmhuggingfacemetaresearchtext-generation

AI & Machine Learning

Vicuna (Hugging Face)

Honorable mention

Early open chat models fine-tuned from Llama-class bases by LMSYS—widely mirrored on the Hub (e.g. Vicuna-7B v1.5).

llmhuggingfacechatinstruction-tuninglmsys

AI & Machine Learning

GLM-5 (Hugging Face)

Also strong

Z.ai GLM-5–generation checkpoints (e.g. FP8 builds) distributed on the Hub for text generation and agent-style use cases.

llmhuggingfaceglmtext-generationz.ai

AI & Machine Learning

Pythia (Hugging Face)

Also strong

EleutherAI’s public scaling suite: matched GPT-NeoX–architecture models from 70M–12B with public datasets for interpretability research.

llmhuggingfaceresearcheleutheraiinterpretability

AI & Machine Learning

Qwen2.5-Coder-7B Instruct (Hub)

Top pick

Alibaba’s Qwen2.5 Coder 7B instruct checkpoint on Hugging Face—optimized for code completion, synthesis, and tooling workflows.

llmhuggingfacecodingqwentext-generation

AI & Machine Learning

OpenELM (Hugging Face)

Also strong

Apple’s OpenELM family—openly released efficient language models with layer-wise scaling and Hub-hosted instruct variants.

llmhuggingfaceappleefficienttext-generation

AI & Machine Learning

NVIDIA Nemotron 3 (Hub)

Also strong

NVIDIA Nemotron 3 open model checkpoints (dense and MoE) on Hugging Face for reasoning, coding, and agentic workloads at scale.

llmhuggingfacenvidiamoetext-generation

AI & Machine Learning

BLOOMZ (Hugging Face)

Honorable mention

BigScience instruction-tuned BLOOM derivatives (e.g. BLOOMZ-560M–176B) for multilingual zero-shot instruction following on the Hub.

llmhuggingfacemultilingualinstructionbigscience

AI & Machine Learning

LlamaIndex

Top pick

Data framework for LLM applications: ingestion, indexing, retrieval, and agents over documents and APIs.

ragllmagentsretrieval

AI & Machine Learning

Chroma

Also strong

Open-source embedding database focused on developer ergonomics for LLM apps: local dev, server mode, and simple APIs.

vector-databaseembeddingsragllm

AI & Machine Learning

PEFT

Top pick

Parameter-efficient fine-tuning methods (LoRA, adapters, prompt tuning) integrated with Transformers models.

fine-tuningloratransformersllm

AI & Machine Learning

TRL

Also strong

Transformer Reinforcement Learning: train LLMs with RLHF, DPO, ORPO, and related preference optimization recipes.

rlhfdpoalignmentllm

AI & Machine Learning

Datasets

Top pick

Hugging Face library for large shared datasets: memory mapping, streaming, Arrow-backed columns, and Hub integration.

datanlpllmhuggingface

AI & Machine Learning

MNN

Also strong

Alibaba’s lightweight inference engine for mobile and edge—used for on-device LLMs and classic CV models with aggressive optimization.

inferenceedgemobilellmtaaft-repositories

AI & Machine Learning

rtp-llm

Honorable mention

Alibaba’s high-performance LLM inference engine (CUDA-focused) for production serving of diverse decoder architectures.

llminferenceservinggputaaft-repositories

AI & Machine Learning

KVPress

Honorable mention

NVIDIA research-oriented toolkit for LLM KV-cache compression to stretch context within fixed VRAM budgets.

llmkv-cachecompressioninferencetaaft-repositories

AI & Machine Learning

Hugging Chat UI

Also strong

Open-source Svelte/TypeScript app that powers HuggingChat—multi-model chat, tools, and self-hostable UI patterns.

chatuillmself-hostedtaaft-repositories

AI & Machine Learning

Hugging Face Alignment Handbook

Also strong

Curated recipes and code for aligning language models (preference optimization, DPO-style flows) on open stacks.

alignmentdporlhfllmtaaft-repositories

AI & Machine Learning

llm-ls

Honorable mention

Rust LSP server that plugs LLM-backed completions into editors—designed to pair with local or API models.

lspidellmdeveloper-toolstaaft-repositories

AI & Machine Learning

LangExtract

Also strong

Google library to extract structured fields from unstructured text with LLMs, source grounding, and visualization helpers.

llmextractionstructured-outputtaaft-repositories

AI & Machine Learning

DeerFlow

Also strong

ByteDance open agent harness for long-horizon research, coding, and creation with tools, memory, and subagents.

agentsorchestrationllmtaaft-repositories

AI & Machine Learning

OpenAI Agents SDK (Python)

Top pick

OpenAI’s MIT-licensed Python kit for multi-agent workflows, handoffs, guardrails, and tracing with the Responses API.

agentsopenaisdkllmtaaft-repositories

AI & Machine Learning

DeepSeek Janus

Also strong

DeepSeek Janus series: unified multimodal understanding and generation models with MIT-licensed research code.

multimodalvisionllmdeepseektaaft-repositories

AI & Machine Learning

LangChain

Top pick

Framework for building LLM applications with chains, tools, and agents.

llmagentspython

AI & Machine Learning

Browser Use

Honorable mention

Open toolkit for browser automation driven by LLM agents.

automationllmbrowser

Security & Privacy

DeepTeam

Honorable mention

LLM red-teaming framework for jailbreak and prompt-injection testing.

llmred-teamsecurity