Ollama
Local LLM runner and model library with simple CLI and API for workstation inference.
Why it is included
Lowers friction for privacy-preserving inference and offline experimentation on open weights.
Best for
Developers and power users testing models without cloud API bills.
If you use Windows, Mac, or paid tools
Local models alternative to ChatGPT, Claude, and Gemini cloud APIs for on-machine experimentation.
Strengths
- Simple UX
- Model packaging
- Local-first
Limitations
- Hardware limits; verify model licenses separately
Good alternatives
llama.cpp · vLLM
Related tools
AI & Machine Learning
PyTorch
Deep learning framework with strong research-to-production paths.
AI & Machine Learning
llama.cpp
Plain C/C++ inference for LLaMA-class models with broad community backends.
AI & Machine Learning
MLX LM
Apple MLX-based LLM inference and training on Apple silicon: efficient Metal-backed transformers and examples for local chat models.
AI & Machine Learning
llamafile
Single-file distributable LLM weights + llama.cpp runtime: run large models from one executable with broad OS CPU/GPU support.
AI & Machine Learning
ExLlamaV2
Memory-efficient CUDA inference kernels for quantized Llama-class models—popular in consumer GPU chat UIs.
AI & Machine Learning
vLLM
High-throughput LLM serving with PagedAttention, continuous batching, and OpenAI-compatible APIs for GPU clusters.
