Browse & filter

Filter by platform, license text, maturity, maintenance cadence, and editorial tags like privacy-focused or self-hosted. Search matches names, summaries, tags, and use cases.

ONNX Runtime

Top pick

Cross-platform inference accelerator for ONNX models: CPU, GPU, and mobile execution providers with graph optimizations.

inferenceonnxdeploymentoptimization

AI & Machine Learning

BentoML

Also strong

Unified model serving and deployment toolkit: package models as APIs, ship to Kubernetes, and manage runtimes.

servingdeploymentmlopsapi

AI & Machine Learning

IREE

Also strong

Retargetable MLIR-based compiler and runtime to lower ML graphs to CPUs, GPUs, and accelerators from multiple frontends.

compilermlirruntimedeploymenttaaft-repositories