Browse & filter

Filter by platform, license text, maturity, maintenance cadence, and editorial tags like privacy-focused or self-hosted. Search matches names, summaries, tags, and use cases.

MLC LLM

Also strong

Universal deployment stack compiling models to Vulkan, Metal, CUDA, and WebGPU via TVM/Unity for phones, browsers, and servers.

llmedgewebgpumobilecompilation

AI & Machine Learning

Google Gemma

Also strong

Google’s smaller open **weights** Gemma line (Gemma 2/3, etc.) with Gemma license terms, plus `gemma.cpp` for lightweight CPU inference.

llmopen-weightsgoogleedgefoundation-model

AI & Machine Learning

Microsoft Phi

Also strong

Small language model family (Phi-3/4 lineage) emphasizing strong quality per parameter; weights on Hugging Face under Microsoft licenses per release.

llmslmmicrosoftonnxedge

AI & Machine Learning

TinyLlama

Honorable mention

1.1B-parameter Llama-architecture model trained on ~3T tokens—Apache-2.0 weights for fast experiments and teaching.

llmslmapache-2educationedge

AI & Machine Learning

SmolLM

Honorable mention

Hugging Face TB small LM family (135M–1.7B) with Apache-2.0 weights aimed at on-device and edge quality per size.

llmslmedgeapache-2huggingface

AI & Machine Learning

OpenVINO

Also strong

Intel toolkit to optimize and deploy deep learning on Intel CPUs, GPUs, and NPUs with model conversion and runtime APIs.

inferenceinteledgeoptimization

AI & Machine Learning

MediaPipe

Also strong

Google’s cross-platform pipelines for perception: face/hand pose, segmentation, and on-device ML graphs for mobile and desktop.

computer-visionedgemobilereal-time

AI & Machine Learning

MNN

Also strong

Alibaba’s lightweight inference engine for mobile and edge—used for on-device LLMs and classic CV models with aggressive optimization.

inferenceedgemobilellmtaaft-repositories