Skip to content
OpenCatalogcurated by FLOSSK

AI & Machine Learning

PyTorch and TensorFlow for training; local inference tools where privacy matters.

Tools in this category (181)

Local LLM runner and model library with simple CLI and API for workstation inference.

llmlocalinference

High-throughput LLM serving with PagedAttention, continuous batching, and OpenAI-compatible APIs for GPU clusters.

llminferenceservinggpuapi

Structured generation language for fast serving: RadixAttention, constrained decoding, and multi-turn batching for frontier-class workloads.

llminferenceservinggpustructured-output

Unified OpenAI-compatible proxy and SDK for 100+ model providers (local, cloud, Bedrock, Azure) with budgets, fallbacks, and logging.

llmapiproxymulti-providergateway

Apple MLX-based LLM inference and training on Apple silicon: efficient Metal-backed transformers and examples for local chat models.

llmapple-siliconinferencemetallocal

Single-file distributable LLM weights + llama.cpp runtime: run large models from one executable with broad OS CPU/GPU support.

llmlocalinferenceportable

Universal deployment stack compiling models to Vulkan, Metal, CUDA, and WebGPU via TVM/Unity for phones, browsers, and servers.

llmedgewebgpumobilecompilation

Memory-efficient CUDA inference kernels for quantized Llama-class models—popular in consumer GPU chat UIs.

llminferencecudaquantizationlocal

NVIDIA TensorRT–based library for optimized LLM inference on GPUs with multi-GPU and speculative decoding features.

llminferencenvidiatensorrtgpu

YAML-configured fine-tuning for LLMs: LoRA, QLoRA, FSDP, and many architectures on top of Hugging Face trainers.

llmfine-tuningloratraininghuggingface

Optimized fine-tuning library claiming 2× faster LoRA/QLoRA with less VRAM via custom kernels and Hugging Face compatibility.

llmfine-tuningloratrainingoptimization

Meta’s Llama family of open **weights** (subject to Llama license) with reference code, tooling, and downloads via Hugging Face and meta-llama org.

llmopen-weightsmetafoundation-model

Mistral’s open-weight checkpoints (e.g. 7B era, Mixtral MoE) and Apache-2.0–licensed **code** alongside proprietary flagship lines—verify each checkpoint.

llmopen-weightsmoeeuropefoundation-model

Alibaba’s Qwen family (dense and MoE) with strong multilingual and coding variants; weights and code on Hugging Face under stated licenses per release.

llmopen-weightscodingmultilingualfoundation-model

DeepSeek open-weight models (e.g. V3/R1 lineage) with MIT or custom terms per release—high capability coding and reasoning checkpoints.

llmopen-weightsreasoningcodingfoundation-model

Google’s smaller open **weights** Gemma line (Gemma 2/3, etc.) with Gemma license terms, plus `gemma.cpp` for lightweight CPU inference.

llmopen-weightsgoogleedgefoundation-model

Small language model family (Phi-3/4 lineage) emphasizing strong quality per parameter; weights on Hugging Face under Microsoft licenses per release.

llmslmmicrosoftonnxedge

Technology Innovation Institute Falcon open weights (7B–180B era) under Apache-2.0 weights for many releases—landmark UAE-led open model line.

llmopen-weightsapache-2foundation-model
Honorable mention

RNN-meets-transformer linear-attention LM architecture running with O(n) memory—unique open line for long-context and embedded inference.

llmarchitecturelinear-attentionopen-weights
Honorable mention

01.AI Yi open-weight bilingual models (EN/ZH focus) with Apache-2.0 or Yi license per checkpoint on Hugging Face.

llmopen-weightsmultilingualchineseenglish

1.1B-parameter Llama-architecture model trained on ~3T tokens—Apache-2.0 weights for fast experiments and teaching.

llmslmapache-2educationedge

Allen AI fully open LLM **pipeline**: weights, training code, data mixes, and evaluation—research transparency flagship.

llmopen-sciencetrainingresearchtransparent

BigScience 176B multilingual causal LM—landmark collaborative open training effort on Jean Zay (weights under BigScience Responsible AI License).

llmmultilingualopen-weightsresearchhistory

EleutherAI framework and 20B-class models for training large autoregressive LMs with 3D parallelism—Apache-2.0 training stack.

llmtrainingdistributedresearcheleutherai

Hugging Face TB small LM family (135M–1.7B) with Apache-2.0 weights aimed at on-device and edge quality per size.

llmslmedgeapache-2huggingface

OpenAI’s open-weight GPT-OSS checkpoints (e.g. 20B, 120B) hosted on Hugging Face for local inference and fine-tuning.

llmhuggingfaceopen-weightsopenaitext-generation

Historic decoder-only LM family (124M–1.5B) under `openai-community` on the Hub—still a default tutorial and pipeline test target.

llmhuggingfacegpt-2educationtext-generation

Meta’s Open Pretrained Transformer suite (125M–175B) released with reproducible logbooks—canonical Hub org `facebook` / `facebook/opt-*`.

llmhuggingfacemetaresearchtext-generation

Early open chat models fine-tuned from Llama-class bases by LMSYS—widely mirrored on the Hub (e.g. Vicuna-7B v1.5).

llmhuggingfacechatinstruction-tuninglmsys

Z.ai GLM-5–generation checkpoints (e.g. FP8 builds) distributed on the Hub for text generation and agent-style use cases.

llmhuggingfaceglmtext-generationz.ai

EleutherAI’s public scaling suite: matched GPT-NeoX–architecture models from 70M–12B with public datasets for interpretability research.

llmhuggingfaceresearcheleutheraiinterpretability

Apple’s OpenELM family—openly released efficient language models with layer-wise scaling and Hub-hosted instruct variants.

llmhuggingfaceappleefficienttext-generation

NVIDIA Nemotron 3 open model checkpoints (dense and MoE) on Hugging Face for reasoning, coding, and agentic workloads at scale.

llmhuggingfacenvidiamoetext-generation

BigScience instruction-tuned BLOOM derivatives (e.g. BLOOMZ-560M–176B) for multilingual zero-shot instruction following on the Hub.

llmhuggingfacemultilingualinstructionbigscience

Open platform for the ML lifecycle: experiment tracking, model registry, packaging, evaluation, and production monitoring.

mlopsexperiment-trackingregistrypython

Data version control for ML: version datasets and models with Git, cloud storage, and reproducible pipelines.

mlopsdata-versioningreproducibilitypipelines

Kubernetes-native toolkit for ML: notebooks, training jobs, pipelines, tuning, and serving components you compose on-cluster.

mlopskubernetespipelinesplatform

Distributed compute framework for Python: scale data loading, training, hyperparameter search, and online serving (Ray Serve).

distributedpythonservingtraining

Composable transformations (grad, vmap, pmap) plus NumPy-like API for high-performance ML research on accelerators.

frameworkresearchgputpupython

High-level multi-backend deep learning API (TensorFlow, JAX, PyTorch) focused on ergonomics and fast iteration.

frameworkdeep-learningpythoneducation

Cross-platform inference accelerator for ONNX models: CPU, GPU, and mobile execution providers with graph optimizations.

inferenceonnxdeploymentoptimization

Intel toolkit to optimize and deploy deep learning on Intel CPUs, GPUs, and NPUs with model conversion and runtime APIs.

inferenceinteledgeoptimization

Microsoft library for extreme-scale model training: ZeRO optimizer states, pipeline parallelism, and inference kernels.

trainingdistributedlarge-modelspytorch

Automatic hyperparameter optimization framework with pruning, distributed search, and lightweight integration hooks.

hyperparameter-tuningautomlpythonoptimization

Computer vision library: classic CV, DNN module for running exported models, camera pipelines, and real-time processing.

computer-visioncpppythonrobotics

Google’s cross-platform pipelines for perception: face/hand pose, segmentation, and on-device ML graphs for mobile and desktop.

computer-visionedgemobilereal-time

YOLO-family detection/segmentation/pose training and deployment toolkit with CLI and Python API.

computer-visionobject-detectionyolotraining

OpenAI’s open-source speech recognition model family with multilingual transcription and translation checkpoints.

speechasraudiomultilingual

CTranslate2 reimplementation of Whisper for faster CPU/GPU inference with lower memory use than reference PyTorch.

speechasrinferenceoptimization

Python library to spin up shareable web UIs for models—inputs, outputs, and multi-step demos with minimal code.

uidemopythonprototyping

Python-first framework for data and ML apps: reactive widgets, charts, and model calls in a single script.

uidashboardpythondata-apps

Data framework for LLM applications: ingestion, indexing, retrieval, and agents over documents and APIs.

ragllmagentsretrieval

Deepset framework for production-ready search and RAG: pipelines, document stores, and evaluation for QA systems.

ragsearchnlppipelines

Open-source embedding database focused on developer ergonomics for LLM apps: local dev, server mode, and simple APIs.

vector-databaseembeddingsragllm

Vector search engine with filtering, REST/gRPC APIs, and cloud or self-hosted deployment for embeddings at scale.

vector-databasesearchragembeddings

Open-source vector database built for billion-scale similarity search with distributed deployment options.

vector-databasesearchscale-outembeddings

Hugging Face library for diffusion models: training, inference, schedulers, and community pipelines in PyTorch.

diffusiongenerativepytorchimages

Parameter-efficient fine-tuning methods (LoRA, adapters, prompt tuning) integrated with Transformers models.

fine-tuningloratransformersllm

Transformer Reinforcement Learning: train LLMs with RLHF, DPO, ORPO, and related preference optimization recipes.

rlhfdpoalignmentllm

Hugging Face library to run PyTorch training on CPU, single GPU, multi-GPU, or TPU with minimal code changes.

distributedtrainingpytorchhuggingface

Hugging Face library for large shared datasets: memory mapping, streaming, Arrow-backed columns, and Hub integration.

datanlpllmhuggingface

Unified model serving and deployment toolkit: package models as APIs, ship to Kubernetes, and manage runtimes.

servingdeploymentmlopsapi

Alibaba’s lightweight inference engine for mobile and edge—used for on-device LLMs and classic CV models with aggressive optimization.

inferenceedgemobilellmtaaft-repositories

Alibaba’s high-performance LLM inference engine (CUDA-focused) for production serving of diverse decoder architectures.

llminferenceservinggputaaft-repositories

Physics-ML / scientific deep learning framework: neural operators, PINNs, and domain-parallel training on GPUs.

physics-mlsimulationpytorchgputaaft-repositories

NVIDIA library for FP8/FP4 and fused kernels on Hopper/Ada-class GPUs to accelerate Transformer training and inference.

trainingtransformersfp8nvidiataaft-repositories

NVIDIA research-oriented toolkit for LLM KV-cache compression to stretch context within fixed VRAM budgets.

llmkv-cachecompressioninferencetaaft-repositories

Flexible, high-performance serving system for TensorFlow (and related) models with versioning, batching, and gRPC/REST.

servingtensorflowinferencegrpctaaft-repositories

Retargetable MLIR-based compiler and runtime to lower ML graphs to CPUs, GPUs, and accelerators from multiple frontends.

compilermlirruntimedeploymenttaaft-repositories

AutoTrain Advanced: low-code training flows for classification, LLM fine-tunes, and diffusion tasks tied to the Hub.

fine-tuningautomlhuggingfacetaaft-repositories

TypeScript/JavaScript libraries to call Inference API, manage Hub assets, and build browser or Node AI features.

huggingfacejavascripttypescriptinferencetaaft-repositories

Open-source Svelte/TypeScript app that powers HuggingChat—multi-model chat, tools, and self-hostable UI patterns.

chatuillmself-hostedtaaft-repositories

Rust LSP server that plugs LLM-backed completions into editors—designed to pair with local or API models.

lspidellmdeveloper-toolstaaft-repositories

Contrastive vision–language pretraining reference implementation: map images and text to a shared embedding space.

multimodalvisionnlpembeddingstaaft-repositories

Google Research pretrained time-series foundation model for forecasting with open Apache-2.0 code and checkpoints.

time-seriesforecastingfoundation-modeltaaft-repositories

Google library to extract structured fields from unstructured text with LLMs, source grounding, and visualization helpers.

llmextractionstructured-outputtaaft-repositories

ByteDance open agent harness for long-horizon research, coding, and creation with tools, memory, and subagents.

agentsorchestrationllmtaaft-repositories

DeepSeek Janus series: unified multimodal understanding and generation models with MIT-licensed research code.

multimodalvisionllmdeepseektaaft-repositories

Open-source TypeScript ‘AI coworker’ framework with memory, tool use, and agent workflows for product integration.

agentstypescriptmemorytaaft-repositories

Apple’s Python utilities to convert, compress, and validate models for Core ML deployment on Apple devices.

coremlon-deviceappleconversiontaaft-repositories

Desktop app to orchestrate multiple AI CLI agents with Kanban-style flows.

agentsdesktoporchestration

run agents that work for you in the background based on what you do

definitive-opensourcecommercialdisruptive

Your agent in your terminal, equipped with local tools: writes code, uses the terminal, browses the web. Make your own persistent autonomous agent on top!

definitive-opensourceclimanual

Open-source text-to-SQL and text-to-chart GenBI agent with a semantic layer. Ask your database questions in natural language — get accurate SQL, charts, and BI insights. Supports 12+ data sources (PostgreSQL, BigQuery, Snowflake, etc.) and any LLM (OpenAI, Claude, Gemini, Ollama)

definitive-opensource

Create agents that monitor and act on your behalf. Your agents are standing by!

definitive-opensource

AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.

definitive-opensource

Agent for collecting, processing, aggregating, and writing metrics, logs, and other arbitrary data.

definitive-opensourcecli-plus

The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.

definitive-opensource

Invoke is a leading creative engine for Stable Diffusion models, empowering professionals, artists, and enthusiasts to generate and create visual media using the latest AI-driven technologies. The solution offers an industry leading WebUI, and serves as the foundation for multipl

definitive-opensource

Stable Diffusion WebUI Forge is a platform on top of Stable Diffusion WebUI (based on Gradio ) to make development easier, optimize resource management, speed up inference, and study experimental features.

definitive-opensource

SD.Next: All-in-one WebUI for AI generative image and video creation, captioning and processing

definitive-opensource

Visualizer for neural network, deep learning and machine learning models

definitive-opensource

All-in-one LLM CLI tool featuring Shell Assistant, Chat-REPL, RAG, AI Tools & Agents, with access to OpenAI, Claude, Gemini, Ollama, Groq, and more.

definitive-opensourcecli

The ultimate space for work and life — to find, build, and collaborate with agent teammates that grow with you. We are taking agent harness to the next level — enabling multi-agent collaboration, effortless agent team design, and introducing agents as the unit of work interaction

definitive-opensource

Your AI second brain. Self-hostable. Get answers from the web or your docs. Build custom agents, schedule automations, do deep research. Turn any online or local LLM into your personal, autonomous AI (gpt, claude, gemini, llama, qwen, mistral). Get started - free.

definitive-opensource

LLM-Driven Extraction of Unstructured Data — Built for API Deployments & ETL Pipeline Workflows

definitive-opensource

A CLI tool to convert your codebase into a single LLM prompt with source tree, prompt templating, and token counting.

definitive-opensourcecli

📦 Repomix is a powerful tool that packs your entire repository into a single, AI-friendly file. Perfect for when you need to feed your codebase to Large Language Models (LLMs) or other AI tools like Claude, ChatGPT, DeepSeek, Perplexity, Gemini, Gemma, Llama, Grok, and more.

definitive-opensourceclimanual

GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.

definitive-opensource

Jan is an open source alternative to ChatGPT that runs 100% offline on your computer.

definitive-opensource

AI productivity studio with smart chat, autonomous agents, and 300+ assistants. Unified access to frontier LLMs

definitive-opensource

✨ Light and Fast AI Assistant. Support: Web | iOS | MacOS | Android | Linux | Windows

definitive-opensourcecli-plus

Enhanced ChatGPT Clone: Features Agents, MCP, DeepSeek, Anthropic, AWS, OpenAI, Responses API, Azure, Groq, o1, GPT-5, Mistral, OpenRouter, Vertex AI, Gemini, Artifacts, AI model switching, message search, Code Interpreter, langchain, DALL-E-3, OpenAPI Actions, Functions, Secure

definitive-opensource
Honorable mention

Build, Evaluate, and Optimize AI Systems. Includes evals, RAG, agents, fine-tuning, synthetic data generation, dataset management, MCP, and more.

definitive-opensource

The all-in-one AI productivity accelerator. On device and privacy first with no annoying setup or configuration.

definitive-opensource

Retrieval Augmented Generation (RAG) chatbot powered by Weaviate

definitive-opensource

RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to create a superior context layer for LLMs

definitive-opensource

High accuracy RAG for answering questions from scientific documents with citations

definitive-opensourceclimanual
Honorable mention

SoTA production-ready AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.

definitive-opensourcecli

pingcap/autoflow is a Graph RAG based and conversational knowledge base tool built with TiDB Serverless Vector Storage. Demo: https://tidb.ai

definitive-opensource

Private AI platform for agents, assistants and enterprise search. Built-in Agent Builder, Deep research, Document analysis, Multi-model support, and API connectivity for agents.

definitive-opensource

Open Source Deep Research Alternative to Reason and Search on Private Data. Written in Python.

definitive-opensourcecli

openpilot is an operating system for robotics. Currently, it upgrades the driver assistance system on 300+ supported cars.

definitive-opensource
Honorable mention

AI wearables. Put it on, speak, transcribe, automatically

definitive-opensource

Privacy first, AI meeting assistant with 4x faster Parakeet/Whisper live transcription, speaker diarization, and Ollama summarization built on Rust. 100% local processing. no cloud required. Meetily (Meetly Ai - https://meetily.ai) is the #1 Self-hosted, Open-source Ai meeting n

definitive-opensource

Replace 'hub' with 'ingest' in any GitHub URL to get a prompt-friendly extract of a codebase

definitive-opensourcemanual
Honorable mention

Open-source infrastructure for Computer-Use Agents. Sandboxes, SDKs, and benchmarks to train and evaluate AI agents that can control full desktops (macOS, Linux, Windows).

definitive-opensourcemanual

Low-code framework for building custom LLMs, neural networks, and other AI models

definitive-opensourcemanual

The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra

definitive-opensource
Honorable mention

Debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards.

definitive-opensource

Claraverse is a opesource privacy focused ecosystem to replace ChatGPT, Claude, N8N, ImageGen with your own hosted llm, keys and compute. With desktop, IOS, Android Apps.

definitive-opensource

Fully Local Manus AI. No APIs, No $200 monthly bills. Enjoy an autonomous agent that thinks, browses the web, and code for the sole cost of electricity. 🔔 Official updates only via twitter @Martin993886460 (Beware of fake account)

definitive-opensourcemanual

An open-source AI agent that brings the power of Gemini directly into your terminal.

definitive-opensourcetuimanual

An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.

definitive-opensourcemanual

A powerful GUI app and Toolkit for Claude Code - Create custom agents, manage interactive Claude Code sessions, run secure background agents, and more.

definitive-opensource

AI Agent that handles engineering tasks end-to-end: integrates with developers’ tools, plans, executes, and iterates until it achieves a successful result.

definitive-opensource

Autoware - the world's leading open-source software project for autonomous driving

definitive-opensource

Open-Source Chrome extension for AI-powered web automation. Run multi-agent workflows using your own LLM API key. Alternative to OpenAI Operator.

definitive-opensource

An open source, privacy focused alternative to NotebookLM for teams with no data limit's. Join our Discord: https://discord.gg/ejRNvftDp9

definitive-opensource

SwarmUI (formerly StableSwarmUI), A Modular Stable Diffusion Web-User-Interface, with an emphasis on making powertools easily accessible, high performance, and extensibility.

definitive-opensourcemanual
Honorable mention

5ire is a cross-platform desktop AI assistant, MCP client. It compatible with major service providers, supports local knowledge base and tools via model context protocol servers .

definitive-opensource
Honorable mention

Open Source AI Platform - AI Chat with advanced features that works with every LLM

definitive-opensource

🐬DeepChat - A smart assistant that connects powerful AI to your personal world

definitive-opensource

Context-aware AI assistant for your desktop. Ready to respond intelligently, seamlessly integrating multiple LLMs and MCP tools.

definitive-opensource

Own your AI. The native macOS harness for AI agents -- any model, persistent memory, autonomous execution, cryptographic identity. Built in Swift. Fully offline. Open source.

definitive-opensource