Deep learning framework with strong research-to-production paths.
AI & Machine Learning
PyTorch and TensorFlow for training; local inference tools where privacy matters.
Tools in this category (181)
Local LLM runner and model library with simple CLI and API for workstation inference.
Plain C/C++ inference for LLaMA-class models with broad community backends.
High-throughput LLM serving with PagedAttention, continuous batching, and OpenAI-compatible APIs for GPU clusters.
Structured generation language for fast serving: RadixAttention, constrained decoding, and multi-turn batching for frontier-class workloads.
Unified OpenAI-compatible proxy and SDK for 100+ model providers (local, cloud, Bedrock, Azure) with budgets, fallbacks, and logging.
Apple MLX-based LLM inference and training on Apple silicon: efficient Metal-backed transformers and examples for local chat models.
Single-file distributable LLM weights + llama.cpp runtime: run large models from one executable with broad OS CPU/GPU support.
Universal deployment stack compiling models to Vulkan, Metal, CUDA, and WebGPU via TVM/Unity for phones, browsers, and servers.
Memory-efficient CUDA inference kernels for quantized Llama-class models—popular in consumer GPU chat UIs.
NVIDIA TensorRT–based library for optimized LLM inference on GPUs with multi-GPU and speculative decoding features.
YAML-configured fine-tuning for LLMs: LoRA, QLoRA, FSDP, and many architectures on top of Hugging Face trainers.
Optimized fine-tuning library claiming 2× faster LoRA/QLoRA with less VRAM via custom kernels and Hugging Face compatibility.
Meta’s Llama family of open **weights** (subject to Llama license) with reference code, tooling, and downloads via Hugging Face and meta-llama org.
Mistral’s open-weight checkpoints (e.g. 7B era, Mixtral MoE) and Apache-2.0–licensed **code** alongside proprietary flagship lines—verify each checkpoint.
Alibaba’s Qwen family (dense and MoE) with strong multilingual and coding variants; weights and code on Hugging Face under stated licenses per release.
DeepSeek open-weight models (e.g. V3/R1 lineage) with MIT or custom terms per release—high capability coding and reasoning checkpoints.
Google’s smaller open **weights** Gemma line (Gemma 2/3, etc.) with Gemma license terms, plus `gemma.cpp` for lightweight CPU inference.
Small language model family (Phi-3/4 lineage) emphasizing strong quality per parameter; weights on Hugging Face under Microsoft licenses per release.
Technology Innovation Institute Falcon open weights (7B–180B era) under Apache-2.0 weights for many releases—landmark UAE-led open model line.
RNN-meets-transformer linear-attention LM architecture running with O(n) memory—unique open line for long-context and embedded inference.
01.AI Yi open-weight bilingual models (EN/ZH focus) with Apache-2.0 or Yi license per checkpoint on Hugging Face.
1.1B-parameter Llama-architecture model trained on ~3T tokens—Apache-2.0 weights for fast experiments and teaching.
Allen AI fully open LLM **pipeline**: weights, training code, data mixes, and evaluation—research transparency flagship.
BigScience 176B multilingual causal LM—landmark collaborative open training effort on Jean Zay (weights under BigScience Responsible AI License).
EleutherAI framework and 20B-class models for training large autoregressive LMs with 3D parallelism—Apache-2.0 training stack.
Hugging Face TB small LM family (135M–1.7B) with Apache-2.0 weights aimed at on-device and edge quality per size.
OpenAI’s open-weight GPT-OSS checkpoints (e.g. 20B, 120B) hosted on Hugging Face for local inference and fine-tuning.
Historic decoder-only LM family (124M–1.5B) under `openai-community` on the Hub—still a default tutorial and pipeline test target.
Meta’s Open Pretrained Transformer suite (125M–175B) released with reproducible logbooks—canonical Hub org `facebook` / `facebook/opt-*`.
Early open chat models fine-tuned from Llama-class bases by LMSYS—widely mirrored on the Hub (e.g. Vicuna-7B v1.5).
Z.ai GLM-5–generation checkpoints (e.g. FP8 builds) distributed on the Hub for text generation and agent-style use cases.
EleutherAI’s public scaling suite: matched GPT-NeoX–architecture models from 70M–12B with public datasets for interpretability research.
Alibaba’s Qwen2.5 Coder 7B instruct checkpoint on Hugging Face—optimized for code completion, synthesis, and tooling workflows.
Apple’s OpenELM family—openly released efficient language models with layer-wise scaling and Hub-hosted instruct variants.
NVIDIA Nemotron 3 open model checkpoints (dense and MoE) on Hugging Face for reasoning, coding, and agentic workloads at scale.
BigScience instruction-tuned BLOOM derivatives (e.g. BLOOMZ-560M–176B) for multilingual zero-shot instruction following on the Hub.
Open platform for the ML lifecycle: experiment tracking, model registry, packaging, evaluation, and production monitoring.
Data version control for ML: version datasets and models with Git, cloud storage, and reproducible pipelines.
Kubernetes-native toolkit for ML: notebooks, training jobs, pipelines, tuning, and serving components you compose on-cluster.
Distributed compute framework for Python: scale data loading, training, hyperparameter search, and online serving (Ray Serve).
Composable transformations (grad, vmap, pmap) plus NumPy-like API for high-performance ML research on accelerators.
High-level multi-backend deep learning API (TensorFlow, JAX, PyTorch) focused on ergonomics and fast iteration.
Cross-platform inference accelerator for ONNX models: CPU, GPU, and mobile execution providers with graph optimizations.
Intel toolkit to optimize and deploy deep learning on Intel CPUs, GPUs, and NPUs with model conversion and runtime APIs.
Multi-framework inference server for TensorRT, ONNX, PyTorch, Python backends—dynamic batching, ensembles, and GPU sharing.
Microsoft library for extreme-scale model training: ZeRO optimizer states, pipeline parallelism, and inference kernels.
Automatic hyperparameter optimization framework with pruning, distributed search, and lightweight integration hooks.
Computer vision library: classic CV, DNN module for running exported models, camera pipelines, and real-time processing.
Google’s cross-platform pipelines for perception: face/hand pose, segmentation, and on-device ML graphs for mobile and desktop.
YOLO-family detection/segmentation/pose training and deployment toolkit with CLI and Python API.
OpenAI’s open-source speech recognition model family with multilingual transcription and translation checkpoints.
CTranslate2 reimplementation of Whisper for faster CPU/GPU inference with lower memory use than reference PyTorch.
Python library to spin up shareable web UIs for models—inputs, outputs, and multi-step demos with minimal code.
Python-first framework for data and ML apps: reactive widgets, charts, and model calls in a single script.
Data framework for LLM applications: ingestion, indexing, retrieval, and agents over documents and APIs.
Deepset framework for production-ready search and RAG: pipelines, document stores, and evaluation for QA systems.
Open-source embedding database focused on developer ergonomics for LLM apps: local dev, server mode, and simple APIs.
Vector search engine with filtering, REST/gRPC APIs, and cloud or self-hosted deployment for embeddings at scale.
Open-source vector database built for billion-scale similarity search with distributed deployment options.
Hugging Face library for diffusion models: training, inference, schedulers, and community pipelines in PyTorch.
Parameter-efficient fine-tuning methods (LoRA, adapters, prompt tuning) integrated with Transformers models.
Transformer Reinforcement Learning: train LLMs with RLHF, DPO, ORPO, and related preference optimization recipes.
Hugging Face library to run PyTorch training on CPU, single GPU, multi-GPU, or TPU with minimal code changes.
Hugging Face library for large shared datasets: memory mapping, streaming, Arrow-backed columns, and Hub integration.
Unified model serving and deployment toolkit: package models as APIs, ship to Kubernetes, and manage runtimes.
Alibaba’s lightweight inference engine for mobile and edge—used for on-device LLMs and classic CV models with aggressive optimization.
Alibaba’s high-performance LLM inference engine (CUDA-focused) for production serving of diverse decoder architectures.
Physics-ML / scientific deep learning framework: neural operators, PINNs, and domain-parallel training on GPUs.
NVIDIA library for FP8/FP4 and fused kernels on Hopper/Ada-class GPUs to accelerate Transformer training and inference.
NVIDIA research-oriented toolkit for LLM KV-cache compression to stretch context within fixed VRAM budgets.
Flexible, high-performance serving system for TensorFlow (and related) models with versioning, batching, and gRPC/REST.
Retargetable MLIR-based compiler and runtime to lower ML graphs to CPUs, GPUs, and accelerators from multiple frontends.
AutoTrain Advanced: low-code training flows for classification, LLM fine-tunes, and diffusion tasks tied to the Hub.
Official Python client for the Hugging Face Hub: upload/download models, datasets, and manage tokens and repos.
TypeScript/JavaScript libraries to call Inference API, manage Hub assets, and build browser or Node AI features.
Rust-based high-throughput server for sentence-transformers–class embedding models with GPU/CPU backends.
Open-source Svelte/TypeScript app that powers HuggingChat—multi-model chat, tools, and self-hostable UI patterns.
Curated recipes and code for aligning language models (preference optimization, DPO-style flows) on open stacks.
Rust LSP server that plugs LLM-backed completions into editors—designed to pair with local or API models.
Contrastive vision–language pretraining reference implementation: map images and text to a shared embedding space.
Google Research pretrained time-series foundation model for forecasting with open Apache-2.0 code and checkpoints.
Google library to extract structured fields from unstructured text with LLMs, source grounding, and visualization helpers.
ByteDance open agent harness for long-horizon research, coding, and creation with tools, memory, and subagents.
OpenAI’s MIT-licensed Python kit for multi-agent workflows, handoffs, guardrails, and tracing with the Responses API.
DeepSeek Janus series: unified multimodal understanding and generation models with MIT-licensed research code.
Open-source TypeScript ‘AI coworker’ framework with memory, tool use, and agent workflows for product integration.
Apple’s Python utilities to convert, compress, and validate models for Core ML deployment on Apple devices.
Framework for building LLM applications with chains, tools, and agents.
State-of-the-art pretrained models for PyTorch, TensorFlow, and JAX.
Open toolkit for browser automation driven by LLM agents.
Desktop app to orchestrate multiple AI CLI agents with Kanban-style flows.
End-to-end platform for machine learning and deployment.
A natural language interface for computers
run agents that work for you in the background based on what you do
Your agent in your terminal, equipped with local tools: writes code, uses the terminal, browses the web. Make your own persistent autonomous agent on top!
Open-source text-to-SQL and text-to-chart GenBI agent with a semantic layer. Ask your database questions in natural language — get accurate SQL, charts, and BI insights. Supports 12+ data sources (PostgreSQL, BigQuery, Snowflake, etc.) and any LLM (OpenAI, Claude, Gemini, Ollama)
Open-source framework for conversational voice AI agents
Create agents that monitor and act on your behalf. Your agents are standing by!
AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
Agent for collecting, processing, aggregating, and writing metrics, logs, and other arbitrary data.
Stable Diffusion web UI
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
Invoke is a leading creative engine for Stable Diffusion models, empowering professionals, artists, and enthusiasts to generate and create visual media using the latest AI-driven technologies. The solution offers an industry leading WebUI, and serves as the foundation for multipl
Stable Diffusion WebUI Forge is a platform on top of Stable Diffusion WebUI (based on Gradio ) to make development easier, optimize resource management, speed up inference, and study experimental features.
Focus on prompting and generating
SD.Next: All-in-one WebUI for AI generative image and video creation, captioning and processing
Visualizer for neural network, deep learning and machine learning models
All-in-one LLM CLI tool featuring Shell Assistant, Chat-REPL, RAG, AI Tools & Agents, with access to OpenAI, Claude, Gemini, Ollama, Groq, and more.
The ultimate space for work and life — to find, build, and collaborate with agent teammates that grow with you. We are taking agent harness to the next level — enabling multi-agent collaboration, effortless agent team design, and introducing agents as the unit of work interaction
Your AI second brain. Self-hostable. Get answers from the web or your docs. Build custom agents, schedule automations, do deep research. Turn any online or local LLM into your personal, autonomous AI (gpt, claude, gemini, llama, qwen, mistral). Get started - free.
🧠 Leon is your open-source personal assistant.
LLM-Driven Extraction of Unstructured Data — Built for API Deployments & ETL Pipeline Workflows
🔥 The Web Data API for AI - Power AI agents with clean web data
Crawl a site to generate knowledge files to create your own custom GPT from a URL
A CLI tool to convert your codebase into a single LLM prompt with source tree, prompt templating, and token counting.
📦 Repomix is a powerful tool that packs your entire repository into a single, AI-friendly file. Perfect for when you need to feed your codebase to Large Language Models (LLMs) or other AI tools like Claude, ChatGPT, DeepSeek, Perplexity, Gemini, Gemma, Llama, Grok, and more.
Get your documents ready for gen AI
User-friendly AI Interface (Supports Ollama, OpenAI API, ...)
GPT4All: Run Local LLMs on Any Device. Open-source and available for commercial use.
Jan is an open source alternative to ChatGPT that runs 100% offline on your computer.
Powerful AI Client
The original local LLM interface. Text, vision, tool-calling, training, and more. 100% offline.
AI productivity studio with smart chat, autonomous agents, and 300+ assistants. Unified access to frontier LLMs
LLM Frontend for Power Users.
✨ Light and Fast AI Assistant. Support: Web | iOS | MacOS | Android | Linux | Windows
Enhanced ChatGPT Clone: Features Agents, MCP, DeepSeek, Anthropic, AWS, OpenAI, Responses API, Azure, Groq, o1, GPT-5, Mistral, OpenRouter, Vertex AI, Gemini, Artifacts, AI model switching, message search, Code Interpreter, langchain, DALL-E-3, OpenAPI Actions, Functions, Secure
Multi-Platform Package Manager for Stable Diffusion
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Pretrain, finetune ANY AI model of ANY size on 1 or 10,000+ GPUs with zero code changes.
Build, run, manage agentic software at scale.
Build, Evaluate, and Optimize AI Systems. Includes evals, RAG, agents, fine-tuning, synthetic data generation, dataset management, MCP, and more.
The all-in-one AI productivity accelerator. On device and privacy first with no annoying setup or configuration.
An open-source RAG-based tool for chatting with your documents.
Retrieval Augmented Generation (RAG) chatbot powered by Weaviate
RAGFlow is a leading open-source Retrieval-Augmented Generation (RAG) engine that fuses cutting-edge RAG with Agent capabilities to create a superior context layer for LLMs
High accuracy RAG for answering questions from scientific documents with citations
SoTA production-ready AI retrieval system. Agentic Retrieval-Augmented Generation (RAG) with a RESTful API.
pingcap/autoflow is a Graph RAG based and conversational knowledge base tool built with TiDB Serverless Vector Storage. Demo: https://tidb.ai
Private AI platform for agents, assistants and enterprise search. Built-in Agent Builder, Deep research, Document analysis, Multi-model support, and API connectivity for agents.
Open Source Deep Research Alternative to Reason and Search on Private Data. Written in Python.
An autonomous agent that conducts deep research on any data using any LLM providers
openpilot is an operating system for robotics. Currently, it upgrades the driver assistance system on 300+ supported cars.
Train your AI self, amplify you, bridge the world
AI wearables. Put it on, speak, transcribe, automatically
Privacy first, AI meeting assistant with 4x faster Parakeet/Whisper live transcription, speaker diarization, and Ollama summarization built on Rust. 100% local processing. no cloud required. Meetily (Meetly Ai - https://meetily.ai) is the #1 Self-hosted, Open-source Ai meeting n
Replace 'hub' with 'ingest' in any GitHub URL to get a prompt-friendly extract of a codebase
Run GGUF models easily with a KoboldAI UI. One File. Zero Install.
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Open-source infrastructure for Computer-Use Agents. Sandboxes, SDKs, and benchmarks to train and evaluate AI agents that can control full desktops (macOS, Linux, Windows).
Low-code framework for building custom LLMs, neural networks, and other AI models
The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra
Debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards.
Open-source context retrieval layer for AI agents
Universal memory layer for AI Agents
Claraverse is a opesource privacy focused ecosystem to replace ChatGPT, Claude, N8N, ImageGen with your own hosted llm, keys and compute. With desktop, IOS, Android Apps.
chat with private and local large language models
Fully Local Manus AI. No APIs, No $200 monthly bills. Enjoy an autonomous agent that thinks, browses the web, and code for the sole cost of electricity. 🔔 Official updates only via twitter @Martin993886460 (Beware of fake account)
ArduPlane, ArduCopter, ArduRover, ArduSub source
An open-source AI agent that brings the power of Gemini directly into your terminal.
An LLM-powered knowledge curation system that researches a topic and generates a full-length report with citations.
A powerful GUI app and Toolkit for Claude Code - Create custom agents, manage interactive Claude Code sessions, run secure background agents, and more.
Digital Mind Extension
AI Agent that handles engineering tasks end-to-end: integrates with developers’ tools, plans, executes, and iterates until it achieves a successful result.
Glamourous agentic coding for all 💘
The open source coding agent.
Open-source simulator for autonomous driving research.
Autoware - the world's leading open-source software project for autonomous driving
An open autonomous driving platform
PX4 Autopilot Software
Open-Source Chrome extension for AI-powered web automation. Run multi-agent workflows using your own LLM API key. Alternative to OpenAI Operator.
An open source, privacy focused alternative to NotebookLM for teams with no data limit's. Join our Discord: https://discord.gg/ejRNvftDp9
SwarmUI (formerly StableSwarmUI), A Modular Stable Diffusion Web-User-Interface, with an emphasis on making powertools easily accessible, high performance, and extensibility.
5ire is a cross-platform desktop AI assistant, MCP client. It compatible with major service providers, supports local knowledge base and tools via model context protocol servers .
Open Source AI Platform - AI Chat with advanced features that works with every LLM
Toolkit for linearizing PDFs for LLM datasets/training
🐬DeepChat - A smart assistant that connects powerful AI to your personal world
Context-aware AI assistant for your desktop. Ready to respond intelligently, seamlessly integrating multiple LLMs and MCP tools.
AI Chatbots in terminal for free
Own your AI. The native macOS harness for AI agents -- any model, persistent memory, autonomous execution, cryptographic identity. Built in Swift. Fully offline. Open source.
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞
