Video creation, editing, avatars, clipping, and production automation.
Give your AI agent eyes to see the entire internet. Read & search Twitter, Reddit, YouTube, GitHub, Bilibili, XiaoHongShu — one CLI, zero API fees.
AI Video
Open-source voice-AI SDK. The Vapi/Retell alternative for builders who want to own the stack. Give your AI agent a phone number in 4 lines — Python and TypeScript, MIT.
AI Video
A secure persistent personal agent server in Rust. One binary, sandboxed execution, multi-provider LLMs, voice, memory, Telegram, WhatsApp, Discord, Teams, and MCP.
Canopy Labs is developing digital humans that are indistinguishable from real ones.
AI Video
This project lets you create your own AI desktop companion with customizable characters and voice conversations that respond in just 1 second. Features include.
Open-source, desktop-grade AI agent that gets real work done — data analysis, slides, docs, video & web research. Built on OpenClaw; runs tools on your real desktop.
AI Video
🤱🏻 Turn any webpage into a desktop app with one command.
AI Video
A generative speech model for daily dialogue.
Translate full-length books and documents with Ollama, OpenAI (comptatible), Gemini, Mistral, Poe or OpenRouter. Preserves formatting. Resumes where you left off. No.
Programmatic video for coding agents — HTML to video on your laptop. Turn HTML, CSS & data into real MP4s with pluggable render engines, 21 templates, AI soundtrack..
AI Video
The Open Source Alternative to Cluely - A lightning-fast, privacy-first AI assistant that works seamlessly during meetings, interviews, and conversations without.
AI Video
MOSS‑TTS Family is an open‑source speech and sound generation model family from MOSI.AI and the OpenMOSS team. It is designed for high‑fidelity, high‑expressiveness,.
Open-Source AI Camera Skills Platform, AI NVR & CCTV Surveillance. Local VLM video analysis with Qwen, DeepSeek, SmolVLM, LLaVA, YOLO26. LLM-powered agentic security.
13 Claude Code skills for video production (transcribe / translate / dub / multicam / subtitles / reframe) + WeChat publishing. Compatible with Claude Code, OpenAI.
Open-source AI design agent — alternative to Lovart AI, Runway Agent, Luma Labs Agent, Krea Agent, Pika Agent, Galileo AI, Magic Patterns. Autonomous multi-step.
Claude Code plugin + MCP server for ComfyUI — 88 tools, 14 AI skills (Flux, WAN, LTX video, Qwen), live graph editing from your Claude session. Generate images &.
AI Video
Gp.nvim (GPT prompt) Neovim AI plugin: ChatGPT sessions & Instructable text/code operations & Speech to text [OpenAI, Ollama, Anthropic,..]
AI ( MCP ) | Ai-based productivity tools (Chat,Draw,RAG,Workflow,MCP marketplace, ASR,TTS, Long-term memory etc)
🍦 Speech-AI-Forge is a project developed around TTS generation model, implementing an API Server and a Gradio-based WebUI.
The self-hosted AI workstation. Autonomous screen agents, 3-tier neural routing, parallel agent swarms, video generation, 4K/8K upscaling, RAG, voice interface, 70+.
AI Video
AI Inference Operator for Kubernetes. The easiest way to serve ML models in production. Supports VLMs, LLMs, embeddings, and speech-to-text.
AI Video
AutoClip: AI-powered video clipping and highlight generation ·
Free, high-quality text-to-speech API endpoint to replace OpenAI, Azure, or ElevenLabs
AI Video
AI Agent — → / / → → → , | Open-source AI video workspace powered by AI Agents, Nano Banana 2 & Veo 3.1 / Grok / Seedance / OpenAI
AI Video
Open-source, accurate and easy-to-use video speech recognition & clipping tool. LLM-based AI clipping integrated.
ASR/STT subtitle generator. Uses Qwen3-ASR, local LLM, Whisper, TEN-VAD. Noise-robust for JAV
Social media scheduling CLI and OpenClaw skill for AI agents posting to X, LinkedIn, Instagram, Facebook Pages, TikTok, Discord, Telegram, YouTube, Reddit, WordPress,.
AI Video
Generate audiobooks from EPUBs, PDFs and text with synchronized captions.
iOS & watchOS speech-to-text app with AI voice keyboard, on-device RAG, and chat with your notes - powered by Apple Foundation Models, WhisperKit, NVIDIA Parakeet, and.
AI Video
The media player for language learning, with dual subtitles, AI-generated subtitles, real-time translation, and more!
AI Video
GPT-4o , ASR+LLM+TTS , DeepSeek R1 , openClaw, , 800ms,Mac ,
AI Video
Infinity is a high-throughput, low-latency serving engine for text-embeddings, reranking models, clip, clap and colpali
A cross-platform video structuring (video analysis) framework. If you find it helpful, please give it a star: ) ( ) ,: )
AI Video
QVAC - Local AI SDK and libraries for building private, cross-platform, peer-to-peer AI applications. Run LLMs, speech-to-text, translation, and more locally on Linux,.
AI Video
(Spring Boot 3. X Microservices framework) Spring Boot 3.X Spring Cloud Alibaba / Spring Cloud Tencent + React 。🔝 🔝 starrred 。Chat GPT(RAG、TTS、STT、LLM)
Real-time web cockpit for OpenClaw: voice conversations, agent automated kanban board, workspace/file control, sub-agent sessions, inline charts, and usage visibility.
AgentCall lets AI Agents join meetings with voice, video & screen-share to build together. Supports Google Meet, Teams, Zoom (Beta)
Production-grade Go SDK for building AI agents with long-term memory, knowledge retrieval, and voice — runnable as a library, a daemon, or a real-time pipeline.
Clip any video into a narration recap with claude code skill| claude code skill ,
OpenBiliClaw Agent: , B YouTube、X、 Web / Local-first private AI discovery agent for Bilibili, Xiaohongshu, Douyi...
The open-source, self-hosted video conferencing software. Scalable, customizable, and with a powerful AI Meeting Agent.
AI Video
Voice notes for iPhone and macOS - 100% Rust, Dioxus, local-first (SQLite + LanceDB + RIG)
AI Video
HeyGen AI agent skills — avatar creation and video production via the v3 Video Agent pipeline
Production-ready code examples for Telnyx AI Communications Infrastructure — Voice AI, SMS, SIP, and IoT APIs
AI Video
The video search layer for AI agents. Search video by meaning — across speech, visuals, and on-screen text.
Iron-Man-style voice assistant + holographic HUD for Hermes Agent. Local Whisper STT, ElevenLabs voice, agent-summoned media panels, runs on your own hardware.