13 tools with this tag
Best 13 multimodal AI tools in 2026
GLM Swipeer Gemini Omni XChat AI Seedance AI Kinovi C Dance ai Seedance 2 Pro Seedance Tianpu Le Gemini MetaMirror VoxDeck are among the best paid / free tools tagged "multimodal AI".

Google's multimodal large model supporting text, image, and code tasks.

AI video generation tool supporting multimodal creative storyboard services for diverse content creation.

AI presentation tool that redefines creative presentations using multimodal AI technology.

The first multimodal music generation model launched by the Changya team, capable of creating music through multiple input methods.

Kinovi is an AI platform for generating videos and images using multimodal references, top-tier models, and a public REST API. It offers free access to start creating.

Seedance 2.0 by Seedance is a multimodal AI video generator that transforms text, images, and audio into cinematic video content for professional creators.

Seedance 2 Pro by Seedance enables creators to produce high-quality AI videos from text, images, and audio, featuring multi-shot scene control and multimodal references for cinematic results.

C Dance ai, developed by Seedance, is a versatile video generation tool supporting text, image, audio, and video inputs. It offers multimodal reference, editing, and director-level control for creativ

ByteDance’s official platform for generating cinematic videos from text prompts using a powerful multimodal AI video engine.

An AI character platform by XChat AI for creating and chatting with virtual personas. Generate images, videos, and more using advanced models like GPT, Claude, Gemini, FLUX, Kling, and ByteDance.

Google's unified multimodal video model for creating, remixing, and editing videos with realistic motion, scene control, and advanced text rendering.

Swipeer by Swipeer AI is a productivity platform for task management, offering swipe-based navigation, advanced chat, multimodal capabilities, and seamless integration to help users unlock their poten

Zhipu AI's GLM-5V Turbo is a multimodal vision-language model designed for complex image analysis, visual reasoning, and text generation from visual inputs.