AI Video Glossary - Terms & Definitions

A-D

AI Avatar

A digital human presenter generated by AI. Can be based on stock characters or cloned from real people. Used in training videos, marketing, and presentations.

CFG Scale (Classifier-Free Guidance)

A parameter that controls how closely the AI follows your prompt. Higher values = more literal interpretation, lower values = more creative freedom.

Diffusion Model

The underlying AI architecture used by most modern video generators. Works by gradually removing noise from random data to create coherent images/video.

DiT (Diffusion Transformer)

A newer architecture combining diffusion models with transformers. Used in state-of-the-art models like Sora and newer Synthesia avatars.

F-I

Frame Interpolation

AI technique to generate intermediate frames between existing ones. Creates smoother slow motion and increases frame rate (e.g., 24fps to 60fps).

Image-to-Video (I2V)

Generating video from a still image. The AI animates the image, adding motion while maintaining the original appearance.

Inpainting

Filling in or replacing parts of an image/video. Used for removing objects, changing backgrounds, or fixing artifacts.

K-O

Keyframe

A reference point in video generation. Some tools let you set start and end keyframes, with AI generating the motion between them.

Lip Sync

Matching mouth movements to audio. AI lip sync can dub videos into new languages while making the speaker appear to speak that language.

Motion Brush

A tool (popularized by Runway) that lets you paint areas of an image to indicate where and how motion should occur.

P-T

Prompt

The text instruction you give to an AI video generator. Better prompts = better results. Often includes subject, action, style, and camera movement.

Temporal Consistency

How stable and coherent objects remain across video frames. Poor consistency causes "jitter" or morphing. A key quality metric.

Text-to-Video (T2V)

Generating video directly from a text description. The core capability of tools like Runway, Pika, and Kling.

Turbo Mode

Faster generation at potentially lower quality. Many tools offer standard vs. turbo options to balance speed and quality.

U-Z

Upscaling

Increasing video resolution using AI. Can transform 720p to 4K or even 8K while adding realistic detail.

Video-to-Video (V2V)

Transforming existing video with AI. Can change style, add effects, or modify content while maintaining the original motion.

Voice Cloning

Creating a synthetic version of someone's voice from samples. Used in avatar tools to maintain consistent voice across languages.