AI Actions

AI actions hand work to a model: chat, image generation, image recognition, moderation, and speech conversion. Each one runs against a model you choose and writes its result back into the context.

Model and provider selection

Every AI action carries a model selection that names the provider and model to use for that call. The available providers and models, and how selection is resolved, are covered on AI Models & Providers. Inputs to these actions are usually value inputs (a constant, a context key, or a variable); outputs are written to a named context key.

AI Chat

Sends a prompt to a chat/completion model and stores the text reply. Use it for classification, extraction, summarization, drafting, and any natural-language reasoning step.

Field	Role
System Message	Input — optional instructions that set the model's behavior.
User Message	Input — the main prompt sent to the model.
Output Text	Output — context key that receives the model's reply.

AI Image Generation

Generates an image from a text description and stores the result for later use.

Field	Role
Prompt	Input — a description of the image to generate.
Output	Output — context key that receives the generated image.

AI Image Recognition

Asks a vision-capable model a question about one or more images and returns a text answer.

Field	Role
Images	Input — one or more images to analyze.
Prompt	Input — what to ask the model about the image(s).
Output Text	Output — context key that receives the answer.

AI Moderation

Evaluates text against content policy and reports whether it should be flagged. Use it to gate user-supplied content before acting on it.

Field	Role
Text	Input — the text to moderate.
Flagged Output	Output — context key that receives the flagged result.

Speech-to-Text

Transcribes audio into text.

Field	Role
Audio	Input — context key holding the audio to transcribe.
Prompt	Input — optional hint to guide transcription.
Output	Output — context key that receives the transcript.

Text-to-Speech

Synthesizes spoken audio from text.

Field	Role
Text	Input — the text to speak.
Speech Output	Output — context key that receives the generated audio.

Chaining AI steps Because outputs land in context keys, you can pipe one AI action into another — for example, transcribe audio with Speech-to-Text, then feed the transcript into AI Chat for summarization.