Using GPT-5
Learn best practices, features, and migration guidance for GPT-5.
GPT-5 is our most intelligent model yet, trained to be especially proficient in:
- Code generation, bug fixing, and refactoring
- Instruction following
- Long context and tool calling
This guide covers key features of the GPT-5 model family and how to get the most out of GPT-5.
Explore coding examples
Click through a few demo applications generated entirely with a single GPT-5 prompt, without writing any code by hand.
Quickstart
Faster responses
By default, GPT-5 produces a medium length chain of thought before responding to a prompt. For faster, lower-latency responses, use low reasoning effort and low text verbosity.
This behavior will more closely (but not exactly!) match non-reasoning models like GPT-4.1. We expect GPT-5 to produce more intelligent responses than GPT-4.1, but when speed and maximum context length are paramount, you might consider using GPT-4.1 instead.
Fast, low latency response options
import OpenAI from "openai";
const openai = new OpenAI();
const result = await openai.responses.create({
model: "gpt-5",
input: "Write a haiku about code.",
reasoning: { effort: "low" },
text: { verbosity: "low" },
});
console.log(result.output_text);
from openai import OpenAI
client = OpenAI()
result = client.responses.create(
model="gpt-5",
input="Write a haiku about code.",
reasoning={ "effort": "low" },
text={ "verbosity": "low" },
)
print(result.output_text)
curl https://api.openai.com/v1/responses \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{
"model": "gpt-5",
"input": "Write a haiku about code.",
"reasoning": { "effort": "low" }
}'
Coding and agentic tasks
GPT-5 is great at reasoning through complex tasks. For complex tasks like coding and multi-step planning, use high reasoning effort.
Use these configurations when replacing tasks you might have used o3 to tackle. We expect GPT-5 to produce better results than o3 and o4-mini under most circumstances.
Slower, high reasoning tasks
import OpenAI from "openai";
const openai = new OpenAI();
const result = await openai.responses.create({
model: "gpt-5",
input: "Find the null pointer exception: ...your code here...",
reasoning: { effort: "high" },
});
console.log(result.output_text);
from openai import OpenAI
client = OpenAI()
result = client.responses.create(
model="gpt-5",
input="Find the null pointer exception: ...your code here...",
reasoning={ "effort": "high" },
)
print(result.output_text)
curl https://api.openai.com/v1/responses \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{
"model": "gpt-5",
"input": "Find the null pointer exception: ...your code here...",
"reasoning": { "effort": "high" }
}'
Meet the models
There are three models in the GPT-5 series. In general, gpt-5 is best for your most complex tasks that require broad world knowledge. The smaller mini and nano models trade off some general world knowledge for lower cost and lower latency. Small models will tend to perform better for more well defined tasks.
To help you pick the model that best fits your use case, consider these tradeoffs:
| Variant | Best for |
|---|---|
| gpt-5 | Complex reasoning, broad world knowledge, and code-heavy or multi-step agentic tasks |
| gpt-5-mini | Cost-optimized reasoning and chat; balances speed, cost, and capability |
| gpt-5-nano | High-throughput tasks, especially simple instruction-following or classification |
Model name reference
The GPT-5 system card uses different names than the API. Use this table to map between them:
| System card name | API alias |
|---|---|
| gpt-5-thinking | gpt-5 |
| gpt-5-thinking-mini | gpt-5-mini |
| gpt-5-thinking-nano | gpt-5-nano |
| gpt-5-main | gpt-5-chat-latest |
| gpt-5-main-mini | [not available via API] |
New API features in GPT-5
Alongside GPT-5, we're introducing a few new parameters and API features designed to give developers more control and flexibility: the ability to control verbosity, a minimal reasoning effort option, custom tools, and an allowed tools list.
This guide walks through some of the key features of the GPT-5 model family and how to get the most out of these models.
Minimal reasoning effort
The reasoning.effort parameter controls how many reasoning tokens the model generates before producing a response. Earlier reasoning models like o3 supported only low, medium, and high: low favored speed and fewer tokens, while high favored more thorough reasoning.
The new minimal setting produces very few reasoning tokens for cases where you need the fastest possible time-to-first-token. We often see better performance when the model can produce a few tokens when needed versus none. The default is medium.
The minimal setting performs especially well in coding and instruction following scenarios, adhering closely to given directions. However, it may require prompting to act more proactively. To improve the model's reasoning quality, even at minimal effort, encourage it to “think” or outline its steps before answering.
Minimal reasoning effort
curl --request POST --url https://api.openai.com/v1/responses --header "Authorization: Bearer $OPENAI_API_KEY" --header 'Content-type: application/json' --data '{
"model": "gpt-5",
"input": "How much gold would it take to coat the Statue of Liberty in a 1mm layer?",
"reasoning": {
"effort": "minimal"
}
}'
import OpenAI from "openai";
const openai = new OpenAI();
const response = await openai.responses.create({
model: "gpt-5",
input: "How much gold would it take to coat the Statue of Liberty in a 1mm layer?",
reasoning: {
effort: "minimal"
}
});
console.log(response);
from openai import OpenAI
client = OpenAI()
response = client.responses.create(
model="gpt-5",
input="How much gold would it take to coat the Statue of Liberty in a 1mm layer?",
reasoning={
"effort": "minimal"
}
)
print(response)
Verbosity
Verbosity determines how many output tokens are generated. Lowering the number of tokens reduces overall latency. While the model's reasoning approach stays mostly the same, the model finds ways to answer more concisely—which can either improve or diminish answer quality, depending on your use case. Here are some scenarios for both ends of the verbosity spectrum:
- High verbosity: Use when you need the model to provide thorough explanations of documents or perform extensive code refactoring.
- Low verbosity: Best for situations where you want concise answers or simple code generation, such as SQL queries.
Models before GPT-5 have used medium verbosity by default. With GPT-5, we make this option configurable as one of high, medium, or low.
When generating code, medium and high verbosity levels yield longer, more structured code with inline explanations, while low verbosity produces shorter, more concise code with minimal commentary.
Control verbosity
curl --request POST --url https://api.openai.com/v1/responses --header "Authorization: Bearer $OPENAI_API_KEY" --header 'Content-type: application/json' --data '{
"model": "gpt-5",
"input": "What is the answer to the ultimate question of life, the universe, and everything?",
"text": {
"verbosity": "low"
}
}'
import OpenAI from "openai";
const openai = new OpenAI();
const response = await openai.responses.create({
model: "gpt-5",
input: "What is the answer to the ultimate question of life, the universe, and everything?",
text: {
verbosity: "low"
}
});
console.log(response);
from openai import OpenAI
client = OpenAI()
response = client.responses.create(
model="gpt-5",
input="What is the answer to the ultimate question of life, the universe, and everything?",
text={
"verbosity": "low"
}
)
print(response)
You can still steer verbosity through prompting after setting it to low in the API. The verbosity parameter defines a general token range at the system prompt level, but the actual output is flexible to both developer and user prompts within that range.
Custom tools
With GPT-5, we're introducing a new capability called custom tools, which lets models send any raw text as tool call input but still constrain outputs if desired.
[
Function calling guide
Learn about custom tools in the function calling guide.
](/docs/guides/function-calling)
Freeform inputs
Define your tool with type: custom to enable models to send plaintext inputs directly to your tools, rather than being limited to structured JSON. The model can send any raw text—code, SQL queries, shell commands, configuration files, or long-form prose—directly to your tool.
{
"type": "custom",
"name": "code_exec",
"description": "Executes arbitrary python code",
}
Constraining outputs
GPT-5 supports context-free grammars (CFGs) for custom tools, letting you provide a Lark grammar to constrain outputs to a specific syntax or DSL. Attaching a CFG (e.g., a SQL or DSL grammar) ensures the assistant's text matches your grammar.
This enables precise, constrained tool calls or structured responses and lets you enforce strict syntactic or domain-specific formats directly in GPT-5's function calling, improving control and reliability for complex or constrained domains.
Best practices for custom tools
- Write concise, explicit tool descriptions. The model chooses what to send based on your description; state clearly if you want it to always call the tool.
- Validate outputs on the server side. Freeform strings are powerful but require safeguards against injection or unsafe commands.
Allowed tools
The allowed_tools parameter under tool_choice lets you pass N tool definitions but restrict the model to only M (< N) of them. List your full toolkit in tools, and then use an allowed_tools block to name the subset and specify a mode—either auto (the model may pick any of those) or required (the model must invoke one).
[
Function calling guide
Learn about the allowed tools option in the function calling guide.
](/docs/guides/function-calling)
By separating all possible tools from the subset that can be used now, you gain greater safety, predictability, and improved prompt caching. You also avoid brittle prompt engineering, such as hard-coded call order. GPT-5 dynamically invokes or requires specific functions mid-conversation while reducing the risk of unintended tool usage over long contexts.
| Standard Tools | Allowed Tools | |
|---|---|---|
| Model's universe | All tools listed under "tools": […] | Only the subset under "tools": […] in tool_choice |
| Tool invocation | Model may or may not call any tool | Model restricted to (or required to call) chosen tools |
| Purpose | Declare available capabilities | Constrain which capabilities are actually used |
"tool_choice": {
"type": "allowed_tools",
"mode": "auto",
"tools": [
{ "type": "function", "name": "get_weather" },
{ "type": "mcp", "server_label": "deepwiki" },
{ "type": "image_generation" }
]
}
}'
For a more detailed overview of all of these new features, see the accompanying cookbook.
Preambles
Preambles are brief, user-visible explanations that GPT-5 generates before invoking any tool or function, outlining its intent or plan (e.g., “why I'm calling this tool”). They appear after the chain-of-thought and before the actual tool call, providing transparency into the model's reasoning and enhancing debuggability, user confidence, and fine-grained steerability.
By letting GPT-5 “think out loud” before each tool call, preambles boost tool-calling accuracy (and overall task success) without bloating reasoning overhead. To enable preambles, add a system or developer instruction—for example: “Before you call a tool, explain why you are calling it.” GPT-5 prepends a concise rationale to each specified tool call. The model may also output multiple messages between tool calls, which can enhance the interaction experience—particularly for minimal reasoning or latency-sensitive use cases.
For more on using preambles, see the GPT-5 prompting cookbook.
Migration guidance
GPT-5 is our best model yet, and it works best with the Responses API, which supports for passing chain of thought (CoT) between turns. Read below to migrate from your current model or API.
Migrating from other models to GPT-5
We see improved intelligence because the Responses API can pass the previous turn's CoT to the model. This leads to fewer generated reasoning tokens, higher cache hit rates, and less latency. To learn more, see an in-depth guide on the benefits of responses.
When migrating to GPT-5 from an older OpenAI model, start by experimenting with reasoning levels and prompting strategies. Based on our testing, we recommend using our prompt optimizer—which automatically updates your prompts for GPT-5 based on our best practices—and following this model-specific guidance:
- o3:
gpt-5withmediumorhighreasoning is a great replacement. Start withmediumreasoning with prompt tuning, then increasing tohighif you aren't getting the results you want. - gpt-4.1:
gpt-5withminimalorlowreasoning is a strong alternative. Start withminimaland tune your prompts; increase tolowif you need better performance. - o4-mini or gpt-4.1-mini:
gpt-5-miniwith prompt tuning is a great replacement. - gpt-4.1-nano:
gpt-5-nanowith prompt tuning is a great replacement.
Migrating from Chat Completions to Responses API
The biggest difference, and main reason to migrate from Chat Completions to the Responses API for GPT-5, is support for passing chain of thought (CoT) between turns. See a full comparison of the APIs.
Passing CoT exists only in the Responses API, and we've seen improved intelligence, fewer generated reasoning tokens, higher cache hit rates, and lower latency as a result of doing so. Most other parameters remain at parity, though the formatting is different. Here's how new parameters are handled differently between Chat Completions and the Responses API:
Reasoning effort
Responses API
Generate response with minimal reasoning
curl --request POST \
--url https://api.openai.com/v1/responses \
--header "Authorization: Bearer $OPENAI_API_KEY" \
--header 'Content-type: application/json' \
--data '{
"model": "gpt-5",
"input": "How much gold would it take to coat the Statue of Liberty in a 1mm layer?",
"reasoning": {
"effort": "minimal"
}
}'
Chat Completions
Generate response with minimal reasoning
curl --request POST \
--url https://api.openai.com/v1/chat/completions \
--header "Authorization: Bearer $OPENAI_API_KEY" \
--header 'Content-type: application/json' \
--data '{
"model": "gpt-5",
"messages": [
{
"role": "user",
"content": "How much gold would it take to coat the Statue of Liberty in a 1mm layer?"
}
],
"reasoning_effort": "minimal"
}'
Verbosity
Responses API
Control verbosity
curl --request POST \
--url https://api.openai.com/v1/responses \
--header "Authorization: Bearer $OPENAI_API_KEY" \
--header 'Content-type: application/json' \
--data '{
"model": "gpt-5",
"input": "What is the answer to the ultimate question of life, the universe, and everything?",
"text": {
"verbosity": "low"
}
}'
Chat Completions
Control verbosity
curl --request POST \
--url https://api.openai.com/v1/chat/completions \
--header "Authorization: Bearer $OPENAI_API_KEY" \
--header 'Content-type: application/json' \
--data '{
"model": "gpt-5",
"messages": [
{ "role": "user", "content": "What is the answer to the ultimate question of life, the universe, and everything?" }
],
"verbosity": "low"
}'
Custom tools
Responses API
Custom tool call
curl --request POST --url https://api.openai.com/v1/responses --header "Authorization: Bearer $OPENAI_API_KEY" --header 'Content-type: application/json' --data '{
"model": "gpt-5",
"input": "Use the code_exec tool to calculate the area of a circle with radius equal to the number of r letters in blueberry",
"tools": [
{
"type": "custom",
"name": "code_exec",
"description": "Executes arbitrary python code"
}
]
}'
Chat Completions
Custom tool call
curl --request POST --url https://api.openai.com/v1/chat/completions --header "Authorization: Bearer $OPENAI_API_KEY" --header 'Content-type: application/json' --data '{
"model": "gpt-5",
"messages": [
{ "role": "user", "content": "Use the code_exec tool to calculate the area of a circle with radius equal to the number of r letters in blueberry" }
],
"tools": [
{
"type": "custom",
"custom": {
"name": "code_exec",
"description": "Executes arbitrary python code"
}
}
]
}'
Prompting guidance
We specifically designed GPT-5 to excel at coding, frontend engineering, and tool-calling for agentic tasks. We also recommend iterating on prompts for GPT-5 using the prompt optimizer.
[
GPT-5 prompt optimizer
Craft the perfect prompt for GPT-5 in the dashboard
](/chat/edit?optimize=true)[
GPT-5 prompting guide
Learn full best practices for prompting GPT-5 models
](https://cookbook.openai.com/examples/gpt-5/gpt-5_prompting_guide)[
Frontend prompting for GPT-5
See prompt samples specific to frontend development
](https://cookbook.openai.com/examples/gpt-5/gpt-5_frontend)
GPT-5 is a reasoning model
Reasoning models like GPT-5 break problems down step by step, producing an internal chain of thought that encodes their reasoning. To maximize performance, pass these reasoning items back to the model: this avoids re-reasoning and keeps interactions closer to the model's training distribution. In multi-turn conversations, passing a previous_response_id automatically makes earlier reasoning items available. This is especially important when using tools—for example, when a function call requires an extra round trip. In these cases, either include them with previous_response_id or add them directly to input.
Learn more about reasoning models and how to get the most out of them in our reasoning guide.
Further reading
Comparison of Responses API vs. Chat Completions
FAQ
How are these models integrated into ChatGPT?
In ChatGPT, there are two models:
gpt-5-chatandgpt-5-thinking. They offer reasoning and minimal-reasoning capabilities, with a routing layer that selects the best model based on the user's question. Users can also invoke reasoning directly through the ChatGPT UI.Will these models be supported in Codex?
Yes,
gpt-5will be available in Codex and Codex CLI.What is the deprecation plan for previous models?
Any model deprecations will be posted on our deprecations page. We'll send advanced notice of any model deprecations.
I kinda like having more good documentation saved on Hive. Doesn't need to be part of reward pool.
@askai what would be some cool ways to integrate gpt5 together with Hive blockchain? Brainstorm cool ideas