with one click
openai-chat-completions
// Create chat completions with messages array, role management, response_format, sampling controls, and cross-provider model names.
// Create chat completions with messages array, role management, response_format, sampling controls, and cross-provider model names.
[HINT] Download the complete skill directory including SKILL.md and all related files
| name | openai-chat-completions |
| description | Create chat completions with messages array, role management, response_format, sampling controls, and cross-provider model names. |
| tech_stack | ["openai"] |
| language | ["python"] |
| capability | ["llm-client","prompt-engineering"] |
| version | openai-python unversioned |
| collected_at | "2026-01-12T00:00:00.000Z" |
Source: https://developers.openai.com/api/reference/python, https://deepwiki.com/openai/openai-python/4.1.1-parameters-and-configuration, https://github.com/openai/openai-python/blob/main/README.md?plain=1
Call client.chat.completions.create() to generate text from OpenAI and OpenAI-compatible models. The API accepts an array of role-tagged messages and returns one or more completion choices with configurable sampling, formatting, and reasoning behavior.
messages and re-send the full history each turnresponse_formatreasoning_effortlogprobsfrom openai import OpenAI
client = OpenAI()
completion = client.chat.completions.create(
model="gpt-5.2",
messages=[
{"role": "developer", "content": "Talk like a pirate."},
{"role": "user", "content": "How do I check if a Python object is an instance of a class?"},
],
)
print(completion.choices[0].message.content)
messages = [
{"role": "system", "content": "You are a helpful math tutor."}, # System instruction
{"role": "user", "content": "What is the derivative of x^2?"}, # User turn
{"role": "assistant", "content": "The derivative is 2x."}, # Model response (history)
{"role": "user", "content": "Now the second derivative."}, # Next user turn
]
| Role | Purpose |
|---|---|
developer | Highest-priority instructions (preferred for gpt-5.x; replaces system) |
system | System-level instructions (legacy, still widely supported) |
user | User input / conversation turn |
assistant | Model response for multi-turn context |
tool | Function call result in tool-calling loops |
The content field can be a plain string or an array of content parts (text, image_url for vision, etc.).
messages = [{"role": "system", "content": "You are a helpful assistant."}]
while True:
user_input = input("You: ")
messages.append({"role": "user", "content": user_input})
response = client.chat.completions.create(model="gpt-4o", messages=messages)
reply = response.choices[0].message
messages.append({"role": "assistant", "content": reply.content})
print(f"Assistant: {reply.content}")
# JSON object (legacy — requires the word "JSON" somewhere in messages)
client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Return JSON with name, age, city"}],
response_format={"type": "json_object"},
)
# JSON Schema (Structured Outputs)
client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "List 3 popular Python libraries"}],
response_format={
"type": "json_schema",
"json_schema": {
"name": "library_list",
"schema": {
"type": "object",
"properties": {
"libraries": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": {"type": "string"},
"purpose": {"type": "string"},
},
"required": ["name", "purpose"],
"additionalProperties": False,
},
}
},
"required": ["libraries"],
"additionalProperties": False,
},
"strict": True,
},
},
)
# Creative generation
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Write a poem"}],
temperature=1.5, # 0-2, higher = more random
top_p=0.9, # nucleus sampling threshold
frequency_penalty=0.5, # -2.0 to 2.0, penalize repetition
presence_penalty=0.3, # -2.0 to 2.0, penalize topic reuse
)
# Deterministic (best-effort)
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Generate a random color name"}],
seed=42, # same seed + same system_fingerprint ≈ same output
temperature=0,
)
# Constrained with stop sequences
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "List fruits: 1."}],
stop=["\n", "4."], # up to 4 stop sequences
max_tokens=50,
)
Critical: Alter temperature OR top_p — not both at once per OpenAI recommendations.
# For standard models (GPT-4o, etc.)
response = client.chat.completions.create(
model="gpt-4o",
messages=[...],
max_tokens=500,
)
# For o-series reasoning models — MUST use max_completion_tokens
response = client.chat.completions.create(
model="o3",
messages=[...],
max_completion_tokens=2000, # NOT max_tokens — incompatible with o-series
)
response = client.chat.completions.create(
model="gpt-5.2",
messages=[{"role": "user", "content": "Solve this complex puzzle..."}],
reasoning_effort="high", # "none"|"minimal"|"low"|"medium"|"high"|"xhigh"
)
Availability depends on model. Lower effort = faster but less thorough. gpt-5-pro only supports "high".
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Say hello world"}],
logprobs=True,
top_logprobs=5, # 0–20, requires logprobs=True
)
# response.choices[0].logprobs.content[...].top_logprobs
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Suggest a startup idea"}],
n=3,
)
for i, choice in enumerate(response.choices):
print(f"Choice {i}: {choice.message.content}")
from openai import OpenAI
client = OpenAI(
api_key="sk-your-deepseek-key",
base_url="https://api.deepseek.com",
)
response = client.chat.completions.create(
model="deepseek-v4-pro", # or "deepseek-v4-flash"
messages=[
{"role": "system", "content": "You are a helpful assistant"},
{"role": "user", "content": "Hello!"},
],
extra_body={"thinking": {"type": "enabled"}}, # provider-specific params
)
Provider model names: DeepSeek (deepseek-v4-flash, deepseek-v4-pro), Qwen (qwen-turbo, qwen-plus, qwen-max), Zhipu (glm-4-plus, glm-4-flash).
max_tokens breaks on o-series models. Always use max_completion_tokens for o3, o4-mini, and similar reasoning models.developer > system for gpt-5.x. New models prefer developer role for highest-priority instructions. Legacy system still works but may be less effective.functions / function_call are deprecated. Migrate to tools / tool_choice. Old params still accepted but subject to removal.user parameter is deprecated. Use safety_identifier for abuse detection and prompt_cache_key for prompt caching.response_format: json_object needs "JSON" in your prompt. If the messages don't mention JSON, the model may refuse or return non-JSON.seed is best-effort, not guaranteed. The system_fingerprint in the response tracks backend state. Same seed + same fingerprint ≈ reproducible output, but backend changes can break reproducibility.extra_body. DeepSeek thinking, Qwen enable_search, etc. are not in the OpenAI schema. The SDK passes them through as-is.store=True drops images >8MB silently from stored completions.response_format. For structured outputs with Pydantic, use client.chat.completions.parse() instead.openai-async-client for async chat (AsyncOpenAI().chat.completions.create(...)).stream=True and use openai-streaming patterns.tools and tool_choice — see openai-tool-calling.openai-structured-outputs (the parse() method).image_url content blocks — see openai-vision.