Published: August 6, 2025

Try it Out!

Try it in the AI Studio playground (120B model) Try it in the AI Studio playground (20B model) Try with API: gpt_oss_1.ipynb  

TL;DR

GPT OSS is a hugely anticipated open-weights release by OpenAI, designed for powerful reasoning, agentic tasks, and versatile developer use cases
  • Released: Aug 2025
  • Two models:
  • Both are mixture-of-experts (MoEs) and use a 4-bit quantization scheme (MXFP4), enabling fast inference (thanks to fewer active parameters)
    • GPT‑OSS‑120B: 36 layers, ~117B total parameters, but only ~5.1B parameters activated per token. It uses 128 experts, with just 4 experts activated per token.
    • GPT‑OSS‑20B: 24 layers, ~21B total parameters, ~3.6B active per token; 32 experts, 4 activated per token.
  • Reasoning, text-only models; with chain-of-thought and adjustable reasoning effort levels.
  • Instruction following and tool use support.
  • License: Apache 2.0, with a small complementary use policy.

Fun Facts

GPT-OSS is OpenAI’s open source model release since GPT-2 (released in 2019). That’s over 6 years! Benchmark beaters:
  • GPT‑OSS‑120B rivals or outperforms OpenAI’s proprietary o4‑mini in coding (Codeforces), general knowledge (MMLU, HLE), and even exceeds on math (AIME) and health (HealthBench).
  • GPT‑OSS‑20B, despite its size, matches or surpasses o3‑mini on many benchmarks—and even shines in math and health tasks

Performance and Benchmarks

Official benchmarks

chart.png See more here

Artificial Analysis Benchmark

Artificial Analysis Intelligence Index (25 Aug '25)  (1).png AA’s analysis More from AA here

Fun Benchmark: “Pelican riding a bicycle”

Inspired by Simon Willison’s fun experiment (see here), this benchmark is all about how well models generate quirky, imaginative responses. Prompt:
Generate an SVG of a pelican riding a bicycle
You can see our full pelican tests here. So how does gpt-oss do? Let’s see.
gpt-oss-20bgpt-oss-120b
gpt-oss-20b-pelican-1.pnggpt-oss-120b-pelican-1.png

Comparing against other SOTA open source models

qwen3-235B-2507deepseek-r1-0528
Screenshot 2025-08-25 at 18.34.14.pngScreenshot 2025-08-25 at 18.34.19.png

References