Both are mixture-of-experts (MoEs) and use a 4-bit quantization scheme (MXFP4), enabling fast inference (thanks to fewer active parameters)
GPT‑OSS‑120B: 36 layers, ~117B total parameters, but only ~5.1B parameters activated per token. It uses 128 experts, with just 4 experts activated per token.
GPT‑OSS‑20B: 24 layers, ~21B total parameters, ~3.6B active per token; 32 experts, 4 activated per token.
Reasoning, text-only models; with chain-of-thought and adjustable reasoning effort levels.
Instruction following and tool use support.
License: Apache 2.0, with a small complementary use policy.
GPT-OSS is OpenAI’s open source model release since GPT-2 (released in 2019). That’s over 6 years!Benchmark beaters:
GPT‑OSS‑120B rivals or outperforms OpenAI’s proprietary o4‑mini in coding (Codeforces), general knowledge (MMLU, HLE), and even exceeds on math (AIME) and health (HealthBench).
GPT‑OSS‑20B, despite its size, matches or surpasses o3‑mini on many benchmarks—and even shines in math and health tasks