For the first time since GPT‑2, OpenAI has dropped fully open‑weight language models that you can download, run and own. The long‑awaited gpt‑oss‑20B and gpt‑oss‑120B are released under the Apache 2.0 license. You get the state‑of‑the‑art AI directly in your hands.
This release is a turning point for AI infrastructure. Organisations can now deploy, fine‑tune and scale frontier‑level models on‑premises or in private clouds, with full control over performance, cost and data privacy.
In this blog, we discuss GPT-OSS and show how Hyperstack makes running these open models effortless at enterprise scale.
OpenAI recently released gpt‑oss‑120B and gpt‑oss‑20B, its first open‑weight language models since GPT‑2 in 2019. These are distributed under the Apache 2.0 license, meaning anyone can download, inspect, deploy, fine‑tune or redistribute the models freely for commercial or research use. These models give organisations the power to run state‑of‑the‑art AI models without relying on a proprietary API.
GPT‑OSS models are Transformers leveraging Mixture‑of‑Experts (MoE) to optimise efficiency. Instead of activating all parameters for every token, MoE enables the model to use only a fraction of the parameters, reducing memory and computation costs while preserving performance.
GPT‑OSS is available in two sizes:
OpenAI benchmarked the models across reasoning, coding, health and competition mathematics, comparing them against their proprietary models.
While GPT‑OSS is open and flexible, it still requires high‑performance compute to run efficiently.
OpenAI’s release of gpt‑oss‑20B and 120B under Apache 2.0 licensing gives you complete freedom to self‑host:
Hyperstack is the first European‑owned GPU cloud to enable this at enterprise scale. We provide on‑demand access to NVIDIA H100 GPUs, high-speed networking and ultra‑fast NVMe storage, ensuring your GPT‑OSS workloads run smoothly and efficiently.
Deployment choice is the most important aspect for organisations handling sensitive or regulated data. With Hyperstack, you can:
This means compliance, latency control and data sovereignty are built into your AI workflow, giving you complete confidence in where your models and data reside.
Unlike SaaS‑wrapped AI platforms, GPT‑OSS models are fully open-source:
Thinking of trying the latest GPT‑OSS models? Here’s why Hyperstack is your ideal platform to run them at scale.
Hyperstack offers on-demand access to high-performance GPUs to run the latest GPT-OSS-20B and 120B models. You can easily run the 20B model on smaller GPUs, ideal for frequent inference, local fine‑tuning and edge‑ready workloads.
For the larger model, you can deploy the 120B model on H100 GPUs for high‑throughput reasoning and long‑context applications. The H100 GPUs on Hyperstack support high-speed networking of up to 350Gbps. This ensures fast data transfer and minimal bottlenecks for large‑scale inference.
Organisations can run GPT‑OSS entirely within Europe, keeping their data sovereign and compliant. We are also SOC 2 Type 1 certified, so this ensures your workloads meet enterprise‑grade security and operational standards.
AI workloads can be spiky and unpredictable but Hyperstack helps you save costs without compromising performance:
Hyperstack supports the full AI lifecycle:
Spin up RTX A6000 and NVIDIA H100 GPUs to run your choice of gpt‑oss‑20B or gpt‑oss‑120B with ease. Deploy in minutes and keep your data fully in your control with Hyperstack’s enterprise‑grade GPU cloud.
GPT‑OSS is OpenAI’s new open‑weight language model series, freely downloadable under Apache 2.0, allowing full control over hosting and usage.
Two models are available: gpt‑oss‑20B with 21 B parameters and gpt‑oss‑120B with 117 B parameters for advanced AI tasks.
Gpt‑oss‑120B requires high‑performance NVIDIA H100 GPUs, delivering high throughput for enterprise‑scale reasoning, long‑context inference and model fine‑tuning.
Hyperstack offers NVIDIA H100 GPUs for $1.90 per hour on-demand with per‑minute billing and hibernation cost‑saving options.