<img alt="" src="https://secure.insightful-enterprise-intelligence.com/783141.png" style="display:none;">
Reserve here

NVIDIA H100 SXMs On-Demand at $2.40/hour - Reserve from just $1.90/hour. Reserve here

Reserve here

Deploy 8 to 16,384 NVIDIA H100 SXM GPUs on the AI Supercloud. Learn More

alert

We’ve been made aware of a fraudulent website impersonating Hyperstack at hyperstack.my.
This domain is not affiliated with Hyperstack or NexGen Cloud.

If you’ve been approached or interacted with this site, please contact our team immediately at support@hyperstack.cloud.

close
|

Updated on 19 Nov 2025

How to Integrate Hyperstack AI Studio as a Provider in LiteLLM

TABLE OF CONTENTS

summary
With the rapid evolution of AI-driven development tools, integrating large language models (LLMs) into software systems has become increasingly accessible and modular. Developers are no longer restricted to a single provider now, they can build hybrid AI systems by combining inference backends, model management tools, and application layer SDKs.

Two such powerful tools that make this process seamless are:

  • Hyperstack AI Studio: a complete platform for training, deploying, and managing LLMs.
  • LiteLLM: a unified SDK and proxy layer that lets you call any OpenAI-compatible API with a simple, standard interface.

In this guide, we will explore three main areas:

  1. What LiteLLM is and why it’s useful.
  2. Why Hyperstack AI Studio is the perfect match for it.
  3. Step-by-step integration of both tools using both the Python SDK method and the Proxy Server method.

Let's get started!

Understanding LiteLLM

What is LiteLLM?

LiteLLM is an open-source Python SDK and proxy server designed to simplify communication with any OpenAI-compatible model APIs. It provides a unified interface that allows you to connect multiple providers like OpenAI, Anthropic, Mistral, or Hyperstack using the same function calls.

Understanding LiteLLM

Essentially, LiteLLM acts as a universal translator for AI models.

Instead of learning different SDKs and authentication formats for each provider, LiteLLM lets you write once and run anywhere.

Why Use LiteLLM?

  • Unified Interface: Call any OpenAI-compatible model using one function, completion().
  • Provider Agnostic: Works with over 100+ LLMs from different providers (OpenAI, Anthropic, Hyperstack, etc.).
  • Proxy Mode: Optionally run as a local or hosted server to manage API routing, usage monitoring, and team-level model sharing.
  • Secure & Configurable: Manage keys, API bases, and routing with simple .env configuration.
  • Scalable: Integrate with apps, dashboards, and backends without vendor lock-in.

In short, LiteLLM helps teams use any LLM through a single SDK, while still allowing advanced setups like caching, load balancing, and proxying.

For more details, you can visit LiteLLM Official Website for the latest updates and features, and check out the LiteLLM Documentation.

Importance of Hyperstack AI Studio

Hyperstack AI Studio is a full-stack AI development platform that enables developers and enterprises to train, fine-tune, deploy, and benchmark LLMs without managing GPU infrastructure.

Hyperstack enhances the AI workflow with several developer-focused benefits:

  • OpenAI-Compatible API: Works directly with LiteLLM and other OpenAI clients.
  • Fine-Tuning Support: Train models on your custom data to meet specific application needs.
  • Evaluation Frameworks: Benchmark and test models after fine-tuning.
  • Playground Interface: Experiment interactively with different models before deploying.
  • Flexible Model Catalog: Supports multiple open-source models (e.g., gpt-oss-120b, Llama-3.1-8B-Instruct, etc.).
  • Secure and Private: Keep data isolated while leveraging managed GPU infrastructure.
  • Cost-Effective and Scalable: Pay only for what you use and scale as needed.

For more details, visit the Hyperstack AI Studio and check out the Hyperstack Documentation.

Why Hyperstack AI Studio + LiteLLM Is a Perfect Match

Integrating Hyperstack AI Studio with LiteLLM gives developers two powerful tools that complement each other perfectly:

Why Hyperstack AI Studio + LiteLLM Is a Perfect Match

Hyperstack AI Studio LiteLLM
AI backend runs inference, fine-tuning, evaluation, and hosting Interface layer routes requests and provides SDK/proxy access
OpenAI-compatible API Supports OpenAI schema out of the box
Scalable, secure model infrastructure Unified function calls and proxy endpoints
Fine-tuned custom model hosting Easy local or cloud integration

In short, Hyperstack provides the model training and hosting capabilities, while LiteLLM offers the developer-friendly interface and API management.

Integration of Hyperstack AI Studio with LiteLLM

There are two ways to integrate Hyperstack AI Studio with LiteLLM:

  1. Using the Python SDK (direct code-based integration)
  2. Using the LiteLLM Proxy Server (self-hosted API gateway)

Let’s go through both step-by-step.

Integration Using Python SDK

Step 1: Install LiteLLM

First, install the LiteLLM package via pip:

pip install litellm

When you run this command, it will download and install the LiteLLM library along with its dependencies.

This installs the main SDK that provides the completion() function and other utilities to interact with LLMs.

Step 2: Retrieve Hyperstack API Details

  1. Go to the Hyperstack Console and log in with your credentials.
  2. Navigate to the AI Studio Playground to explore available models before integration them with LiteLLM.

Hyperstack AI Studio Playground

In the playground, Select your desired model after quick testing it on the interface. We are going with llama-3.1-8b-instruct for this integration.

Then click on the API section to get the Base URL and Model ID.

Then click on the API section to get the Base URL and Model ID.

You can check the available models on their base model documentation page. You can copy the model id and base url from here, we will need it in the next step.

Step 3: Generate an API Key

To authenticate, we will need a valid API key from Hyperstack AI Studio.

  1. Go to the API Keys section in the Hyperstack console.

  2. Click Generate New Key.

  3. Give it a name (e.g., LiteLLM-integration-key).

  4. Copy the generated key, we will use it in LiteLLM module.

Hyperstack API key generation

Now that we have the required details for LiteLLM, let's now use them.

Step 4: Connect Hyperstack AI Studio to LiteLLM

We first have to import the necessary libraries, you can import litellm and os as shown below:

# Import the 'completion' function from the litellm library
from litellm import completion

# Import the 'os' module for interacting with the operating system environment
import os

Next, we need to set the environment variables for the Hyperstack API key and base URL. Replace "your_hyperstack_api_key" with the actual API key you obtained from the Hyperstack console.

# Set the Hyperstack API key as an environment variable
# Replace 'your_hyperstack_api_key' with the actual API key from your console
os.environ["OPENAI_API_KEY"] = "your_hyperstack_api_key"

# Set the base URL for Hyperstack's OpenAI-compatible API
os.environ["OPENAI_API_BASE"] = "https://console.hyperstack.cloud/ai/api/v1"

We can now call the completion() function to interact with the Hyperstack model. Here’s a simple example that sends a message and prints the response:

# Try executing the following block
try:
    # Call the LiteLLM completion function
    # Here, we specify:
    # - model: The name of the Hyperstack-hosted model (example: Llama 3.1 8B)
    # - messages: The chat-like input structure, similar to OpenAI’s ChatCompletion API
    response = completion(
        model="openai/meta-llama/Llama-3.1-8B-Instruct",
        messages=[{"role": "user", "content": "hi there"}],
    )

    # Print the assistant's response from the output
    print(response['choices'][0]['message']['content'])

# Catch any exceptions (e.g., invalid API key, network issues)
except Exception as e:
    # Print the error message for debugging
    print("Error:", e)

You can see we have specified the model name as openai/meta-llama/Llama-3.1-8B-Instruct, here we are setting openai/ as a prefix because Hyperstack uses OpenAI-compatible endpoints.

When you run this code, it should print a response from the model.

How can I assist you today?

This confirms that our LiteLLM SDK is successfully communicating with the Hyperstack API endpoint.

Integration Using LiteLLM Proxy Server

We normally use the LiteLLM Python SDK for development, but for production or team environments, the LiteLLM Proxy Server is a more robust solution.

The LiteLLM Proxy Server acts as a local gateway that can route API requests to any compatible backend, including Hyperstack. It also provides a web dashboard for managing models, API keys, and requests.

Step 1: Get the Proxy Server Code

Make sure you have Docker downloaded and set up files for the LiteLLM proxy:

You can download the Docker Desktop application for your operating system.

Step 1: Get the Proxy Server Code

Once Docker is installed, you can proceed with the following commands:

# Download the official docker-compose file
curl -O https://raw.githubusercontent.com/BerriAI/litellm/main/docker-compose.yml

# Download the Prometheus configuration file for metrics
curl -O https://raw.githubusercontent.com/BerriAI/litellm/main/prometheus.yml

So, we are basically downloading two files:

  1. docker-compose.yml: This file contains the configuration for running the LiteLLM proxy server and its dependencies (like Prometheus for monitoring).
  2. prometheus.yml: This file contains the configuration for Prometheus, which is used to monitor the performance and usage of the LiteLLM proxy server.

Step 2: Configure Environment Variables

We need to set up a couple of environment variables for the LiteLLM proxy server to work correctly.

# Create a .env file and add master key for admin access
echo 'LITELLM_MASTER_KEY="sk-1234"' > .env

# Add a random salt key used for encryption of credentials
# You can generate one using any password generator, e.g. 1Password
echo 'LITELLM_SALT_KEY="sk-1234"' >> .env

# Source the environment variables
source .env

We are setting two important variables here:

  1. LITELLM_MASTER_KEY: This is the master key used for admin access to the LiteLLM dashboard.
  2. LITELLM_SALT_KEY: This is a random salt key used for encrypting credentials.

Make sure to replace "sk-1234" with your own secure keys.

Step 3: Start the Proxy Server

Run the following command to start LiteLLM via Docker Compose:

docker compose up

Once we run this command, Docker will pull the necessary images and start the LiteLLM proxy server along with Prometheus for monitoring.

Step 3: Start the Proxy Server

Once started, you can access the LiteLLM web interface at: http://0.0.0.0:4000

Step 4: Login to the Admin Panel

Use the master key you defined earlier (LITELLM_MASTER_KEY) to log in to the LiteLLM Admin Dashboard.

Step 4: Login to the Admin Panel

Once logged in, you should see the main dashboard.

Step 5: Add Hyperstack Model to the Proxy Server

Now we need to add our Hyperstack model to the LiteLLM proxy server.

  1. Navigate to the Models section from the left sidebar.
  2. Click on the “Add Model” button.

Step 5: Add Hyperstack Model to the Proxy Server

Select “OpenAI Compatible Model” since Hyperstack uses the same API format because our model is OpenAI-compatible.

Step 5: Add Hyperstack Model to the Proxy Server

We have to provide the same details we used in the Python SDK method:

  • Model Name: meta-llama/Llama-3.1-8B-Instruct
  • Base URL: https://console.hyperstack.cloud/ai/api/v1
  • API Key: Your generated Hyperstack API key

Then click Add Model.

Now that the model is added, you should see it listed in the Models section.

Step 6: Test the Model in the Playground

Go to the test playground to verify that the model is working correctly.

  1. Select the model you just added.
  2. Enter a prompt such as:
    "Capital of Nepal"

Step 6: Test the Model in the Playground

The model is responding with the correct answer:

The capital of Nepal is Kathmandu.

Great! This confirms that the LiteLLM proxy server is successfully communicating with the Hyperstack model.

Step 7: Access via cURL or Python

Once verified, you can now call the same model via API requests. We can send a cURL request to the LiteLLM proxy server:

curl --location 'http://0.0.0.0:4000/chat/completions' \
    --header 'Content-Type: application/json' \
    --data '{
    "model": "meta-llama/Llama-3.1-8B-Instruct",
    "messages": [
        {
        "role": "user",
        "content": "what llm are you"
        }
    ]
}'

When we execute this command, we are getting a response from the model hosted on Hyperstack via the LiteLLM proxy server.

{
    "id": "chatcmpl-7Jt1mQkz4b5c6d7e8f9g0h1i2j3k4l",
    "object": "chat.completion",
    "created": 252351225662,
    "model": "meta-llama/Llama-3.1-8B-Instruct",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "I am a large language model developed by Meta AI, based on the LLaMA architecture."
            },
            "finish_reason": "stop"
        }
    ],
    "usage": {
        "prompt_tokens": 10,
        "completion_tokens": 20,
        "total_tokens": 30
    }
}

This confirms that your LiteLLM Proxy Server is successfully routing traffic through Hyperstack AI Studio!

Monitoring and Next Steps

Hyperstack provides built-in monitoring tools to track usage, and costs of your API calls. Go to the Usage Dashboard in Hyperstack to see your consumption metrics.

Monitoring and Next Steps


LiteLLM Admin Panel also provide a high level overview of used inference endpoint to review model performance, latency, and error logs from the dashboard.

LiteLLM Admin Panel also provide a high level overview of used inference endpoint to review model performance, latency, and error logs from the dashboard.

Next Steps:

We can one step further by doing some additional steps:

  • Fine-tune a model in Hyperstack and redeploy it via LiteLLM.
  • Add multiple models to compare responses and latency.
  • Use LiteLLM caching or load-balancing features for production workloads.

Conclusion

By integrating Hyperstack AI Studio with LiteLLM, we achieve a flexible and scalable AI pipeline:

  • Hyperstack AI Studio: acts as the backend model provider (fine-tuning, hosting, benchmarking).
  • LiteLLM: acts as the frontend bridge (SDK, API, proxy, and management dashboard).

This combination provides:

  • A unified interface for inference calls.
  • Full control over custom LLMs via Hyperstack.
  • An OpenAI-compatible and cost-efficient deployment architecture.

FAQ

1. Do I need a Hyperstack account?

Yes. Sign up at Hyperstack Console to generate API keys and access the AI Studio playground.

2. Can I use my fine-tuned models from Hyperstack with LiteLLM?

Absolutely. Once fine-tuned, simply use your model’s API endpoint and name in LiteLLM’s configuration.

3. Can I connect multiple models?

Yes. LiteLLM lets you add multiple Hyperstack endpoints (e.g., gpt-oss-120b, Llama-3.1-8B-Instruct, etc.) and switch between them easily.

4. Where can I learn more?

To learn more about LiteLLM, visit the LiteLLM Documentation. For Hyperstack, check out the Hyperstack Documentation.

Subscribe to Hyperstack!

Enter your email to get updates to your inbox every week

Get Started

Ready to build the next big thing in AI?

Sign up now
Talk to an expert

Share On Social Media

18 Nov 2025

Understanding Zed Editor What is Zed Editor and Why to Use it? Zed Editor is a ...

17 Nov 2025

In this tutorial, you will be: RooCODE running inside VS Code Have Hyperstack AI Studio ...