Build a Python SDK with Streaming
This tutorial includes the following SDK languages and versions:
TypeScript v1 TypeScript v2 Java Python v1 Python v2 C# Go PHP ❌ ❌ ❌ ❌ ✅ ❌ ❌ ❌
Streaming data is a common pattern when calling AI Large Language Models (LLMs) like Chat GPT, Claude, Llama or Mistral.
In this post we will explore how to seamlessly create an SDK for your favorite LLM using the liblab SDK generator.
The example in this tutorial will use Ollama to host a LLM locally on your computer, however you can use the same principles to access any LLM API that provides an OpenAPI file
The OpenAPI file contains API information like paths, params, security schemas, and other.
Prerequisites
- A liblab account
- The liblab CLI installed and you are logged in
- Python ≥ 3.9
- Ollama installed and running
- A Large Language Model, such as Llama3.1
Steps
- Setting up Ollama and installing Llama 3.1
- Setting up the
liblab CLI - Generating the SDK
- Using the SDK
- How to enable streaming endpoints
- Conclusion
1. Setting up an example Llama API
First go to Ollama home to download and install the latest version.
Then, once Ollama is installed and running, execute the following command on a console to download the latest version of Llama 3.1:
ollama pull llama3.1
To verify that Llama 3.1 is installed, run the following command:
ollama run llama3.1
If all is well, you will be able to send prompts to the model and receive responses. For example:
ollama run llama3.1
>>> tell me a joke
Here's one:
What do you call a fake noodle?
(wait for it...)
An impasta!
Hope that made you laugh! Do you want to hear another?
>>>
2. Setting up the liblab CLI
First, ensure you have the liblab CLI installed. If not, you can install it via npm:
npm install -g @liblab/cli
Once installed, you need to log in to your liblab account. Run the following command and follow the prompts to log in:
liblab login
After logging in, you can configure the CLI with your project. We want a new directory for the SDK, let's create one called streaming:
mkdir -p streaming
cd streaming
liblab init
This will generate a liblab.config.json.
Before we edit the generated json file, let’s create the API spec for which we will generate this SDK.
ℹ️ Usually we don’t need to create the API spec ourselves since most APIs provide them. However we are using Ollama locally and it does not provide an OpenAPI spec.
Create a new file called ollama-open-api.yaml and paste the following content into it:
openapi: 3.0.0
info:
title: Ollama API
description: This is an open API Spec for Ollama, created internally by liblab. This is not an offical API Spec.
version: 1.0.0
paths:
/api/generate:
post:
description: Send a prompt to a LLM.
operationId: generate
x-liblab-streaming: true
requestBody:
content:
application/json:
schema:
$ref: '#/components/schemas/GenerateRequest'
responses:
'200':
description: OK
content:
application/json:
schema:
$ref: '#/components/schemas/GenerateResponse'
components:
schemas:
GenerateRequest:
type: object
required:
- model
- prompt
properties:
model:
type: string
prompt:
type: string
stream:
type: boolean
GenerateResponse:
type: object
required:
- model
- created_at
- response
properties:
model:
type: string
created_at:
type: string
response:
type: string
done:
type: boolean
done_reason:
type: string
context:
type: array
items:
type: integer
total_duration:
type: integer
load_duration:
type: integer
prompt_eval_count:
type: integer
prompt_eval_duration:
type: integer
eval_count:
type: integer
eval_duration:
type: integer
Now let's update the liblab.config.json file to use the our new spec. Copy and paste the json to overwrite the liblab.config.json file:
{
"sdkName": "ollama-sdk",
"apiVersion": "1.0.0",
"apiName": "ollama-api",
"specFilePath": "./ollama-open-api.yaml",
"languages": ["python"],
"auth": [],
"customizations": {
"baseURL": "http://localhost:11434",
"includeOptionalSnippetParameters": true,
"devContainer": false,
"generateEnv": true,
"inferServiceNames": false,
"injectedModels": [],
"license": {
"type": "MIT"
},
"responseHeaders": false,
"retry": {
"enabled": true,
"maxAttempts": 3,
"retryDelay": 150
},
"endpointCustomizations": {
"/api/generate": {
"post": {
"streaming": "true"
}
}
}
},
"languageOptions": {
"python": {
"alwaysInitializeOptionals": false,
"pypiPackageName": "",
"githubRepoName": "",
"ignoreFiles": [],
"sdkVersion": "1.0.0",
"liblabVersion": "2"
}
},
"publishing": {
"githubOrg": ""
}
}