Accelerating KBY-AI SDKs with Kubernetes Configuration
KBY-AI’s server SDKs can run on a Kubernetes configuration to enable acceleration and handle multiple requests efficiently
If you are using a Kubernetes
configuration, you can send multiple requests in parallel and receive responses simultaneously. This approach significantly reduces API
response time and optimizes performance efficiently.
To validate the performance, we tested the KBY-AI ID Document Liveness SDK by measuring the response time when sending multiple requests in parallel.
Creating EKS Cluster
EKS
stands for Elastic Container Service
for Kubernetes
. It's a managed container service offered by Amazon Web Services (AWS)
that allows users to run Kubernetes
without having to manage the underlying infrastructure.
It simplifies deploying, managing, and scaling containerized applications using Kubernetes on AWS infrastructure.
AWS
provides official documentation for creating an EKS
cluster. You can follow their step-by-step guide to create EKS
cluster on AWS
console.

Adding Node Group To Cluster
Once you have created the EKS
cluster, you need to add a node group to it.
We added a node
group with 20
nodes to the cluster to measure the response time of the KBY-AI ID Document Liveness Detection SDK under multi-threading conditions. Each node
was configured with 2
CPU
cores and 8GB
of RAM
.

We allocated CPU
, RAM
, and pods
to each node
as shown in the diagram below.

Preparing Python Script To Measure API Response Time
We provided a Python
script to measure API
response time when sending 1,000
requests simultaneously using a Base64
image.
To run the script, you need to prepare the Base64
image data (base64.txt
file).
import requests
import time
from concurrent.futures import ThreadPoolExecutor
# API Endpoint
api_url = "http://ad90b13a512d447b38c87a1570b3660c-657463739.us-east-1.elb.amazonaws.com:80/process_image_base64"
# api_url = "http://44.218.215.247:9001/process_image_base64"
# Load Base64 Data from File
base64_file = "base64.txt"
try:
with open(base64_file, "r") as file:
base64_data = file.read().strip() # Read and remove any extra spaces/newlines
except FileNotFoundError:
print(f"Error: {base64_file} not found.")
exit(1)
# Example Payload
payload = {
"base64": base64_data
}
# Number of concurrent threads
num_threads = 1000
# Function to make an API call
def call_api(thread_id):
start_time = time.time()
try:
response = requests.post(api_url, json=payload, timeout=1000) # Set timeout
elapsed_time = time.time() - start_time
print(f"Thread {thread_id}: {response.status_code} - {response.text[:100]} (Time: {elapsed_time:.2f}s)")
except requests.exceptions.RequestException as e:
print(f"Thread {thread_id}: Request failed - {e}")
# Execute concurrent requests
if __name__ == "__main__":
start_time = time.time()
with ThreadPoolExecutor(max_workers=num_threads) as executor:
futures = [executor.submit(call_api, i) for i in range(num_threads)]
for future in futures:
future.result() # Ensure all tasks complete
total_time = time.time() - start_time
print(f"Total Execution Time: {total_time:.2f}s")
Result
We ran the Python
script against the API
from both the EKS
cluster and an EC2
instance to compare the response time between sending requests simultaneously and sending them serially from the EC2
instance.
It took 1,381.77
seconds to receive all 1,000
responses through the API
from the EC2
instance when sending requests sequentially.

In contrast, it took 57.25
seconds to receive all 1,000
responses through the API
from the EKS
Kubernetes
cluster when sending all requests simultaneously.

As you can see, deploying the KBY-AI ID Document Liveness Detection SDK to an EKS
cluster with a node
group significantly reduced API
response time.
To learn more about accelerating our SDK
s, please contact us.
Last updated