Deploying Serge on GCP: A Step-by-Step Guide
Published:
Serge is a chat interface that allows you to run Alpaca models without the need for any API keys. It is entirely self-hosted, fits on 4GB of RAM, and can run on the CPU. In this guide, we’ll walk you through the steps required to deploy Serge on Google Cloud Platform (GCP).
Before we get started, make sure you have a GCP account set up and are familiar with the basic concepts of GCP, including creating and managing virtual machines, and setting up firewall rules.
Step 1: Create a Virtual Machine
The first step in deploying Serge on GCP is to create a virtual machine. We recommend using a machine type with at least 4GB of RAM and a CPU. You can choose any operating system you like, but I recommend using a Linux-based distribution such as Ubuntu or Debian.
To create a virtual machine in GCP, follow these steps:
- Go to console.cloud.google.com and create a new project (eg. serge)
- Navigate to the Compute Engine section and click on the Create Instancebutton
- Choose the machine type you want to use, and choose the region and the zone for your instance. In this example, we are picking the us-central1region andus-central1-czone. Also, we are gonna use ae2-standard-8machine so that we can run models up to 30 billion parameters:

- Select the operating system you want to run and configure the disk space. In this example, we are gonna use Ubuntu 22.04 LTS for the OS and a 250GB SSD as our boot disk. Click on the SELECTbutton:

- Configure the firewall to allow HTTP/HTTPStraffic:

- Leave everything else as it is and click on the CREATEbutton to create your virtual machine
- Navigate to VPC networks and click on default
- From the sidebar menu click on Firewalland then click on theCreate a firewall rulebutton
- Choose a name (eg serge-firewall) and on the Target tagsfield fill in thehttp-servertag. In the Source IPv4 ranges enable all by filling in0.0.0.0/0. In the Protocols and ports section click onTCPand open port8008(we will need it for later). Click on theSAVEbutton:

Step 2: Install Dependencies
To get started with serge, follow these steps:
- Navigate to VM Instances and click on the SSHbutton
- Clone the sergerepository:git clone https://github.com/nsarrazin/serge.git cd serge
- Start the Docker container:sudo docker-compose up -d
- Download the required tokenizers:sudo docker-compose exec serge python3 /usr/src/app/api/utils/download.py tokenizer 7B sudo docker-compose exec serge python3 /usr/src/app/api/utils/download.py tokenizer 13B sudo docker-compose exec serge python3 /usr/src/app/api/utils/download.py tokenizer 30BPlease note that the models occupy the following storage space: 7B requires 4.21G, 13B requires 8.14G, and 30B requires 20.3G 
Step 3: Access the API
Once you’ve installed the dependencies and started the Docker container, you can access the serge API by following these steps:
- Navigate to VM Instances and copy the - External IPof your machine
- Open your web browser and navigate to http://external-ip:8008/ and Voilà: 

- You should see the - sergehomepage, which means that the API is up and running!
- To use the API, make requests to http://external-ip:8008/api/: 
That’s it! You’re now ready to use serge on GCP. Happy coding!

 Google Cloud Skills Boost - Image Generated by DALL-E
 Google Cloud Skills Boost - Image Generated by DALL-E