How to quickly deploy 500 node multi-region proxy network
Once upon a time, while facing issues with rate limits on a distributed network, all players in the industry were competing for limited resources and rate limits by deploying proxy nodes that try to blend in with natural traffic.
TL;DR of the final solution:
- GCP Compute Instances with container option, add Managed Instance Groups, add LB in front — great for a POC and fast GTM — Go-to-Market.
- Vultr using Flatcar + cloud init + REST API via CLI for cost effectiveness.
While I was working on it, of course the first thing was trying to do it with Ansible the same way all the remaining infra is managed, but quickly it became evident that would be too slow and a bottleneck.
Ansible was fine for our bare metal servers that are deployed once with many services working together and kept for months/years, but an overkill to this simple use case where we just need to deploy a “swarm” of a single stateless app that can easily scale out.
Obviously I started looking at leveraging containers to deploy this application and considering where and how to deploy it.
Initially some options were considered like:
- AWS Lambda/GCP functions/etc
- GCP App Engine/DigitalOcean APps
- Fly.io
These were ruled out as too many instances of the applications would be running behind a NAT and sharing the same IP address.
So we need virtual machines with unique IP addresses, each.
Enter Google Cloud Compute Instances…
One of the things that standout from similar services is that they provide an easy way to deploy a container on each VM without too much fuss.
Simply select a container image instead of the OS base image like Ubuntu, Google will automatically use a Container Optimized OS image, setup a service to launch the container and give you a full hands off experience.
This can be used to quickly validate the deployment before moving to automation and scaling out.
Now this is where GCP shines, instead of a single instance, head to Instance Templates and specify a template that can be widely deployed.
Then head to Instance Groups, create one “Managed Instance Group (stateless) specifying the template you created above, delete the auto scaling option and simply select the amount of proxies that you need.
Also make sure to define a health check like TCP on port :8000, or HTTP to :8000/health, whatever suits your application.
So now you have a region with 100+ proxies ready to go, with automatic self-healing and easily scaled up or down with a button.
You will do the steps above for every region you need a presence on.
Finally the one thing missing is the load balancer so your proxy instances can be remotely accessed, create one with the params below:
- Application Load Balancer
- Public Facing
- Global Load Balancer
Once done, open the newly created load balancer, head to Backend Services and create a new one targeting each Instance Groups you’ve created before.
Then head to routing to specify the params to target each of the instance groups.
In my use case, the client software was multi plexing requests to different endpoints, so I have endpoints like nyc.example.com that points to the Instance Group in NYC. Do this for every region.
Once the LB is deployed and updated, you’re finally ready to serve 1000s of RPS from as many locations as you require with little to no effort.
After all this was done and we were serving traffic I was looking into automating some of the steps in order to facilitate updates to all the regions, one feature from GCP stood out, the “Equivalent Command Line” that’s present on the GUI when setting up multiple of the components above, unfortunately it was broken (outdated config?) for at least some of the params we were using.
Then the real problem surfaced, of course we’re using Public Cloud and costs will quickly grow out of proportion to $1000s a day without issues and they will just send an invoice for >100k by the end of the month without thinking twice.
So the actual next step/improvement was redeploying this solution outside GCP; other providers don’t have as handy of a GUI, so it took an extra day to build something similar in a more generic solution that could be used in different providers.
Some of the options being considered are the popular VPS providers like:
- DigitalOcean
- Lenodo
- Vultr
Vultr won in this use case due to availability of locations, they have 32 locations currently, vs only 12 from DigitalOcean, and even less from Linode.
Simply moving to one of these, means that each instance will include 1 to 4TB of bandwidth bundled in, and any overage is easily priced at 1/10 of any Public Cloud. A clear win.
Now that we’ve choose the provider, the goal is to reproduce the same setup we have working on GCP, I ended up with:
- FlatCar OS as the container optimized OS
- Butane YAML config file, that’s converted to ignite using
Your Butane config may look something like this:
variant: flatcar
version: 1.0.0
passwd:
users:
- name: core
ssh_authorized_keys:
- ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAIKzxQgNsaYBXX5C7/YYk3ZJZqWn5/qJGP7wTPYX8g1Xa
storage:
directories:
- path: /root/.docker
files:
# read only token with access to DO docker registry
- path: /root/.docker/config.json
contents:
inline: |
{"auths":{"quay.io":{"auth":"<redacted>"}}}
- path: /etc/fstab
contents:
inline: |
/swapfile swap swap defaults 0 0
- path: /etc/app.env
contents:
inline: |
PORT=8000
RUST_LOG=info
systemd:
units:
- name: swap.service
enabled: true
contents: |
[Unit]
Description=Enable swap file
After=local-fs.target
[Service]
Type=oneshot
RemainAfterExit=yes
ExecStart=/usr/bin/dd if=/dev/zero of=/swapfile bs=1M count=4098
ExecStart=/sbin/mkswap /swapfile
ExecStart=/sbin/swapon /swapfile
[Install]
WantedBy=default.target
- name: app.service
enabled: true
contents: |
[Unit]
Description=My APpp
[Install]
WantedBy=multi-user.target
[Service]
ExecStart=/usr/bin/docker run --restart=always --network host --env-file /etc/app.env --name=app quay.io/<repo>/<project>:latest
ExecStop=/usr/bin/docker stop app
ExecStopPost=/usr/bin/docker rm app
Which you can convert to the final Ignition config file, by doing:
docker run — rm -i quay.io/coreos/butane:latest — pretty — strict < butane.yaml > ignition.json
You can easily define extra services like Prometheus exporters, fluentd/fluentbit/grafana agent or something else to ship metrics/logs to your solution of choice.
Now for deployments we’re simply going to use the REST API via a regular command line to quickly deploy 100s of VMs per region and add them behind a load balancer.
Unlike GCP we won’t have a Global Load Balancer, but since we’re addressing each region by a different DNS endpoint it really makes no difference.
Create all
export VULTR_API_KEY=<readacted>
# if you're familiar with airport codes, get all regions by doing
curl "https://api.vultr.com/v2/regions" -X GET -H "Authorization: Bearer ${VULTR_API_KEY}" -s | jq -r .regions[].id
# all the regions we're planning to deploy to
regions=("ams" "atl" "bom" "cdg" "ewr" "fra" "jnb" "lax" "lhr" "mia" "nrt" "ord" "sao" "sea" "sgp" "syd" "tlv" "waw" "yto")
for region in "${regions[@]}"; do
# deploy 100 instances in every region
for i in $(seq 0 100); do
curl "https://api.vultr.com/v2/instances" \
-X POST \
-H "Authorization: Bearer ${VULTR_API_KEY}" \
-H "Content-Type: application/json" \
--data "{
\"region\" : \"${region}\",
\"plan\" : \"vhp-2c-4gb-amd\",
\"label\" : \"app-${region}-${i}\",
\"os_id\" : 2077,
\"user_data\" : \"$(cat ignition.json | base64 -w0 )\",
\"backups\" : \"disabled\",
\"hostname\": \"app-${region}-${i}\",
\"tags\": [
\"app\"
]
}"; sleep 1; echo "done"; done
done
# get all the instances in the region
curl "https://api.vultr.com/v2/instances?tag=app®ion=${region}" -X GET -H "Authorization: Bearer ${VULTR_API_KEY}" -s | jq -r .[][].id
curl "https://api.vultr.com/v2/load-balancers/<lb-id>" \
-X PATCH \
-H "Authorization: Bearer ${VULTR_API_KEY}" \
-H "Content-Type: application/json" \
--data '{
"instances": [
"<instance-id>",
"<instance-id>"
]
}'
In this case, all the LBs were created in advance, we defined the health check on each, and we’re simply attaching all the relevant VMs to each LB.
A quick more shell glue would put it all together.
Every time we need to update the images, we’ll simply do a reinstall on each VM, as all requests are multiplexed to all regions anyway, we don’t really need to worry about draining requests or playing too nice with in flight requests.
So now just update each VM with a new config, and then request reinstall.
instances=$(curl "https://api.vultr.com/v2/instances?tag=curl®ion=<region>" -X GET -H "Authorization: Bearer ${VULTR_API_KEY}" -s | jq -r .instances[].id)
for i in $instances; do
echo "reconfiguring $i..."
curl "https://api.vultr.com/v2/instances/$i" \
-X PATCH \
-H "Authorization: Bearer ${VULTR_API_KEY}" \
-H "Content-Type: application/json" \
--data "{
\"user_data\" : \"$(jq -Rs '.' < ignition.json)\"
}"
echo "reinstalling $i..."
curl "https://api.vultr.com/v2/instances/$i/reinstall" \
-X POST \
-H "Authorization: Bearer ${VULTR_API_KEY}" \
-H "Content-Type: application/json" \
--data '{}'
sleep 30
done
In the end this solution was used only for a couple of months, and after the initial iterations there wasn’t much of a need for updates, any improvements or better tooling; as this was never intended as a permanent solution.
Infrastructure as code tools are nice but when you want to move quickly, simply using the provider APIs directly can do a great lot without spending time on tools.