We have been working on application with a complex matching logic that runs against 30M+ record of elastic search to find the matches. After trying different optimization techniques, we had pretty solid code and it could no longer go lower the execution duration. Now was time to see if we can run multiple such a processes in scalable environment.
Current deployment architecture was a simple – 4 CPU VM for application execution and another 4 CPU VM hosting the multi node elastic search cluster. Of course all this uses the Docker containers. Job runner Celery takes care of the offline task execution. With 4 CPU on VM host, we had configured it to use no more than 2 CPUs which limited the concurrent task executions to two at a time.
Also read: Bundling React With Gulp
We toyed with implementing each process execution as a serverless Azure function for a bit but variable execution duration of each job exposed us to the possibility of running a single instance beyond allowable processing time. So that idea was a no-go.
Next to try out was Azure Scale-set. Scale-set allows us to add additional VM instance based on a pre-crafted VM image. Here’s high level deployment architecture.
By default the Azure scaleset allows a scale-out trigger at a some configurable high CPU utilization for the cluster. When the utilization crosses such a high water mark, pre-set number of new VM(s) are added to the cluster. Such an implementation would not suite us for couple of reasons:
1. The jobs we have are not constantly CPU bound. The CPU load is spread over the period of time. Such a variability of load would not measure well against scale out trigger that expects “Above X parentage for Y number of mins”. Essentially the scale-out event would not reliably trigger while there are pending jobs in the queue.
2. The other issue is, we could specify scale-in (removal of instance) to use FIFO or LIFO when ramping down the instances but there is no provision to specify that instances doing no work can only to be taken out no matter when they were added.
Hence I decided to take matters in our hands and code a custom functions for scale-out and in. Here’s show it works…
1. Celery uses Redis database to keep the queue records. So the right place to look for pending work is this this queue.
2. When a Orchestration process monitoring the queues finds there are jobs pending, it triggers the scale-out event for the cluster. Sample code below.
3. Once the VM comes up, it de-queues a pending job and proceeds to run it. Once the original job is complete, it checks for any more pending jobs and processes it till there is no more jobs pending in Celery queue.
4. Each scale VM instance inspects itself every 5 mins. If it finds no job has been pulled in, it shuts itself down. Sample code below:
5. The same Orchestration process inspects scale-set to check if there are any shut-down scale-set VMs and if found, it takes them out of commission. This allows us to remove VMs that are no lonegr needed. Here’s the sampel code:
Essentially – now the scale-out ( addition of VMs) is based on number of pending jobs in Redis queue. Scale in (removal of VMs) happens only when job queues are empty and it happens to only to the VMs that are not running any job.
Such an implementation also allowed us to prevent any aborted jobs providing a precise control over VM utilization and hence the cost. One word of caution, addition of scale set VM does have ripple effect on other component that it depends on.. We had to experiment and tune how many instances can we scale-out to to consider the capacity of ELK stack and MYSQL.