In-depth look at a Scalable & Robust Data Stream Processing pipeline using Golang

magicpin Engineering
5 min readJan 22, 2018

Over the last 6 months we have seen tremendous growth at magicpin. We have grown 6x in both users and merchants and the average session time is up more than 50%. All this growth has resulted in an exponential increase in the data generated — transactions, reviews, likes, selfies, check-ins, voucher redemption and more. Right from the outset we realised how important this data in ensuring an engaging user experience. From surfacing the best restaurants based on a users interest to personalising the cash back deals for our users — this is the data powers them and more. In this post we will focus on a critical component in building these features — a robust and scalable system to process and ingest diverse data streams and serve recommendations generated based on them.

Like a lot of consumer tech companies we have a number of data streams with different sources of origin and different magnitude of scale. We wanted to build a system that could handle all of them, from the clickstream data generated by the app and site, to transactions performed by the user and the voucher, deals booked by those users. We also wanted to insulate upstream systems from this stream ingestion system, and ensure that the upstream systems to operate in a ‘fire and forget’ mode. Given our usage and growth, we needed the system to be able to handle thousands of request/s in parallel at best (concurrently at worst) and support async calls. An architecture that really impressed us was the Keystone Pipeline at Netflix . As stream processing was central to this, we quickly settled on Kafka being the central queue. Further as in the blog above we preferred upstream systems to push data over HTTP and avoid writing directly to Kafka since that would mean a lot of changes to the upstream services especially around error handling, retries and making calls async. Additionally, we introduced a web socket endpoint for the app to be able to send the real-time data to the system without the overhead of HTTP. On the data-storage side, we decided to use Cassandra and Elastic. On top of these data stores we built our personalisation and recommendation engine that generates recommendations using Spark’s ML library. We will go into details about our recommendation engine in a separate post.

At magicpin we have a lot experience in scaling Java based MVC apps. At same time we also knew how finicky it can be to tune JVM and as a company, we make every effort to stay on top of new and emerging technologies and continously ask ourselves, can we be doing this better? As we were debating the implementation of this system we realised that a language thought with parallelism and concurrency from the ground up might be a better fit. Researching more led us to this awesome talk ‘Concurrency Is Not Parallelism’ and made us excited to explore Go more. The fundamentals of Golang and how constructs like ‘Channels’ make parallel programming more elegant excited us and after reading this blog post — Handling-1-million-requests-per-minute-with-golang we were convinced that Golang is the way to go!

A couple of weeks later, after an initial crash course in Golang, we came away more impressed with how elegant and effective (both in lines of code and performance) Golang was. We started modeling our architecture and drew a lot of inspiration from the awesome Malwarebyte’s Golang article and came up with a worker pool architecture that streamed data from upstream sources through Kafka into ES and Cassandra. We wrote Spark jobs on top of this data and wrote the resulting recommendations back to Elastic/Cassandra.

The ingestion service is built around a HTTP worker pool and delegator/router which passes incoming jobs to idle workers. Each worker runs its own Go Routine pushing the data onto a Kafka topic. Once the job is completed, it returns to the pool. Simple and effective. Coupling this with an efficient and effective NoSQL store (Cassandra worked well for our use-case) we are to process 5k req/s on each 4vCPU machine. In addition to ingesting data, different consumers are colocated with our HTTP servers. These consumers perform a number of operations like cleaning, massaging, and aggregating the data before pushing them into the recommendation pipeline and/or other data sinks(eg. ES).

The REST API that exposed the output of the recommendation/personalisation engine uses Golang’s Gorilla mux package for routing and dispatch. The http package, which is used by Gorilla (and practically every other web framework for Golang) ensures that there is one Go routine per HTTP request. On the API side we could get a single 1 vCPU instance to serve 600 req/s.

Server — Requests processed per second
API — Requested served per second

All of our Golang apps are built using the twelve-factor app methodology which makes them very easy to be containerized using Docker. To further optimize our hardware cost we deploy these containers on Kubernetes and leverage auto scaling to spin-up containers on-demand as the traffic spikes.

Our transition and experiment with Golang has been very fruitful. We have started to see if we can move other micro-services and middleware in our stack to Golang. At the same time we realize that Golang is not a silver bullet. We carefully look at our services and keep asking one fundamental question — Can this be better? It’s a constantly evolving process and one that takes careful research and planning. As we continue to grow we are excited to figure out how to scale these services to 10x the current level. Ping us if you have questions about what we did or would like to tell us how we could have done something better? If you want to come and build with us, even better!

--

--