Categories
Archives

The Uber Node.js Story

Uber is an ambitious and innovating project, one that aims to make transportation as reliable as any other basic need in life, to make it accessible to everyone, everywhere. A tough task, a challenge! Uber needed a system that will keep running no matter what, and that’s why its engineers chose Node.js.

The Node.js Uber story starts actually at Joyent, the home of Node.  Curtis Chambers, one of the first employees at Uber and Amos Barreto, its Director of Engineering, way back when Uber was just a startup, went to Joyent for some node.js help. At Joyent they met Tom Croucher who is currently Uber Site Reliability Engineer. They wanted to use Node.js, who was still in its early stages for Uber, back in 2011. They took a huge change, it was risky to built and base almost an entire business on this new JavaScript backend thing that people kept talking about. Node.js was used for a few things, nothing very important, a bit here and there, and not at such a large scale.

Uber was one of the first big business that adopted Node at such a level, right from the start. Netflix and PayPal, both, big companies took the Node road, but they we’re already established companies when they made that change to Node.js from Java. Right now they’re running a older version of Node, 0.10. There are a few reasons why they went to Node and why node works so well for their business:

1 Node.js handles asynchronous I/O requests with a non-blocking, single-threaded event loop. It is particularly well-suited to distributed systems that make a lot of network requests.

“Node.js is particularly well-suited to writing systems that have all their state in memory”  Kris Kowal, Software Engineer at Uber.

2 Node.js—and JavaScript in general—s excellent for quick iteration; programs can be inspected and errors can be addressed on the fly without requiring a restart, so developers can publish and deploy new code constantly.

One of the things that makes Node.js uniquely suited to running in production is that you can inspect and change a program without restarting it. So very few other languages offer that capability. Not a lot of people seem to know that ability exists, but indeed you can inspect and even change your program while it’s running without restarting it.”

Kris Kowal, Software Engineer at Uber.

3 The active open source community continuously optimizes the technology; it gets better, all the time, practically on its own.

“By building on Node.js’s actively-developed, open-source system, we get the benefit of lots of people making the software better”

Kris Kowal, Software Engineer at Uber.

The amazing thing is, that even though they’re such a huge company, they invest in the community, cherishing it, at the last node Interactive they made available the Node.js systems that make Uber work basically, available for the community, which was great. You can check out Tom Crouchers presentation here.

Besides this they also created three pieces of software to keep their matching system running all the time at the massive scale required: Ringpop, TChannel, and Hyberbahn.

We’re going to talk a bit more about Ringpop and the other Uber made Node.js products.

Ringpop is an open-source Node.js library that brings application-layer sharding to many of their dispatching platform. It has 3 core features:  a membership protocol, a consistent hash ring and request forwarding. Here are a few more details about it:

  • Ringpop is an AP system, that trades consistency for availability, something very important for Uber

  • Ringpop is an embeddable module that’s included in each Node process.

  • Node instances gossip around a membership set. Once all the nodes agree who each other are, they can make lookup and forwarding decisions independently and efficiently.

  • Ringpop  is very scalable, you can add more processes and get more things done.

  • The gossip protocol is based on SWIM. A few improvements have been made to improve convergence time.

  • A list of members that are up are gossiped around. As more nodes are added it is scalable. The ‘S’ in SWIM is for scalable and really does work. It has scaled to thousands of nodes so far.

  • SWIM combines health checks with membership changes as part of the same protocol.

  • In a ringpop system there are all these Node processes containing ringpop modules. They gossip around the current membership.

  • Ringpop is built on Uber’s own RPC mechanism called TChannel.

  • It’s a bidirectional request/response protocol that was inspired by Twitter’s Finagle.

  • Uber is getting out of the HTTP and Json business. Everything is moving to Thrift over TChannel.

  • Ringpop is doing all its gossip over TChannel based persistent connections. These same persistent connections are used to fanout or forward application traffic.  TChannel  is also used to talk between services.

Source;

TChannel is a networking multiplexing and framing protocol for RPC. It uses a request/response model with out-of-order responses, where slow requests at the head of the line will not block subsequent faster requests. It also creates a high performance forwarding path for the existent requests .

Hyberbahn is a service-to service discovery and routing system. It allows Uber to adapt in real time in order to ensure user requests get where the need to go to be resolved, no matter what’s happening in the system. It’s elastic, massively distributed, and fault tolerant.

Source;

Uber means transportation that’s always available. This is their core value and as such they needed a system that scales quickly, also taking into account how fat their business is growing, doubling in size almost every six months. Uber is now doing over two million RPCs per second at peak across the Node.js fleet. They also wants to add 1000 engineers to their team this year to be able to sustain their companies growth so don’t forget to follow them.

This quote from Kris Kowal, Software Engineer at Uber sums up the Uber Node.js story very well:

“Somewhere in the world, someone needs to get where they need to go, and without the flexibility and reliability that Node.js provides that someone might be left thirsty for another transportation option.”

Tweet