A Startup Engineering Journey
In the early days, they had built a Ruby on Rails monolith. But as the company started growing, they realized that the architecture would not scale.
they called it Monorail.
To solve this issue, they shifted to a service-oriented architecture
They didn't have plenty of infrastructure resources and competencies yet, so they went with aws. They divided Monorail into two major services: Hyperloop and Treehouse.
While the new design solved tight coupling issues, it couldn't solve the upcoming scaling issues.
So they looked at further mechanisms. Building in house vs adopting a 3rd party solution.
They considered migrating to Kubernetes, aka k8s.
The engineering team started to use YAML templates to minimize complexity and abstract configurations.
They started using Git to store configuration files.
It helped them streamline the process for configuration changes, review, and updates across multiple staged environments. Later that year, they made the move and migrated to k8s, but their deployment started failing due to the etcd cluster getting out of memory.
Fortunately, k8s stopped any deployment and scaling in the process whenever etcd failed. This issue was finally resolved by upgrading to the V3 data format of etcd.
When their cluster node doubled in size again, they started facing more issues.
They developed SmartStack - an in house service Mesh that enabled transitioning to a multi-clustered environment.
I'd assume Consul, istio and other meshing services were not out there yet.
they then created the kube-system, back then an in-house method for deploying clusters,which allowed them to deploy while ensured equal cluster performance.
With the migration to k8s and numerous operational optimisations, they reached no less than 125,000 production deployments per year.
This is one of the biggest success stories of k8s. They had not only benefited from it but have contributed back with major improvements and new components, thanks to open source.
This story tells us about engineering operations, deployment, configuration and scalability. Migrating to a microservices architecture is one of those tales that start with "it was a total mess, but the teams learned from their mistakes". It took years, but compounded incremental improvements led the system to reach new heights.
They could have taken many other routes, micro services and container applications is what made sense for Airbnb's use cases and demand at the time.
They could have skipped the git config solution, but etcd was probably not ready.
They could have tried to plan this out and operated for scale from the beginning
But the monorails app enabled Airbnb to dev and ship early, get the product out there early enough to get users feedback and some revenue, then more funding and justifications to prepare for further expansion.
Infrastructure and devops tech will evolve even faster in the coming decade than the previous ones. Enginnering teams that will succeed will be the ones that adapt the architecture, design, framework and tooling along with adjusting operational processes that solve the challenges their environment throws at them.
Staying agile and pushing incremental improvements is necessary to remain lean while the business grows. Pace and balance, willingness to unlace with patience is what takes us from where we are to where we should be.