As N26 grew fast, they had to scale their technology to keep up. This meant scaling not only their infrastructure, but also their teams; for instance, they had to decide how to distribute work over teams and what technology to use or not use. Folger Fonseca, software engineer and Tech Lead at N26, shared his experience from scaling technology at N26 at QCon London 2020.
One of the technology challenges that N26 faced during hypergrowth was about microservices:
After starting with microservices, teams were allowed to run whatever tech stack or programming language they wanted to use. This proved to be really hard to maintain as we realized that it was hard to move engineers freely between teams. Soon enough there were services written in some language that none of our engineers would want to be bothered with learning, therefore we decided to keep our tech stack under control by using an internal tech radar which allowed us to keep a smaller set of technologies in which to build up experience and knowledge, while still keeping an organized way of driving innovation.
Another challenge was that certain domain teams grew from having 2 to 12 engineers. Soon enough the domain needed to be split into multiple teams. Creating and improving a good onboarding process is critical to enabling the healthy and efficient growth of teams without entirely removing their capacity to deliver, as Fonseca explained:
As part of our onboarding process, we make sure everyone is paired with an onboarding buddy who can support them as the first point of contact in case they have any doubts or questions during the first month. Clear goals are shared during the new engineers’ probation period so that they have a clear understanding of what they are expected to achieve during the first few months at N26. Services should be properly documented. We also run company and team-specific onboarding trainings with specific guidelines in the way of creating and maintaining backend microservices.
Fonseca mentioned that one thing they learned during hypergrowth was that sometimes if you want to go faster, you first have to slow down. It’s important to take a step back and review practices and processes to look for opportunities for improvement, he said. Having an open culture to receive and provide feedback really helps to identify where you can improve as an organization and as an engineering team.
"If you don’t take the time to tackle automation, this will soon come back and have a huge drawback on your team’s delivery capacity," Fonseca stated.
InfoQ interviewed Folger Fonseca after his talk at QCon London.
InfoQ: What stages did N26 go through while hyper-growing?
Folger Fonseca: I’ve been working for the company for the last three years, in startup time this may sound like an eternity because of how much a startup can change even in a short period of time. At the very beginning, it was quite dynamic. There was zero coordination in the sense that things were not formally expressed, but rather informal. And you relied a lot on individual skills. And then we went through hyper-growth. This meant that our engineering group grew three or four times the initial size. So we went from having something like 40 to 50 people, to having 300.
There are a lot of really interesting challenges when you are scaling not only your infrastructure, but also your teams.
InfoQ: How did you deal with the challenges?
Fonseca: There were challenges at multiple levels, but one great example is what we did with our CI/CD pipelines and how we invested in automation. Currently we have over 180 microservices and we deploy them to production around 300 times a week. If you have something inside your release processes that requires a manual review, soon that manual review becomes a bottleneck. We try to avoid as much as possible any kind of manual process and instead try automated processes. This may look expensive at first glance, but it has proven worthwhile.
InfoQ: How about the technology stack at N26? What does that look like at a high level?
Fonseca: We use microservices architecture which means that potentially each microservice can have its own stack. But we tried to be consistent across our ecosystem. Which means that we started early on with Java because it’s one of the most popular languages. But as we grow and discover new languages, we adopted Kotlin because of its simplicity and all that it brings to the table as a modern language.
Currently, on our backend services, we only use Java and Kotlin and for most of our services we only use Kotlin. We use the spring boot for web frameworks and then for storage we rely mostly on SQL; for some others, we will use document-based storage. We use Rest for communication in between services, but we rely a lot on asynchronous communication using the AWS messaging systems.
InfoQ: How do those technology choices have an impact on the culture and your ability to scale the teams and keep your teams performing well?
Fonseca: I think it’s really important to pick the right technology stack for a couple of reasons. One of the things that you want to have with your stack is its ability to actually enable you to deliver faster, because the end goal is that your users get more feature sets.
And then there are factors playing in there, such as how many engineers can you hire for that stack? How many experienced people are there available in the market to grow your teams? Java is one of the most established languages; there is a huge pool of talent available to hire. But at the same time, Kotlin is a modern language. By including Kotlin, we can hire people from Java and then use them as well for Kotlin.
Microservices architecture was definitely one of the decisive decisions that we took because it really enables you to run parallel features, to organize your teams and to require less coordination between those teams. In N26, we divide teams into segments; every segment is like its own startup targeting a given goal, e.g. one segment can have the goal of driving more card transactions while another segment’s is to increase the number of signups. Segments are entirely autonomous to define which features they want to work on as long as they target their goals. Each team within a segment will have a set of microservices which they take care of; service ownership is end-to-end, which means that teams take care of the services infrastructure, quality standards, documentation, testability, scalability and new feature development.
The idea of this model is to have a clear segregation of responsibilities in order for people to have the same mental model when they describe what their team is working on.
And then the last one is to be able to grow dynamically. So this is what the cloud provides. We use AWS and our application is 100 percent hosted in the cloud. And this really helps you to scale up or scale down, depending on the situation.