Transcript
Yadav: When I was asked to come here and give a talk, I was thinking about how not to give a talk which we have been through, because at Celonis, we went through a lot of troubles to where we are right now, so it's like a journey, and the tools which we used in the process to make us deliver faster. Nx is one of the most important tools which we use in our ecosystem. That's why I just mentioned it in the talk title as well. We'll talk about Nx as well. Let's see what we are going to talk about.
First, I want to show you how our application looks like. This is our application. If you see the nav bar, actually each nav bar is actually an application. It's a separate application which we load inside our shell. Even inside the shell, there can be multiple components which we can combine together and create a new dashboard for our end users. It means there are different teams which are building these applications. This is where we are right now.
Problem Statement
How did we start? I just want to show you what the problem statement was. What was the problem or the issue which we were trying to resolve? Then we ended up here. This was our old approach. I was speaking to some of my friends, and we were talking about the same issue where we have multiple repositories, but we are still thinking about, of course, moving all the code to a single monorepo stuff, but we are not able to do it. Or we are struggling, like we know that there might be challenges. This is where we were three years ago. We had separate apps with separate repositories. We used to load each app using URL routing. It's not like SPA or module federation or microcontent, which we know today. Because in the past few years, tools have added more capabilities.
For example, webpack came with the support of module federation, which was not there earlier. Everyone was solving module federation in some different ways, just not in the right way. This is another issue which we had. Of course, we had close to 40 different repositories, and then we used to build those code. We are using GitHub Actions. Of course, we used to build that code, push it in the artifact or database, because that was the one way to load the application. We used to push the entire build into the database and then load it on frontend. The only problem is we were doing it X times. The same process, same thing, just 30 times. Of course, it costs a lot of money. The other issue which we had was, of course, we have a design system.
Which company doesn't have a design system? The first thing which a company decides to do is, let's have a design system. We don't have a product, but we should have a design system. This was another issue which we had. Now this became a problem. Of course, we had a design system, but now different applications started using different versions of design system, because sometimes they had time. Some teams started pushing back, we don't have frontend developers or we don't have time to upgrade it right now. This was, of course, a big pain. How should we do it? This caused another issue.
Some of our customers are actually seeing the same app, but as soon as they move to a different application or part of the application, they see a different design system. There's probably a dark theme and light theme, just as an example. Think about a different text box. Someone is seeing a different text box and someone is seeing different.
What were the issues we went through? Page reloads, for example, of course, now with HTML5, everyone knows of course, the experience should be smooth. As soon as I click on the URL, there should not be page refresh. That's the expectation of today's users. This is not early '90s or 2000 where we can just click on a URL and wait for one hour to download my page. This is the thing of the past. Our users were facing this issue. Every page, every app, it reloads the entire thing. Bundle size, of course, we could not actually tree shake anything, or there was no lazy loading. Of course, there was a huge bundle size which we used to download.
Of course, when we have to upgrade Angular or any other framework, this can be any other framework which you are using in your enterprise. We are using Angular, of course. We had too much effort upgrading Angular because we have to do it 30 times. Plus, our reusables and design system. Maintaining multiple versions of shared libraries and design system became a pain because we cannot move ahead and adopt the new things which are available in Angular or any other ecosystem because it's always about backward compatibility.
Everyone knows, backward compatibility is not a thing. It's just a compromise. It's a compromise we do that, ok, we have to support this. That's why we are just still here. Now, as we said, we had 30-plus apps and then we used to deploy them separately. We had to sync our design system, which we saw in the previous slide. Which was, again, very difficult because for a few seconds or a few minutes, if your releases are not synchronized, you will see different UIs.
What Is Nx?
Then came Nx. Of course, we started adopting Nx almost three years back. Let's see what is Nx. It's a build tool. It's an open-source build tool, which is available for everyone. You can just start using it for free. There's no cost needed. It also supports monorepo. Monorepo is just an extra thing which you get. The main thing is it's a build tool. It's a build tool you can use. Let's see. It actually provides build cache for tasks like build and test. As of today, one thing which we all are doing is we are building the same code again and again. Nx takes another approach. The founders actually are from Google. Everyone knows Google has different tools.
If you have a colleague from Google, you keep hearing about, we had this tool and we had that tool, and how crazy it was. Of course, these people, they used to work in the Angular team. They took this idea of Bazel. Bazel was, of course, the idea, because Google uses it a lot. They built the entire Nx based on it. Eventually, they launched it for Angular first, and then now it's platform technology independent. As I said, it's framework and technology agnostic. You can use it for anything. It's plugin based, so you can bring your own framework. If there is no support for any existing technology, you can just add it. Or if you have any homegrown framework, you build it on your own. You can also bring it as a plugin, as part of Nx, and you can start getting all the features which Nx offers.
For example, build cache. It supports all the major frameworks out of the box. For example, Angular, React, Vue. On top of it, it supports micro-frontend. If you want to do micro-frontend with React or Angular, it's just easy. I'll show you the commands. It also supports backend technologies. They have support for .NET, Java, Spring. They have support for Python. They also added support for Gradle recently. As I said, it's wide.
Celonis Codebase
This is our codebase as of today. We have 2 million lines of code. We have close to 40-plus applications. We have 200 projects. Why are applications not projects? Because we also have libraries. We try to split our code into smaller chunks using libraries, so that's why we have close to 200 projects. Then, more than 40-plus teams which are contributing to this codebase. We get close to 100 PRs per day. That's average. There are some times where we get more. With module federation, this is what we do today. We are not loading those applications via URL routing. It's just the Angular application loads natively. We have multiple applications here. Shell app is something which just renders your nav bar.
Then you can access any apps. It just feels like you're a single page application. There is no reload. We can do tree shaking. We can actually do code splitting. We can also share or reduce the need to share our design system across the application, because now we have to do it only once. These are some tasks which we run for each and every PR. Of course, we do build. Once you write your code, the first thing which you do is you build your project. Then we write unit tests. We use Jest. We also have Cypress component test to write our test. Then we, of course, run it on the CI as well. Before we merge our PR, we also run end-to-end test. We are using Playwright for writing our end-to-end test or user journey.
Then, let's see how to start using module federation with Angular. You can just use this command. You can generate nx generate. For any framework, you will find nx generate. Then you will say nx, and the framework name. You can just here, for example, replace Angular with React, and you get your module federated app or micro-frontend app for your React application. These remotes are actually applications which will be loaded when you route through your URLs. For example, home, about, blogs, this can be different URLs which we have. They are actually different applications. It means your three teams can work on three different applications but, at the end, they will be loaded together.
Feature Flags
We use feature flags a lot because when we started migrating all of the codebase, it became a mess. Of course, a lot of teams started pushing their code in a single codebase. We were coming from different teams. A different team had their different ways to write code. We had feature flags for backend. Of course, that was something which was taken care of. At the frontend, we were seeing a lot of errors. We thought of creating a feature flag framework for our frontend application. This is how it feels like without feature flag. I've seen this meme many times. This always says, this is fine. We believe this is not fine. If your organization is always on fire, this is not fine. This is not fine for everyone. You should not do 24 by 7 just monitoring your systems because you just did a release. This is where we started. Of course, we had a lot of fires.
Then we decided, of course, we will have our own feature flag framework for frontend applications. This is what we used to think before we had a feature flag. We're used to, ok, backend, frontend, we will merge it. Then everything goes fine. We'll do a release, and everyone is happy. This is not the reality. This looks good on paper but, in reality, this is what happens once you merge your code. Everything just collapses. We started with this. We started creating our frontend feature flag to do this. We now have the ability to ship a feature based on a user, based on a cluster. We can also define how many percentages of users or customers we want to ship this feature to. Or we can also ship a specific build. We generally try to avoid this. This is something which we use for our POCs.
Let's say if you want to do a POC for a particular customer, we can say, just use this build. That customer will do its POC, and if they're fine or they're happy with this, we can go ahead and write for the code. For example, of course, we have to still write tests. We have to write user journey test. This is just for POC. We can also combine all of the above. We ended up with this. We started seeing, now there are less bugs, because now the bugs are isolated, because they are behind a feature flag. We also have the ability to roll back a feature flag if anything goes wrong. We don't have to roll back the entire release, which was the case earlier. Now we are shipping features with more confidence, which we need.
Before you ask me which feature flag solution we are using, I'm not here to sell anything. We built our own. We decided to build our own. How? Again, Nx comes into the picture. Because Nx, as I said, is plugin based. You can build anything and just create it as a plugin. You get everything out of the box. It feels native. It feels like you are still working with Nx. This is the command. You can just say, nx add and a new plugin. You can define where you want to put that plugin into. For our feature flag solution, we use a lot of YAML files. We added all the code to read those YAML files as part of our plugin. It's available for everyone.
One thing which you have to focus more on, in case you are creating a custom solution, is developer experience. Otherwise, no one will use it. We also added the ability to enable/disable flags. Developers can just raise a PR and enable and disable a feature flag. We also added some checks that no one should disable a flag in case it's already being used, and no one knows about it. There are some checks. Like, for example, your release manager or your team lead has to approve it.
Otherwise, someone just does it by mistake. Then we also have a dashboard where you can see which features are enabled and in which environment. Our developers can also see that. We also have a weekly alert, just in case there is a feature flag which is GA now, and it's available for everyone. We also send a weekly alert so developers can go ahead and remove those feature flags. This is fine, because we know where the fire is, and we can just roll it back.
Proof of Concepts
Of course, when you have a monorepo, the other problem which we have seen is that a lot of teams are actually not fans of monorepos, because they think they're being restricted to do anything. This is where we came up with the idea like, what if teams want to do a proof of concept? Recently, there were a few teams which said, we want to come into the monorepo, but the problem is our code is something which is a POC. We don't want to write tests, because we also have checks. I think most of you might have checked for your test coverage. You should have 80%, or 90%, or whatever. I don't know why we keep it, but it's useful, just to see the numbers.
Then we said, let's give you a way so you can start creating POCs, and we will not be a blocker for you anymore. In Angular, you can just say, I'll define a new remote, and that's it. A new application is created. They can just do it. Another issue is, most of the enterprises, they have their own way of creating applications. They may need some customization. That, I want to create an application, but I need some extra files to be created when I create this application. Nx offers you that. Nx offers you the way to customize how your projects will be created. For example, in our use case, what we do is whenever we create an Angular application, we also add the ability to write component test. What we did is we just took the functionality from Nx, added all this into a single bundle or a single plugin, and we gave it to our developers.
That whenever you create a new application, you will also get component test out of the box. Or let's say it can be your Cypress, or it can be your Playwright, or it can be anything which you like. For example, you want to create some extra files, for example, maybe Dockerfile, or maybe something related to your deployment, which is mandatory for each and every app. You can customize the way your applications are created by using the generators. This is called Nx Generator. As I said, you can also create files. You can define the files wherever you want to. Generally, we use files as a folder. You can put all the files.
For example, as I said, Dockerfile, or any other files which you need for configuration. You can pass them as a parameter. It uses a format called EJS. I'm not sure how many people are aware of EJS. It uses a syntax called EJS to replace any variables into the actual file. Here, I'm talking about the actual file. This is not any temporary files. I'm talking about the actual files which will be written on the drive. You can all do this with the help of Nx Generator. This is what we do whenever someone creates a new application. We just add some things out of the box.
Maintaining a Large Codebase
When it comes to maintaining a large codebase, because now we are here, we have 2 million lines of code in a single repository, there are a few things which we have to take care of. For example, refactoring. We do a lot of refactoring because we got the legacy code. I'm sure everyone loves legacy code, because you love to hate it. Then, we keep doing deprecations. This is one thing I think we are doing better, that we are doing deprecations. As soon as we see some old code, we start deprecating that code if it's not used. Then, migration. Of course, over the period of time, we have migrated multiple apps into our monorepo.
We still support, just in case anyone wants to migrate their code to our monorepo. It took us time. It took us close to two years. Now we are at the stage where I think we have only one app, which is outside our monorepo. This is not going to happen in a day, but you have to start someday. Then, adding linters and tools. Of course, this is very important for any project. You need to have linters today. You may need to add tools tomorrow. Especially with the JavaScript ecosystem, there is a tool every one hour, I think. Then, helping team members. This is very important in case you are responsible for managing your monorepo. I'm sure if you end up doing this, initially you will end up actually doing this a lot.
Most of the time, you'll be helping your new developers onboard into a monorepo. This is very important, again. Documentation, this is critical, because if you don't do this, then more developers will rely on you, which you don't want to. It will take your time away. Then the ability to upgrade Angular framework for everyone. Whatever framework you use, we use Angular, but in case you use React or Vue. This is what we wanted. This is what comes under the maintaining our monorepo. How do we do this? For example, Nx offers something called nx graph. If I run nx graph, I get this view, where I can see all the applications, all the projects.
I can figure out which library is dependent on which app. If I want to refactor something, I can just check if this is being used or not by using the nx graph. Or if there is something refactored which is required, I can just look at this graph and say, probably this UI should not be used in home, it should be used in blogs. Then you can just refactor your code. It helps a lot during refactoring and during deprecations as well.
Now, talking about the migrations. As I said, you may have to migrate a lot of code to your monorepo once you start, because all the code is available in different repositories. Nx offers you a command called nx import, where you can define your source repository and your destination repository, and it will migrate your code with your Git history. This command just came in the last release. From past years, we have been doing it manually. We did it for more than 30 repositories, but we did it manually. The same thing is now available as part of Nx. You can just run this command and do everything automatically. We deploy our documentation on Backstage.
This is what we do, so everyone is aware of where the documentation is. We use Slack for all the communications or any new initiatives or deprecations which we are announcing. We have a dedicated Slack channel, so just in case developers have any questions, they can ask on this channel. It actually improves the knowledge sharing as well, because if someone already knows something, we don't have to jump in and say, this is how you should do it. It reduced a lot of dependency from us, the core team. Education is important.
We started doing a lot of workshops initially when we moved to a monorepo, just to give the confidence to the developers that we are not taking anything from you. We are actually giving you more control over your codebase, and we are just here to support. We started educating. We did multiple workshops. Whenever we add a new tool, we do a workshop. That's very important.
Tools
As I said, every other hour, you are getting a tool. What should you use? Which tool should you add? This is true that, of course, introducing a tool in a new codebase is very time consuming. You may actually end up doing probably two, three days just to figure out how to make this tool work. At the same time, sometimes adding a tool is easy, but maintaining it is hard. Because as soon as you add it, there is a new tool, which is available the next hour, which is much more powerful than this. Now you are maintaining this tool, because there is no upgrades. Most of your code is already using this tool, so you cannot actually move away from this now.
At the end of the day, you have to just maintain this code or maintain this tool. Nx makes it easy. It also makes it easy to introduce a new tool and maintain a new tool. Let's see how. Nx offers you support out of the box for the popular tools, for example, Cypress and Playwright. This is now a go-to tool for writing end-to-end tests. I'm not sure about the others, but it's widely used in the JavaScript ecosystem. Anyone who starts a new project probably now goes for Playwright, but there was a time that many people were going with Cypress. Nx, just a command, and then you can just start adding or start using this tool. You don't have to even invest time configuring this. You just start using it. That's what I'm talking about.
For unit tests, it gives you Jest and Vitest out of the box. You can just add this and then start using it. No time needed to configure this tool. What about the upgrades? Nx offers you something called Migrate. With the migrate command, you can just migrate everything to the latest version. For example, if you're using React and you want to move to the new React version, you can just say nx migrate latest, and it will migrate your React version. Same for Angular. This is what we do now. We don't invest a lot of time doing manual upgrades or something. We just use this nx migrate, and our code gets migrated to the new version. It works for all the frameworks, all the technologies which is supported by Nx, but you can also do it for your plugins.
For example, let's say if you end up writing something for your own company, a new plugin, and you want to push some new updates, you can just write a migration, where this migration tool will just automate the migration for your codebase, and your developers don't have to even worry about what's happening. Of course, you have to make sure that you test it properly before shipping.
Demo
I'll show you a small demo, because everything we saw was a picture. Always believe when you see something running, otherwise, don't. This is how your nx graph looks like, whenever you run nx graph, and you can click on Show all projects. Then you can hover on any project and see how it is connected, like how it's being used, which application is dependent on which application. For example, shell, you see dotted lines. Dotted lines is lazy loading. It means they are not directly related, but they are related.
For example, Home and UI, it says that there is a direct dependency. You can figure out all this from nx graph. It also gives you the ability to see tasks, tasks like build or lint. Let's say if you make a code change, you can figure out what tasks will be run after my code change. Which builds will be running? Which applications will be affected? Everything you can figure out from this nx graph. This is free, so you don't have to pay. I'm just saying this is one of the best features which I have seen, which is available for free. Let me show you the build. I talked about caching. Let's run a build, nx run home:build. I'll just do production build. It's running the build. This line is important. It says, 1 read from cache. Let's say if you make some changes, like right now, one thing about monorepo, people think I have 40 projects.
Whenever I make changes, my 40 projects will be built. Monorepos have actually a bad name for this. I have done .NET, so I know. We used to have so many projects, and then rerun the same code or build the same code again and again, but not with Nx. Nx knows your dependency graph, so it can figure out what needs to be built again and what needs to be read from the cache. They do it really well. Here we can see one read from cache, because I already built it before. It just retrieved the same build from the cache. Now let's say, 40 teams working on 40 different apps, but one team makes changes to its own app, then 39 apps are not built again, because Nx knows from dependency graph that this application is not affected, so I don't have to build anything.
If I try to build it again, so next time it will just retrieve everything from cache. Now it's faster than before. It says now it took 3 seconds, which earlier was 10 seconds. This is what Nx offers you out of the box. Nx is available for your builds, your test, your component test, or your end-to-end test, anything. All the tasks can be cached. This is caching.
CI/CD
Of course, CI/CD, there is always one guy in your team who is asking for faster builds. I was one of them. We use GitHub Actions with Nx, which gives us superpower. How do we do it? We use actually larger runners on GitHub Actions. We use our own machines. We used to use GitHub-provided machines, but it was too expensive for us. We moved on to using our own machines now. We use Merge Queue to run end-to-end tests. I'll talk about Merge Queue, because this is an amazing feature given by GitHub. This is only available for enterprises. We can cache build for faster build and test, which we saw on the local. What we saw was on the local. I'll show you how we do it on CI. Let's talk about Merge Queue and user journey test first.
One thing about user journey test is they are an excellent way to avoid bugs. Everyone knows. Because you are testing in a real-time simulation, because you are actually going to log in and click on a button to process something. We all know that if you try running user journey on every PR, it will be very expensive, because we are interacting with the real database. It may take a lot of time to complete your build. We also know that when you are running multiple branches, this is another issue. Because the next branch will soon go out of sync with the main branch because you already have latest changes in main branch.
Then running the user journey test again on an old branch is pointless because now you don't have latest changes. It means there is chances that you may introduce errors. This is where actually Merge Queue was introduced by GitHub. Let's see how it works. Let's say these are four PRs in your pipeline, PR is pull request, and PR 4 fails, so it's removed from your queue. These three PRs, PR 1, PR 2, PR 3, will be sent to your Merge Queue. Merge Queue is actually a feature provided by GitHub, which you can enable from your settings. You can define how many PRs you need to consider for Merge Queue. We do 10. Ten PRs will be pushed to Merge Queue at once. You can change. Because we have 100 PRs per day, we found that this is our average. We can do 10.
In your case, if you get more PRs, you can just increase the number of PRs which you want to push into Merge Queue. Then once it goes to Merge Queue, this is how it works. GitHub will create a new branch from your PR, the first PR, and the base branch will be main. Then it will rebase your changes from PR 1 to this new branch, which is created, but it will not do anything else. The branch is created. That's it. Then it creates another branch called PR 1, PR 2. Now the PR 1 branch is your base. Then it will merge PR 2 changes into this branch. Now it's latest code. Same with PR 3. Now it will create PR 1, PR 2, PR 3, take PR 1, PR 2 as base, and PR 3 changes will be merged to this branch.
After this, it will run all the tasks which are actually available on your CI/CD. For example, you run build, you run test, you run your component test, plus user journey test. Whenever you are running user journey test, you are running it on latest code. It's not the old code which is out of sync. Yes, it reduces the number of errors you have.
Before I go with affected, I want to give some stats, like how we are doing today. With 2 million lines of code, 200 projects, as of today, our average time for each PR is 12 minutes. For entire rebuild, it's 30 minutes. It's all possible because we take usage of affected builds. Because Nx knows what has been affected, so this is what it does internally. For example, Lib1, Lib2, it affects five different applications. Your change is this. You push a new code, which affects your library 1, in turn affects App1 and App3. What we will do is we will just run affected tasks. We will say, run affected and do build, lint, test. That's it. We retrieve the cache from S3 bucket.
As of today, we are using S3 bucket to push our cache and then retrieve it back whenever there is a change. We just retrieve it back from the S3 bucket. You can do it if you have money. There is a paid solution by Nx, it's called Nx Cloud. You can just remove this. You don't have to do it on your own. Nx Cloud can take care of everything for you. It can actually even do cache hit distribution. I'm talking about cache hit distribution on your CI pipeline as well as on your developer's machine. Your developers can get the latest build, which is available on the cache, and they don't have to build even a single thing. It's very powerful, especially if you are onboarding new developers. They can just join your team on day one, within one hour, they are running your code without doing anything, because everything is already built.
As soon as they make changes, they are just building their own code and not everything. If you want to explore Nx Cloud, just go to nx.dev, and then you will find a link for Nx Cloud. As of today, we are not using Nx Cloud because it was probably too expensive for us and not a good fit, but if you have a big organization? As I said, Nx Cloud works for everyone. It's not only for frontend or backend: any technology, any framework. This is an example from our live code. We have our design system. For example, when I tried to run it for the first time, it took 48 seconds. The next run took us 0.72 seconds, not even a second. This is a crazy level of time which we save on every time we build something. Our developers are saving a lot of time. They are drinking less coffee.
Release Strategy
The last thing is about the release strategy. One thing at Celonis, is we love our weekends. I'm sure everyone loves their weekend, but we really care about it. Our release strategy is actually built around the same, that we don't have to work on weekends. This is what we do. Of course, we have 40-plus apps, so we know that this is risky, so we don't do Friday releases. Because it's not fun, going home and working on Saturdays and Sundays to fix some bugs. What we do today, we create a new release candidate every Monday morning. Then we ask teams to run their test. It's a journey. There are teams who have automated tests. There are teams who don't have automated tests. They do manual or whatever way they are doing, or they just say, ok, it's fast. You should not do that, but, yes, that might be a possibility. They execute their tests, automated or manual.
If everything goes fine, we deploy by Wednesday or Thursday. Wednesday is our timeline that we ask every team to finish their test by Wednesday, or worst case, Thursday. If something goes wrong, we say, no release this week. Because we are already on Thursday, if we do a release, it means our weekends are destroyed. We don't like that. We really care about our weekends, so we cancel our release, and then we say, we'll come back on Monday and then see if it goes ahead and we can do a deployment. If everything goes green, we just deploy and then go home and monitor it for Monday, either Thursday or Friday, based on when we release. Everything is happy. Then we do this again next week.
Of course, there are some manual interventions which are required here. This is where we want to be. Of course, every company has a vision. Every person has a vision. We also have a vision. This is what we want to do. We want to create a release candidate every day. If CI/CD is green, we want to deploy to production. That's it. If there's something which goes wrong, we want to cancel our deployment and do it next day. Renato accidentally mentioned 40 releases per week. We at least want to do five releases a week. That's our goal. Probably we will be there one day. We are probably very close to that, but it will take us some time.
Questions and Answers
Participant 1: I have a question about end-to-end test. As I understand you call it user journey test. How do you debug that in this huge setup of 40 teams? Let's say if test is red, how do I understand root causes? It can be a problematic red.
Yadav: Playwright actually has a very good way to debug the test. We use Playwright. Then it comes with a debug command. You can just pass, --debug, and whichever application is giving us an error, you can just debug that particular application. You don't have to debug 40 applications. We also have insights. Whenever we run tests, we push the data of success and failure on Datadog. We display it on our GitHub summary. Even the developer knows which test is failing. They don't have to look into the void and see, what's going wrong? They know, this is the application, and this is what I have to debug.
Participant 2: I was wondering if you also integrate backend systems into this monorepo, or if it was a conscious decision not to do so.
Yadav: It does support. As I said, you can actually use your backend, like .NET Core. I think it supports Spring, as well as Maven. Now they added support for Gradle as well. You can bring whatever framework or whatever technology you want to. We are not using it because I think that's not a good use case for us. I think more teams will be happy with having the current setup where they own the backend, and the frontend is owned by a single team.
Participant 3: How do you handle major framework updates or, for example, design system updates? Because I think in the diagram you showed that you try to do it like every day release. I can imagine that with many breaking changes, this is not how it can work. You need more time to test and make sure it's still working.
Yadav: We actually recommend every developer write their own test. It's not like another team who is writing the test. That's one thing. Of course, about the upgrades, this is what we do. We have the ability to push a specific build. For example, Angular 14 upgrade, which was a really big upgrade for us, because after Angular 13, we were doing it for the first time, and there were some breaking changes. We realized very early that there are some breaking changes. We wanted to play safe. What we did is with feature flag, we started loading only Angular 14 build for some customers and see how it goes. We rolled it out initially for our internal customers, like our users.
Then we ran it for a week. We saw, ok, everything is fine. Everything is good. Then we rolled it out for 20% of the users. Then we monitored it again for a week. Then 50%, and now we will go 100%. This is very safe. We don't have any unexpected issues. With design system, we do it weekly. It's like design system is owned by another team, so they make all the changes. They also do it on Monday. They get enough time, like four or five days, to test their changes, and then make it stable before the next release goes.
Participant 4: You explained about the weekly release. How do you handle hotfix with so many teams?
Yadav: Of course, there will be hotfixes, we cannot avoid this. There will be some code which goes by mistake on release. We try to capture hotfixes or any issues on release before they go on to production. Just in case there is anything which needs to be hotfixed, they generally create a PR with the last release, which is there. Then we create a new hotfix. It's all automated. You just need to create a new release candidate from the last build, which we had, and just push a new build again. Good thing is, with the setup, it's not like we have to roll back the entire release.
See more presentations with transcripts