BT

Facilitating the Spread of Knowledge and Innovation in Professional Software Development

Write for InfoQ

Topics

Choose your language

InfoQ Homepage Articles Simplicity, Speed, and Re-Use. Shipping Threads in 5 Months

Simplicity, Speed, and Re-Use. Shipping Threads in 5 Months

Key Takeaways

  • We reused key parts of Instagram to build Threads in five months.
  • Using a code base for something new that it wasn't designed to serve brought in technical debt, which needs to be paid down now.
  • You need to know the legacy code base inside and out to customize it so that you can effectively repurpose it.
  • When we found out on launch day that the system couldn't handle the scale, we raced to redesign that system to horizontally scale and orchestrate a bunch of workers.
  • Cleaner, newer code isn't necessarily always better, as all the little learnings encoded into an old battle-tested code base add up. If you can help it, don't throw it away.

In Jan 2023, we received word that we’d need to build a microblogging service to compete with Twitter in a couple of months. A small team was assembled to take on that challenge, and we shipped a new social network in July. This article describes how we developed and launched the Threads app at Meta last year.

This is a summary of the talk 0 → 1, shipping Threads in 5 months which I gave at QCon London 2024.

Values and milestones

There were four basic values of the new Threads product:

First, the format here was text. Instagram puts media front and center, but here every post starts with text.

Two, we wanted to carry over the design language and ethos of Instagram. The simplicity, the feel of the product that so many love around the world. We felt like that was a good foundation.

Three, we knew that one of the values that helped early Twitter establish itself was its openness. By this I mean that the community was free to craft their own experience with an API. Public content was widely available on the web through embeds. People generally used these tools to share their Twitter feed all over the place. While this is a new world, with GenAI coming up, we felt like a new walled garden product just wouldn't fly.

Lastly, we felt like we needed to prioritize the needs of creators. Every social network has a class of folks who produce the majority of the content that others consume. This usually follows the Zipfian exponential distribution. On text-based networks this is exaggerated, and it's a smaller proportion of the user base who produce most of the content. It's hard to be interesting and funny in 500 characters. We knew that keeping the needs of this community in mind was critical for long term success.

With these values in mind, we sketched out what the absolute bare minimum product was and got to work building. An important objective was that we wanted to have a product ready to ship as soon as possible to give us options.

To give ourselves goalposts, we outlined four milestones we'd like to get to. Each milestone was designed to be a possible end state itself, where we could ship if needed, each one gradually layered on the next most essential bit of functionality.

Milestone one was just about standing up the app. You can see how bare bones that is. Being able to log in, to make a text post that gets associated with your account. Really just the basics.

Milestone two was our essentials bucket. Giving the app its familiar look with tabs for feed, notifications, and profile. Getting the basics of integrity working, being able to block another profile, to report it.

Milestone three was what we called the lean launch candidate. Here we fleshed out services that had been neglected before, like a basic people search, a full screen media viewer to view photos and videos. Starting to figure out what to do with conversations, the lifeblood of the app. How do you rank them as a unit? Copying your follow graph from IG so that you can bootstrap your profile, and the ability to mute other profiles. That's a long laundry list.

Milestone four was where we enabled interoperability with the fediverse, and was a final ship candidate. For anyone who knows the product, you might be laughing at the ambitiousness because we are far from full fediverse interop even today. In one of my weaker moments, I promised we could have this ready within the month. This was going to be the last thing we tackled in May. We took this very seriously.

As each milestone neared, we'd shifted from a building features mindset to a polish for launch mindset. This was honestly exhausting as an engineer, as the context switching between building mods is taxing. In some ways, we shipped three products in those six months. Each time we'd call a war room, burn the midnight oil, and push through to get a complete app. Then we decide whether we were ready or not to ship.

With the benefit of hindsight, I now see there was a big upside to the strategy. It served as a strong forcing function to simplify the product. If all you get is three weeks to pick what features you will add, and which will add the most incremental value, you really have to boil it down to just the basics. The actual product we shipped was something like M3.5. We never got anywhere near M4, but we did get a much-needed iteration on top of M3.

Speeding up development

It would be extremely ambitious to build Threads from scratch in five months. We started in earnest in Feb, and we were promising our leadership that we'd have it ready by the summer. Our source of confidence in this timeline was that we knew we could repurpose parts of Instagram for the job.

When you take a wide-angle view, broadcast sharing on Instagram is fairly straightforward. You can follow profiles, access a feed of their posts, plus some recommendations. You can respond to one another, building up communities around interests. Coincidentally, this is exactly the set of features we wanted for Threads on day one, so we reused them.

Our first prototype added a mod to the Instagram feed that surfaced text-only posts. We reused the ranking. For the post, the inside joke is that we just reordered the layout. We put the caption on top and the media on the bottom, everything else is just the same. This drastically reduced technical scope.

We've now taken the intractable problem of how do you build a new text-based social network and turn it into a very specific one? How do I customize Instagram feed to display a new text post format?

As engineers, you will undoubtedly note that this approach has major downsides. You're accumulating massive tech debt using a code base for something new that it wasn't designed to serve. There's a whole series of paper cuts that you accumulate as a result. You also need to know said legacy code base inside and out. It's millions of lines of code, but you need to customize just a few so that you can effectively repurpose it. My favorite one-liner about this is that it's harder to read someone else's code than it is to write your own.

We borrowed Instagram's design language. Your Instagram identity bootstrapped your Threads one. We carried the spirit throughout, surgically reusing what we could and rebuilding only where necessary.

It also means that the Threads team has much to be grateful for, the existence of Threads itself owes itself to the foundation laid by Instagram infra and product teams over the years. We'd be nowhere without it. I like to think that this focus on simplicity paid off. Much of the praise we received at launch was for the sparkling simplicity of the app without it feeling like it lacked essential features.

The Threads stack

Here’s a technical overview of the Threads stack.

Meta in general is big into monolithic binaries and its monorepo. There's a number of reasons for this. The upshot is that much of Instagram's business logic lives in a Python binary called Distillery, which talks to Meta's bigger monolith, a massive PHP binary or Hack binary called WWW. Distillery is probably the largest Django deployment in the world, something I don't know if we're proud of, even if we run a pretty custom version of it.

The data for our users is stored in a variety of systems. The most important ones are TAO, a write-through cache that operates on a graph data model, and UDB, a sharded MySQL deployment that stores almost all our data. By graph data model, I mean that TAO is natively familiar with links between nodes and has optimized operations to query details of those. There's indexing systems on top that let you annotate particular links with how you want to query them. They build in memory indexes to help. The whole model has evolved over many years of building social products as Meta, and this has served us well.

There's also other systems involved, and I can't do them all justice. There's a big Haskell service that fronts a lot of the rules that go into our decision making around restrictions, like looking for inauthentic behavior in user actions. There's a key-value store called ZippyDB that's used to stash lots of transient data that's written too often for MySQL to handle the load. There's a serverless compute platform, async, which was critical, and I'll cover it later. Then there's a Kubernetes-like system for managing, deploying, and scaling all these services. All of these need to work in concert.

The launch

On July 5th, anyone who wasn't an engineer was happily hanging out on the product with the celebs who had come on for early access. This was probably the closest anyone was going to get to a personal chat with Shakira. For engineers, it was an all hands on deck preparation for launch the next day, upsizing systems, sketching out the runup show, and the like.

In the middle of the day, our data engineer popped up in the chat and mentioned something odd. He was seeing tens of thousands of failed login attempts on our app. This was odd, because no one, certainly not tens of thousands of people should have access to the app yet. We pivoted quickly and ruled out a data issue. Then we noticed that all of these were coming from East Asian countries. Maybe you figured it out, it would take us another beat to realize that this was time zones. Specifically, we were using the App Store's preorder feature where people can sign up to download your app once it's available, and you specify a date. Since we said July 6th, once it was past midnight in these countries, the app became available, and they couldn't log in because of another gate on our end. This was an oh-shit moment.

The warm fuzzy feeling of being safely ensconced in an internal only limited testing was gone. We pulled together a war room and a Zoom call with close to 100 experts from around the company and all the various systems I mentioned. Given that this was the middle of the night in those countries, it was evident that demand was going to far exceed our estimations once this all went live.

We had a healthy preorder backlog built up and they were all going to get the app once the clock struck midnight in their local time. We chose a new target launch time, specifically midnight UK time, which gave us a couple of hours to prepare. We spent that time upsizing all the systems the Threads touched. I fondly remember a particular ZippyDB cache that is essential for feed to function at all, that needed to be resharded. It had to be resharded to handle 100x the capacity it was provisioned for. That job ended minutes before Mark posted that Threads was open for signups. I don't think I'll ever forget the stress of those final moments.

A problem revolved around copying the follower graph from Instagram to Threads. The issue with graph copying was that you could say that you wanted to follow people who hadn't signed up for Threads yet. And the problem is that the audience waiting for someone popular is unbounded. When big celebrities, say former President Barack Obama signed up for Threads, they had a backlog in the millions of people who were waiting to follow them. The system that we had originally designed simply couldn't handle that scale. We raced to redesign that system to horizontally scale and orchestrate a bunch of workers, so it could eat through these humongous queues for these sorts of accounts.

It was a hair-raising couple of days, but all told, I doubt I'll ever experience a product launch quite like that again. I learned a bunch from others about staying graceful in stressful situations. I'll carry that wherever I go.

Conclusion

In conclusion, getting this off the ground was a unique experience for me. My biggest takeaway is the power of keeping things simple. It's certainly not easy. The aphorism about it taking longer to write a shorter letter applies to anything creative. If you're clear about the value you want to provide, it can guide all your hard decisions on where you want to cut scope.

The other learning is that cleaner, newer code isn't necessarily always better. All the little learnings encoded into an old battle tested code base add up. If you can help it, don't throw it away. The compliment to this is us saying that code wins arguments. This is to say that building a product often answers questions better than an abstract analysis. We got through a lot of thorny questions quicker with prototypes instead of slide decks.

Of course, nothing can be entirely generalized, because we need to acknowledge how lucky we are for the opportunity and the reception once this product came out. None of that was guaranteed. I feel very grateful for the community that embraced the product. We have a saying that goes back to the early days of Facebook that this journey is just 1% finished. It indeed feels that way to me, especially with Threads.

About the Author

Rate this Article

Adoption
Style

BT