Instagram + Android: Four Years Later

Instagram Engineering
Instagram Engineering
10 min readJun 21, 2016

--

The first version of Instagram for Android was built in four months by a team of two engineers. It’s been four years since that launch, and in that time we’ve added features such as video, direct messaging, photo maps, advertiser support, and new ways to discover and explore the amazing content shared by users around the world. We’ve regularly released new filters, editing tools, and apps to unlock creative potential. Almost 30 engineers now work in our Android codebase every day. All this — yet Instagram for Android is still one of the fastest-starting apps on the platform, and is only a 16MB APK download for most users. How did we scale the team and create so many awesome new features while maintaining our best-in-class app size and performance? We’ve focused on providing the best possible experience for a small, well-scoped feature set, we have an extremely efficient UI layer, we take ownership of all the important code in our app, and we’ve invested heavily in maintaining our values as the team grows.

Do the Simple Thing First…

A core value of Instagram Engineering is to “Do the Simple Thing First”. We build for the use case that exists now, rather than the one that may exist later. We still care deeply about performance — but “doing the simple thing first” reminds us not to prematurely optimize our code or chase every small performance win. We think holistically and pragmatically, always considering the downside of increased complexity that often accompanies micro-optimization.

At the core of this principle is the idea that the Instagram app is simply a renderer of server-provided data, much like a web browser. Almost all complex business logic happens server-side, where it is easier to fix bugs and add new features. We rely on the server to be perfect, enforced through continuous integration testing, and dispense with null-checking or data-consistency checking on the client. Because of this, the app crashes when there is malformed data, rather than remain in a weird state. Our automated crash reporting triggers alarms and an investigation, and we can fix the bug. It is much easier to fix a crash, with an attached stack trace, than to debug a weird state issue based on a user’s report.

Living in a fast-growing codebase for four years has made us value straightforward, readable, debuggable code, and so we do not heavily rely on opaque code-gen, runtime annotation processing, or other clever “magic”. The only annotation processing that we use happens at compile-time, generating Java source files that look and behave as if they were handwritten. We prefer code to be right there on the screen in front of us, not hiding behind a complex meta-processor. It is simple for new developers to ramp up in this environment; they can easily trace what’s happening in the app and track down bugs.

The original Instagram app was much simpler than what exists today. As a small team, building quickly to keep up with market pressure, we used a lot of inheritance to share code. This approach didn’t scale with the team’s growth: it led to a confusing, tightly-coupled, brittle architecture, where execution bounced between different levels of a class hierarchy. Our desire to have small, simple, single-purpose classes has led us to embrace the principle of ”composition over inheritance”. We’ve found that it often takes a little more thought to build things without falling back on inheritance, but as a team grows, more emphasis needs to be placed on architecture to give a solid foundation for future development.

…And Optimize What Matters

Equally important to us is to “Optimize What Matters”. We have a high bar for the performance of our most-used features, and as the product has grown and evolved we have needed to continually reevaluate our previous assumptions. Sometimes code must be rewritten to deal with new feature requirements or operating system capabilities. This is best illustrated with a series of examples:

  1. JSON Parsing Architecture. We originally used object-mapped parsing to deserialize JSON responses from the server. It was the easiest way to get up and running. Before we even shipped Instagram 1.0, we found that performance was lacking — it could take over 30s to render the “explore” tab on lower-end devices. We replaced the object-mapped parsing code with handwritten stream-parsing code and saw that time go down to a few seconds. This worked for a year, but it was tedious and error-prone to hand-write parsing code. If done wrong, it could put the parsing thread in an infinite loop. At that point, we wrote an annotation-based parser generator. After it had stabilized, we deployed it for the entire app and entirely removed our dependence on object-mapped parsing.
  2. Comment Rendering. The addition of Emoji to the Android operating system tanked the performance of our comment rendering, causing the app to drop frames while scrolling the feed. We investigated the root cause and designed a sophisticated text layout caching mechanism to keep our feed fast.
  3. Activity Screen. We originally used a webview to display our “activity” screen, where you can see your likes and comments. Our theory was that we’d want to add new types of stories often, but this didn’t end up being the case in practice. And for a complex set of reasons, using a webview slowed down our cold start time by over 30%! We rewrote the screen in native code and saw our startup time drop to historic lows, while making the screen feel better than ever. Our cold-start time is now the second-fastest among the top 100 apps in the Android store.
  4. Feed. We are always adding features to make it easier to find awesome content on Instagram. For example, we recently added a full-screen, immersive video viewer to explore, and the ability to view top posts on hashtag and location feeds. As the different “feed” screens in the app have diverged, the architecture of our core feed code, which relied heavily on inheritance, made even the simplest features difficult to build cleanly. We removed all inheritance from feed Fragments and ListAdapters in a large, multi-month refactor, building a library of reusable components that could be easily used together to build new products. The resulting code was more robust and flexible, yet also dramatically simpler.
  5. Networking. We rewrote our inheritance-based HTTP request-generation and processing code in a declarative style in order to A/B test different low-level HTTP frameworks. We were able to deploy a client-side HTTP2 implementation with full confidence: our experimental data showed faster end-to-end request time on every API endpoint with no regressions.

We have a sophisticated set of tools, mostly built by our counterparts working on Facebook for Android, which report on and analyze all sorts of data about our apps in the wild. We track scrolling performance, start time, data usage, stability, and bug reports to make sure that we’re never regressing on our commitment to provide the best experience to our users.

Efficient UI

The first version of Instagram was a luscious, skeuomorphic masterpiece with textures, shadows, and gradients everywhere. In early 2014, we embarked upon one of our largest optimizations yet: a project to overhaul Instagram’s look and feel on Android. Our goals were to make the app both faster and more beautiful. We designed and built an interface that makes use of flat colors, lines, and simple icons, combined with a subtle sense of space and layout to create a refined, efficient UI layer. We expected some performance gain, but the magnitude of our results surprised us. I’ll summarize them here:

  1. Startup time. Images take time to decode and occupy precious memory. We overhauled our app to remove textures, instead painting flat colors to the screen. We were efficient with our usage of icons, colorizing them in code to avoid loading multiple versions. Reducing our startup asset count from 29 to 8 reduced cold start by 120ms across devices. This is not only felt at app startup, but every time we show a new screen.
  2. App size. Converting image usage to trivial painting code, and using ColorFilter to colorize assets programmatically allowed us to cut our total asset count in half, reducing the app’s size by multiple megabytes.
  3. Developer efficiency. A simpler UI is one that is faster to build. We have a library of components in the app that are easy to reuse, and new features don’t require difficult layout or positioning code to position background images and shadows correctly. We have a standard set of dimensions and colors — defined semantically — to reduce the amount of cognitive load and communication necessary between engineer and designer. Two years later, our engineers still love developing in this environment, and the app maintains a remarkable visual consistency between features.

Owning our Code

Over time, as we have refined our codebase via relentless optimization, we have reduced or eliminated dependencies on many third-party libraries that commonly appear in other Android apps, preferring to fully own our infrastructure code. One of my colleagues likes to describe our app as a “race car” — every single component is specialized for the job it needs to do.

Our image cache, for example, is homegrown, and comprises less than 1500 lines of Java code. It is designed to download, decode and display large images while the user is scrolling feed, without dropping frames. It is not a general purpose image library, so it eschews features that are not directly needed by the product, but it works extremely well for our use case.

As mentioned above, we developed a JSON parser/serializer generator which works with jackson-core (a low-level streaming JSON parser) to generate fast, memory-efficient parsing code. We do not use dependency injection, as we believe the code size, complexity, and performance hit do not justify the benefits. We use only a small subset of Guava, carefully evaluated for performance on mobile. We do not include the Play Services library, writing our own code to interface with GCM.

At a time when many popular Android apps are multidex, we still ship a single dex file. Secondary dexes incur a performance penalty on every method call, and loading too much code is generally bad because it eats up a lot of memory. We carefully track method reference count. It’s important that we do not unintentionally or carelessly add new dependencies. We created tests that run for every diff that check against an approved set of libraries that can be included in the app. If the diff adds a new library without fixing the test (which triggers a review from engineers on our team), it cannot be committed. We have also specified method count ‘budgets’ for internal libraries, and created tests to enforce them.

We recognize that writing so much custom code may not be a feasible approach for smaller teams which do not have the resources to write everything from scratch. For us, too, this was an iterative process: as our team grew and new features made more demands on size and performance, we started to be much more selective about the code we shipped. We removed third-party libraries that we could replace with our own code, tailored to our use case and therefore smaller.

Now that we have a great set of libraries, we’ve made sure that they are reusable amongst our various apps. Having an “app starter kit” shortened the development time of both Layout and Boomerang by months. Tests enforce that the “app starter kit” doesn’t depend on app-specific code.

Mentorship and Documentation

As the team size doubled, doubled, and then doubled again, it became important to teach new engineers about the codebase and our mobile engineering philosophy. We didn’t want people to invent new solutions for problems we had already solved because of lack of awareness.

Our main teaching tool has always been code review. Every diff at Instagram (and Facebook) is reviewed by another engineer. We are especially thorough, asking people to conform to patterns already in the app and to make their code fairly robust, as well as checking for obvious errors. We pair each new engineer with an experienced mentor, who serves as their main reviewer and can answer most questions. The best, longest-tenured engineers are expected to make themselves available regularly to debate and inform technical architecture decisions, and provide specialized expertise to help solve problems — not just sequester themselves away writing code. The average engineer spends 20–40% of her day on various code review and mentorship activities.

We try to make our code as self-documenting as possible. We use annotations such as @Nullable, or for an even stronger guarantee, Guava’s Optional class, to document the null-contract of methods. We are ruthless about proper naming, believing that it prevents bugs and promotes readability. We strongly-type everything, which in addition to documenting allows the compiler to do as much work as possible to prevent bugs. We use enums regularly because they are safer than Strings or ints and their performance downsides are minimal.

As the team has grown, we’ve constantly reevaluated how we work — doubling down where appropriate and shifting our tactics when things aren’t working. The thing that hasn’t changed is the values that underlie all these decisions. A team that shares a set of values can work well even when decentralized. We spend a couple hours every month presenting our mobile engineering values and culture to all new engineers who join Instagram.

Conclusion

Every team and codebase develops its own philosophy and strategy as it grows and matures. This is the one that has worked really well for us, built on many hours writing code and reading other people’s insights shared via blog posts like this one. We hope you can glean some useful strategies and ideas to incorporate into your development process.

Over the coming weeks, we’ll be sharing more specific details on a number of projects we’ve worked on recently that line up with the learnings above…stay tuned!

We recently moved our mobile infrastructure engineering teams (iOS and Android) to New York City. If this blog post got you excited about what we’re doing, we’re hiring — visit our careers page.

Tyler Kieft works on Android and iOS at Instagram

--

--