Your source code is vulnerable, here’s what hackers are looking for

The biggest news of the spring season so far has been of a slap, a hostile takeover bid, a devastating series of source code dumps by a heretofore unknown hacking group, and now a hacked batch of OAuth tokens leading to yet more source code leaks. Of all of those, the source code leaks might have the most enduring impact.

What do we know?

The latest hack was reported by GitHub, which discovered the breach due to anomalous access patterns. The attack was on the tokens GitHub issues to other apps so their users can integrate with GitHub, but the original targets of the attack were most likely Heroku and Travis CI, the apps to which the abused authentication tokens were assigned. Attackers used the tokens to download the private code repos of dozens of companies. Additionally, GitHub confirmed that AWS keys found in a private NPM module accessed with compromised OAuth token were used to access S3. The attack demonstrates how supply chain relationships can be used to hop from one target to another, or to reach an ultimate target via a chain of connected targets. 

Before that was Lapsus$, which stunned the world with the highly unusual ultimatum to Nvidia to open source their code. In the highly compressed timeline that will now be known as the Spring of Lapsus$, the group announced its attack, then days later dropped the 1TB archive via its Telegram channel. The group’s chaotic energy continued from there. The group dropped substantial source code from Samsung and Microsoft. Lapsus$ also targeted digital transformation contractor Globant, and source code for over two dozen clients, including DHL and Facebook. And the group’s attack on authentication provider Okta may have given the group access to thousands of additional targets.

How do attackers use this data?

In the most recent attack, it’s not yet known how attackers gained access to the OAuth tokens or client secrets necessary to use them. But the attackers’ determination to get access to source code is especially revealing about how hackers are now mapping and gaining access to their targets.

Looking at recent source code leaks, including those from Lapsus$, we find the following categories of goodies:

  • Intellectual property – it’s not surprising that this is the thing many of the executives I talk with fear most. For them, code is how they turn their understanding of the business and their customers into revenue, and code exposure could nullify their differentiation if competitors were to use it.
  • Map of their systems – necessarily, code details what systems it connects to and how. From databases to remote APIs, hackers can trace the interconnections in the app just as developers need to.
  • The secrets to access those systems – the user and password to access that database, the keys to connect to that API, the GRPC/TLS client certificate used by that set of microservices, the certificate signing request and private key used to generate the SSL certificates that prove the company’s identity on the web, and the code-signing key used to prove the authenticity of their apps all fall into this category, along with far too many other bits that only work if they’re kept private.

While many executives most fear the loss of intellectual property, the biggest risk is actually the exposure of the systems their code connects to and the secrets used to connect to them. And that’s exactly why hackers were so intent on stealing tokens to access code in GitHub, and it might explain why the Lapsus$ group thought the source code they stole was so valuable—even if the executives didn’t realize why at the time.

Lapsus$’s rapid pace was greased at least in part by lateral movement between connected companies and systems. At least some of the group’s attacks originated on vendors to their target companies, and the group appeared effective at leveraging the secrets they learned to gain further access everywhere they went.

The effects might reverberate longer yet. Just as identity theft attacks can play out over years or decades, the unparalleled exposure of source code and secrets so far in 2022 could cast a long shadow on the security of not just these companies, but of their customers as well.

Responsible companies will investigate every leak thoroughly and quickly replace any affected secret (and more, just to be safe), but that process isn’t always straightforward, and it often goes far beyond rotating keys and passwords.

Companies like those whose code was targeted in the OAuth token attack will need to look at any built artifacts, such as public or private NPM modules. A backdoor injected into an artifact might go unnoticed and reused for years. Some of the companies targeted by Lapsus$ make hardware that is in the hands of hundreds of millions of customers. Securely updating those now that the companies’ private keys are compromised is an enormous undertaking. Even more difficult will be rebuilding customer trust.

What can we do?

The first step is to assume your code will leak. We all work and hope to avoid that, but the code leaks of 2022 suggest every company should be planning for it. Execs that used to think they were protected because all their code was in private repos are realizing they don’t have the protection they thought they had.

Mitigating a code breach requires looking at what’s in your code. Hackers might get the intellectual property, but do you need to give them the keys and passwords to all your systems? 

Ten years ago, early in the infrastructure-as-code revolution, there weren’t really any good solutions for how to manage secrets. But now those tools are common and companies should be looking to eliminate the use of secrets in code so that hackers can no longer use them as an attack vector. Using those tools and rotating any secrets currently in code is one of the most important steps any company can take in the current threat climate.

Security is a process, not a product, but some of those processes can be automated. Automated scanning for secrets and other sensitive information is the first step. Good automation works with developers in their workflow to spot the secrets and PII in code. In our work with customers, just notifying a developer of a secret in a pull request is over 80% effective at preventing new secrets from getting into code.

And giving developers visibility into the security health of their code gives them a way to track their progress removing and rotating secrets already in the code and watching the code health improve. On a tough day when nothing else is working, fixing a security issue can be the small win that helps pave the way to the next win. But all of this is only possible when you can see what’s in your code with the same clarity that hackers have when they dig through it.
tl;dr: The security landscape has been shifting for years, but 2022 has illuminated those changes in a way that few can deny. Private repos aren’t as private as they once were, and hackers are using the secrets they find in them to move quickly through adjacent systems. Effective processes and technology to mitigate the risk of code leaks are available now. The only challenge most companies face is in recognizing the problem.

Protecting organizations from Software Supply Chain Attacks (cheat sheet included)
Watch a brief video: Finding code secrets with BluBracket

Share this post!