Tips for Building a Modern Security Engineering Organization
Over the past decade the world has fundamentally changed in a variety of ways, with huge implications for business. We’ve seen the rise of transformational new technologies, for instance, such as cloud, mobile, and big data. When it comes to running a modern security engineering team that keeps your business secure, three changes have been particularly important.
For starters, things move a lot faster than they used to. Code that once took weeks or even months to deploy can now go into production almost instantaneously. Plus, we’ve got the added complexity of having more people with access to production systems than ever before as the responsibilities of development and operations teams merge. Last but not least, the cost of launching attacks has dropped significantly, making it a lot easier for hackers to target companies.
To adequately address these changes, today’s security engineering teams need to understand continuous deployment and DevOps. Not only that, they need to figure out ways to drive up attack costs to make themselves a harder target for attackers.
We’ve come a long way from the days of traditional waterfall, where deployment to production was often months or even years away. In my previous role at Etsy as Director of Security Engineering, we were pushing new code to production an average of 30 times a day. Additionally, we were constantly iterating in production using feature flags, ramp ups, and A/B testing — something that’s been a game changer for security requiring everyone to adopt a completely new mindset.
In the old deployment models like waterfall, security functioned as a blocker to the business requiring sign off before allowing anything to go into production. The shift to quicker deployment models is therefore often scary to security teams. It feels like code is now going to be flying out the door without any degree of control.
But here’s the thing. The control we thought we had was really just an illusion. Why? Because
If you’ve ever lived through waterfall development methodologies or out-of-band patches, then you know how painful it can be when an emergency comes up. Whether it’s because of a security issue, a performance issue, or just a general bug fix, shipping any type of fix, especially for an emergency, has traditionally been incredibly hard. Most organizations, that only release every 18 months, just aren’t designed to rush something out the door in a matter of days or even weeks. With continuous deployment, by contrast, there’s no such thing as an out-of-band patch. An “emergency fix” is just one of the dozens of deployments that are already going to happen that day.
What makes continuous deployment safe?
In a word, safety comes from “visibility.” Over the past five years, DevOps teams have been focused on increasing visibility and awareness to facilitate informed decision-making. Although security is a few years behind the curve here, we’re finally headed in that direction now, too.
To explain why, let me draw an analogy to aviation. Security at present is like piloting a plane without any instruments. Sure, you can fly, but when there are bumps along the way you have no idea if it’s because you’ve just hit some turbulence or because your engines are on fire. In other words, it’s like living in a binary world where things are either fine or they’re not, when of course it’s never really that black or white.
Thankfully, with the shift to DevOps and continuous deployment we have the opportunity to gain far greater visibility and awareness than ever before so that we can make better decisions. Of course, to ensure the kind of visibility and awareness you need, you’ve got to actively share information with other teams and organizations. One way of doing this is by embracing the cultural change that the shift to DevOps/continuous deployment often triggers.
Greater communication is key
With continuous deployment, you no longer kick your code over to Q&A for six weeks and then on to staging for 12 more. Instead, you perform code reviews and tests and then ultimately deploy it to production yourself. By removing the old organizational blockers, speed is dramatically increased.
For security engineering teams, this means that if you’re a roadblock to development, it’s now easy for them to work around and actively avoid you. A big part of the solution is better communication and here are some key lessons learned:
- Don’t be a jerk. This should be obvious, but empathy needs to be a core part of your security team’s culture. People should want to talk to security, so make sure that you’re hiring with that in mind. Especially important is empathy with operations and development teams. Understanding their daily battles and commiserating gives you credibility making you more successful in the long run.
- Make realistic tradeoffs. Don’t fall into the trap of thinking every issue is critical. If you prioritize the ones that really matter and agree to not hold up the works for those that don’t, you’ll find that teams will be much more willing to engage with you.
- Explain impact clearly. Telling colleagues in another department that “if an attacker did X and Y, our user data would be compromised” paints a clear picture. Telling them that “the input validation in this function is weak” doesn’t. Remove the security language barrier by speaking in plain English.
- Reward people who communicate with your team. Believe it or not, t-shirts, gift cards and high fives all work (shockingly) well. Creating a culture where interacting with security is seen as a positive thing will dramatically pay off.
- Take the false positive hit yourself. Wherever possible, avoid sending unverified issues to engineering/operations teams. When issues are discovered or reported, have the security engineering team verify them and potentially even make the first attempt at a patch. When security sends loads of unverified issues to engineering teams that turn out to be false positives, engineering will rightfully ignore future communications from the security team which is exactly what you want to avoid.
- Scale via team leads. Build relationships with technical leads from other teams, encouraging them to make security part of their team’s culture. This ensures that when new engineers join their respective teams, security is emphasized to them even without your direct involvement.
While it may sound trivial, the best you thing you can do to help ensure the success of your security team is to promote better communication.
Widespread access needs to be managed
Most startups begin with a pretty simple access control policy: everyone gets access to everything. That’s particularly true as development and operations teams merge. Of course, as organizations grow and scale, this becomes increasingly problematic and pressure starts to mount to put some policies and regulations around who can access what.
The key to getting it right is avoiding knee-jerk reactions and taking away capabilities from people when they’re just trying to do their job. Instead, focus on building safe ways to perform needed job functions, by taking the following steps:
- Figure out what the underlying function or capability is that your colleagues need. What is it that they require to get their job done?
- Create an alternative, safe way for them to perform the function or capability.
- Transition your entire organization over to the new, safer way of doing things.
- Begin soft-failing the old system, setting up alerts to notify you of any usage of the old unsafe way of doing things so you can correct those instances.
An often seen example of this is where a large percentage of the development organization has SSH access to production systems. In this case, the steps are:
- Determine why SSH access is needed to production systems. Often it’s due to needing to be able to view error or access logs for the application to debug issues.
- Provide an alternative safe way to access that data via a central logging system like Splunk, ELK, etc.
- Publicize the new alternative way to access the data.
- Begin alerting on SSH access to production systems so a reminder about the new approach can be sent.
- Restrict SSH access down to only those which require it, ex: sysops.
If you take this approach, everyone wins. Security doesn’t become a blocker by removing capabilities that people need to be effective, but instead they are provided with a safe approach to perform the required tasks.
Increase the cost of attack
Although it has become cheaper and easier to conduct attacks, there are several ways to use this to your advantage as a defender. Some of the most effective approaches are to run realistic attack simulations against your organization, have a disclosure policy, and potentially even a bug bounty program. The goals of these sort of programs are to:
- Incent people to report issues to you
- Drive up the costs of vulnerability discovery and exploitation
- Provide external validation of where your security program is and isn’t working
If you’re worried about budgetary concerns, money is rarely the main motivation for researchers reporting issues (although it certainly helps!). Similarly, if you’re concerned about inviting attacks, the fact is that if you’re on the Internet you already get a free penetration assessment every single day, you just don’t receive the report.
Before launching a disclosure program or a bounty, one of the most effective things you can do is take note of what vulnerability classes you expect to see and what ones you don’t. You can then compare your expectations against the issues that actually wind up getting reported to provide extremely useful data on where your security program is working well and where it needs adjustment and iteration.
The shift to DevOps and continuous deployment often feels scary to security teams because it represents such a significant departure from the way we’ve approached security in the past. However, instead of reducing security this transition actually affords us a unique opportunity to fundamentally shift the position of security from being a blocker to enabling greater business velocity.