Conducting Architecture Reviews in Light of the New TSA Directives

TSA, the sector-specific agency for pipelines, released its first directive to the pipeline industry on May 27th and followed up with a second directive on July 20th. While the second directive was announced, the particulars have not been made public. This has caused some confusion, especially for asset owners, who are struggling with understanding and moving to the new directives. What we do know, from the TSA blog post, is of three high-level requirements:

Implement specific mitigation measures to protect against ransomware attacks and other known threats to information technology and operational technology systems (See Dragos whitepaper, Ransomware in ICS Environments).
Develop and implement a cybersecurity contingency and recovery plan (See Dragos whitepaper, Preparing for Incident Handling and Response in ICS).
Conduct a cybersecurity architecture design review and technical assessment.

This blog post is centered around that last requirement to “[c]onduct a cybersecurity architecture design review and technical assessment.” The process that TSA calls out in the May 27th directive is a Validated Architecture Design Review (VADR). Dragos performs reviews aligned with VADR regularly, and they are often done early on in a client’s cybersecurity journey. Some common findings from our architecture reviews, as reported in our Year in Review:

Lack of network segmentation,
Improper account management, and
Limited or no visibility within the OT environment.

Network Segmentation

Network segmentation is not as simple as just installing a firewall between your IT and OT environment. While that one piece of technology can assist an organization in protecting that specific network boundary, the process that an organization goes through to understand its environment and make the risk-based decisions on how to properly segment its networks is much more involved. Before an organization installs any new equipment or creates any new segmentation rules, they first should be able to answer these questions:

What devices are supposed to be connected to the network?
How are they supposed to be connected to the network?
What devices are supposed to be talking on the network?
Why are those devices supposed to be talking?

These questions sound very basic, but we often find that organizations have outdated, incorrect, or incomplete information that makes these questions difficult to answer. The first question accounts for the organization having a reasonably good asset inventory. The second question relates to the organization having up-to-date network architecture diagrams and network configurations. The third and fourth questions both relate to the organization understanding their data flows and how those data flows relate to business needs.

These four questions are not the only ones that an organization would need to ask themselves, but having sufficiently complete answers to these will provide them with more opportunities to understand their systems before they go through the effort to determine how they can restrict network traffic to the minimum needed to meet the business needs. After understanding what, how, and why systems are communicating on the OT networks, then an organization can apply network segmentation devices and rulesets to restrict the network communications and segment their network zones. These zones improve the overall reliability of OT systems and reduce their attack surface.

Account Management

Another common finding for our assessments and architecture reviews is around improper account management for both human users and automated processes or services within the OT environment. Understanding the ways that the systems use those accounts, as well as how they are established, modified, and removed is an important process for an organization. As with network segmentation, there are a small number of questions that an organization can help mature its understanding of account management within its OT environment.

Which systems utilize accounts?
How does the organization centrally manage accounts?
For systems that utilize accounts that can’t use centralized management, how does the organization manage those accounts?
Does the account management process interact with the HR and vendor/contractor management systems?

The first question relates to asset inventory, although it delves one layer deeper than the first question in network segmentation section. Not only does the organization need to know the devices and systems that connect to their networks, but it also needs to understand how they need to interact with those devices and systems. Also, when looking at what systems utilize accounts, it is important to consider the normal operating conditions as well as abnormal conditions, like during initial development, installation, startup, shutdown, and maintenance. Accounts may be used during these times that may be overlooked during a quick review of device and system accounts.

The second and third questions relate to how they manage the accounts they have within their OT environment. Central management of accounts can be both a benefit if properly configured and a detractor if improperly configured. Many systems within the OT environment still use local accounts, even those that may also have the capability to use centralized management. This will probably not change for decades, so an organization needs to understand how they manage those devices and systems.

The fourth question relates to how an organization administers accounts and if/how they link that to personnel management for those that have access to the OT environment. Since OT asset owners often rely on vendors or contractors to perform certain operations within the environment, it’s also important to consider the ways in which their processes interact with those of the asset owner.

Visibility in the OT Environment

Lastly, establishing visibility in the OT environment is a key aspect to maintaining the overall security and reliability of the systems and networks. While this may seem like a self-serving finding for a vendor of OT monitoring technology, the concept of visibility is much more than any one single thing. It really has to do with the realization that even the best defenses can fail. To be prepared for that possibility, it is important to understand what kind of information can be collected and monitor that information to ensure that the devices and systems are operating properly. This has sometimes been called an “assume breach” model, but can also benefit normal operations as much as it can in catching an attacker. To help an organization understand their visibility into the OT environment, some initial questions they should ask include:

Which devices and systems have host-based event and log information?
How can that host-based event and log information be obtained?
How and where should network-based monitoring be obtained?
Who needs to view and respond to the information?

The first and second questions relate to the realization that many OT systems may not have common event and log management systems. It is important for an organization to understand what they have and how they can get access to it, especially if they need to respond to an event or incident. A corollary to this is understanding how long that information is kept, since some systems may age it out or overwrite it on a periodic basis.

Since there are many OT systems that may not have any event or log information, may be legacy, or may be considered riskier to the process, it is also important to consider how the devices and systems communicate on the network. It is also important to consider that it may not be possible to monitor everywhere, so an organization will need to pick the key locations around the network based upon the devices and systems plus the risks to the process. One helpful guide to get this process started is the “Collection Management Framework”.

Once an organization collects all its information, it needs to make use of it somehow. It will need to determine who needs to get access to that information, how they obtain access, and how they deal with that information. There may be many things in the OT environment that trigger alerts in common monitoring technologies, especially those that are designed for IT. Understanding how the information collected translates into potential issues for the OT processes takes monitoring systems that can make sense of it in an OT context, personnel that can properly understand the potential process impacts, and processes in place to respond appropriately without overburdening staff.

In Summary

Even though TSA is restricting access to the guidelines on a need-to-know basis, there are things that pipeline operators can do to prepare without knowing all the specifics. In addition, many of the things an organization needs to do don’t require acquiring new technology. Many of the things can be accomplished by understanding:

Which devices, systems, and networks do they have?
How can they configure those devices, systems, and networks?
Who needs access to those devices, systems, and networks?
How do the devices, systems, networks, and personnel relate to the process?

For more information on how Dragos can help your organization to understand and align with the latest TSA directives, see the related resources from this article, or reach out to one of our experts.