Chapter 6

Securing the Smart Grid

Information in this chapter:

• Implementing security control within Smart Grid endpoints

• Establishing strong boundaries and zone separation

• Protecting data and applications within the Smart Grid

• Situational awareness

• Use case: defending against Shamoon

The consequences of a cyber attack against the Smart Grid range from espionage to sabotage, and from petty theft to larger privacy concerns. However, one thing is certain, and that is that there is considerable risk of digital foul play. So, how can the Smart Grid be protected against this risk of cyber attack? By understanding Smart Grid architecture (see Chapter 2, “Smart Grid Network Architecture”), attack methodologies (see Chapter 3, “Hacking the Smart Grid”), and a basic cyber security model (see Chapter 5, “Security Models for SCADA, ICS, and Smart Grid”), we can start examining where and how to implement specific countermeasures. In this chapter, point security products and technologies will be discussed throughout all areas of the Smart Grid architecture, from generation to metering and everything in between. While no one product or technology is certain to stop all attacks, when used together in a defense-in-depth posture across all areas of the Smart Grid, it is possible to greatly minimize the risk of a successful cyber attack. This approach creates multiple layers of protection that enable the architecture to remain resilient even when a small number of security defenses are violated—effectively creating a sort of fault-tolerant security environment.

However, despite the degree of endpoint, network and data security that is established, vulnerabilities will still remain as new exploits and more sophisticated, blended attacks will continue to arise. It is therefore necessary to—above all else—establish appropriate monitoring of all Smart Grid systems to obtain situational awareness. Today’s log and event analysis tools are capable of looking at the bigger picture: collecting events from implemented security countermeasures and comparing it to network activity, user activity, application activity and even global threat activity. By looking at the entirety of a system’s digital behaviors, areas of risk can be identified, blended attacks can be detected, and in many cases suspicious or dangerous trends can be identified. The importance of establishing situational awareness cannot be underestimated due to the sheer volume of information that is relevant—both within each discrete domains and zones, and within the larger conglomerate system that is “the Smart Grid.”

Implementing security control within Smart Grid endpoints

There are several methods of securing a device or endpoint against a cyber attack. Common technology-based methods include as follows:

• Access control/data access control.

• Anti-virus.

• Application whitelisting or dynamic whitelisting.

• Change control or configuration control.

• Database security.

• Endpoint encryption.

• Host data loss prevention (DLP).

• Host firewall.

• Host intrusion detection systems/host intrusion prevention systems (HIDS/HIPS).

• System hardening.

How and where these controls are implemented will vary based upon the endpoints that require protection. We can generalize this to a degree by zone: for example, field devices will typically be comprised of embedded or fixed-function devices, which may not be accessible by an end user for the installation of new software. For this reason, it may be more appropriate to focus security controls on the conduit(s) that connect to these field zones. However, the protection of field devices remains an important challenge that needs to be addressed within the industry as a whole. This generalized approach is illustrated in Figure 6.1, which illustrates specific endpoint security technologies and where they are suitable within the Smart Grid reference architecture.

image

Figure 6.1 Applying endpoint security controls to the Smart Grid reference model.

Field zone protection

Field devices are often embedded systems: usually designed around low cost and low power consumption. Therefore, these devices will typically require security countermeasures that consume less memory, utilize fewer CPU resources, and have a smaller footprint. While the protection of these devices is paramount, the Smart Grid owner or operator will (typically) be unable to alter these devices, or install any commercially available cyber security countermeasure. Rather, the onus falls to the device manufacturer to build cyber security into these devices within the factory. Many device vendors are starting to investigate (and even implement) certain controls as part of their product development cycle. This is a positive trend, as the protection of the field devices would go a long way toward removing the inherent vulnerabilities that are present in most control environments. In addition, many commercially available technologies are highly significant and applicable to field device protection. Consider application whitelisting. Application whitelisting is one of the more popular software security solutions for embedded devices for these reasons. Unlike blacklist technology, which defines a rapidly growing list what is bad, whitelisting defines a finite list of known good applications and blocks everything else. “Applications” here refers to executables, including DLLs and other system functions, making it extremely difficult for malware to circumvent. The biggest advantage in an embedded system is, of course, that the whitelist rarely changes, minimizing or even eliminating the need to update the security profile on the embedded device (anti-virus, in contrast, must be patched frequently, with over 8 million new malware samples collected in Q2 of 2012, and over 90 million unique malware sample known to date in total.1). Application whitelisting is highly suitable for many embedded devices, which by their nature provide a fixed function: no application other than those that originate in the factory should ever be enabled on these devices (embedded devices using VxWorks or a similar real-time operating system are an exception due to the way “applications” function in these systems; however the concept of locking down authorized code and preventing the execution of unauthorized code still applies).

Note

Too much, too little, too late?

Not a reference to the Johnny Mathis and Deniece Williams hit single from 1978, but to the even more difficult conundrum of balancing security risks and infrastructure spend. It is important to note that while many security measures are described herein, and it is not necessary to implement them all. Rather, each measure taken will improve the overall security posture, both of individual systems and as the whole “Smart Grid.”

Ideally, every security control mentioned here would be implemented at every level. However, the reality is that in most systems, it will be difficult to justify this level of cyber security due to issues of cost and complexity. Smart Grid cyber security is further complicated because—as an infrastructure comprised of a conglomerate of different interconnected systems—there are different owners, different operators, and different stakeholders across these systems. So, the question remains a difficult one: how much security is enough? How much is too much? What is needed now, and what can wait?

The answer of course depends upon the specific drivers of the Smart Grid planners and operators. Depending upon where the Smart Grid is located, prescriptive controls may be applicable. In addition, risk-based assessment of the Smart Grid should be performed to identify and prioritize those areas where cyber security controls should be implemented. This book cannot, nor can any other book, describe the outcome of such as assessment—the assessment must and should be done. Only then can the most critical elements of a particular Smart Grid implementation be secured to the appropriate strength and priority using a repeatable risk management process.

However, it is important to remember that there are exceptions to all rules: some systems in the field, such as feature-rich gateways, may run on standard computing platforms using commercial operating systems, making it not only possible to install secure software, but also making much more important to do so.

It’s also important to watch the field device vendors closely, as more security countermeasures are being designed into these devices from the factory. Several device vendors—including the makers of the real-time operating systems (RTOS) that are often used on fixed-function devices, the industrial control vendors designing field assets, and even the chipset vendors who are designing the silicon used within these devices—are working with security vendors and researches to implement a greater degree of protection. Many field device vendors have implemented secure coding and design practices, while others are addressing compensatory features to limit vulnerabilities, protect against exploit, or provide secure communication capability. At the DHS’s 2012 Spring Industrial Control System Joint Working Group conference, RTOS makers Wind River and security research firm Wurldtech presented a joint initiative to embed vulnerability detection signatures directly into the RTOS, essentially providing an embedded host deep packet inspection capability into fixed-function devices.2 Another example is that of major ICS and electric utility vendor Siemens who in Summer 2012 released new enhanced communication processors for their Windows PCs and S7 line of PLCs providing an effective method of point-to-point authentication between command and control devices. Other vendors are supporting this positive trend as well, and as more security becomes built in to these devices by design, the less onus is on the end customer to secure the devices themselves after the fact (if that is even possible at all).

Control zone protection

As shown in Figure 6.1, as we move further in from the field and into the substations, we see more sophisticated devices: SCADA servers, measurement and data management servers, AMI headends, and similar server-based systems. These devices are fully owned and operated by the end user and thus any security countermeasure—once fully vetted and tested by the vendor—can theoretically be installed. Again, however, there are certain technologies that fit better than others.

• Application whitelisting again solves the patching and resource problems associated with traditional anti-virus and is therefore recommended on all systems.

• Anti-virus is useful as well, although on many servers where application whitelisting technology is used, anti-virus serve more as an auditor of the whitelisting solution than as an active security measure, provided the whitelisting application is capable of detecting both memory-and filesystem-based malware. Anti-virus scans should never detect malware on a whitelisting endpoint and therefore will produce useful reports to that effect.

• A method of controlling configurations and system changes—often referred to as change control, change management or configuration management systems—provides added protection by preventing the secure state of an endpoint (obtained through the use of the other countermeasures listed here), from being changed or manipulated back into an untrusted state. Some solutions will detect changes and report them to a security information and event management system (SIEM), while others may actively prevent changes.

• Above and beyond the protection of third-party security products discussed so far, full system hardening is recommended, by removing all unnecessary applications and services from the host. Also, separation of services (either to dedicated hardware or to individual virtual machines if virtual data centers are utilized) is recommended to prevent cross-contamination should a system become compromised. One of the fundamental principles used in hardened industrial systems is that of “least privileges” which denies or removes everything except that which is specifically required to accomplish a given function.

• Host IDS or Host IPS is also recommended. While signature-based host detection may still be challenges due to the same patching challenges that anti-virus faces, the class of server typically found in these zones will provide the greater horsepower needed to perform deep packet inspection, detect inbound anomalies, and other tricks common to HIDS and HIPS platforms—making them useful beyond “typical” DPI inspection and detection.

• Host data loss prevention (DLP) enables sensitive data to be tagged and monitored, so that attempts to exfiltrate these data via an application, via the network or by removable media can be alerted and/or blocked. This is important considering the risk that data within the substation can present: almost any information extracted from a SCADA system, for example, can be used as reconnaissance toward a larger threat.

• Event logging, though not a security control per se, is also an important consideration. Any information—from system utilization, network activity, security events, authentication activity, et al.—that can be provided by a host and utilized to great benefit by centralized monitoring tools such as security information and event management (SIEM) systems. The rule of thumb is if there’s any chance that a given piece of information may be relevant, log it.

Caution

Some more advanced application whitelisting systems may also be able to prevent new data from being written directly into memory, which is useful for protection against memory attacks such as buffer overflows. As is typical in a Smart Grid, however, additional caution may need to be taken when implementing these types of advanced cyber security controls. Some Smart Grid communications may utilize application layer protocols that write directly to memory by design, and the prevention of this activity could cause a communication failure. Similarly, if software is utilized to prevent changes, extra consideration should be given to intentional changes in project files or other file structures that may be a necessary function of some systems. Remember that within the SCADA and control environments, availability is the primary objective, and any software that could potentially impact operations should always be fully tested and vetted by both the end user as well as the vendor of the system or server being protected.

Service zone protection and back-office systems

As we get to centralized SCADA systems, historians, data concentrators, and back-office systems such as billing and customer management systems, the capabilities of the servers increase, as does the value of the data created by (or utilized by) these systems. In these cases, the same security countermeasures utilized in the control zone apply. However, there is an increased reliance upon the integrity of data and less on availability since we are moving away from real-time control to more transaction-based information. The increased volume of information (as a result of centralization) also indicates the use of supporting database(s) to store and manage that data. It is therefore important to give additional consideration to data integrity and information assurance tools, including database security solutions, database auditing, and data loss prevention tools in these areas.

Establishing strong boundaries and zone separation

The term “boundary” or “perimeter” is misleading, as it implies that there is a specific thing that needs to be secured. In fact, a “boundary” is likely to consist of many network connections, between many devices—the “wires” within the “conduit.” Ideally, there would be only one such physical connection—the ISA-99 “conduit”—between systems that contains the necessary communication paths—the “wires.” In this way, the conduit can be easily demarcated on both ends with a security gateway (a VPN, firewall, intrusion detection system, etc.). In a system as complex as a Smart Grid, however, that ideal is difficult to achieve as there are so many interconnected systems, blurring the proverbial “perimeter” and making network security a much greater challenge. There may be multiple wires, and multiple conduits—enough so that managing and controlling them all can be extremely difficult. It is therefore extremely important that any excess connectivity (any extraneous interfaces, ports or services) should be eliminated so that (a) the legitimate can be sufficiently protected, and (b) those protections cannot be inadvertently circumvented via an unintentional “back door.” Figure 6.2 illustrates specific network security technologies and where they are suitable within the Smart Grid Reference Architecture.

image

Figure 6.2 Applying network security controls to the Smart Grid reference model.

The problem is that there is a need for strong network security—including encryption and authentication—between critical systems, but only a small subset of the substation, control room, data center, and field devices support this natively. To further complicate matters, commercially available TLS isn’t always the best solution, because the added overhead may interfere with real-time communication (Layer 2 encryption makes more sense here, although TLS may be required by those intending to comply fully with IEC62351.). Even then, the process of encryption can also put blinders on security monitoring and situational awareness tools. The solution, therefore, lies in careful compromises, and in compensatory measures.

One such compromise is of risk vs. visibility: if the network connection is critical it should be secured as strongly as possible, using encryption and authentication. A strong network VPN or security gateway is a good choice here. However, if traffic is encrypted in transit, make sure that the traffic can still be monitored at either end. This will prevent a compromised device from gaining free reign to transmit exploits and malware across a connection that is “invisible” to security tools.

Compensating controls

Compensating network security controls has been around for a long time, and includes devices such as firewalls, intrusion detection and prevention systems, network access control systems, and similar devices. Firewalls, either host-or network-based, should be used promiscuously to filter out unauthorized network connections where possible. However, authorized traffic can be used to exploit a system, so firewalls should always be supplemented with some sort of monitoring or inspection technology as well:

• Industrial protocol filters monitor industrial protocols such as Modbus, DNP3, 61850, and others, and filter traffic based upon the protocol being used (i.e. disallow Modbus traffic, allow DNP3), or the content of the protocol (i.e. allow DNP3 “read” commands, disallow DNP3 “writes”). Note that simple port filtering using a firewall can filter disallowed protocols as well. However, unless the firewall is content aware (i.e. it can decode the protocol) it will not be able to detect traffic masquerading under a spoofed TCP or UDP port. As of the time of publishing, there are only a very few devices that support ICS protocol content inspection.

• Intrusion detection systems perform deep packet inspection (DPI) on network traffic and check against a defined set of exploit or vulnerability signatures (a “rule set”). An IDS is a passive device: it may be deployed inline or on a mirror or span port, but it will not block traffic: it will only alert if and when a signature match is detected.

• Intrusion prevention systems function just like an IDS, and only they can be deployed inline and can actively block traffic—often by dropping the offending packet, or by resetting the TCP session.

• Application content inspection systems perform a hybrid function: they inspect and decode the contents of a packet like an IDS, or a series of packets within a given application session, and analyze the contents similar to an Industrial Protocol filter. Application content inspection may be provided by a dedicated device such as a network DLP appliance, or it may be a supported feature of newer application-aware firewalls.

• Transport layer security is increasingly supported by those devices intended for use in substations automation, thanks to IEC 61850 and IEC 62351. However,TLS can also be established as a compensating measure around those devices that do not comply with the TLS requirements of 62351. Network-based encryption (usually via a VPN appliance) also provides the benefit of facilitating lower-cost network inspection. That is, data across otherwise vulnerable communication paths can be encrypted (to protect against eavesdropping, replay attacks, man-in the-middle attacks, etc.), while the decrypted traffic within the secure network can still be inspected using IDS or IPS technology to ensure the integrity of the communication (to protect against network-based exploits). Data integrity is also provided as a standard piece of the encryption process, preventing potential manipulation of data in transit (effectively preventing “garbage” from being introduced into an encrypted payload).

It’s important to remember that, without costly hardware tools, encrypted traffic is extremely difficult to inspect. Therefore, if a communication path is encrypted using a network appliance or other tool, the traffic should by inspected before encryption (i.e. within the “safe” side of the demarcation). If the encryption occurs at the host, the host itself must be secured to ensure the integrity of the sessions it participates in—otherwise an infected endpoint could authenticate and send encrypted exploits without risk of detection. Application whitelisting is a good fit here, as it offers the best degree of protection against malware at the endpoint.

Of course, it won’t be possible to protect all connections, especially in remote areas. It’s also not always possible to implement just any commercial off-the-shelf (COTS) security tool. Many substations, for example, present extreme temperature conditions and high levels of electromagnetic interference (EMI). Space may be limited, or other physical or environmental conditions might prevent installation. Luckily, many network security tools today can be virtualized. This allows the firewall, IDS, IPS, and/or other tools to be installed as software upon an industrial computing system—ideally one that already exists and is implemented in the target area. While not as strong a solution as dedicated, purpose-built appliance, virtualization offers more flexible deployment options and (typically) lower cost—an important consideration as you attempt to implement network security deeper into the grid where it may not be cost-effective to implement large numbers of dedicated hardware appliances, but where virtual appliances may be easily deployed. Note that virtualized systems should not be allowed to violate established zone separation. Virtual machines should be assessed as if they were physical devices, and virtual connectivity should be assessed as if they were “normal” network connections.

Advanced network monitoring

A monitored network connection can be used to aide security in several ways. Often, anomalous behavior can indicate that some sort of cyber attack is underway. In other scenarios, anomalous behavior may be an indication that a breach has occurred and additional infection stages are in process. Network behavior and anomaly detection (NBAD) tools are devices dedicated to this type of analysis and detection, and can often block against suspect traffic as well, like an IPS. Similarly, by feeding network flow information to a SIEM or log management tool, anomalies may be detected after the fact. One advantage to this method is that multiple flows can be compared—against each other and against other security events—to detect more complex threats such as multi-vector attacks, low-and-slow attacks, and blended attack scenarios.

Another useful tool for network monitoring and analysis is a network forensics tool such as Netwitness Investigator or Solera Networks’ DeepSee products. These tools capture all network traffic, storing it as a base of forensic evidence that is extremely useful to investigators should a breach occur. Network packet capture does require an investment in storage, but the return can be well worth it, especially if used by an advanced cyber security team.

Data loss prevention (DLP) is better known for securing financial information in the banking industry, or patient information in the medical industry, but is just as applicable to the Smart Grid. DLP provides detection of sensitive information—both at rest and in motion—and can prevent this information from being stolen. In terms of the Smart Grid, DLP has a less obvious role. However, just like in other critical industries such as Smart Oilfields used by modern oil and gas companies, Smart Grids rely on sensitive information that must be protected. DLP can prevent it from being transmitted outside of a protected zone via the network, and can even prevent sensitive data from being saved to a USB drive or other removable media, printed, or otherwise extracted. This information could include the following:

• Distributed measurements from PMUs, PDCs, the metering infrastructure, etc.

• Power consumption information or other data with potential privacy concerns, from meter data management systems, AMI headends, HEMS, and other devices.

• Customer finance information from billing and payment systems.

• Energy production, load, and shed information used by trading systems.

• Purchasing records or other information containing specific device models of gateways, PLCs, and other substation and field devices, which could be used to research and develop a custom threat.

As can be seen in Figure 6.2, not every control fits in every area, and the needs of a DLP solution for back-office systems will be very different from those intended for SCADA systems. Still, there are numerous areas where security may be implemented to separate, segment and control network activity between the numerous and diverse systems of the Smart Grid.

Protecting data and applications within the Smart Grid

Protecting the data and applications being used within the Smart Grid means understanding a lot about how the Smart Grid works, and is one of the reasons this book has focused so heavily on the importance of the interconnectedness of the grid. In a Smart Grid, data protection requires as follows:

• Being aware of all the data and applications that are being used within the Smart Grid, including the following:

• Where automation logic resides, what it controls, and how.

• Where measurements are being taken, and how those measurements are being used.

• Where management systems—including SCADA, EMS, and other systems—reside, what they manage, and how.

• What business applications are being used, how they utilize or depend upon grid operations or measurement data, and how they obtain that data.

• Being aware of where repositories of data reside, and how they are stored (i.e. a database).

• Being able to collect that information in a format that is relevant to digital cyber security, even if the data spans multiple domains or zones.

• Being able to analyze and assess that data in a meaningful manner, to detect indications of cyber risk and threat.

• Being able to articulate that analysis back to the many stakeholders involved in Smart Grid operations.

This requires a very comprehensive understanding of the grid as a whole—a daunting task, but one that can be made somewhat easier through technology. This is because the same security countermeasures that protect data (SIEM, Network DLP, etc.)are first and foremost monitoring tools designed to obtain and analyze information from the network and/or from the devices within the network. By using these tools purely for information-gathering, a baseline can be established that can help identify where important data resides. For example, by monitoring network traffic with a SIEM, active information flows using industrial protocols can be discovered, in turn identifying the source and destination IP addresses of those communications—in turn, building a list of active automation devices. Monitoring database activity—using a network-or host-based database activity monitoring tool—can identify systems both inserting new data, and accessing stored data, again identifying dependent systems and applications (a network-based database monitoring tool may even be able to identify unknown databases by detecting SQL traffic to those databases, which is very useful for identifying databases that might have otherwise been overlooked).

Note

One of the most difficult tasks in securing industrial control system networks is that there is zero tolerance for any “false-positive” results that could negatively impact the performance of the network and its connected devices, either in terms of availability or integrity. What adds to this challenge is that in many cases, detailed knowledge of the network architecture in terms of not only connected devices, but also the communications that occur between devices is often incomplete or unknown. There are many tools that can be deployed in more traditional office networks, but these same technologies could prove to be disastrous when implemented on sensitive, time-critical ICS networks.

To help solve this dilemma, a team of researchers from Idaho National Laboratory (INL) sponsored by the Department of Energy—Office of Electricity Delivery and Energy Reliability (DOE-OE) and funded by the National SCADA Test Bed Program, initiated the Sophia project. Sophia is a passive, real-time tool for inter-device communication discovery and monitoring of active elements in an ICS architecture. Sophia monitors network traffic and extracts the source, destination, and port sets between ICS components. These “conversations” are stored in real-time to establish a list of conversations that are valid, effectively “fingerprinting” the ICS network.

Once the initial fingerprint is identified and accepted, Sophia continues to monitor and capture conversations, and is able to generate alerts on any conversation or device that is not a part of the system fingerprint. This effectively creates a form of “network whitelisting” application, where it knows what is normally allowed, and raises alerts when anything outside the norm occurs.

The data and application protection systems illustrated in Figure 6.3 include as follows:

• SIEM or security information and event management systems are information management systems designed to collect information from devices, networks, and applications. SIEMs are often focused on security events from network and endpoint security products (as described above), but can also collect application logs, operating system logs, and network flows (an audit of network communication details from the switching or routing infrastructure). This allows a SIEM to identify risks and threats against applications and data based on analysis of both internal system data and external global threat data.

• Network DLP prevents the loss or theft of data across the network by detecting specific types of data (or specific data that has been tagged or flagged in some way) that are “in motion.” i.e. Network DLP can detect data that is being used by applications and traversing the network. For example, detecting when sensitive data are retrieved within a query from a remote database console, embedded in a file transfer, or attached to an email. Network DLP is very useful at the perimeters of data centers that house substation automation systems, energy management systems, and other systems that contain information that could in and of itself pose a threat to the larger operations of the grid. In other words: where the “data” originate or are used by a command and control system such as a SCADA server, where the data can be used to manipulate operations and cause a direct threat to the infrastructure.

• Database Activity Monitoring or “DAM” does exactly what it says: it monitors activity to and from databases, either via in-line inspection of network traffic, or via a host-based agent that monitors database activity locally. DAM is extremely useful for both monitoring and controlling the access to the data stored within a database, but it can also protect against database-specific exploits, such as Slammer, that could compromise the database server entirely.

image

Figure 6.3 Applying data integrity and protection controls to the Smart Grid reference model.

Notice in Figure 6.3 that SIEM can be deployed both for local information management or centralized. This allows SIEM to be used within secure and isolated facilities, or in broadly distributed systems. This is crucial to obtaining situational awareness across zones and will be discussed further under “Situational Awareness,” below.

What types of data should be protected? Some examples include as follows:

• SCADA project files: used by SCADA severs; certain gateways and controllers; the software development environments (SSEs) for HMI console SDEs, etc.

• Measurement data stored within: PDCs; meter data management systems; data historians, etc.

• Personal data about customers or end users, stored in: customer service/CRM systems; billing systems; the advanced metering infrastructure, etc.

• Information about the grid, including the following: specific device models of assets and infrastructure, from purchasing systems or from engineering diagrams; Device node information obtained from industrial protocol traffic between devices; PMU location data obtained from GPS references stored at the PDC, etc.

This also warrants a discussion of intellectual property and the theft thereof, as certain IP could be used by an attacker to threaten specific systems. For example, information about grid operations and metering could allow a malicious actor to discover weaknesses in the equation of supply, transmission, distribution, and consumption of power. Is the disruption of bulk generation required to disrupt power delivery to a specific area, or could a distributed generation facility be targeted instead? If so, the attacker could cause as much damage to the overall grid operation via a (relatively) easier target. This is because in any complex system bottlenecks and weak points can be found, and be exploited. Examples of intellectual property that could be stolen and used for malicious purposes include the following:

• SCADA severs—automation logic and operations data including process schematics and flow diagrams that could be used to disrupt grid operation.

• EMS—load, outage, response, and similar data that could be used to identify periods of stress, and ultimately identify the best way to impact grid reliability and recovery.

• Demand response, AMI, and HEMS—data concerning how energy is delivered to the end user and how it is used, identifying peak periods of use, etc. could be used to impact end-users. For example, issuing remote disconnects when demand for power is high, etc.

• End-user information—consumption data, home privacy, HEMS, and similar data used to obtain personal data about in-home habits, up to and including what appliances are being used in-home and when. See Chapter 4, “Privacy Concerns With the Smart Grid.”

Situational awareness

“Situational Awareness” refers to process of perception, decision and action that enables the assessment of and reaction to a situation. In the context of cyber security, the first step (perception) requires the collection and aggregation of information from a variety of digital systems, usually by a security information and event management system (SIEM) or similar tool. The more information that can be collected, the greater the visibility the SIEM has into the environment it is protecting (i.e. the better the “perception” of the systems). Once perception is obtained, the next step is to make educated decisions based upon the situation. SIEM tools provide many automated mechanisms to make decisions about potential risks and threats to a system, and also provide a console through which a security professional can manually assess the situation as well. Automated capabilities include the following:

• Correlation of collected data against known threat patterns, to detect more complex or blended threats (e.g. multiple failed login attempts, followed by an eventually successful login attempt, followed by a network port scan originating from the same device, which may be indicative of a brute force attack).

• Calculation of baseline activity and trends, and the ability to alert when statistical deviations occur (very useful for detecting the symptoms of an attack that have affected operations).

• Tracking or “scoring” risk associated with specific assets, users, or applications and issuing an alert when risk is high.

• Filtering large amounts of data against defined criteria, and/or cross-referencing information against outside information sources such as: threat feeds; CERT activity, malware databases.

Note

The assumption is made, of course, that a logging or security monitoring tool such as a SIEM will be used to facilitate and automate the volumes of information analysis required to obtain situational awareness in a Smart Grid. This is because the amount of data can be staggering and is definitely too much to depend entirely upon human assessment. Much of these data will originate from or be relevant to the industrial automation systems used within the Smart Grid. The monitoring and management of these data falls outside the scope of this book but is covered in the book “Industrial Network Security,” by one of this book’s authors, Eric Knapp.

What to monitor

What needs to be monitored? For effective situational awareness within a Smart Grid, a minimum of the following should be monitored:

• All endpoint activity of the servers, gateways, controllers, field devices, etc. Essentially, if it has a network interface and is connected in any way to the grid it should be monitored.

• All network activity between any and all of these devices, obtained from the network infrastructure itself (switches, routers) and/or from network probes (network IDS or IPS is useful for this function).

• All data produced by and/or utilized within the grid, especially readings from measurement devices, device status information, protection status, phasor data, etc.

In other words, abide by the 3 × 3 model referenced in Chapter 5, “Security Models for SCADA, ICS, and Smart Grid” by monitoring endpoint, network, and data layers across all domains.

To actually perform this level of monitoring, a combination of log collection (from those devices and applications producing logs), event collection (from those cyber security devices that produce security events), and the direct inspection of the communication paths between devices (using network-based detection tools such as IDS, IPS, DLP, and DAM). For those devices that do not produce logs, do not utilize security countermeasures that could create events, and cannot be inspected by a network-based tool—for example, the embedded devices and field devices used within substation automation, line protection, metering, etc.—another means of information collection is required. One example of how to obtain information from these devices would be to integrate with the PDCs, SCADA servers, historians, and/or other systems that are already monitoring these devices, so that all activity can be passed to the SIEM. This will typically require customization of the SIEM, to provide the necessary integration with these specialized systems. However, it should be noted that many of these “specialized systems”—DM, EMS, HEMS, Substation Gateways, SCADA servers, et al.—will most likely produce logs of their own, relevant to the operating of that device. These logs should be collected as well, to provide awareness as to when users authenticate to the server, activity that is performed, changes that are made, etc.

Where to monitor

Based on the advice given so far, the trite answer is, “everywhere!” However, there’s a broader concept of “where” that applies to the grid. For example, what domains should be monitored? What zones? Well, the answer is still trite, and it’s still “everywhere!” However, the concept of domains and zones by definition means that the systems will be separated from each other, and often there will be (or at least should be) hard cyber security perimeters between them.

Therefore, in many cases the information will need to be collected locally within a domain or within a specific zone within a domain. In some cases, there will be no network path at all to a centralized facility (i.e. the mythical “air gap”), and in many cases, there will be hard security restrictions in place. For example, in nuclear generation facilities, there is a clear one-way communication requirement between secure zones and unsecure zones: if a network connection is used at all, it must only support outbound communication, so that information can be obtained from a reactor for use by business systems, but malware or malicious control cannot be sent from the insecure location back to the reactor. Local information can therefore be collected by a local SIEM that is deployed within the secure facility. The collected data can then be sent, one-way over a data diode or unidirectional network gateway, to a SIEM located centrally.

In most areas, however, establishing a secure connection for the bi-directional communication of collected data is enough. Fortunately, most SIEM tools on the market today support encrypted transfer of data when deployed in a distributed manner—many of which are actually certified “secure” under standards such as common criteria (which certifies system security) and FIPS 140-2 (which certifies secure and cryptographic boundaries).

Once information is being collected from within and between zones, simple correlation rules within the SIEM can be used to detect policy violations, for example, controllers communicating with IEDs in a separate zone, or administrators authenticating to servers in a separate domain. These types of deviations are clear indicators of risk—someone or something is behaving in a way that may seem normal but has not been explicitly allowed—but also of threats. This is because most methods used by attackers will violate these strict policy definitions. The trick is to “teach” these policies to the SIEM so that it can detect the violation. This requires configuring the SIEM with domain and zone knowledge, as well as user roles and responsibilities, and any other relevant policy information. This can be done by the following:

• Establishing lists of IP addresses or IP ranges (subnets) that are authorized within a defined zone, typically by defining and populating variables within the SIEM.

• Defining access rules (this controller is only allowed to communicate with these field devices) within the SIEM, typically via correlation rules assessing network flow data.

• Establishing users, groups, and roles as actionable variables within the SIEM, typically done automatically through integration with LDAP or Active Directory, but sometimes requiring manual variable definitions.

• Defining allowed user/asset interactions, again using correlation rules to assess established user knowledge against defined zones.

Note

Segmented monitoring and situational awareness

Just as monitoring can be distributed within and between different systems, zones and domains, security analysis can also be distributed. The security management tools that provide situational awareness—SIEM, log management systems, etc.—typically consist of separate components for the collection of data (“visibility”) and the analysis of data (“awareness”). Most commercial solutions allow either component to be distributed, feeding back into one or more centralized location. For example, a collector appliance might be deployed across several plants to feed data back to a centralized analysis appliance.

But what about areas that require a first-line defense capability? In these areas, the “analysis” portion of the solution can be distributed as well, providing the critical facility with the ability to collect and analyze local data. At the same time, those data can still be fed back to a central location for further analysis;—however, now there is localized situational awareness as well, to support incident detection, investigation and remediation inside of the secure facility. This type of distributed situational awareness is widely adopted in high-security facilities such as nuclear generation plants, but is also applicable within the Smart Grid, providing localized situational awareness to crews within substations, dispatch facilitates, etc.

This methodology can also be extended in the other direction: feeding data up even further, to provide remote situational awareness to a trusted third party who provides managed security services.

Use case: defending against Shamoon

Shamoon (W32.DistTrack) is malware that was first detected by Symantec on August 16, 2012. It was designed for data-theft and destruction, and targeted the Oil Industry. Consisting of a Dropper (the original infection that installs the additional modules of the malware), a Reporter (responsible for sending information back to the attacker) and a Wiper (which overwrites the master boot record of its target to render the system useless),3 Shamoon was able to cause significant damage. Once the malware successfully breached a system, it spread quickly, stealing information along the way, and then leaving behind a wasteland of completely wiped systems. Because the master boot record was overwritten, system storage was effectively wiped beyond recovery, requiring that every infected machine needed to be completely reimaged, and the damage of data theft was compounded by data loss. Luckily, there were no known incidents where Shamoon interrupted the operations of a target company, but the damage to enterprise systems was severe. While Shamoon (at the time of this writing) has focused exclusively on the Oil industry, the malware could easily have targeted the business servers in any industry, including the Smart Grid.

The ICS-CERT responded quickly, issuing a Joint Security Awareness Report (JSAR-12-241-01) about Shamoon, and offering a comprehensive mitigation strategy. Looking at that mitigation strategy here, we can apply the concepts of endpoint, network, and data protection to extend the ICS-CERT’s advice into a more actionable cyber security plan. Table 6.1 provides recommended countermeasures to help address each mitigation, as well as the expected result of employing that countermeasure.

Table 6.1

Mapping Cyber Security Countermeasures to the JSAR Mitigation Recommendations for the Shamoon Virus

Image

Image

Image

Summary

Developing a security plan based upon the Smart Grid cyber security reference model described in Chapter 5, “Security Models for SCADA, ICS and Smart Grid” will highlight where security controls need to be deployed. At this phase, it is possible to assess these various Smart Grid systems and components to determine where the greatest risks lie, and develop a security plan that will prioritize the security of those areas at the highest level risk. Common controls for endpoint, network and data protection are available today to support such a plan. While popular controls—such as application whitelisting, vulnerability detection via network or host IDS, NAC and SIEM—are often seen as “silver bullets,” many controls are applicable and will provide the best protection when used as part of a defense-in-depth strategy. A comprehensive and mature security plan might include host and network DLP, application content monitoring, purpose-built industrial control system firewalls and protocol filters, and more.

References

1. McAfee Labs. McAfee threats report: second quarter 2012. Santa Clara, CA: McAfee. Inc.; 2012.

2. Kube Nate, Damisch Alexander. Mitigating industrial control systems vulnerabilities through intrusion detection systems/intrusion prevention systems signatures. In Department of homeland security industrial control systems joint working group spring conference. Georgia: Savannah; May 9, 2012.

3. Industrial Control System Cyber Emergency Response Team (ICS-CERT). Joint security awareness report JSAR-12-241-01B—Shamoon/disttrack malware. update B. US department of homeland security; October 16, 2012.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset