Control System Security: Security Basics: Jump Boxes

(This article was originally published on the Findings From the Field blog.)

The initial ballot on proposed revisions to NERC-CIP 005-4 is complete and the results and comments have been posted. Votes for the negative carried the day. I hope the proposed changes can be salvaged because they do have value. The revisions would require sites to use a “remote access server” or more succinctly, a “jump box” to provide access to critical assets inside an electronic security perimeter. The measures, described in more detail in a Draft Guidance Document, are intended to address serious problems with remote access mechanisms observed at NERC-CIP sites. Industrial Defender security assessors report that they agree with NERC – they also see weak and misconfigured remote access mechanisms routinely, issues that the proposed regulations should help address.

Problems with Remote Access

Before diving into the solution, let’s understand some of the problems security assessors see routinely. The first problem is summarized “I have a VPN, so I’m safe, right?” Let’s say you have an engineering workstation on the corporate network connected by a VPN to the critical network. The VPN lets all ports on the engineering workstation connect to any host and any port on the critical network – it is as if the engineering workstation is “right there,” directly connected to the critical network when the VPN is active. Your problems:

Any malware which has compromised the engineering workstation can now propagate freely to any of the typically less-well-secured hosts on the critical network. You may trust the engineer, but should you trust the workstation?

A keystroke logger on the workstation can capture the engineer’s VPN user name and password and use them to access the critical network from another machine, at a later date.

Malware on the engineering workstation could open tunnels to command and control servers elsewhere in the enterprise or on the internet while the VPN connection is established, giving distant attackers full, real-time access to the critical network.

Even if you restrict which ports and which hosts on the critical network are available to the engineering workstation, malware can often still propagate directly to the more vulnerable critical assets.

Another common problem is remote control software like Remote Desktop or VNC. Let’s say the engineering workstation uses VNC to access a handful of hosts in the critical network. The perimeter firewall opens only the VNC ports to only the engineering workstation. What’s wrong with that?

Most open source VNC implementations are not encrypted – user names and passwords are passed in plain text and are easily captured.

Software on critical networks, the operating systems and remote control servers included, tend to be less well patched than software on enterprise networks, leading to the possibility of exploiting vulnerabilities exposed through even the one port if the engineering workstation is compromised.

Once logged into a workstation on the critical network, an adversary who has stolen VNC credentials has access to the entire network.

A keystroke logger on the engineering workstation would, again, capture credentials.

Remote Access Servers/Jump Boxes

The solution the new regulations require is a “jump box.” A simple jump box is a workstation on a DMZ network segment with access to those critical hosts and ports the box needs access to. If a remote user wants to work with critical assets, they first log into the jump box over a remote access mechanism like Remote Desktop. Having logged into the jump box, the remote users connect to other machines on the critical network and do their work. Why is this better?

The jump box is heavily defended – most services are turned off, most software un-installed, the box is fully patched, automatic-updated, anti-virus-ed, whitelisted/HIPS’ed, anti-spyware-ed, host-firewalled, automatic-updated, etc.. You can implement every meaningful security mechanism on the jump box and keep it current right up to the second with patches and signatures, because the control system does not need the jump box to be running. If a faulty patch or signature causes the jump box to malfunction for a day or a week, the grid doesn’t go down – you simply have less reliable remote access for that period of time.

The only port open to the enterprise network is the remote control port, and you can defend that further with a VPN.

The jump box is on a DMZ and you can control which hosts and services the jump box has access to on the critical network.

Other Measures

By now some astute readers are saying “Aha!” because they’ve noted that we still have the notorious keystroke logger scenario to solve – a keystroke logger can steal jump box credentials just as easily as VPN or remote control credentials. The CIP 005-4 “two factor authentication” requirement addresses this scenario. CIP requires at least two-factor authentication – you can do more if you want. What most people will implement is “something you know” (a password) and “something you have” (a smart card). Either the perimeter VPN, or the jump box itself must challenge users for all of user name, password, and something else – usually a constantly-changing PIN from a smart card. A keystroke logger can catch the user name and password, but the PIN changes every time and so capturing it is worthless.

In addition, the Draft Guidance Document makes other recommendations which are not yet proposed as regulation. Where practical, the document recommends:

Configure the perimeter firewall to block all connection attempts from the jump box to the enterprise network or the open internet. The jump box is for accessing critical assets only.

Apply strong security measures to the workstations or laptops that are clients of the jump box – anti-virus, host firewall, etc.. The harder the client workstations are to compromise, the harder it is to gain the foothold necessary to attack the jump box.

If you use a VPN, disallow “split tunneling” which would allow malware on a VPN client workstation to simultaneously establish connections to the open internet and the protected network.

Consider issuing hardened corporate laptops to employees who must work at home from time to time, or even to vendors. This way, you control the security mechanisms on equipment accessing critical assets remotely, not your employees or your vendors.

Consider booting those laptops from an ultra-hardened CD-ROM operating system, as was recently developed for the DoD. This way, even if your client laptop has been compromised, the compromise cannot propagate to the read-only CD-ROM used to boot the laptop when connecting to the jump server.

To minimize damage from lost or stolen laptops, I would add to those recommendations BIOS passwords and strong hard disk encryption.

The Problem with Jump Boxes

With jump boxes though, as with most security measures, the old adage holds true: the more secure you make something, the less useful it becomes. There are problems with using jump boxes.

The first complaint will be from your remote users about how slow everything has become. These users will often be accustomed to running their applications on their laptops or workstations, and fetching a small amount of real-time information over a VPN. These users are accustomed to seeing their applications run quickly, then suffer a short delay to fetching a little information from the critical network, and then run quickly again. With a jump box, you run the Remote Desktop or other client locally and you run the application remotely, on the jump box. Remote desktop kinds of applications send large amounts of information back and forth across the wide area network to the jump box, even with compression. These types of applications are really pulling bitmaps of portions of the jump box “screen,” and rendering that image on the client workstation very quickly. Every mouse-click results in delays in local rendering. Even with high-speed WAN links, you face delays, because every mouse movement and every mouse click must be transmitted to the jump box and screen images must be sent back to the client. Some remote control applications are better than others at minimizing these delays, but there really is no getting around this problem, your users will need to get used to it.

The next complaint you’ll get from your purchasing people is cost. If you have many remote users who need simultaneous access to critical assets, each simultaneous user will need a jump box or a terminal server type session, or a virtual machine. With many simultaneous users, the costs add up. Further, once these users see how slow everything has become, you may need to purchase additional communications bandwidth. To reduce latency you may need to lease additional direct lines between locations, rather than using centrally-routed communications. If your corporate IT team has not already invested in a multi-factor authentication system, you will have that expense as well. The result will be a dramatically more secure remote access system, but there are costs.

There are additional software and labor costs as well: to be useful for trouble-shooting the critical network, jump boxes typically need a broad variety of software installed on them. Licenses for that software must be purchased, and the software versions must be kept consistent with the versions of software being maintained on the critical network. Worse: you may have a large variety of applications that remote users need to run, and there may be compatibility problems between applications. It may be that you are not able to install all of the applications on the same jump box. You may need to set up even more jump boxes to accomodate incompatible software, and your users will then need to know which box to connect to when they need to use a particular tool.

Note: this discussion has focussed on “windows-style” jump boxes. The regulations apply just as much to command-line-oriented jump boxes like the ssh servers which are ubiquitous in the Unix/Linux world. Ssh jump boxes make sense for all of the security reasons that windows-style jump boxes make sense. Ssh jump boxes suffer from some of the same slowness in echoing keystrokes that is observed on windows-style jump boxes, but less so. Character echo slowness is rarely so serious that it is worth correcting through the purchase of additional bandwidth or dedicated lines.

The Future of the Regulation

As I said, I think the jump box concept has a lot of security value, but there are costs. The comments posted in response to the initial ballot contain a number of themes:

Many pointed out that regulations should describe “what but not how” and said that the proposed changes were too prescriptive,

Many pointed out that there was no definition of “remote access” or “intermediat system” and that the regulation was ambiguous without such definitions, and

Many pointed out technical objections about duplicating part of NERC-CIP 004 and other regulations in the proposed NERC-CIP 005-4.

The technical objections do make sense, and I hope they can be sorted out fairly easily. The concerns about definitions have merit too, especially the problem with defining “remote access.” If “remote access” was meant to include server-to-server communications, then the guidance document would need to add a section on requirements for proxies for industrial protocols like ICCP. Since no such guidance was provided, I think server-to-server communications were intended to be excluded from the scope of the new regulation and the wording should be adjusted to make this clear.

However, I can’t agree with the “what but not how” comments. The guidance document is quite prescriptive, but it is only guidance. The most prescriptive parts of the proposed regulations say that you must use “multi-factor authentication to establish remote access”, and that you must use an “intermediate device or system” for remote access to critical networks. Both concepts are state of the practice in enterprise security and I think would significantly improve the security of not only NERC-compliant sites, but sites in many industries.

We will need to wait and see what changes to the proposal NERC comes up with. There is an urgent need to strengthen remote access provisions in the regulation.

Control System Security

2010-11-03

Security Basics: Jump Boxes

No comments:

Post a Comment