How Vulnerable Are We?

8/1/2008 August 2008

IMAGINE THESE THREE SCENARIOS: One, terrorists use a stolen nuclear weapon to destroy the George Washington Bridge—the country’s busiest—an iconic structure and a critical national transportation link. Two, terrorists launch a coordinated attack on the country’s electronic financial infrastructure, using previously unseen, self-propagating computer viruses to shut down credit card and checking account systems throughout the United States for one week. Three, terrorists introduce a botulinum toxin into milk at half a dozen dairy processing facilities around the country, causing hundreds of deaths throughout the United States.

Each of these three scenarios represents a serious threat, but which is the worst, and in what terms? Lives lost? Economic impact? Psychological effect? Which is most likely to occur? More to the point, if you had $1 billion to defend against all three, how would you spend the money?

These kinds of questions are familiar to anyone who has ever conducted a risk assessment. The difficulty of answering them provides a window into why analyzing risk is never easy, even if the assessment concerns only a single office building or data network. Now, consider the scope and challenge of assessing, ranking, and addressing interdependencies of the risk posed to the 18 sectors that own and operate the nation’s critical infrastructure. That is the daunting task now facing government, academia, and the private sector as they work together toward the goal of a national comparative risk assessment to help ensure that the country’s finite resources are best applied to mitigate risk. Complicating the mission is the continuing lack of consensus over everything from which methodologies yield the best risk analysis to the feasibility of comparing risk across sectors.

Methodologies

Soon after 9-11, the Port Authority of New York and New Jersey (PANYNJ) engaged the U.S. Department of Justice for help in developing unified methodologies to leverage existing risk assessments of port and mass transit infrastructures. The results were the Transit Risk Assessment Methodology (TRAM) and the Maritime Assessment and Strategy Toolkit (MAST). TRAM, for one, is now used by Amtrack and at least 15 major transit authorities.

Fast-forward to 2006, when the Department of Homeland Security (DHS) issued the National Infrastructure Protection Plan (NIPP), the country’s framework for critical infrastructure and key resource (CI/KR) risk mitigation. The NIPP calls for a six-element risk mitigation framework that begins with setting security goals, followed by asset identification, risk assessment, prioritization, implementation of protective programs, and finally, evaluation of effectiveness.

In the NIPP, DHS expressed its intent to let sectors leverage existing assessments, such as with TRAM and MAST, rather than re-invent the wheel. Earlier, DHS considered use of a single methodology. Shortly after DHS was formed in 2003, the agency hired ASME Innovative Technologies Institute, LLC, (ASME-ITI) to develop the methodology. The company called it Risk Analysis and Management for Critical Asset Protection (RAMCAP), and trademarked that term.

RAMCAP was developed as a start-to-finish risk mitigation tool, with comparison in mind, incorporating common metrics, common terminologies, and a common format for reporting results. The steps encompassed by RAMCAP include asset characterization, risk assessment, and risk management through security enhancement.

The RAMCAP model is being applied to major fixed critical infrastructures: nuclear power, nuclear waste, petrochemical refining, liquid natural gas, and chemical manufacturing, according to ASME-ITI. But it has come under criticism from some security professionals who consider it too difficult to apply.

Qualitative v. quantitative. Risk assessment methodologies rely on mathematical and logical constructs. Assessments based on “hard” metrics like financial cost or lives lost are referred to as quantitative. Assessments based on general or subjective terms like “high,” medium,” or “low” risk are called qualitative risk assessments. Most, however, muddy the two, and are, therefore, considered “semi-quantitative,” relying on numerical metrics to gauge intangibles, like psychological impact.

Henry H. Willis is a policy researcher with the think tank RAND Corp. in Pittsburgh. He conducts cost/benefit analyses of protective measures, seeking to determine the exact return on investment for different options. Willis is firmly in the quantitative camp. He says that without a quantitative component, whatever you are doing isn’t truly a risk assessment.

Willis refers to qualitative methodologies not as risk assessments but as “qualitative risk scoring and sorting models,” and he explains that their value when it comes to trying to assess the value of protective measures is limited.

Mark Abkowitz—a civil and environmental engineering professor at Vanderbilt University in Nashville, Tennessee, and author of Operational Risk Management: A Case Study Approach to Effective Planning and Response—acknowledges the clumsiness of graded qualitative risk assessments, but he says that we should embrace them anyway, at least for the time being. He advises that security professionals charged with this responsibility select out the highest-risk sites and assets from each sector on a graded scale, and aggregate the risk “up to the national level,” rather than trying to work across complex methodologies.

“You’ve got to crawl before you can walk, and walk before you can run. It’s not that we can’t do it, it’s that we can’t be that precise about it right away. I just don’t think that it’s possible yet, and afterward, you’d have to defend that assessment with the rationale you used to do it,” Abkowitz says.

One criticism leveled at RAMCAP was that it was too quantitative and heavily reliant on complex algorithms.

Gathering so-called hard quantitative data can be tricky. Consider the challenge, for example, of trying to determine the financial cost of potential events, which is acritical step in determining the return on

investment (ROI) from potential mitigation techniques. Subjective calculations are inevitably involved in the assignment of cost when predicting potential damages, such as human casualties and business disruptions.

To minimize the problem, the value assigned to consequences should be based on historical data, as is the case with the insurance industry’s actuarial tables. For example, Abkowitz and peers from Battelle and the Federal Motor Carrier Safety Administration relied on a fairly solid statistical foundation, including 50 years of federal statistics on high-consequence hazmat (hazardous materials) transportation incidents in developing their comparative risk assessment of hazardous material and nonhazmat truck shipments.

Abkowitz acknowledges, however, that it’s an “imperfect science.” Historical data is never absolute. In the case of hazmat accidents, he notes, by no means are all of the country’s hazardous materials accidents reported to authorities. Nor does even the most sophisticated actuarial table quantify the value of a human life, if that’s even possible. And the past is not always predictive of the future.

As everyone continues to look for ways to tackle the problem, DHS says that it will not favor any specific risk analysis methodologies. “The goal is not to grade methodologies, but to find the best practices so we can plug the divergent strategies into one another,” says Amy Kudwa, DHS spokesperson.

Scenarios. DHS conducts assessments with help from the Department of Defense, and a framework developed by the Institute for Defense Analyses (IDA). Mirroring the familiar definition of risk as the factor of consequence, vulnerability, and threat (R = C x V x T), which is referenced in NIPP, the IDA methodology multiples probability of attack (threat), by probability of success (vulnerability), by consequences (either monetary loss or mortality) to determine risk.

The first two elements of the equation—probability factors—were determined by analysts, chief among them former military personnel, tasked during their careers with both the destruction and protection of critical infrastructures.

While unable to discuss the content of the equations, Edmon Begoli, a researcher and lead application architect at Tennessee’s Oak Ridge National Labs, says that the values for each of the factors are “experientially based” and “empirical.”

The scenarios focus on the high-likelihood of events, Begoli says. “We do not, however, neglect things less likely to succeed, because we don’t want to miss anything,” says Begoli.

The question of what type of scenario to include is one of the sticking points when analyzing risk. One security professional who had worked with companies trying to carry out some of what DHS was asking for in terms of risk assessments said that the scenarios DHS wanted tested were “ridiculous.” He went on to explain: “We like to use the term highest probability worst case scenario instead of worst case scenario. There’s a huge distinction there. And when they start dealing with worst-case scenarios…. They could double or triple the length of risk assessments and end up with iffy results.”

The complaint is a common one among risk assessment experts: that DHS has focused on imagined worst-case consequences—a necessary exercise as 9-11 illustrated—but that it has done so to the exclusion of threat, as determined by history and adversaries’ likely capabilities.

Terminology. The debate over what constitutes a threat scenario worth testing is just one example of the disagreements that surround definitions. When experts discuss risk assessments, they frequently invoke two clichés: the practice is “an art, not a science,” they say, and, “the devil is in the details.”

Those thorny details begin with semantics. What is risk? Every expert has a different definition. While clearly differentiated from the term “security” and to a lesser degree “threat,” some assessment experts substitute the term “vulnerability.” Experts further offer varying definitions for risk assessment, risk analysis, risk evaluation, and risk categorization.

Given this cacophony of terms and methodologies, the two-year-old Security Analysis and Risk Management Association (SARMA) has set out to formulate a common professional lexicon based on a wiki-software approach, a user-edited online reference. (To view SARMA’s wiki, visitwww.securitymanagement.com)

Joel B. Bagnal, White House deputy assistant to the President for homeland security, said at a recent conference hosted by SARMA that President Bush may issue a directive ordering establishment of a “comprehensive risk management paradigm,” which, like the nascent SARMA lexicon, would attempt to provide a common set of concepts and definitions for risk assessment efforts in which the federal government holds a stake.

Such an order, likely to take the form of a homeland security presidential directive, would not lay out terms, but might direct DHS and relevant agencies like the departments of Energy and Defense—old hands at risk assessment—to devise a common federal paradigm.

Since 2003, ASIS International has offered security professionals The General Security Risk Assessment Guideline. The succinct 22-page guideline offers direction on qualitative assessment, equations for quantitative assessment of probability and cost, a brief lexicon, a process flow chart, and advisories for effective implementation. The guideline has been certified as qualifying for liability protection under the Supporting Anti-Terrorism by Fostering Effective Technologies (SAFETY) Act. (To view the risk assessment guideline, visitwww.securitymanagement.com)

Transparency. DHS estimates that there are about 200 risk assessment methodologies in use. Many are not only esoteric, they are also opaque, because they are often proprietary. For that reason, practitioners typically refer to a methodology’s specific risk formula as “the black box.”

That’s problematic. Martin Clauberg of the University of Tennessee, author of Comparative Risk Assessment, Concepts, Problems, and Applications, and his co-authors, warn against basing public policy on any risk assessment methodology that neither the public nor policy makers can understand. With that in mind, Clauberg says, the best methodology is “completely transparent. You show how you got there, and you present the information as clearly as possible to decision makers.”

Some sectors have had to develop their own methodologies to measure the components of risk. The commercial facilities sector, which encompasses everything from apartment buildings and shopping malls to stadiums, now uses the Vulnerability Identification Self-Assessment Tool (ViSAT) to identify vulnerabilities, while in 2007 the food sector began use of the Food and Agriculture Sector Criticality Assessment Tool (FASCAT) which, as co-developer Lyle Jackson of DHS explained at a recent SARMA conference, is used to identify critical nodes and chokepoints in the system. It is serving as a prerequisite to implementing a mature methodology to gauge vulnerabilities and eventually risk.

In the NIPP, DHS officials stated that they would only consider existing sector risk assessments if those sectors opened up the books to share their methodologies, providing what they called “transparency” so that the underlying credibility of the method for assessing risk could be confirmed. At the same time, DHS does not plan to share what it is working on in terms of future methodologies: “Our adversaries would find a lot of value in that information,” Kudwa says.

“I don’t understand why you couldn’t reveal the methods you used to some sort of expert panel to do a third-party review and have that party identified, so you know the panel . . .[has] credibility,” Abkowitz says.

Yet even when fully transparent, nearly every element of a risk assessment is assailable, because nearly all bear the taint of human value judgments, notes Mary Lynn Garcia, a physical security researcher at Sandia National Laboratories in New Mexico.

Clauberg and his co-authors acknowledge that perfection in risk assessment is unattainable and that none would ever be accepted as scientific fact by a purist. In the case of exclusively quantitative risk assessments, none can incorporate a full universe of relevant factors and data. Qualitative risk assessments, at the other extreme, are governed by muddy semantics and arbitrary judgment.

Howard Safir, former New York City police commissioner and chairman and CEO of GlobalOptions Group’s security consulting firm SafirRosetti, says that in the private sector, smart firms can and do address the concern about a risk assessment’s validity by following the medical profession’s mantra and getting a second opinion: They seek peer review of outside risk assessments.

The Transportation Security Administration (TSA) also gets second opinions, as explained at the SARMA conference. TSA’s Matt McLean told attendees that the agency’s Risk Management Analysis Process (RMAP) is subject to “periodic expert reviews.”

Jay Robinson, a risk assessment manager with Alion Science and Technology, who also spoke at the SARMA event, said that his company takes a similar approach. Its “datasets,” or the specific factors and metrics applied in a given client’s risk assessment, are put before a panel consisting of company experts and, in some cases, outside subject matter experts, who can critique them.

Cross-Sector Comparability

In a sense, all risk assessments are comparative in that they consider two or more factors relative to one another. Sometimes the comparison is within a sector. For example, would a physical intrusion or a cyberattack pose greater risk to a single power plant? Sometimes it’s across sectors, as when the task is to determine which site carries greater risk of terrorist attack—a bridge or a chemical plant?

Clauberg and his co-authors refer to a familiar comparative risk assessment of the modern era: whether it’s safer to fly or drive. As comparative risk assessments go, it’s a fairly simple and straightforward formula: mortality rate per mile traveled. A mile is a mile, regardless of the means, and mortality is the ultimate “hard” data metric.

In the context of the national comparative risk assessment, however, the comparison is an “apples-to oranges” one, contrasting disparate sectors—like national monuments versus agriculture—sectors that, as explained, use vastly different risk assessment methodologies.

“It’s been kind of the bane of risk assessments, how you put consequences into equal terms, when a lot of these things don’t lend themselves to equal terms,” says Abkowitz.

DHS, as noted, indicates that it is committed to accommodating different sectors’ disparate methodologies into a system to rank national risks comparatively. The NIPP, and its sector-specific annexes completed in 2007, called for establishment of 18 sector coordinating councils (SCCs) to oversee NIPP implementation, including risk assessment and asset prioritization, within sectors, all coordinated nationally by DHS and the private-sector Partnership for Critical Infrastructure Security (PCIS).

Some experts, however, including Garcia, question the whole notion of cross-sector rankings based on disparate methodologies. Even apples-to-apples comparisons of different quantitative methodologies are tricky, she says, noting that even the most sophisticated quantitative or semi-quantitative risk assessment incorporates subjectivity and inference.

An overall national comparative risk assessment methodology, even one that has “handicapped” risk values from different sectors and methodologies by weighting them mathematically, would, in scientific terms, represent another finger on the scale.

“That’s placing subjectivity on top of subjectivity,” Garcia says, arguing that comparing assessments that rely on different methodologies is a nonstarter. “There’s no good scientific method that says, in the middle of the analysis, you can change tools,” she notes.

Garcia compares risk assessment to algebra, and says that in order to conduct a sound comparison across sectors, there must be at least one common factor metric to carry between equations. The simplest is mortality; slightly fuzzier: cost.

DHS’s existing assessments based on the IDA methodology are interoperable, Begoli says, because they are based on a common threat scenario across individual assessments. Begoli describes advanced comparative risk assessment, across sectors and models, as an “emerging” science.

Interdependency. Another complication concerns interdependencies among sectors. That refers to the follow-on impact of any disruption. For example, if a dam bursts, the direct impact is the flooding; the indirect impact would be the loss of electricity generated by the dam.

Willis of RAND says that there are those who believe the secondary, cascading effects of an attack across sectors around the country could prove more devastating than the initial physical impact. One point experts do agree on: The country cannot properly determine the value of an infrastructure asset or sector without considering its interdependence on others; none exists in a vacuum.

For answers, DHS has turned to big brains, and even bigger computers, with establishment of its National Infrastructure Simulation and Analysis Center (NISAC), run primarily out of Sandia and Los Alamos National Laboratory, also in New Mexico.

NISAC conducts its work using existing data to map and model critical infrastructures, based on knowledge of historical events and behavioral options to estimate the consequences of potential incidents, considering factors ranging from the realistic to the abstract.

Scenarios already modeled, according to the center, include pandemic influenza. They consider factors such as virulence, immunization rates, and drug effectiveness to estimate various levels of severity and the impacts on critical infrastructures. NISAC has also modeled disruptions to the nation’s oil supply network and its banking and finance systems.

Real-world events provide the foundation for the models, and they also inform existing modeling methodologies. Aspects of hurricanes Katrina and Rita and their aftermaths, for example, highlighted possible problems in event models, according to the center.

NISAC’s Agent-Based Laboratory for Economics (N-ABLE) specifically models the supply chain and economic events (and public policy) that affect infrastructure, while the Railroad Network Analysis System (R-NAS), and the Air Transport Optimization Model (ATOM) deal with the transportation sector, just to name a few of the constructs being explored.

Despite imperfections and daunting challenges, the goal of assessing risks nationally, across and among sectors and assets, is too critical an exercise to forgo. “We don’t have resources to protect all of the assets that make society operate,” Abkowitz says. “So you have to go through some sort of process to say, ‘These resources are more valuable than these, and what can we do to harden them.’”

Joe Straw is an assistant editor at Security Management.