Using M&S to Determine Cyber Survivability: Score Small and Let the Machines Do the Math

F-35A Lightning IIs assigned to the 95th Fighter Squadron, Tyndall Air Force Base, Fla., fly over the Gulf of Mexico.
U.S. Air Force Photo by TSgt. Betty Chevalier

The FY22 National Defense Authorization Act’s mandate to conduct full-spectrum survivability testing of U.S. air systems across all applicable threat domains—and within current resource constraints—necessitates the use of modeling and simulation (M&S) as a “universal integrator” between these domains (as illustrated in Figure 1) [1, 2]. This article focuses on the interaction of cyber analysis, cyber M&S, and cyber testing (represented by the bubble at the bottom left of the figure) and how to integrate these activities to enable and improve decision-making in cyber survivability assessments.

Figure 1. Campaign- and Mission-Level M&S as an Integrator for Full-Spectrum Survivability.
Figure 1. Campaign- and Mission-Level M&S as an Integrator for Full-Spectrum Survivability [2].
Unfortunately, despite widespread recognition of the importance of cyber survivability and measuring cyber risk in contested environments, there is currently little agreement on how best to measure these effectively [3–4]. Arguably, the first question that needs to be answered is, What are the units of cyber risk or cyber survivability? According to the Department of Defense (DoD), risk is defined as the “probability and severity of loss linked to hazards” [5]. Because the severity of loss that matters to DoD operations is loss of mission capability, Expected Mission Loss (EML) (as defined in Equation 1) becomes our metric [6].

Equation 1. EML Defined.
Equation 1. EML Defined.

With EML, we now know what two things we will need to measure—(1) the likelihood of a particular risk occurring, and (2) the mission impact in percentage mission lost if the risk does occur.

Doing math with likelihood, mission impact, and EML is uncontroversial once the inputs are agreed to. The disagreement is typically over how to determine what likelihood and mission impact are appropriate for a particular risk. One important, and often overlooked, factor is how to measure the amount of uncertainty in the inputs. However, several methods do exist—such as using 90% confidence intervals (90CI)—that enable the tracking of that uncertainty throughout the calculations [6]. Additionally, risk scoring should be validated by ensuring predicted results are congruent with actual results collected during testing and exercises. As the survivability community moves toward more cyber live fire test and evaluation (LFT&E), the collection of quantitative metrics beyond simple pass/fail becomes vital to the validation of risk scoring. Finally, there are numerous, well-known psychological biases and heuristics that make simply using human assessors to assess the overall risk of an attack from end to end problematic [7]. Asking a subject-matter expert what the likelihood of attack A is forces the expert to integrate numerous elements of probability across a complex attack chain to develop an overall score. One way to simplify this task for the scorer is to break the problem down into smaller pieces.

Instead of asking the aforementioned question, which includes the entire attack chain from reconnaissance and tool development through multiple steps of execution, an expert could instead be asked the following question, What is the likelihood that an adversary will be able to execute specific attack A on component B, running operating system C, from compromised component D? Each of the small steps an attacker must make to execute an attack can likewise be scored, and then a model can connect those steps and do the appropriate math to determine the likelihood. Our proposition herein here is that we can more accurately assess the cyber survivability of our systems by scoring the likelihood for each discrete small substep an attacker must take and then combining all those steps in a system-level model to calculate overall likelihood while using mission- and campaign-level models to calculate mission impact.

Overall Process Flow

To effectively score cyber risk using M&S, the process can be broken down into four main categories (as illustrated in Figure 2): (1) Engineering and Preparation, (2) Modeling and Simulation, (3) Test and Validation, and (4) Risk Management. Each of these stages is crucial in ensuring that the risk scoring is accurate, actionable, and integrated into decision-making processes.

Figure 2. Process Flow for Using M&S to Determine Cyber Risk Scoring.
Figure 2. Process Flow for Using M&S to Determine Cyber Risk Scoring.

In the Engineering and Preparation stage, information is gathered, developed, analyzed, and prepared for ingestion into the models. In the Modeling and Simulation stage, the system- and mission- or campaign-level models are built and run in simulations. In the Test and Validation stage, the model results are compared to real-world results from testing or exercises to validate that the models provide a reasonable basis for decision-making. Then in the Risk Management stage, senior leaders make decisions that either the current level of risk is acceptable or that changes need to be made to reduce the risk.

To illustrate the process flow, we can use a completely notional unmanned aerial system (UAS), the MQ-99 Berserker. The Berserker is at the conceptual design stage, and a basic concept of operations and architecture have been developed in model-based systems engineering (MBSE) tools.

The Berserker has both air-to-air and air-to-ground roles and is semi-autonomous in that it can operate independently or under the direct control of a ground station. The weapons bay is sized for either two Air Intercept Missile (AIM)-120 Advanced Medium-Range Air-to-Air Missiles (AMRAAMs) or six GBU-39/B Small Diameter Bombs (SDBs). The aircraft is envisioned to be rail-launched and parachute-recovered to keep it free of operating from traditional airbases and is intended to be attritable, in that it is usually intended to be recovered but, for high-priority missions, may be deliberately sent out without the intent to recover it.

Engineering and Preparation

Intelligence information is always crucial in cyber risk and survivability, but it is often extremely challenging to get, as cyber weapons are much easier to hide than large physical weapons such as battleships, aircraft, and tanks. Mission engineering is also critical to have, so that a system’s expected contribution to a mission, as well as its connections to and dependencies on other systems, is understood. Of course, the details of a system’s design are important inputs into any type of system-level model; however, if a system is early in the design life cycle, assumptions can be made and recorded about the design that can be updated as the design matures.

Numerous ways exist to develop attack scenarios currently in use. Some methods, such as the Mission-based Risk Assessment Process for Cyber (MRAP-C), rely on “bottom-up” approaches that look across an entire attack surface. Other methods, such as Systems-Theoretic Process Analysis for Security (STPA-Sec), use a “top-down” method that focuses on the mission and what losses would be unacceptable. Either approach can work, and often a combination of approaches is the best way to develop a rich list of potential attack scenarios.

Criticality analysis determines which system components are essential for mission success. This analysis is a standard part of program protection and so is often already available to a program. In addition, criticality analysis is integrated into various frameworks, including MRAP-C and the Navy’s Cyber Survivability Risk Assessment (CSRA). A well-developed set of critical components can also serve as a key input into building a system-level model.

To build a system-level cyber model, we used the Cyber Operations Lethality and Effectiveness (COLE) tool, which is supported by both the Office of the Director, Operational Test and Evaluation (DOT&E) and the Joint Aircraft Survivability Program (JASP). Other tools could be used as well.

COLE was originally developed as a Joint Munitions Effectiveness Manual (JMEM) tool for determining the probability of success for traditional information technology (IT) cyber attacks, but it has been expanded to incorporate cyber elements from weapons systems such as 1553 buses and controllers. The tool does not model the actual data flowing through the system but instead tracks component-level hardware, software, and firmware down to the specific build. This functionality enables COLE to determine how likely a particular component is able to be affected by a specific attack against a vulnerability even if that software build has not been explicitly tested (based on the build’s similarity to a build that has been tested). Accordingly, this type of model gives us what we need to be able to calculate overall probabilities later in the process.

Modeling and Simulation

With the system model in hand, we can now model the actual attacks. To illustrate this process, we used COLE to model a simple hypothetical two-step cyber attack against the Berserker’s radar warning receiver (RWR) controller. COLE enables this modeling via a Risk Assessment module that leverages the previously constructed system model. That module identifies the first step where an attacker infiltrates the software development chain of the RWR controller and inserts malicious code.

COLE enables the entering of probabilities in 90CI. In this case, our 90CI was 30–70%, which reflects a high uncertainty and a mean of 50%. Now that the implant is in place, the second step in the attack chain is that the attacker sends a particular signal that triggers the implanted malware to disable the system and stop reporting to threats. This step is modeled in COLE with a 90CI of 65–85%, representing that the signal may not be seen by the system in a complex electronic warfare environment.

COLE can now calculate the overall probability of the adversary being successful with this attack given the probabilities of the two steps and determine the likelihood to have a 90CI of 21.9% to 53.1% with a mean of 37.4%. Note that the tool assumes an attacker will launch a particular attack, but it is also easy to include a probability of attack into the calculation if desired. There is currently debate on whether it is useful to include a probability of attack launch into cyber risk, but this methodology allows for either approach to be used. The difference is that an EML that includes the probability of attack launch will obviously be smaller than one that does not (unless the probability is 100%). To simplify this example, the probability of attack launch was assumed to be 100%, with notional intelligence assumed to indicate definitively that the adversary is trying to accomplish this attack and has committed the necessary resources.

With the system-level modeling of the cyber attack complete, we can now turn to modeling the cyber attack at the mission level. The Advanced Framework for Simulation Integration and Modeling (AFSIM) software was selected for this example as the mission-level modeling tool, and a simple scenario involving a set of targets, defenses, two Berserkers, and a controlling F-35 were all built into the scenario. Figure 3 shows a screen shot of the baseline scenario without cyber attacks.

Figure 3. Baseline AFSIM Berserker Scenario.
Figure 3. Baseline AFSIM Berserker Scenario.

In the baseline scenario, an average of 4.44 of 5 targets were hit by the Berserkers and 10 Berserkers were lost over 50 runs of the simulation. Note that we ran AFSIM using a Monte Carlo approach so that we could ingest the probability distributions calculated using COLE. This approach allows us to include the uncertainty and confidence intervals in our calculations. With the supply chain RWR controller cyber attack included in the scenario, an average of 2.50 of 5 targets were hit by the Berserkers and 42 Berserkers were lost over 50 runs of the simulation.

With the aforementioned numbers, mission impact can be measured in terms of either how many more Berserkers were lost or how many fewer targets were destroyed as compared to the baseline case. Remember that likelihood is already embedded in the calculations since the probability of the cyber attack being successful was modeled in each individual simulation run. Thus, the mission impact based on aircraft lost in EML can be calculated as shown in Equation 2. Likewise, the mission impact based on targets destroyed can be calculated as shown in Equation 3.

Equations 2 and 3. Mission Impact Based on Aircraft Lost in EML (top) and Targets Destroyed (bottom).
Equations 2 and 3. Mission Impact Based on Aircraft Lost in EML (top) and Targets Destroyed (bottom).

Either of these EML values could be used if target destruction or platform survival was the critical mission element. They can also be combined using any desired weighting depending on which mission element is considered more significant. For example, if 50% weight was put on each element, then the EML for the mission is 35.4%. For an unmanned but expensive system, 50% might be valid; but for a manned system, platform survival might be considered significantly more important than targets destroyed.

While probability distributions were not calculated in this case due to the small number of simulation runs, if tracking uncertainty were determined to be important, enough simulation runs could be accomplished to calculate those distributions.

Test and Validation

For M&S to effectively inform decision-making, it must, of course, be validated to ensure its accuracy and reliability. This validation is primarily achieved through testing. The first step in this process involves using the risk scoring and calculated EML values to guide test planning. This approach allows testers to focus on specific risk scenarios that are most likely to be executable and those that would have the greatest impact on the mission if an adversary were to exploit them.

Given that the probabilities of the different steps of the attack are determined separately and then calculated through the kill chain, it is not necessary for testers to validate the entire kill chain of an attack. Instead, it may be more practical for testers to concentrate on those elements of the kill chain that have the highest uncertainty or those where different stakeholders have divergent views on the appropriate likelihood values. Test results should be fed back into both the system engineering and design processes and the M&S frameworks, allowing for continuous updates to values and uncertainties.

The COLE application used for modeling the cyber attacks here underwent verification, validation, and accreditation (VV&A) by a tri-Service and U.S. Cyber Command-led Model Review Committee in 2020 and 2023. This VV&A effort involved comparing the mathematical models used in COLE to calculate the probability of success/effect against test data, ensuring completeness and reliability. Although this effort focused on using COLE for a Joint Munitions Effectiveness Manual (JMEM) and not specifically for risk assessments, the peer-reviewed and validated calculations in COLE are noteworthy. The next step in this process would involve comparing COLE predictions to test and evaluation data on a blue system.

In some cases, exercises can help validate mission- or campaign-level models. This validation is typically feasible when systems are much later in the life cycle and can be integrated into exercises. One program that facilitates this type of work is DOT&E’s Cyber Assessment Program (CAP), which integrates cyber effects into selected full-scale exercises. If a mission simulation tool runs a particular mission scenario and that mission is then flown in an exercise, the results should be congruent. If not, the mission simulation needs to be analyzed and updated accordingly.

Risk Management

The final phase of the cyber survivability assessment and management process involves leaders determining whether the current level of risk is acceptable or if changes need to be made to the system or mission to reduce that risk. The approach outlined here provides clear, understandable metrics for decision-makers—such as how many additional Berserkers are expected to be destroyed (32 more Berserkers lost over 50 missions) or how many fewer targets are likely to be hit (97 fewer targets hit over 50 missions) due to a cyber attack. These metrics offer a more intuitive understanding compared to narrative descriptions of possible risks or qualitative risk matrices.

Both milestone decision authorities and authorizing officials can leverage this information to make their determinations on whether the risk is acceptable, as well as what needs to be done if it is not. EML values in particular can help decision-makers understand what specific risks should be addressed first as they give a clear priority order.

In addition to helping with system risk acceptance, this process can also help inform campaign-level risk acceptance by combatant commanders. These cyber effects can be modeled in campaign-level models and inform both resourcing and schemes of maneuver in different scenarios. If the overall results are unacceptable, combatant commanders can communicate that fact to system developers and the Services responsible for acquiring systems. Thus, the loop is finally closed—all the way from a vulnerability in a single component to how that vulnerability affects a theater-level campaign—which then can inform what, if anything, should be done about that vulnerability.

Conclusions

M&S significantly enhances the assessment of cyber risk by breaking down risk scoring into smaller, discrete components that can be both accurately scored and validated. This methodology allows for the combination of various risk elements into a comprehensive sequence necessary to simulate the execution of a cyber attack and its potential impact on mission success.

The example provided here, while based on a notional system, demonstrates the efficacy of this approach. The next logical step is to apply this methodology to a real system, thereby validating the theoretical models against practical outcomes. These results can then be integrated into a broader mission-based risk assessment that includes other threat areas (as shown in Figure 1). This comprehensive assessment framework enables a better understanding of full-spectrum survivability by incorporating analysis, modeling, simulation, and validation across multiple threat domains.

In practical application, this approach not only refines our understanding of cyber threats but also enhances decision-making processes by providing clear, quantifiable metrics, such as EML. As shown, these metrics are more intuitive and actionable for decision-makers compared to traditional narrative risk descriptions or qualitative risk matrices. For instance, knowing how many additional platforms are expected to be lost or how many fewer targets hit due to a particular cyber attack provides a concrete basis for prioritizing risk mitigation efforts.

Furthermore, the integration of M&S with rigorous test and validation processes ensures that the models reflect real-world scenarios. This feedback loop between cyber risk assessment and cyber testing forms a robust validation mechanism, enhancing the credibility and reliability of the risk assessments. As cyber testing validates the predictions made by the models, adjustments can be made to improve the accuracy of future assessments.

The methodology also supports campaign-level risk acceptance by combatant commanders, allowing for the modeling of cyber effects at a strategic level. This informs resourcing and maneuver strategies, providing a holistic view of how individual system vulnerabilities can impact broader operational objectives.

In conclusion, the structured approach outlined herein offers a significant advancement in cyber survivability assessment. By integrating detailed risk scoring, comprehensive M&S, and rigorous validation, this methodology provides a powerful tool for enhancing the security and resilience of DoD systems. Future work should focus on applying these principles to real-world systems, thereby continuously refining the models and improving our understanding of cyber threats and their impact on mission success.

About the Author

Dr. William “Data” Bryant is a cyberspace defense leader for Modern Technology Solutions, Inc. His background in operations, planning, and strategy includes more than 25 years of service in the Air Force, where he was a fighter pilot, planner, and strategist. Dr. Bryant helped create Task Force Cyber Secure and served as the Air Force Deputy Chief Information Security Officer, developing and implementing many proposals and policies to improve the cyber defense of weapon systems. He holds multiple degrees in aeronautical engineering, space systems, military strategy, and organizational management.

References

  1. 2022 National Defense Authorization Act, 118th Congress, §4172, p. 2566.
  2. Bryant, W., C. Fisher, D. Boseman, and J. Ivancik. “Digital Technology—a Universal Integrator—Enabling Full-Spectrum Survivability Evaluations.” Naval Engineers Journal, vol. 136, pp. 189–198, spring 2024.
  3. Bryant, W. D., and R. Ball. “Developing the Fundamentals of Aircraft Cyber Combat Survivability.” Parts 1–4, Aircraft Survivability, spring 2020 (part 1), summer 2020 (part 2), fall 2020 (part 3), and spring 2021 (part 4).
  4. Bryant, W. D. “Measuring the Wind: Determining a System’s Cyber Combat Survivability Level.” Aircraft Survivability, summer 2023.
  5. Joint Staff. “Department of Defense Dictionary of Military and Associated Terms.” Joint Publication 1-02, November 2010 (as Amended Through February 2016).
  6. Brown, A., W. Bryant, E. Moro, and M. Standard. “The Unified Risk Assessment and Measurement System (URAMS) Guidebook: Version 3.0.” Edited by W. Bryant, October 2023.
  7. Kahneman, D. Thinking Fast and Slow. Farrar, Straus and Giroux, April 2013.
By:  William “Data” Bryant

Read Time:  12 minutes

Table of Contents

Aircraft Survivability Journal

Archives

Scroll to Top