Illustration by *Security Management;* iStock

How to Stress-Test Your Security Program with Red Teams

By Luke Bencie, Gary DeMercurio, Eric Kready, CPP, and Dennis Harrison, CPP

16 December 2024

Focus on Red Teaming and Penetration Testing

Red teaming is a vital part of a comprehensive security strategy. It goes beyond traditional security assessments by incorporating threat assessment, vulnerability analysis, and penetration testing, which includes physical, digital, and human realms. This holistic approach is necessary because it mirrors real-world threats that do not restrict themselves to a single domain.

Red teams operate covertly and mimic adversaries, using methodical techniques to avoid detection and gradually increasing their tactics to accomplish the established goal without being caught. Understanding and mimicking real threats is paramount.

The practice of red teaming, especially when it includes aspects such as social engineering and physical intrusion, helps organizations understand the full spectrum of threats they face. By simulating real-world attackers, who often combine cyber, physical, and social tactics, red teaming reveals how these different vectors can be exploited in tandem.

Red teaming is a critical learning tool. Red teaming exercises, particularly those involving physical and social engineering, provide invaluable insights into how security measures perform under realistic conditions. By challenging assumptions and testing countermeasures, organizations can learn where their defenses are strong and where they need improvement. Red teaming does not only exist to find holes. It also exists to reinforce strong points illustrating the work teams have done to improve their security posture.

The Integral Foundation: Target Analysis and Vulnerability Assessment

A red team comprises an independent group tasked with challenging an organization to enhance its efficacy.

For decades, military special operations have employed red teams to unveil latent physical vulnerabilities and assess the responsiveness of security units, such as force protection and first responders. This approach extends to predicting the actions of various entities, be they foreign government leaders, terrorist factions, business rivals, or political adversaries. Through simulated attacks, these red team "aggressors" push existing protocols to their limits—evaluating the system's capacity to detect, mitigate, and counter threats within a safe and controlled environment.

As the global security landscape has changed, private corporations (and even nonprofits) have included red teaming as part of their security best practices. Whether through coordinated penetration tests, non-destructive simulated attacks, or theoretical tabletop exercises, red teaming aims to stress-test systems under real-world conditions.

The process is akin to an airline pilot being subjected to recurring training in a flight simulator. The pilot is thrown into an escalating series of non-normal events, such as the plane’s engine sucking in a flock of birds followed by a lightning strike that causes the instruments to malfunction. These scenarios are known in the security world as design basis threats (DBTs). They are plausible threats to the system, which are documented and discussed as “what if” possibilities.

Airlines use this training process to test a pilot’s decision-making abilities under extreme duress so when an actual emergency occurs, the pilot is thoroughly prepared to respond accordingly. Similarly, organizations must be primed to address escalating security threats. This requires conducting honest, and effective, threat and vulnerability assessments as part of the security process.

Subjective assessments, which are based more upon the assessor’s personal experience in one field rather than a balanced process or methodology, can lead to biased or unbalanced reporting. Instead, security practitioners can prioritize their internal assets and employ a proven threat or vulnerability assessment methodology to identify vulnerabilities in soft or hard target environments, minimize human error, and allow for the proper allocation of security resources to reduce risks to critical assets.

One quantifiable assessment to consider is the CARVER Target Analysis and Vulnerability Assessment Methodology, a methodology popularized by the U.S. Central Intelligence Agency (CIA) and U.S. military special forces.

CARVER stands for:

Criticality: How vital an asset or critical system is to your company
Accessibility: The level of difficulty for an adversary to access or attack the asset
Recoverability: The speed at which you could recover in case of an incident
Vulnerability: How well the asset can withstand an adversary's attack
Effect: The extent of impact across your business if something happens to the asset
Recognizability: The likelihood that an adversary identifies the asset as a valuable target

This system, developed during World War II, served as a tool for analysts to determine optimal bombing targets. It’s a versatile approach, suitable for both offensive and defensive purposes, making it invaluable for identifying weaknesses and conducting internal audits. Because CARVER relies on the process of target analysis (a fancy term for an offensive, clandestine assessment prior to an attack), it is also perfect for outlining red teaming exercise scenarios by prioritizing critical infrastructure and key assets. Many security experts regard it as the gold standard for safeguarding critical assets.

Intelligence, security, and counterterrorism professionals worldwide undertake vulnerability assessments to identify infrastructure susceptible to various threats. These assessments aim to prevent, deter, and mitigate risks while enabling a swift response and recovery post-event. Incorporating red teaming into such assessments exposes weaknesses within infrastructure security systems contributing to community stability, economic resilience, and robust infrastructure against non-kinetic or non-lethal threats.

By empowering leaders to make informed decisions and continually monitor security measures, red teaming ensures the safety, prosperity, and well-being of organizations and their essential services.

Building a Penetration Testing Program

Think of an organization’s security like the defense of a fortress in medieval times. Having a one-time penetration test or red team is like hiring a group of mercenaries to try breaking in once, then leaving. They might find weak spots and reinforce them, but once they leave, the fortress is back on its own, exposed to the changing tactics of enemy forces. Back then, you could see the enemy coming, or you knew who was most likely to attack you. Now, it could be anyone at any time, and you never see them coming—you just have to hope your fortress is secure.

On the other hand, having a full-time offensive security team is like having an elite group of spies and scouts permanently stationed around the fortress. Not only are they constantly probing for new vulnerabilities, testing defenses from the inside and out, but they also study the newest techniques of war, adapting to new threats before they reach the walls. They don’t just test once and disappear; they are on guard, evolving their strategies as potential attackers change theirs.

This full-time team means the fortress is not left vulnerable to new threats that didn’t exist the last time you hired that mercenary group. Instead, you have a proactive force keeping your fortress one step ahead of the next attack.

But where do you start? It comes back to your design basis threats. DBTs dictate the organizational level of protection that must be provided to protect critical components and personnel. Organizations will need to consider the size of the force, its level of training, proficiency, equipment availability, and necessary weight restrictions for equipment. Additionally, circumstance-specific attributes should be identified, such as threats from vehicle-borne or waterborne improvised explosive devices.

Finally, organizations must set what methods the adversary force could use to attain information, such as only open source or with the assistance of an insider. Once the DBT is established, the security strategy and requisite security measures must be designed and implemented to defend against attacks from the stated threat.

While there are a significant number of items an adversary can purchase off the shelf and an untold number of how-to videos available on the Internet, the threat must be defined for a penetration testing program (PTP) to progress. If not, the potential exists to “what if” and wargame items ad nauseum and the program will stagnate.

Building appropriate support. Penetration testing ranges in complexity from simple and rudimentary tests to comprehensive attack operations. But no matter the scope nor mission, all successful penetration testing programs have these elements in common: top-driven, executive support, and stakeholder buy-in.

The first two define themselves, but stakeholder buy-in is often overlooked or misunderstood. Stakeholders are generally considered to be people, a department, or line of business (LOB) that are impacted by a penetration test (before, during, and after). Stakeholders are most responsible for vulnerabilities identified during a penetration test, or they can influence, support, or fund the implementation of training or a countermeasure project. It is important to identify these key figures to ensure they are not underserved or excluded during the planning, implementation, and review phases of a penetration test. A structured and thorough stakeholder management approach identifies the relevant stakeholders for the specific penetration test and defines processes that establish a positive and transparent relationship with them.

Failing to establish stakeholder relationships can have both short- and long-term consequences. Immediate effects could include delay of test execution, exercise interference, or unintended reporting to law enforcement or legal authorities. Long-term effects could include partnership erosion, as well as degradation in the trust and confidence of the enterprise’s security apparatus.

Selecting a test type. There are two types of penetration testing programs: internally managed or outsourced.

Internally administered programs can be resource-intensive, but for large organizations with a substantial physical footprint, executing testing requirements with internal sources can be a cost-effective approach. Outsourcing penetration testing to experts significantly reduces the risk of bias or partiality during testing and ensures every scenario (in the scope of work) has been expertly tested. Both types of programs merit thoughtful consideration.

Budget and organizational culture are the biggest factors that determine pen test methodology. Active penetration testing and red teaming produces priceless insights, but these activities can be costly. There are inherent dangers involved whenever testing physical structures and human response, which can result in serious injuries. Additionally, because the results of active penetration testing—both immediate and during after-action reviews—are overt, they can lead to embarrassment of individuals or departments leading to morale degradation instead of team building. As red team professionals often note, covert penetration yields overt results. For these reasons, organizations often deploy a passive penetration test strategy.

Passive penetration testing—often referred to as purple team activity—utilizes the subject matter expertise of red team personnel collaborating with facility and business stakeholders to identify vulnerabilities and strategize solutions. There are two common forms of purple team activity. The first is a tabletop exercise where red team members partner with blue team members (security staff, facility personnel, etc.) to conduct a scenario-based threat exercise focusing on a predetermined threat’s most likely course of action and most dangerous course of action. The second is a Left Seat/Right Seat, also known as an admin walk, where a red team leader and a blue team leader physically walk a space and discuss a threat actor attack focusing on the attack plan, the countermeasures or defensive posture, and the response measures.

Beyond impacts to morale, potential injury, and other ramifications of penetration testing, cost is a notable factor. Organizations that face greater threats must properly harden assets and sufficiently equip security forces. The cost transfer for this is generally to the customer, so careful consideration must be given in the risk management process regarding the probability of an event occurring and the impact it would have on the organization and its customers.

Obtaining authorization. No matter the type of program or the testing methodology employed, penetration test programs must include a few key fundamentals, beginning with the authorization for testing. Pen test authorization defines the origin, stimulus, and requirements for the test, including stakeholder coordination, policy requirements, countermeasure confirmation, and approval for the execution of the test.

The requirement and authorization of a penetration test can come from multiple avenues. Some might originate from an organizational need to test current control measures against a newly identified threat. Others might be requested by the primary LOB or senior leader of a facility to determine how attacks could affect key business processes or assets like supply chains.

Setting the scope. The scope of a penetration test is the sum of all the boundaries of an engagement, comprising all items to be tested or to be specifically excluded from that engagement. The scope can either whitelist items (things in scope, such as barriers, specific rooms, or buildings, etc.) or blacklist items (things that may not be tested, such as confidential information storage areas, personal storage units, etc.).

The scope should be defined early in the process and agreed upon by the stakeholder and the penetration test team, at a minimum. Once the scope is defined and agreed upon, subsequent preparation phase steps may commence.

To limit the chances of negative collateral impact, rules of engagement (ROE) are paramount. ROE are directives that define the extent to which penetration testers may execute their craft to accomplish their mission. Most established programs have their ROE defined for general use under the scope for red team exercises. ROEs may be customized to meet specific requirements of a penetration test depending upon the facility, location, country, customs, laws, or specified target or threat.

Typical ROEs include 10 or more rules, but the most common is that if challenged by law enforcement, the red team member should immediately stop the exercise, identify themselves, present the authorization letter, and then immediately contact the exercise manager and client.

Remember, it’s always better to blacklist things that are off limits instead of whitelisting things; this helps to avoid open interpretation. However, limiting the scope can lead to a false sense of security because real attackers won’t adhere to such boundaries.

This is seen especially when companies browse the MITRE ATT&CK framework and ask testers to check a few known vulnerabilities in their assessment. But when a test is limited to a few attack vectors, it morphs from a true red team event into a simplified penetration test—reducing the number of potential security gaps the testers can uncover.

Effective penetration tests are also converged. Modern day security is predicated upon the connection between physical, cyber, and human aspects. It’s nearly impossible to test an organization’s true security posture if this interconnected relationship is limited.

Picking your partners. Collateral partner coordination—including law enforcement, landlords, and neighbors—must always be considered when planning a penetration test. It is important to note that this list is not exhaustive; each test should be different because each facility and partner is different. Taking the time to consider everyone who may be affected by a penetration test—before the test is executed—is exponentially better than dealing with the multitude of unintended and unexpected effects that may occur when this planning step is not conducted.

Consider a hypothetical—but frequently realized—example: a highly motivated security professional decides to conduct a penetration test on his own organization, but he fails to conduct the proper collateral partner coordination. During the exercise, the property manager contacts the managing director of the facility to inform her of suspicious activity. The managing director immediately contacts law enforcement, resulting in several hours of awkward interactions.

Adding specialized expertise. Training is often the biggest variable in developing and administering a penetration testing program, and it’s usually the reason for outsourcing this integral component in the process. Every profession has a plethora of specialties, and it’s rare to find someone who excels in every facet of their field. Having the skills, tools (and knowledge of how to use them), and experience can be the deciding characteristics of how to build a program and what strategy to use.

Most attacks aren’t blunt-force operations but highly specialized, multifaceted campaigns. Offensive security teams often focus on their core strengths, but with the explosion of new, intricate attack surfaces, it’s impossible for every member to master every skill required to defend against these sophisticated threats.

Red teams are composed of specialists, with each bringing unique expertise. Organizations might need to pull in outside experts to handle extremely complex or emerging threats that require deep, niche knowledge. This is especially true in areas like artificial intelligence (AI)-powered social engineering, where attackers leverage voice cloning or deepfake video to impersonate trusted contacts over platforms like Slack. Other examples include cognitive and information warfare, where understanding the psychological intricacies behind manipulation and polarization could demand months of focused study—time most internal teams don’t have.

Ransomware is another critical area. While most offensive security teams understand ransomware at a high level, they are often generalists. Ransomware threat actors are specialists who know every nuance of how ransomware operates, from initial infection to data exfiltration and ransom negotiation tactics. Outsourced experts bring an unmatched depth in these areas, allowing your organization to stay ahead of complex threats without diverting internal resources from their primary roles. These external red teams are essential when your in-house team encounters attack surfaces that demand cutting-edge knowledge and a level of dedication that can’t realistically be covered in-house. By bringing in outside expertise, you’re also providing a new level of understanding to your current team—expanding their skill set.

These are some of the critical components of a penetration testing program and are not meant to represent the entirety of a program. Building a program requires research, experience, and partnership.

Real-World Applications and Considerations

Red team exercises test not just technology but also the people and processes that are part of an organization’s security posture. This is crucial because human factors are often the weakest links in security. Real-world training and testing can uncover vulnerabilities in standard operating procedures and employee behaviors that computer-based and compliance-focused training seldom cover.

Traditional definitions of social engineering like pretexting, phishing, and smishing are no longer sufficient. Social engineering now includes advanced tactics like AI-generated content and voice cloning. Security testing must adapt to these new methods to stay ahead of attackers who are constantly innovating.

In scenarios involving critical infrastructure or government entities, the stakes are even higher. Red teaming that includes physical, cyber, and social engineering aspects can uncover vulnerabilities that might be exploited by sophisticated adversaries, including nation-states or advanced criminal groups.

The value of rules of engagement cannot be understated, especially regarding the adversarial force’s planned activities. Attackers must clearly understand what actions will be performed and what actions will be simulated, as well as agreed upon times to complete a simulated activity. Additionally, the anticipated actions of employees and security forces must be considered.

To be effective, the red team event should only be known to a select number of key personnel within the organization so participants have a sense of the stakes and pressure of a real-life incident. Depending on the nature of such events, it is advisable to have an individual controller or, at a minimum, an area controller who is aware of the intended activity where interaction between an adversary and organizational personnel may occur. All organizations should strive for their employees to maintain a questioning attitude. Security professionals are trained to look for what is correct and to promptly notice when something seems out of the ordinary. Anything or anyone that seems out of place should make security professionals question why and pursue additional information.

However, one never knows when or where individuals may take matters into their own hands during an exercise to confront a perceived invader. This is where the organizational controller would step in and clarify the situation.

The risk is compounded exponentially when conducting force-on-force red team activities against fixed installations with an armed security force, and whenever local law enforcement agencies participate. These activities introduce many uncertain variables—including interactions that could escalate—so it is imperative that live weapons and ammunition are not available to participants involved in the test. Personnel on shift who are not involved in the shift should also have barriers on their weapons or be sequestered until the test is completed.

Demonstrating the Value of Testing

A properly developed and administered penetration testing program provides several forms of value by uncovering unknown logical, physical, and human vulnerabilities in the organization’s security. A solid program or event tests people, places, and things to ensure policy and procedure adherence, as well as individual diligence and compliance. Red teams should provide training, empowering those involved.

Red teams exist in an organization to first and foremost ensure that security’s defensive posture is doing what it is supposed to be doing. That defensive posture is also largely comprised of humans, which means that a red team is testing that the organization’s people are doing what is required of them. However, what makes red teams so vital is human nature. Most people naturally want to help others when they see someone in distress. They will often choose kindness over apathy. This is how humans survived for thousands of years. The unfortunate reality is that bad actors use this instinct to their advantage and prey on this empathetic side of human nature. It is the responsibility of the red team to train people about why bad actors do it, how they manipulate people, and what to do when you want to help someone but policy and procedure state otherwise.

Red teams also teach people in the organization what to look for, such as tape on a magnetic lock or something shoved into a hole for a latch. If an organization has 4,000 employees, they should also have 4,000 security personnel keeping watch—the red team helps ensure that happens.

The true strength of the red team isn’t its toolset or methodology; it is the team’s experience in how to adapt to changing circumstances, probe weaknesses, and wait for the defense to make an error. Once that weakness is found, a great red team will exploit it to determine what that vulnerability is tied to and how it connects with the human, logical, and physical aspects of security. Without that room to operate and adapt, the test has little value beyond that of a physical assessment or walkthrough.

For organizations undergoing a penetration test, the process can feel risky. Security directors and CSOs may be—understandably—hesitant to invite outsiders to scrutinize their defenses, potentially exposing vulnerabilities that could reflect poorly on the security team. However, forward-thinking security teams see penetration testing as an opportunity to strengthen their defenses and raise the organization’s standards, particularly against ever-evolving threats. By accepting this challenge, security departments can turn the tests into a valuable benchmark, using it to measure progress and adapt proactively. Rather than viewing penetration tests as a potential embarrassment, organizations should see them as a powerful tool to assess and enhance their resilience. Security teams that embrace regular exercises often develop greater pride in their capabilities, stronger confidence in their systems, and a competitive spirit that makes the organization more resilient.

The true purpose of red teaming is not to win or lose. It is to test layers of security by finding and exploiting vulnerabilities in an organization’s security posture. We are all on the same team with the ultimate objective of improving the security posture. Red teams are only adversaries for the sake of identifying vulnerabilities and seeking to exploit them so corrective actions can be implemented.

Luke Bencie is the managing director at Security Management International, LLC. For more than 20 years and in more than 100 countries, Bencie has been a consultant to the U.S. Department of State, U.S. Department of Defense, Fortune 500 companies, and foreign governments. He specializes in conducting strategic and security management assessments, performing counterintelligence and due-diligence investigations, and providing specialized intelligence advisory services. He is the author of six books on security.

Gary DeMercurio is the founder and CEO of Kaiju Security, which redefines red team operations, simulating real-world threats for global clients. He has more than 14 years of expertise in network, physical, and social engineering security. His journey, rising from an associate to global vice president at major consultancies, culminated in his current leadership role. Kaiju Security redefines 'Red Team' operations, simulating real-world threats for global clients. Prior to his security career, he was an international process and quality engineer for the U.S. Navy's CH53K Project, and he served as a U.S. Marine Corps officer and Naval Aviator.

Eric Kready, CPP, manages a global physical penetration testing program for a global Fortune 500 company. He served more than 20 years in the U.S. Army before transitioning to a civilian career in the private sector where he engaged with ASIS International and earned the designation of Certified Protection Professional (CPPⓇ). Kready has previously served in security operations and intelligence positions in the private sector.

Dennis Harrison, CPP, serves in senior leadership positions in two organizations with a focus on international nuclear security and full spectrum security operations. A 30-year U.S. Army veteran with more than 18 years serving in Special Forces assignments, Harrison was director of special operations (Allied Universal) for the Nuclear Energy Institute’s (NEI) Composite Adversary Force (CAF) for more than nine years. Harrison currently serves as the vice president of operations for Global Security Solutions with a continued focus on international nuclear security, and as the president of Five Star Global Security, a full spectrum security organization that is owned and operated by former U.S. special operators.