U.S. National Institute of Standards and Technology (NIST)

The NIST 800-61r2 standard provides guidelines for incident handling, particularly for analyzing incident-related data, and determining the appropriate response to each incident. The guidelines can be followed independently of particular hardware platforms, operating systems, protocols, or applications.

The first step for an organization is to establish a computer security incident response capability (CSIRC). NIST recommends creating policies, plans, and procedures for establishing and maintaining a CSIRC.

Policy Elements

An incident response policy details how incidents should be handled based on the organization’s mission, size, and function. The policy should be reviewed regularly to adjust it to meet the goals of the roadmap that has been laid out. Policy elements include the following:

  • Statement of management commitment
  • Purpose and objectives of the policy
  • Scope of the policy
  • Definition of computer security incidents and related terms
  • Organizational structure and definition of roles, responsibilities, and levels of authority
  • Prioritization of severity ratings of incidents
  • Performance measures
  • Reporting and contact forms

Plan Elements

A good incident response plan helps to minimize damage caused by an incident. It also helps to make the overall incident response program better by adjusting it according to lessons learned. It will ensure that each party involved in the incident response has a clear understanding of not only what they will be doing, but what others will be doing as well. Plan elements are as follows:

  • Mission
  • Strategies and goals
  • Senior management approval
  • Organizational approach to incident response
  • How the incident response team will communicate with the rest of the organization and with other organizations
  • Metrics for measuring the incident response capacity
  • How the program fits into overall organization

Procedure Elements

The procedures that are followed during an incident response should follow the incident response plan. Procedures elements are as follows:

  • Technical processes
  • Using techniques
  • Filling out forms,
  • Following checklists

These are typical standard operating procedures (SOPs). These SOPs should be detailed so that the mission and goals of the organization are in mind when these procedures are followed. SOPs minimize errors that may be caused by personnel that are under stress while participating in incident handling. It is important to share and practice these procedures, making sure that they are useful, accurate, and appropriate.

Incident Response Stakeholders

Other groups and individuals within the organization may also be involved with incident handling. It is important to ensure that they will cooperate before an incident is underway. Their expertise and abilities can help the Computer Security Incident Response Team (CSIRT) to handle the incident quickly and correctly. These are some of the stakeholders that may be involved in handing a security incident:

  • Management – Managers create the policies that everyone must follow. They also design the budget and are in charge of staffing all of the departments. Management must coordinate the incident response with other stakeholders and minimize the damage of an incident.
  • Information Assurance – This group may need to be called in to change things such as firewall rules during some stages of incident management such as containment or recovery.
  • IT Support – This is the group that works with the technology in the organization and understands it the most. Because IT support has a deeper understanding, it is more likely that they will perform the correct action to minimize the effectiveness of the attack or preserve evidence properly.
  • Legal Department – It is a best practice to have the legal department review the incident policies, plans, and procedures to make sure that they do not violate any local or federal guidelines. Also, if any incident has legal implications, a legal expert will need to become involved. This might include prosecution, evidence collection, or lawsuits.
  • Public Affairs and Media Relations – There are times when the media and the public might need to be informed of an incident, such as when their personal information has been compromised during an incident.
  • Human Resources – The human resources department might need to perform disciplinary measures if an incident caused by an employee occurs.
  • Business Continuity Planners – Security incidents may alter an organization’s business continuity. It is important that those in charge of business continuity planning are aware of security incidents and the impact they have had on the organization as a whole. This will allow them to make any changes in plans and risk assessments.
  • Physical Security and Facilities Management – When a security incident happens because of a physical attack, such as tailgating or shoulder surfing, these teams might need to be informed and involved. It is also their responsibility to secure facilities that contain evidence from an investigation.

NIST Incident Response Life Cycle

  • Preparation – The members of the CSIRT are trained in how to respond to an incident. CSIRT members should continually develop knowledge of emerging threats.
  • Detection and Analysis – Through continuous monitoring, the CSIRT quickly identifies, analyzes, and validates an incident.
  • Containment, Eradication, and Recovery – The CSIRT implements procedures to contain the threat, eradicate the impact on organizational assets, and use backups to restore data and software. This phase may cycle back to detection and analysis to gather more information, or to expand the scope of the investigation.
  • Post-Incident Activities – The CSIRT then documents how the incident was handled, recommends changes for future response, and specifies how to avoid a reoccurrence.

Preparation

The preparation phase is when the CSIRT is created and trained. This phase is also when the tools and assets that will be needed by the team to investigate incidents are acquired and deployed. The following list has examples of actions that also take place during the preparation phase:

  • Organizational processes are created to address communication between people on the response team. This includes such things as contact information for stakeholders, other CSIRTs, and law enforcement, an issue tracking system, smartphones, encryption software, etc.
  • Facilities to host the response team and the SOC are created.
  • Necessary hardware and software for incident analysis and mitigation is acquired. This may include forensic software, spare computers, servers and network devices, backup devices, packet sniffers, and protocol analyzers.
  • Risk assessments are used to implement controls that will limit the number of incidents.
  • Validation of security hardware and software deployment is performed on end-user devices, servers, and network devices.
  • User security awareness training materials are developed.

Additional incident analysis resources might be required. Examples of these resources are a list of critical assets, network diagrams, port lists, hashes of critical files, and baseline readings of system and network activity. Mitigation software is also an important item when preparing to handle a security incident. An image of a clean OS and application installation files may be needed to recover a computer from an incident.

Often, the CSIRT may have a jump kit prepared. This is a portable box with many of the items listed above to help in establishing a swift response. Some of these items may be a laptop with appropriate software installed, backup media, and any other hardware, software, or information to help in the investigation. It is important to inspect the jump kit on a regular basis to install updates and make sure that all the necessary elements are available and ready for use. It is helpful to practice deploying the jump kit with the CSIRT to ensure that the team members know how to use its contents properly.

Detection and Analysis

Different types of incidents will require different responses.

Attack vectors.

  • Web – Any attack that is initiated from a website or application hosted by a website.
  • Email – Any attack that is initiated from an email or email attachment.
  • Loss or Theft – Any equipment that is used by the organization such as a laptop, desktop, or smartphone can provide the required information for someone to initiate an attack.
  • Impersonation – When something or someone is replaced for the purpose of malicious intent.
  • Attrition – Any attack that uses brute force to attack devices, networks, or services.
  • Media – Any attack that is initiated from external storage or removable media.

Detection.

There are automated ways of detection such as antivirus software or an IDS. There are also manual detections through user reports.

It is important to accurately determine the type of incident and the extent of the effects. There are two categories for the signs of an incident:

  • Precursor – This is a sign that an incident might occur in the future. When precursors are detected, an attack might be avoided by altering security measures to specifically address the type of attack detected. Examples of precursors are log entries that show a response to a port scan, or a newly-discovered vulnerability to an organization’s web server.
  • Indicator – This is a sign that an incident might already have occurred or is currently occurring. Some examples of indicators are a host that has been infected with malware, multiple failed logins from an unknown source, or an IDS alert.

Analysis.

When an indicator is found to be accurate, it does not necessarily mean that a security incident has occurred. Some indicators happen for other reasons besides security. A server that continually crashes, for example, may have bad RAM instead of a buffer overflow attack occurring. To be safe, even ambiguous or contradictory symptoms must be analyzed to determine if a legitimate security incident has taken place. The CSIRT must react quickly to validate and analyze incidents. This is performed by following a predefined process and documenting each step.

Scoping.

When the CSIRT believes that an incident has occurred, it should immediately perform an initial analysis to determine the incident’s scope, such as which networks, systems, or applications are affected, who or what originated the incident, and how the incident is occurring. This scoping activity should provide enough information for the team to prioritize subsequent activities, such as containment of the incident and deeper analysis of the effects of the incident.

Incident Notification.

When an incident is analyzed and prioritized, the incident response team needs to notify the appropriate stakeholders and outside parties so that all who need to be involved will play their roles. Examples of parties that are typically notified include:

  • Chief Information Officer (CIO)
  • Head of information security
  • Local information security officer
  • Other incident response teams within the organization
  • External incident response teams (if appropriate)
  • System owner
  • Human resources (for cases involving employees, such as harassment through email)
  • Public affairs (for incidents that may generate publicity)
  • Legal department (for incidents with potential legal ramifications)
  • US-CERT (required for Federal agencies and systems operated on behalf of the Federal government)
  • Law enforcement (if appropriate)

Containment, Eradication, and Recovery

Containment strategy

These are some conditions to determine the type of strategy to create for each incident type:

  • How long it will take to implement and complete a solution?
  • How much time and how many resources will be needed to implement the strategy?
  • What is the process to preserve evidence?
  • Can an attacker be redirected to a sandbox so that the CSIRT can safely document the attacker’s methodology?
  • What will be the impact to the availability of services?
  • What is the extent of damage to resources or assets?
  • How effective is the strategy?

Evidence.

These are some of the most important items to log when documenting evidence used in the chain of custody:

  • Location of the recovery and storage of all evidence
  • Any identifying criteria for all evidence such as serial number, MAC address, hostname, or IP address
  • Identification information for all of the people that participated in collecting or handling the evidence
  • Time and date that the evidence was collected and each instance it was handled

Attacker Identification.

These are some of the most important actions to perform to attempt to identify an attacking host during a security incident:

  • Use incident databases to research related activity. This database may be in-house or located at organizations that collect data from other organizations and consolidate it into incident databases such as the VERIS community database.
  • Validate the attacker’s IP address to determine if it is a viable one. The host may or may not respond to a request for connectivity. This may be because it has been configured to ignore the requests, or the address has already been reassigned to another host.
  • Use an internet search engine to gain additional information about the attack. There may have been another organization or individual that has released information about an attack from the identified source IP address.
  • Monitor the communication channels that some attackers use, such as IRC. Because users can be disguised or anonymized in IRC channels, they may talk about their exploits in these channels. Often, the information gathered from this type of monitoring is misleading and should be treated as leads and not facts.

Eradication, recovery and remediation.

This includes malware infections and user accounts that have been compromised. All of the vulnerabilities that were exploited by the attacker must also be corrected or patched so that the incident does not occur again.

To recover hosts, use clean and recent backups, or rebuild them with installation media if no backups are available or they have been compromised. Also, fully update and patch the operating systems and installed software of all hosts. Change all host passwords and passwords for critical systems in accordance with the password security policy. This may be a good time to validate and upgrade network security, backup strategies, and security policies. Attackers often attack the systems again, or use a similar attack to target additional resources, so be sure to prevent this as best as possible. Focus on what can be fixed quickly while prioritizing critical systems and operations.

Post-Incident Activities

After a major incident has been handled, the organization should hold a “lessons learned” meeting to review the effectiveness of the incident handling process and identify necessary hardening needed for existing security controls and practices. Examples of good questions to answer during the meeting include the following:

  • Exactly what happened, and when?
  • How well did the staff and management perform while dealing with the incident?
  • Were the documented procedures followed? Were they adequate?
  • What information was needed sooner?
  • Were any steps or actions taken that might have inhibited the recovery?
  • What would the staff and management do differently the next time a similar incident occurs?
  • How could information sharing with other organizations be improved?
  • What corrective actions can prevent similar incidents in the future?
  • What precursors or indicators should be watched for in the future to detect similar incidents?
  • What additional tools or resources are needed to detect, analyze, and mitigate future incidents?

NIST Special Publication 800-61 provides the following examples of activities that are performed during an objective assessment of an incident:

  • Reviewing logs, forms, reports, and other incident documentation for adherence to established incident response policies and procedures.
  • Identifying which precursors and indicators of the incident were recorded to determine how effectively the incident was logged and identified.
  • Determining if the incident caused damage before it was detected.
  • Determining if the actual cause of the incident was identified, and identifying the vector of attack, the vulnerabilities exploited, and the characteristics of the targeted or victimized systems, networks, and applications.
  • Determining if the incident is a recurrence of a previous incident.
  • Calculating the estimated monetary damage from the incident (e.g., information and critical business processes negatively affected by the incident).
  • Measuring the difference between the initial impact assessment and the final impact assessment.
  • Identifying which measures, if any, could have prevented the incident.
  • Subjective assessment of each incident requires that incident response team members assess their own performance, as well as that of other team members and of the entire team. Another valuable source of input is the owner of a resource that was attacked, in order to determine if the owner thinks the incident was handled efficiently and if the outcome was satisfactory.

These are some of the determining factors for evidence retention:

  • Prosecution – When an attacker will be prosecuted because of a security incident, the evidence should be retained until after all legal actions have been completed. This may be several months or many years. In legal actions, no evidence should be overlooked or considered insignificant. An organization’s policy may state that any evidence surrounding an incident that has been involved with legal actions must never be deleted or destroyed.
  • Data Type – An organization may state that particular types of data should be kept for a specific period of time. Items such as email or text may only need to be kept for 90 days. More important data such as that used in an incident response (that has not had legal action), may need to be kept for three years or more.
  • Cost – If there is a lot of hardware and storage media that needs to be stored for a long time, it can become costly. Remember also that as technology changes, functional devices that can use outdated hardware and storage media must be stored as well.

The critical recommendations from NIST for sharing information are as follows:

  • Plan incident coordination with external parties before incidents occur.
  • Consult with the legal department before initiating any coordination efforts.
  • Perform incident information sharing throughout the incident response life cycle.
  • Attempt to automate as much of the information sharing process as possible.
  • Balance the benefits of information sharing with the drawbacks of sharing sensitive information.

example 28.4.13-lab—incident-handling

Security Onion Architecture

CapMe :This is a web application that allows viewing of pcap transcripts rendered with the tcpflow or Zeek tools. CapME can be accessed from the Enterprise Log Search and Archive (ELSA) tool. CapME provides the cybersecurity analyst with an easy-to-read means of viewing an entire Layer 4 session. CapME acts as a plugin to ELSA and provides access to relevant pcap files that can be opened in Wireshark.

Snort : This is a Network Intrusion Detection System (NIDS). It is an important source of alert data that is indexed in the Sguil analysis tool. Snort uses rules and signatures to generate alerts. Snort can automatically download new rules using the PulledPork component of Security Onion. Snort and PulledPork are open source tools that are sponsored by Cisco.

Zeek : Formerly known as Bro. This is a NIDS that uses more of a behavior-based approach to intrusion detection. Rather than using signatures or rules, Zeek uses policies, in the form of scripts that determine what data to log and when to issue alert notifications. Zeek can also submit file attachments for malware analysis, block access to malicious locations, and shut down a computer that appears to be violating security policies.

OSSEC : This is a host-based intrusion detection system (HIDS) that is integrated into Security Onion. It actively monitors host system operations, including conducting file integrity monitoring, local log monitoring, system process monitoring, and rootkit detection. OSSEC alerts and log data are available to Sguil and Kibana. OSSEC requires an agent to be running on the Windows computers in the enterprise.

Wazuh : is a HIDS that will replace OSSEC in Security Onion. It is a full-featured solution that provides a broad spectrum of endpoint protection mechanisms including host logfile analysis, file integrity monitoring, vulnerability detection, configuration assessment, and incident response. Like OSSEC, it requires agents to be running on network hosts.

Suricata : This is a NIDS that uses a signature-based approach. It can also be used for inline intrusion prevention. It is similar to Zeek; however, Suricata uses native multithreading, which allows the distribution of packet stream processing across multiple processor cores. It also includes some additional features such as reputation-based blocking and support for Graphics Processing Unit (GPU) multithreading for performance improvement.

  • Sguil – This provides a high-level console for investigating security alerts from a wide variety of sources. Sguil serves as a starting point in the investigation of security alerts. A wide variety of data sources are available to the cybersecurity analyst by pivoting directly from Sguil to other tools.
  • Kibana – Kibana is an interactive dashboard interface to Elasticsearch data. It allows querying of NSM data and provides flexible visualizations of that data. It provides data exploration and machine learning data analysis features. It is possible to pivot from Sguil directly into Kibana to see contextualized displays based on the source and destination IP addresses that are associated with an alert. Search the internet and visit the elastic.co website to learn more about the many features of Kibana.
  • Wireshark – This is a packet capture application that is integrated into the Security Onion suite. It can be opened directly from other tools and will display full packet captures relevant to an analysis.
  • Zeek – This is a network traffic analyzer that serves as a security monitor. Zeek inspects all traffic on a network segment and enables in-depth analysis of that data. Pivoting from Sguil into Zeek provides access to very accurate transaction logs, file content, and customized output.
  • NIDS – Snort, Zeek, and Suricata
  • HIDS – OSSEC, Wazuh
  • Asset management and monitoring – Passive Asset Detection System (PADS)
  • HTTP, DNS, and TCP transactions – Recorded by Zeek and pcaps
  • Syslog messages – Multiple sources

Alerts will generally include five-tuples information

  • SrcIP – the source IP address for the event.
  • SPort – the source (local) Layer 4 port for the event.
  • DstIP – the destination IP for the event.
  • DPort – the destination Layer 4 port for the event.
  • Pr – the IP protocol number for the event.

SIEM Inputs and Outputs

SIEM combines the essential functions of security event management (SEM) and security information management (SIM) tools to provide a comprehensive view of the enterprise network using the following functions:

  • Log collection – Event records from sources throughout the organization provide important forensic information and help to address compliance reporting requirements.
  • Normalization – This maps log messages from different systems into a common data model, enabling the organization to connect and analyze related events, even if they are initially logged in different source formats.
  • Correlation – This links logs and events from disparate systems or applications, speeding detection of and reaction to security threats.
  • Aggregation – This reduces the volume of event data by consolidating duplicate event records.
  • Reporting – This presents the correlated, aggregated event data in real-time monitoring and long-term summaries, including graphical interactive dashboards.
  • Compliance – This is reporting to satisfy the requirements of various compliance regulations.

A popular SIEM is Splunk, which is made by a Cisco partner. The figure shows a Splunk Threat Dashboard. Splunk is widely used in SOCs. Another popular SIEM solution is Security Onion with ELK, which consists of the integrated Elasticsearch, Logstash, and Kibana applications. Security Onion includes other open-source network security monitoring tools.

Common NGFW events include:

  • Connection Event – Connection logs contain data about sessions that are detected directly by the NGIPS. Connection events include basic connection properties such as timestamps, source and destination IP addresses, and metadata about why the connection was logged, such as which access control rule logged the event.
  • Intrusion Event – The system examines the packets that traverse the network for malicious activity that could affect the availability, integrity, and confidentiality of a host and its data. When the system identifies a possible intrusion, it generates an intrusion event, which is a record of the date, time, type of exploit, and contextual information about the source of the attack and its target.
  • Host or Endpoint Event – When a host appears on the network it can be detected by the system and details of the device hardware, IP addressing, and the last known presence on the network can be logged.
  • Network Discovery Event – Network discovery events represent changes that have been detected in the monitored network. These changes are logged in response to network discovery policies that specify the kinds of data to be collected, the network segments to be monitored, and the hardware interfaces of the device that should be used for event collection.
  • Netflow Event -Network discovery can use a number of mechanisms, one of which is to use exported NetFlow flow records to generate new events for hosts and servers.