The NIST 800-61r2 standard provides guidelines for incident handling, particularly for analyzing incident-related data, and determining the appropriate response to each incident. The guidelines can be followed independently of particular hardware platforms, operating systems, protocols, or applications.
The first step for an organization is to establish a computer security incident response capability (CSIRC). NIST recommends creating policies, plans, and procedures for establishing and maintaining a CSIRC.
Policy Elements
An incident response policy details how incidents should be handled based on the organization’s mission, size, and function. The policy should be reviewed regularly to adjust it to meet the goals of the roadmap that has been laid out. Policy elements include the following:
- Statement of management commitment
- Purpose and objectives of the policy
- Scope of the policy
- Definition of computer security incidents and related terms
- Organizational structure and definition of roles, responsibilities, and levels of authority
- Prioritization of severity ratings of incidents
- Performance measures
- Reporting and contact forms
Plan Elements
A good incident response plan helps to minimize damage caused by an incident. It also helps to make the overall incident response program better by adjusting it according to lessons learned. It will ensure that each party involved in the incident response has a clear understanding of not only what they will be doing, but what others will be doing as well. Plan elements are as follows:
- Mission
- Strategies and goals
- Senior management approval
- Organizational approach to incident response
- How the incident response team will communicate with the rest of the organization and with other organizations
- Metrics for measuring the incident response capacity
- How the program fits into overall organization
Procedure Elements
The procedures that are followed during an incident response should follow the incident response plan. Procedures elements are as follows:
- Technical processes
- Using techniques
- Filling out forms,
- Following checklists
These are typical standard operating procedures (SOPs). These SOPs should be detailed so that the mission and goals of the organization are in mind when these procedures are followed. SOPs minimize errors that may be caused by personnel that are under stress while participating in incident handling. It is important to share and practice these procedures, making sure that they are useful, accurate, and appropriate.
Incident Response Stakeholders
Other groups and individuals within the organization may also be involved with incident handling. It is important to ensure that they will cooperate before an incident is underway. Their expertise and abilities can help the Computer Security Incident Response Team (CSIRT) to handle the incident quickly and correctly. These are some of the stakeholders that may be involved in handing a security incident:
- Management – Managers create the policies that everyone must follow. They also design the budget and are in charge of staffing all of the departments. Management must coordinate the incident response with other stakeholders and minimize the damage of an incident.
- Information Assurance – This group may need to be called in to change things such as firewall rules during some stages of incident management such as containment or recovery.
- IT Support – This is the group that works with the technology in the organization and understands it the most. Because IT support has a deeper understanding, it is more likely that they will perform the correct action to minimize the effectiveness of the attack or preserve evidence properly.
- Legal Department – It is a best practice to have the legal department review the incident policies, plans, and procedures to make sure that they do not violate any local or federal guidelines. Also, if any incident has legal implications, a legal expert will need to become involved. This might include prosecution, evidence collection, or lawsuits.
- Public Affairs and Media Relations – There are times when the media and the public might need to be informed of an incident, such as when their personal information has been compromised during an incident.
- Human Resources – The human resources department might need to perform disciplinary measures if an incident caused by an employee occurs.
- Business Continuity Planners – Security incidents may alter an organization’s business continuity. It is important that those in charge of business continuity planning are aware of security incidents and the impact they have had on the organization as a whole. This will allow them to make any changes in plans and risk assessments.
- Physical Security and Facilities Management – When a security incident happens because of a physical attack, such as tailgating or shoulder surfing, these teams might need to be informed and involved. It is also their responsibility to secure facilities that contain evidence from an investigation.
NIST Incident Response Life Cycle
- Preparation – The members of the CSIRT are trained in how to respond to an incident. CSIRT members should continually develop knowledge of emerging threats.
- Detection and Analysis – Through continuous monitoring, the CSIRT quickly identifies, analyzes, and validates an incident.
- Containment, Eradication, and Recovery – The CSIRT implements procedures to contain the threat, eradicate the impact on organizational assets, and use backups to restore data and software. This phase may cycle back to detection and analysis to gather more information, or to expand the scope of the investigation.
- Post-Incident Activities – The CSIRT then documents how the incident was handled, recommends changes for future response, and specifies how to avoid a reoccurrence.
Preparation
The preparation phase is when the CSIRT is created and trained. This phase is also when the tools and assets that will be needed by the team to investigate incidents are acquired and deployed. The following list has examples of actions that also take place during the preparation phase:
- Organizational processes are created to address communication between people on the response team. This includes such things as contact information for stakeholders, other CSIRTs, and law enforcement, an issue tracking system, smartphones, encryption software, etc.
- Facilities to host the response team and the SOC are created.
- Necessary hardware and software for incident analysis and mitigation is acquired. This may include forensic software, spare computers, servers and network devices, backup devices, packet sniffers, and protocol analyzers.
- Risk assessments are used to implement controls that will limit the number of incidents.
- Validation of security hardware and software deployment is performed on end-user devices, servers, and network devices.
- User security awareness training materials are developed.
Additional incident analysis resources might be required. Examples of these resources are a list of critical assets, network diagrams, port lists, hashes of critical files, and baseline readings of system and network activity. Mitigation software is also an important item when preparing to handle a security incident. An image of a clean OS and application installation files may be needed to recover a computer from an incident.
Often, the CSIRT may have a jump kit prepared. This is a portable box with many of the items listed above to help in establishing a swift response. Some of these items may be a laptop with appropriate software installed, backup media, and any other hardware, software, or information to help in the investigation. It is important to inspect the jump kit on a regular basis to install updates and make sure that all the necessary elements are available and ready for use. It is helpful to practice deploying the jump kit with the CSIRT to ensure that the team members know how to use its contents properly.
Detection and Analysis
Different types of incidents will require different responses.
Attack vectors.
- Web – Any attack that is initiated from a website or application hosted by a website.
- Email – Any attack that is initiated from an email or email attachment.
- Loss or Theft – Any equipment that is used by the organization such as a laptop, desktop, or smartphone can provide the required information for someone to initiate an attack.
- Impersonation – When something or someone is replaced for the purpose of malicious intent.
- Attrition – Any attack that uses brute force to attack devices, networks, or services.
- Media – Any attack that is initiated from external storage or removable media.
Detection.
There are automated ways of detection such as antivirus software or an IDS. There are also manual detections through user reports.
It is important to accurately determine the type of incident and the extent of the effects. There are two categories for the signs of an incident:
- Precursor – This is a sign that an incident might occur in the future. When precursors are detected, an attack might be avoided by altering security measures to specifically address the type of attack detected. Examples of precursors are log entries that show a response to a port scan, or a newly-discovered vulnerability to an organization’s web server.
- Indicator – This is a sign that an incident might already have occurred or is currently occurring. Some examples of indicators are a host that has been infected with malware, multiple failed logins from an unknown source, or an IDS alert.
Analysis.
When an indicator is found to be accurate, it does not necessarily mean that a security incident has occurred. Some indicators happen for other reasons besides security. A server that continually crashes, for example, may have bad RAM instead of a buffer overflow attack occurring. To be safe, even ambiguous or contradictory symptoms must be analyzed to determine if a legitimate security incident has taken place. The CSIRT must react quickly to validate and analyze incidents. This is performed by following a predefined process and documenting each step.
Scoping.
When the CSIRT believes that an incident has occurred, it should immediately perform an initial analysis to determine the incident’s scope, such as which networks, systems, or applications are affected, who or what originated the incident, and how the incident is occurring. This scoping activity should provide enough information for the team to prioritize subsequent activities, such as containment of the incident and deeper analysis of the effects of the incident.
Incident Notification.
When an incident is analyzed and prioritized, the incident response team needs to notify the appropriate stakeholders and outside parties so that all who need to be involved will play their roles. Examples of parties that are typically notified include:
- Chief Information Officer (CIO)
- Head of information security
- Local information security officer
- Other incident response teams within the organization
- External incident response teams (if appropriate)
- System owner
- Human resources (for cases involving employees, such as harassment through email)
- Public affairs (for incidents that may generate publicity)
- Legal department (for incidents with potential legal ramifications)
- US-CERT (required for Federal agencies and systems operated on behalf of the Federal government)
- Law enforcement (if appropriate)
Containment, Eradication, and Recovery
Containment strategy
These are some conditions to determine the type of strategy to create for each incident type:
- How long it will take to implement and complete a solution?
- How much time and how many resources will be needed to implement the strategy?
- What is the process to preserve evidence?
- Can an attacker be redirected to a sandbox so that the CSIRT can safely document the attacker’s methodology?
- What will be the impact to the availability of services?
- What is the extent of damage to resources or assets?
- How effective is the strategy?
Evidence.
These are some of the most important items to log when documenting evidence used in the chain of custody:
- Location of the recovery and storage of all evidence
- Any identifying criteria for all evidence such as serial number, MAC address, hostname, or IP address
- Identification information for all of the people that participated in collecting or handling the evidence
- Time and date that the evidence was collected and each instance it was handled
Attacker Identification.
These are some of the most important actions to perform to attempt to identify an attacking host during a security incident:
- Use incident databases to research related activity. This database may be in-house or located at organizations that collect data from other organizations and consolidate it into incident databases such as the VERIS community database.
- Validate the attacker’s IP address to determine if it is a viable one. The host may or may not respond to a request for connectivity. This may be because it has been configured to ignore the requests, or the address has already been reassigned to another host.
- Use an internet search engine to gain additional information about the attack. There may have been another organization or individual that has released information about an attack from the identified source IP address.
- Monitor the communication channels that some attackers use, such as IRC. Because users can be disguised or anonymized in IRC channels, they may talk about their exploits in these channels. Often, the information gathered from this type of monitoring is misleading and should be treated as leads and not facts.
Eradication, recovery and remediation.
This includes malware infections and user accounts that have been compromised. All of the vulnerabilities that were exploited by the attacker must also be corrected or patched so that the incident does not occur again.
To recover hosts, use clean and recent backups, or rebuild them with installation media if no backups are available or they have been compromised. Also, fully update and patch the operating systems and installed software of all hosts. Change all host passwords and passwords for critical systems in accordance with the password security policy. This may be a good time to validate and upgrade network security, backup strategies, and security policies. Attackers often attack the systems again, or use a similar attack to target additional resources, so be sure to prevent this as best as possible. Focus on what can be fixed quickly while prioritizing critical systems and operations.
Post-Incident Activities
After a major incident has been handled, the organization should hold a “lessons learned” meeting to review the effectiveness of the incident handling process and identify necessary hardening needed for existing security controls and practices. Examples of good questions to answer during the meeting include the following:
- Exactly what happened, and when?
- How well did the staff and management perform while dealing with the incident?
- Were the documented procedures followed? Were they adequate?
- What information was needed sooner?
- Were any steps or actions taken that might have inhibited the recovery?
- What would the staff and management do differently the next time a similar incident occurs?
- How could information sharing with other organizations be improved?
- What corrective actions can prevent similar incidents in the future?
- What precursors or indicators should be watched for in the future to detect similar incidents?
- What additional tools or resources are needed to detect, analyze, and mitigate future incidents?
NIST Special Publication 800-61 provides the following examples of activities that are performed during an objective assessment of an incident:
- Reviewing logs, forms, reports, and other incident documentation for adherence to established incident response policies and procedures.
- Identifying which precursors and indicators of the incident were recorded to determine how effectively the incident was logged and identified.
- Determining if the incident caused damage before it was detected.
- Determining if the actual cause of the incident was identified, and identifying the vector of attack, the vulnerabilities exploited, and the characteristics of the targeted or victimized systems, networks, and applications.
- Determining if the incident is a recurrence of a previous incident.
- Calculating the estimated monetary damage from the incident (e.g., information and critical business processes negatively affected by the incident).
- Measuring the difference between the initial impact assessment and the final impact assessment.
- Identifying which measures, if any, could have prevented the incident.
- Subjective assessment of each incident requires that incident response team members assess their own performance, as well as that of other team members and of the entire team. Another valuable source of input is the owner of a resource that was attacked, in order to determine if the owner thinks the incident was handled efficiently and if the outcome was satisfactory.
These are some of the determining factors for evidence retention:
- Prosecution – When an attacker will be prosecuted because of a security incident, the evidence should be retained until after all legal actions have been completed. This may be several months or many years. In legal actions, no evidence should be overlooked or considered insignificant. An organization’s policy may state that any evidence surrounding an incident that has been involved with legal actions must never be deleted or destroyed.
- Data Type – An organization may state that particular types of data should be kept for a specific period of time. Items such as email or text may only need to be kept for 90 days. More important data such as that used in an incident response (that has not had legal action), may need to be kept for three years or more.
- Cost – If there is a lot of hardware and storage media that needs to be stored for a long time, it can become costly. Remember also that as technology changes, functional devices that can use outdated hardware and storage media must be stored as well.
The critical recommendations from NIST for sharing information are as follows:
- Plan incident coordination with external parties before incidents occur.
- Consult with the legal department before initiating any coordination efforts.
- Perform incident information sharing throughout the incident response life cycle.
- Attempt to automate as much of the information sharing process as possible.
- Balance the benefits of information sharing with the drawbacks of sharing sensitive information.
example 28.4.13-lab—incident-handling