With increasing frequency on security incidents and greater than ever focus on the security posture of organizations, you as a manager or as an engineer, are being called to be able to measure the impact of your security controls. In other words, there is a high demand for being able to report, measure and provide situational awareness about your people, process and technologies that are related to the security business. Essentially, being able to measure something that would produce actionable insights that will result in more effective and efficient services.
That being written, below some random thoughts about this topic. My background is engineering, not economy or business management but is common sense that in any operational business metrics are key in order to improve management and delivery of the services. Security should not be an exception. However, if you consider your organization end-to-end security stack (e.g. from the endpoint AV and operating system patches to the DDoS detection and mitigation controls) there is a huge variety of data sources that don’t have a common interface. This means whenever you want to measure something it will be manually and labor intensive i.e., the metrics are not cheap to gather. Now, if you put on top of that the people and processes aspect it will be even harder to measure.
Anyway, you also don’t want to measure just something just because you can. Metrics should support your mission and help you having a clear picture on how well are you performing. In addition as stated by Andrew Jaquity on his book “Security Metrics – Replacing Fear, Uncertainty, and Doubt”, good metrics should be consistently measured, gathered in an automated way, expressed in a number or percentage and expressed using a unit of measure like hours or dollars – I really recommend his book – .
As industry matures, we are getting better and better at measuring the different processes and different security controls. There are many well defined metrics and the book mentioned previously is a great resource. But let’s consider a practical example. A security monitoring function, maybe within a Security Operations Center. What do we want to measure and report? It depends on the size, scope and maturity level of the organization but below some reporting goals that one might chose in order to support the overall mission:
- Provide end to end effectiveness metrics about operational readiness to detect and block threats.
- Periodic benchmark about operational readiness to detect and block threats.
- Minimize the risk of attacks that could result in lost revenue, public embarrassment, and/or regulatory penalties.
So, if you stated your reporting goals what would you achieve with those? what might be the outcomes?
- Enable informed business decisions by producing actionable intelligence and situational awareness.
- Being able to plan for, manage and respond to all categories of threats – even as they become more frequent and more severe.
- Minimize false positives and harmless security incidents focusing on valuable and meaningful incidents.
- Facilitate strategic decisions to improve the security monitoring service by looking at the big picture.
- Guide the resource allocation.
- Help diagnose problems.
Now that we might have the goals and the benefits of reaching those goals let’s consider how we do it. First might be a manual approach by gathering the different data sources, extracting the data, consolidate the information, produce the dashboards/graphs, analyze the data and provide the actionable recommendations. Then you improve from there in an incremental approach by producing a mature and consistent process that generate the metrics in an automated fashion. The content of your report or dashboard might contain:
- Total number of devices being monitored.
- Volume of events, incidents and tickets that were handled (both the number and the type).
- Resolution times (a measure of the length of time from when the incident/ticket was received, the length of time from when the incident/ticket was dispatched, etc.).
- Number of employees (e.g. Cost).
- Headcount to Ticket ratio (e.g.,Improve the per capita security incidents ratio enabling a close monitor of the workload that might contributes to a better/worse work environment).
- Number of employee certifications ( how well-trained and well-equipped is your team?).
- Events/Incidents generated per region, device, signature (top talkers, outliers, trends..).
With these metrics we start to have answers to basic questions that business leaders might have such as:
- Are security incidents going up or down?
- Does the security operations team respond quickly to the different incidents?
- Does the regional SOC ranks favorably when compared with other regional SOC (both in volume and per capita security incidents rate).
- How effective is the SOC?
- Is the SOC running better off than it was last month/quarter/year?