Earlier today I stumbled across the NIST patch management pub; it was released in November 2005.

There’s lots of goodness in this document. What I like best are the recommended program metrics, which I reprint here, lightly edited:

Metric NameUnitsMaturity Level
Vulnerability ratioVulnerabilities/Host3
Unapplied patch ratioPatches/Host3
Network services ratioNetwork Services/Host3
Response time for vulnerability and patch identification (triage processes)Time4
Patch response time (critical)*Time4
Patch response time (noncritical)*Time4
Emergency configuration response time (for zero-days)Time4
Cost of patch vulnerability groupMoney (labor)5
Cost of system administration supportMoney (labor)5
Cost of softwareMoney5
Cost of program failuresMoney5

*In my own work, I’ve referred to these metrics as “patch latency”.

Now, there are certainly folks who will state that patching doesn’t matter much. They may be right. But if you view patching as a discipline you need to get right regardless, I would suggest that these are pretty good metrics.

Update 1: Interestingly, our friends at Microsoft, who (ahem) have some experience patching systems in large, complex environments, add these metrics to the mix:

Two of the key business metrics that we use to measure our success is the number of servers that are patched outside of their maintenance windows. For example, if we apply a patch outside of a server’s pre-defined maintenance window, that’s going to be a ding against our success. Another metric is going to be the number of servers patched by deadline.

Update 2: We’ve had quite a bit of chatter about this on the securitymetrics.org mailing list. The inestimable Fred Cohen stated that some of these metrics are too hard to gather, or may not tell an enterprise much when taken in aggregate. He’s right, in part. But I would suggest that the reason, say, the “cost of program failures” is rated a 5 on the process maturity scale is precisely because it is difficult to measure. Indeed, the numerical process maturity rating could be taken as a proxy for measurement difficulty. As for his question about the relative value of average measurements taken in aggregate (across multiple contexts, business units and threat environments) — that is why cross-sectional analyses exist.

Good stuff.