The National Institute of Standards and Technology (NIST) published its Artificial Intelligence Risk Management Framework (NIST AI 100-1) in January 2023.
The NIST AI Framework consists of 19 categories and 72 subcategories within the following four core functions:
- Govern
- Map
- Measure
- Manage
In prior articles, we focused on considerations when assessing and implementing the Govern and Map functions within the NIST AI Risk Management Framework. In this article, we focus on the implementation of the Measure function of the NIST AI Risk Management Framework.
The Measure function includes five categories and 22 subcategory controls as listed in Table 1 below.
Table 1 | |
---|---|
Category | Subcategory |
MEASURE 1: Appropriate methods and metrics are identified and applied. | MEASURE 1.1: Approaches and metrics for measurement of AI risks enumerated during the MAP function are selected for implementation starting with the most significant AI risks. The risks of trustworthiness characteristics that will not – or cannot – be measured are properly documented. |
MEASURE 1.2: Appropriateness of AI metrics and effectiveness of existing controls are regularly assessed and updated, including reports of errors and potential impacts on affected communities. | |
MEASURE 1.3: Internal experts who did not serve as front-line developers for the system and/or independent assessors are involved in regular assessments and updates. Domain experts, users, AI actors external to the team that developed or deployed the AI system, and affected communities are consulted in support of assessments as necessary per organizational risk tolerance. | |
MEASURE 2: AI systems are evaluated for trustworthy characteristics. | MEASURE 2.1: Test sets, metrics, and details about the tools used during Test & Evaluation, Validation & Verification (TEVV) are documented. |
MEASURE 2.2: Evaluations involving human subjects meet applicable requirements (including human subject protection) and are representative of the relevant population. | |
MEASURE 2.3: AI system performance or assurance criteria are measured qualitatively or quantitatively and demonstrated for conditions similar to deployment setting(s). Measures are documented. | |
MEASURE 2.4: The functionality and behavior of the AI system and its components – as identified in the MAP function – are monitored when in production. | |
MEASURE 2.5: The AI system to be deployed is demonstrated to be valid and reliable. Limitations of the generalizability beyond the conditions under which the technology was developed are documented. | |
MEASURE 2.6: The AI system is evaluated regularly for safety risks – as identified in the MAP function. The AI system to be deployed is demonstrated to be safe, its residual negative risk does not exceed the risk tolerance, and it can fail safely, particularly if made to operate beyond its knowledge limits. Safety metrics reflect system reliability and robustness, real-time monitoring, and response times for AI system failures. | |
MEASURE 2.7: AI system security and resilience – as identified in the MAP function – are evaluated and documented. | |
MEASURE 2.8: Risks associated with transparency and accountability– as identified in the MAP function – are examined and documented. | |
MEASURE 2.9: The AI model is explained, validated, and documented, and AI system output is interpreted within its context –as identified in the MAP function – to inform responsible use and governance. | |
MEASURE 2.10: Privacy risk of the AI system – as identified in the MAP function – is examined and documented. | |
MEASURE 2.11: Fairness and bias – as identified in the MAP function – are evaluated and results are documented. | |
MEASURE 2.12: Environmental impact and sustainability of AI model training and management activities – as identified in the MAP function – are assessed and documented. | |
MEASURE 2.13: Effectiveness of the employed TEVV metrics and processes in the MEASURE function are evaluated and documented. | |
MEASURE 3: Mechanisms for tracking identified AI risks over time are in place. | MEASURE 3.1: Approaches, personnel, and documentation are in place to regularly identify and track existing, unanticipated, and emergent AI risks based on factors such as intended and actual performance in deployed contexts. |
MEASURE 3.2: Risk tracking approaches are considered for settings where AI risks are difficult to assess using currently available measurement techniques or where metrics are not yet available. | |
MEASURE 3.3: Feedback processes for end users and impacted communities to report problems and appeal system outcomes are established and integrated into AI system evaluation metrics. | |
MEASURE 4: Feedback about efficacy of measurement is gathered and assessed. | MEASURE 4.1: Measurement approaches for identifying AI risks are connected to deployment context(s) and informed through consultation with domain experts and other end users. Approaches are documented. |
MEASURE 4.2: Measurement results regarding AI system trustworthiness in deployment context(s) and across the AI lifecycle are informed by input from domain experts and relevant AI actors to validate whether the system is performing consistently as intended. Results are documented. | |
MEASURE 4.3: Measurable performance improvements or declines based on consultations with relevant AI actors, including affected communities, and field data about context relevant risks and trustworthiness characteristics are identified and documented. |
How can organizations use the NIST AI Risk Management Framework Controls to assess activities that involve AI systems for the Measure function?
Along with the NIST AI Risk Management Framework, NIST also provided the AI Risk Management Playbook which contains supporting actions and considerations for each subcategory control.
Below are example questions to focus on when assessing an organization’s current AI compliance posture relative to the Measure function within the NIST AI Risk Management Framework:
- How will the appropriate performance metrics, such as the accuracy of the AI, be monitored after the AI is deployed? 1
- What corrective actions has the entity taken to enhance the quality, accuracy, reliability, and representativeness of the data? 2
- What are the roles, responsibilities, and delegation of authorities of personnel involved in the design, development, deployment, assessment, and monitoring of the AI system? 3
- How has the entity identified and mitigated potential impacts of bias in the data, including inequitable or discriminatory outcomes? 4
- As time passes and conditions change, is the training data still representative of the operational environment? 5
What should companies consider implementing to support alignment with the NIST AI Risk Management Framework Measure function?
After assessing and documenting activities that involve AI systems against the Measure function, below are examples of AI compliance management activities to assist organizations in implementation for remediation of gaps or to demonstrate AI compliance readiness and maturity:
- Establish approaches for detecting, tracking, and measuring known risks, errors, incidents, or negative impacts. 6
- Document reports of errors, incidents, and negative impacts and assess the sufficiency and efficacy of existing metrics for repairs, and upgrades. 7
- Utilize separate testing teams established in the Govern function to enable independent decisions and course correction for AI systems. Track processes and measure and document changes in performance. 8
- Measure and document performance criteria such as validity (false positive rate, false negative rate, etc.) and efficiency (training times, prediction latency, etc.). 9
- Monitor and document how metrics and performance indicators observed in production differ from the same metrics collected during pre-deployment testing. 10
The Measure function is focused on developing business processes to measure and then remediate or improve topics such as false positives, bias, and the intended uses of the AI systems. The Measure function aligns with Article 17 of the EU AI Act requirements where providers of AI systems must create a “quality management system” to properly assess or measure techniques, procedures, and systematic actions in the design, control, and verification of high-risk AI systems including testing and validation processes.
In our next article, we will focus on implementing the Manage function of the NIST AI Risk Management Framework.
Notes:
1. NIST AI 100-1. NIST AI RMF Playbook. January 2023. Page 106.
2. NIST AI 100-1. NIST AI RMF Playbook. January 2023. Page 109.
3. NIST AI 100-1. NIST AI RMF Playbook. January 2023. Page 110.
4. NIST AI 100-1. NIST AI RMF Playbook. January 2023. Page 116.
5. NIST AI 100-1. NIST AI RMF Playbook. January 2023. Page 120.
6. NIST AI 100-1. NIST AI RMF Playbook. January 2023. Page 105.
7. NIST AI 100-1. NIST AI RMF Playbook. January 2023. Page 108.
8. NIST AI 100-1. NIST AI RMF Playbook. January 2023. Page 110.
9. NIST AI 100-1. NIST AI RMF Playbook. January 2023. Page 119.
10. NIST AI 100-1. NIST AI RMF Playbook. January 2023. Page 122.
© Copyright 2024. The views expressed herein are those of the author(s) and not necessarily the views of Ankura Consulting Group, LLC., its management, its subsidiaries, its affiliates, or its other professionals. Ankura is not a law firm and cannot provide legal advice.