More research into human and societal sources of bias in artificial intelligence systems is needed before government can hope to establish a standard for assuring their trustworthiness, technology experts say.
AI developers and users tend to focus on the representativeness of datasets and fairness of machine learning algorithms. But according to National Institute of Standards and Technology guidance published March 15, a socio-technical approach to building and deploying systems is needed.
Trustworthy and responsible AI experts have argued societal values should factor into development and use, but Special Publication (SP) 1270 is the first document in which NIST consistently recognizes such systems as socio-technical ones.
“I’m not sure if anything is so lacking that it requires revisiting this particular document,” Cynthia Khoo, associate at the Georgetown Law Center on Privacy & Technology, told FedScoop. “Especially given the role that it’s meant to play, which is just providing an overarching framework from which further, more specific guidance will emerge.”
NIST’s IT Laboratory intends to host a workshop and release a draft version of a practical guide on managing a variety of risks, including bias, across the AI Risk Management Framework for public comment in the coming weeks.
Bias is context-dependent, so ITL has also adopted the National Cybersecurity Center of Excellence‘s model to engage with AI companies and establish best practices for identifying, managing and mitigating bias using commercial tools. Those best practices will be compiled within additional guidance working toward a standard.
“For bias guidance, we’re just going to continue to work internally and with the broader community to identify socio-technical governance practices that can eventually become a standard,” said Reva Schwartz, a research scientist with NIST’s Trustworthy and Responsible AI program and SP 1270 coauthor. “So that’s kind of our marching path right now.”
In addition to acknowledging non-technical factors like systemic discrimination contributing to bias, which AI developers can’t simply code against, SP 1270 clarifies the limitation of measures to mitigate bias.
Bias stems from a variety of sources like errors in test datasets — even 3% to 4% error can diminish a model’s performance on actual data — or mathematical technology that only approximates reality, and placing humans in the loop only introduces cognitive limitations.
“Do their biases attenuate each other or amplify each other?” asked Apostol Vassilev, ITL research team lead and SP 1270 coauthor. “That’s an open question; no one seems to know that, and yet people assume that if you throw an innocent person in there all the problems go away.”
Unresolved, unresearched problems like that are a “call to action” for NIST, Vassilev added.
SP 1270 warns of the dangers of techno-solutionism — relying on AI as a quick, foolproof fix for complex, real-world problems — an aspect of the guidance Khoo and other experts pushed for in their comments on the draft version.
An AI tool without an underlying process or business purpose will only amplify biases and, in particular, problematic bias, Schwartz said.
“We know there’s great potential for AI,” she said. “We know it’s a net positive, but to unlock its full potential we can’t just place it in high-risk locations, especially in the federal government.”
Khoo praised SP 1270 for encouraging the consultation of subject matter experts (SMEs) beyond computational and software engineers, like doctors, when developing automated technologies for medical diagnoses, and historically marginalized groups. Human-centered design requires SME involvement from an AI project’s outset and a common language across disciplines, Schwartz said.
A final, unique feature of SP 1270 is its analysis of how historical context and systems of power — and who among developers, data scientists and policymakers — influence algorithmic bias, Khoo said.
“Governance is a key factor, especially within a socio-technical frame,” Schwartz said. “So what goes into the organizational decisions that are made about AI systems, specifically in government agencies, when they reach out to vendors or internally to have AI developed.”
That makes organizations’ adherence to federal guidance on managing bias in AI all the more important, she added.
Agencies are increasingly launching AI systems for sensitive situations like deploying drones and managing and monitoring weapons, when often those systems are validated in optimized conditions, rather than real-world ones. SP 1270’s success will be gauged based on agencies’ ability to use it to avoid launching biased AI systems.
“To evaluate the effectiveness of this document, it’s almost a matter of the proof in the technological pudding,” Khoo said. “By the time further guidance has come out, will this framework actually prevent the development or deployment of harmful or discriminatory algorithmic technologies, and I think that’s ultimately what it has to be judged on.”