Evolving Government: The promise land of AI is in…redaction?


Written by

Part 2 of the Evolving Government series with Dcode42

When people think of artificial intelligence, they think of self-driving cars and intelligent, talking robots. Throw in the word government, and the imagination shifts to clandestine programs that deal with cyberespionage, counterterrorism, nano-drones and mind-control.

This is only partly true.

As mentioned in part one of this series, the predominant value that AI adds to the world is not just in these science fiction scenarios but in the most mundane, mucky and unglamorous tasks: contract management, spreadsheet manipulation, invoice processing and analysis of millions of documents. For these processes, the promise land of artificial intelligence has never been truer.

The advent of cognitive computing

AI has come a long way since a group of scientists at Dartmouth coined the term in 1956 — especially so for structured data, or data that can be put into tables. With the advent of cognitive computing, even problems with unstructured data and natural language are no longer uniquely human. AI can now analyze email, text, social media, blog posts, video transcriptions and more.

Cognitive computing, is very good at two things: First, enabling tasks that are impossible for humans to complete in a reasonable amount of time without error; and second, accelerating tasks to such an extent that it revolutionizes the way business is done around them.

Think of something repetitive and tedious, yet necessary. For the sake of this argument, let’s choose redaction of sensitive information. When I began my journey into the government market, the extent of the redaction issue quickly became apparent.  Of all data breaches, 40 percent are inadvertent internal leaks by people within the organization. These preventable mistakes still end up splashed in the press all too frequently. Recently, the Justice Department inadvertently released information on immigration judges that compromised the privacy and safety of these individuals.

Improper redaction extends far beyond the walls of government. Access to sensitive information by inside actors in combination with an improperly redacted PDF led the New York Times to accidentally releasing the name of an NSA agent. While tedious, redaction can have serious implications.

Methodologies for redaction today range from “hiding” fields on PDFs to more sophisticated redaction technologies, such as Titus and Adobe. These processes are still resource-intensive and time-consuming. However, there is now opportunity for agencies to move towards “intelligent redaction.”

Using cognitive computing, intelligent redaction takes the process from hours down to a matter of milliseconds. Self-learning software, such as Coseer, can constantly learn and identify sensitive information and automatically remove it with high levels of accuracy. All the software needs to train is to read through a few documents along with their redacted version. Nothing glamorous, yet such software can save a lot of time for agencies, focus human resources on more valuable tasks and prevent mission-critical mistakes. Harnessing these emerging technologies can improve protection of classified and sensitive information, speed up Freedom of Information Act responses and help manage acquisition processes more effectively.

AI on all mediums of communication

Documents are just the beginning. Now that redaction happens in milliseconds, why not put it in email servers, chat communications and cloud-based document sharing? Looking back at that 40 percent of data breaches in government linked to internal actors, most frequently, they also occur when critical information is sent over emails, printers, chats or similar channels.

Now that redaction can happen in milliseconds, government agencies have the power to redact every piece of information that leaves their secure firewalls through any channel. Imagine if NSA leaker Reality Winner hit print only to find that the classified NSA document on the Russia investigation was redacted as it came out of the actual printer. This is a great example of how AI, designed to improve the accuracy and speed of a learnable process, can become a key building block of the government’s cybersecurity strategy.

Will the government adopt AI?

Artificial intelligence is ushering in the fourth industrial revolution—one that will let humans focus on tasks that involve empathy, critical thinking, wisdom and creativity, and empower government at all levels to better serve its citizens.  As I am quickly discovering through the Dcode42 program, the government remains hesitant to embrace emerging technologies when the stakes may be high, and understandably so.

To move past this, agencies should leverage small samples of data to run valid proof of concepts. For example, government agencies can take previously redacted documents that are now available for public consumption and run sample tests through tools like Coseer. A small proof of concept on non-sensitive data can demonstrate the huge value potential to agencies, value that represents millions of dollars saved for taxpayers if the technology is more accurate and faster than the current process.

Praful Krishna is CEO of Coseer, a member in Dcode42’s latest cohort focused on AI and machine learning.

-In this Story-

AI, DCode42, evolving government, machine learning