Written byCarten Cordell
The General Services Administration wants to help data scientists capitalize on the opportunities within the government’s massive trove of information by offering them a common gateway to access it.
GSA chief data officer Kris Rowley said Tuesday that the agency had developed a central data repository that offered role-based access for its datasets.
Called the Data Science Virtual Desktop, the platform will offer data scientists the ability to utilize the data and pursue innovations and insights while still providing a centralized environment to securely manage the information, Rowley said.
No one would officially “own” the data, he said, but it would be instead managed by agency personnel called data stewards.
“The data stewards manage the system of records, they manage the functional areas and know where the data is and they allow people to have access to that data through an automated capability,” Rowley said at the Advanced Technology Academic Research Center’s Federal Big Data Summit.
The platform will be available through GSA’s Data To Decisions website, which will also be the portal for agencies to post their data. Upgrades to that site, including single portal sign-on, are expected to be complete by Dec. 11.
The overall benefit, Rowley posited, was that with increasing demand for access to agency data, the platform offers a balance between transparency and a strong cybersecurity posture.
“I don’t want people to feel like I’m overseeing or watching everything everyone is doing,” Rowley said. “But I also want there to be some controls in place when the data steward grants access to a dataset or IT grants access to a tool, we have some automation behind it and we have some documentation behind it.”
The platform — in addition to GSA’s decentralized set of data modeling tools and training courses — could also streamline the study of popular datasets that draw similar interest, like agency financial information, by ensuring that the information is assessed through common business models instead of a duplicative analysis.
“It’s all carrot and no stick,” Rowley said. “I’m not shutting anything off. I’m really just trying to encourage people to bring their work to a common place, bring their data to a common place — having data stewards there to share that data when appropriate — and then present their analysis on a common platform.”