The Second Decade of DataSF
After setting the standard for high-quality public data--see where the team is headed in the next ten years
How It All Started
The year was 2014. Frozen and Interstellar were in theaters, the World Cup played out in Brazil, and San Francisco codified its Open Data Policy into the Administrative Code. That spring, the City appointed its first Chief Data Officer, Joy Bonaguro.
Back then, the mission was clear: build trust in open data, publish what we could, and create a strong foundation of standards, systems, and practices. And we did. Over the past decade, DataSF has become a go-to source of high-quality public data—powering apps, research, journalism, and transparency.
But the story doesn't stop there.
As we enter our second decade, DataSF is expanding its mission. We're launching new tools, platforms, and policies that serve not only the public but also the City staff who deliver services every day. Our focus is on modern infrastructure, cross-departmental collaboration, and enabling data science at scale. This next chapter is about making data more usable and more impactful across San Francisco.
Building the Foundation
In its early years, DataSF focused on creating the infrastructure to guide data use across the City, with a strong commitment to public access. These foundational policies and processes shaped the team's growth and set citywide standards for data practices:
Data & System Inventory: Centralized catalog of systems and datasets
Data Publishing Process: Clear steps for releasing data to the public
Data Management Policy: Framework for treating data as a strategic asset
Data Classification Standards: Guidelines for handling sensitive or private data
Metadata Standards: Requirements for dataset descriptions and documentation
Dataset Standards: Specifications for core datasets like addresses, parcels, and streets
Together, these standards turned scattered data assets into a trusted, reliable public resource.
Opening the Floodgates
Policy sets the stage—but data use drives change. In the early years, the DataSF team took on the herculean task of getting departments to share their data on the City's open data portal. They onboarded dozens of departments and identified hundreds of datasets for publication.
One dataset in particular marked a turning point: 311 Cases. Still one of the most-used datasets on the portal, it demonstrated the power of open data to support civic engagement, service delivery, and transparency. Countless apps and reports have been built on top of it, making it a kind of civic weathervane for the City.
Since then, DataSF has published 554 datasets from 67 departments and divisions. Today, the portal handles over 7 million API calls and 76,000 page views each month—and growing.
From Access to Impact
Publishing data is only the beginning. Helping people use it effectively is what unlocks its full potential. Two flagship programs helped expand the impact of open data:
DataScienceSF: A cohort-based program where departments partner with data scientists to generate actionable insights and deliverables that improve City services.
DataAcademy: A series of evolving classes for City staff, covering Power BI, ArcGIS, R, Python, SQL, open data, service design, and more.
These initiatives transformed DataSF from a data provider into a data enabler.
The Biggest Surprise
As DataSF evolved to meet the City’s most pressing needs—like tracking assets and publishing critical information during the COVID-19 response—one insight stood out: the biggest users of open data were City staff.
What began as a public-facing platform became one of the most valuable internal tools in City government—a high-quality, always-up-to-date, and well-documented data resource that supports operations every day.
Looking Forward
So, what’s next for DataSF? We’re not guessing. The future is already underway. Led by our new CDO, Soumya Kalra, we are building on our strong public foundation, we’re doubling down on supporting City staff while continuing to enable members of the public. Here’s where we’re going:
Launching the Unified Data Platform: We're rolling out a modern data stack—including a warehouse, orchestration engine, semantic layer, and flexible integrations—to extend the tools we've used for years to more City departments. This platform will democratize access to data and enable more effective, cross-departmental work.
Modernizing the Data Management Policy: Our current policy has guided public data publishing, but gaps remain in internal data sharing. Just as we created standards for open data, we now aim to define clear, consistent policies for managing and sharing internal data.
Doubling Down on Analytics: With the rise of AI and large language models, there’s never been a better time to apply cutting-edge tools to complex challenges. We're growing our Data Science team to help departments take advantage of these advances and elevate their impact.
The second decade of DataSF is about going deeper, not just wider—making data more useful, usable, and used across the City.
The Road Ahead
As we enter our second decade, DataSF is focused on deepening its impact. The first ten years were about building a foundation—one rooted in transparency, quality, and accessibility. The next ten are about making that foundation work even harder for San Franciscans.
We know data alone doesn’t solve problems. But good data, in the hands of people who know how to use it, absolutely can. That’s why our future efforts are focused on empowering City staff, supporting smarter decisions, and enabling collaboration across departments.
We’re excited about what’s next. If the first decade proved that open data is valuable, the next will prove just how transformative it can be.
Let’s build it together.