- Chapter 4 of the 2018 National Climate Assessment looks at the potential climate impacts on the US energy system.
- Flow of Flows — Orchestrating ELT with Prefect and dbt. More exploration of how to build data processing pipelines using open source tooling.
- Orchestrating Airbyte data connection tasks with Prefect. Official integrations for Airbyte connectors as Prefect tasks.
- Data cleaning IS analysis, not grunt work. A longish post exploring what we really get out of doing data cleaning, and why it’s more valuable and complex than it often gets credit for.
- Peer learnings about what it means to become an open data steward, from the 2021 ODI Open Data Summit. Videos and responses from participants on many facets of stewarding open data, especially as a business / organization.
A couple of weeks ago I attended TWEEDS 2020 virtually (like everything this year) and talked about Catalyst’s ongoing Public Utility Data Liberation (PUDL) project, and especially the challenges of getting a big pile of data into the hands of different kinds of users, using different tools for different purposes. It ended up sketching out a bit of a PUDL infrastructure roadmap for the next year, and so we thought it would be a good idea to write it up here too.
We’ll have a separate post looking at our 2021 data roadmap.
The US Energy Information Asymmetry
PUDL is all about addressing a big information asymmetry in the regulatory and legislative processes that affect the US energy system. Utilities have much more information about their own systems than policymakers and advocates typically do. As a result, regulators often defer to the utilities on technical & analytical points. Commercial data exists, but it’s expensive. We want to get enough data into the hands of other kinds of stakeholders that they can make credible quantitative arguments to regulators, and challenge unfounded assertions put forward by utilities.