Data Liberation Engineer

We are recruiting a new member who will help expand and maintain our Public Utility Data Liberation project

You’ll focus providing analysis-ready data to climate advocates, policymakers, researchers, and journalists. Catalyst is a small organization, so we all play a variety of roles as needed. Our biggest needs are currently in data wrangling, client-facing data analysis, and software engineering.

See our GitHub profile for more information about the tools we use.

About Catalyst Cooperative

Catalyst Cooperative is a democratic, worker-owned data engineering and analysis consultancy. We develop open source software and publish analysis-ready open data related to the US energy system. We provide customized data, analyses, and software development services to people working in the public interest to help them understand the changing energy landscape and inform public policy. Our focus is primarily on mitigating climate change and improving utility regulation in the United States.

We focus on the data so that our clients and stakeholders can focus on their own specialties, whether that means political advocacy, academic research, storytelling, or building out the clean energy system of the future.

Work Expectations

Catalyst is an entirely remote organization, with members scattered across North America from Alaska to Mexico.

Members have a high degree of autonomy and flexibility to determine their own work-life balance. We require that members work at least 1000 hours per year to maintain their membership.

Member candidates are first hired as contractors, and must work at least 500 hours in their first 6 months. Barring unforeseen problems, they are then invited to joint the co-op as a member, becoming both an employee and co-owner of the co-op.

Benefits & Compensation

We exist to help our members earn a decent living while working for a more just, livable, and sustainable world. Our income comes from a mix of foundation and client funded work.

As a 100% employee-owned company, we are able to compensate members through a mix of wages and profit sharing.

  • An hourly wage (currently $36.75/hr)
  • Tax-advantaged annual patronage dividends (proportional to hours worked)
  • Tax-deferred employer contributions to our retirement plan (proportional to annual wages)

We also reimburse members for expenses related to maintaining their home office, and provide a monthly health insurance stipend.

Domains of Responsibility & Expertise

As a small organization we all wear several hats. Our main activities are listed below along with how much of it we do in general (Volume) and how short we are in that department, either in terms of capacity or level of expertise (Need).

Data Wrangling

Volume: High; Need: High

  • Writing data cleaning and tidying functions that take messy raw data from a wide variety of sources, and turn it into well normalized database tables.
  • Exploratory data analysis and ad-hoc visualizations with pandas dataframe in Jupyter notebooks to understand the underlying structure of the data, what’s wrong with it, and the consequences of trying to fix it.
  • What are the primary keys of a table? What do all of the codes mean? Have they changed over time? Should they be consolidated / standardized? Are the reporting units consistent? Which identifiers correspond to each other between tables? What values are likely to be invalid?
  • Having some energy system domain knowledge can be very helpful in interpreting the data and identifying issues.

Data Analysis

Volume: High; Need: High

  • Develop new data analyses from scratch to serve specific client needs.
  • Adapt existing spreadsheet-based analysis provided by clients and collaborators into Python modules and/or Jupyter Notebooks that can handle larger volumes of data, and that may eventually run as part of our automated builds.
  • Adapt or re-implement relevant analyses from the academic literature so that they can be replicated continuously as new data is released over time.
  • Impute / estimate missing data values, and validate that the methods reproduce existing data accurately.
  • Identify invalid / outlier values that should be removed or replaced with imputed / estimated values.
  • Having some energy system domain knowledge is very helpful.

Software Engineering

Volume: High; Need: High

  • We work almost entirely in Python, with a bit of SQL here and there.
  • You would primarily be working on very data oriented software: creating new data extraction tooling, data structures for managing metadata, glue to connect the various portions of our data pipeline together, conversions between data formats, structures that allow us to apply data cleaning or transformation routines uniformly in different contexts.
  • Good software testing and documentation is integral to our work.
  • We sometimes provide software engineering services to other organizations working in similar domains or with similar data, but most of our software engineering is internally focused at the moment.

Data Engineering

Volume: Medium; Need: Medium

  • Integrate freshly wrangled data and analyses into our production environment where it can be run automatically and and be continuously deployed as new data is released over time.
  • Ensure replicable programmatic access to raw data sources that are poorly curated by the public agencies that publish them.
  • Compile, curate, and publish metadata associated with the data we process and distribute: define table structures, fields, data types, relationships between tables, etc.
  • Encapsulate and refactor existing locally run data pipelines into an easily extensible and scalable automated build using Dagster to coordinate the various operations.
  • Integrate existing analytical functions into the automated data builds so that the results are easily available within the data products we’re publishing.
  • Design and implement data validation / quality control systems so that we know when something has gone wrong, and have some sense of how clean the data we’re publishing is.
  • Design and implement databases and data warehouses.

Data Visualization

Volume: Low; Need: High

  • Create legible and informative static visualizations in notebooks (Matplotlib, seaborn, etc.) for use on the web and in publications.
  • Build dashboards which display data from our data warehouse and allow some level of interactivity / exploration. A lot of this work is client facing, but we want to start creating both public and internal dashboards too. E.g. to help monitor data quality in our nightly builds, to track our own business metrics, or to feed visualizations into public-facing websites and social media.

Community Management

Volume: Low; Need High

  • Manage our social media presence by engaging with our relatively technical users and the climate & energy policy informed public. Communicate about our work and its relevance to current events and other related efforts and applications. This includes Twitter, blog posts, newsletters, and domain specific email lists.
  • Field responses to our public facing email addresses, GitHub issues & discussions.
  • Prepare tutorials and examples to help get new users oriented to the data and software we are providing.
  • Attend meetups and conferences to network and give presentations.
  • Respond to queries from allied NGOs, researchers, policymakers and other users and allies who want to build a relationship with us. This could be in writing, phone calls, video calls.
  • Periodically survey our users and potential users to understand whether we are meeting their needs, and how we can improve.
  • Help define, collect, analyze, and interpret our usage metrics.
  • Communicate user feedback and needs to the rest of the cooperative.

Project Management

Volume: Low; Need: High

  • Track technical dependencies within and between projects to avoid bottlenecks.
  • Track team capacities in terms of both hours and skill sets to avoid bottlenecks.
  • Develop work plans / priorities / schedules.
  • Track grant and client budget constraints and progress toward milestones.
  • Facilitate communication between internal teams.

Policy Analysis (Low)

Volume: Low; Need: Low

  • Stay up to date with the rapidly evolving energy policy landscape, and areas where our data can help accelerate the transition away from fossil fuels.
  • Track national and relevant state-level legislation and regulatory processes, especially related to fossil fuel asset retirements, financing mechanisms, and incentives related to new electricity infrastructure development.

Business & Governance

Volume: Medium; Need: Medium

  • Actively participate in the democratic management and governance of the cooperative.
  • Help develop new client relationships and seek out new users and lines of business.
  • Identify and apply for foundation and public agency grant funding.
  • Ensure that the co-op’s day to day business operations run smoothly.
  • Recruit new co-op members with the right skills and mission alignment.

How to Apply

Materials to Submit

Send the following materials to hello@catalyst.coop:

Your resume or CV

A 1-2 page cover letter addressing the following questions:

  • Which position you are applying for.
  • Why you are interested in working on climate and energy policy issues.
  • Why you are interested in working on open source software and open data.
  • Why you are interested in becoming a member of a worker cooperative.

Projects / Portfolio (optional): We recognize that not everyone has had the opportunity to do substantial open source work. However, if you do have publicly accessible software or data projects that you’d like to highlight for us, and potentially discuss during the interview process, please feel free to include links. This could be a repo on GitHub, a notebook-based data story with visualizations, some documentation you’re particularly proud of, etc.

Interview & Membership Process

  • Based on your submitted materials, a member of our team may contact you to schedule a quick (20-30 minute) phone check-in.  We could then ask you to spend 2-4 hours on a small take-home project, which we would review and discuss in a longer subsequent interview.
  • Co-op Membership Process: This is a member-track contractor position. After a 6 month candidacy period in which you spend at least 500 hours working for Catalyst, you will become eligible for membership in the cooperative. Existing members will then vote on whether to invite you to join. To accept the invitation you will need to sign the co-op’s membership agreement and purchase a member equity share ($1000) which is redeemable when you leave the cooperative.
  • Catalyst is committed to creating an inclusive workplace for folks of all backgrounds. Discrimination against race, color, religion, sex (including pregnancy, gender identity, and sexual orientation), national origin, age, disability or genetic information is both illegal and against our cooperative’s values.