We are seeking a new member who will focus on automating our data pipeline and integrating new public datasets into our Public Utility Data Liberation (PUDL) project.
About Catalyst Cooperative
Catalyst Cooperative is a democratic, worker-owned data engineering and analysis consultancy. We develop open source software and publish analysis-ready open data related to the US energy system. We provide customized data, analyses, and software development services to climate advocates, researchers, policymakers, and journalists to help them understand the changing energy landscape and inform public policy. Our focus is primarily on mitigating climate change and improving electric utility regulation in the United States.
We do the data engineering so that our clients and stakeholders can focus on their own specialties, whether that means political advocacy, academic research, storytelling, or building out the clean energy system of the future.
For this position, you will:
- Help migrate our data pipelines onto cloud infrastructure and implement a continuous integration and deployment system that automatically builds, validates, and publishes data products on a regular basis.
- Integrate data processing methods and analytical outputs developed by others both inside and outside of Catalyst into our data processing pipeline for continuous deployment and wider distribution.
- Occasionally provide software design and development services to outside client projects which are adjacent to PUDL.
- Design and build open source software that liberates messy, poorly curated public data from legacy file formats, and turns it into clean, structured, analysis-ready data products used by a wide variety of public interest groups.
- Write software tests and documentation.
- Design and implement automated data quality control and validation processes.
- Actively participate in the democratic management and governance of the cooperative.
- Learn about energy systems and the public policies that influence their design and operation.
You should definitely have:
- Software development experience, including using version control systems, participating in code reviews, testing software, using continuous integration, etc.
- Experience with data-oriented software design and development in Python.
- Proficiency in data wrangling using Python, including cleaning and restructuring messy data, creating linkages between disconnected datasets, data quality assurance & quality control.
- Experience designing automated and reproducible data processing pipelines.
- Familiarity with the principles of “tidy data” and database normalization. Experience designing well-normalized tables and interacting with databases like Postgres and SQLite using a variety of interfaces such as SQLAlchemy, the Python standard libraries, or raw SQL.
- A passion for climate action, open source software, and open data.
- An interest in writing good documentation and ensuring that our data and software are usable by people with a wide variety of technical backgrounds.
- Experience writing software in Python, and using pandas, Jupyter Notebooks, version control with git, testing and CI tools like code linters, pytest, Tox, and GitHub Actions.
- Permission to work in the US (though you don’t need to be physically located in the US) or a legal entity in your home country for Catalyst to contract with.
It would be great if you have:
- Experience using databases, cloud object stores, Docker, Dask, Prefect, and Kubernetes to orchestrate a distributed data processing pipeline.
- Experience publishing analysis ready cloud optimized data catalogs using object stores.
- Experience using Numpy and SciPy to identify bad data, impute missing values, and evaluate the statistical validity of those imputations.
- Experience using machine learning frameworks like scikit-learn to link disparate datasets together.
- Experience working with geospatial data and related open source libraries.
- Experience using Docker and Jupyter to archive and distribute data, computational environments, and reproducible analyses.
- Familiarity with using RST and Sphinx to document Python projects.
- Experience scraping data from the web.
- Experience with project management for data analysis and software development.
- Experience with or interest in doing business development — seeking out new clients and grant funding opportunities, exploring open source business models, and working on grant applications.
- Familiarity with cooperatives and small-scale democratic organizations.
- Experience cultivating a community of users and contributors around an open source project.
- Knowledge of the US electric utility sector and power markets, especially in the context of state or federal level climate policy and ratepayer advocacy.
- Experience cleaning and analyzing public data related to the US energy system, including any of:
- FERC Forms 1, 2, 714, or the FERC EQR
- EIA Forms 176, 860, 861, 923, or 930
- EPA’s Continuous Emissions Monitoring System (CEMS)
Compensation & Co-op Membership
- This is a member-track contractor position. It’s our hope and general expectation that you will become an employee-owner after an initial 6-months of contract-based work.
- Base Compensation: We currently pay ourselves and contract workers an hourly wage of $36.75.
- Co-op Member Benefits:
- Members have the privilege and responsibility to participate in running the cooperative. Currently all members of the cooperative sit on the co-op’s board, which makes strategic and governance decisions for the cooperative.
- Profit Sharing: Each year we collectively decide how to allocate the cooperative’s surplus income, dividing it between a patronage dividend to members, and retained earnings which are invested in internal infrastructure projects, and which provide financial stability to the organization. For more details on how this works, see this explanation from the Democracy at Work Institute.
- Retirement Plan: Employees of the co-op may contribute up to $13,500 per year to a tax-deferred SIMPLE IRA, with matching funds from the co-op up to 3% of your wages.
- Flexible Time and Time Off: Catalyst members are generally expected to work 30 hours per week. Members accrue 5 days of paid time off per year and are able to maintain membership working half time over a calendar year. Consecutive time off over 2 weeks is encouraged but needs board approval.
- (Pending) Health benefits: Catalyst is exploring options for providing healthcare benefits to members within the next 6 months to year. These may include a group insurance plan, health savings accounts (HSA), or a health reimbursement arrangement (HRA).
- Catalyst is a small, all-remote worker cooperative. Currently we have four members distributed across North America.
- Our goal is to earn a decent living doing work that improves the world while leaving time for other parts of our lives that are important to us. We aim to have 30-hour work weeks under normal circumstances with the understanding that there will be occasional periods when we need to work 40 hours a week to meet a deadline or complete a project.
- Our current members mostly come from energy policy and climate advocacy backgrounds and are primarily self-taught when it comes to software development and data analysis. We are hoping that our new members can help us balance the team out with more depth of technical experience.
- With such a small team, we all end up taking on a variety of responsibilities, some of which end up being outside of our existing areas of expertise. Nobody is expected to know everything, and we encourage everyone to spend time learning new skills on the job. We recognize that failures are a normal part of learning, and want to provide an encouraging environment.
- We are committed to being a diverse and inclusive workplace. See our Code of Conduct for more details on what that means.
- Our work is funded in roughly equal parts by client contracts and grants. Our clients are a mix non-profit organizations focused on accelerating the decarbonization of the US energy system and academic researchers modeling the energy system to inform public policy.
Materials to Submit
Send the following materials to email@example.com:
Your resume or CV
A 1-2 page cover letter addressing the following questions:
- Which position you are applying for.
- Why you are interested in working on climate and energy policy issues.
- Why you are interested in working on open source software and open data.
- Why you are interested in becoming a member of a worker cooperative.
Projects / Portfolio (optional): We recognize that not everyone has had the opportunity to do substantial open source work. However, if you do have publicly accessible software or data projects that you’d like to highlight for us, and potentially discuss during the interview process, please feel free to include links. This could be a repo on GitHub, a notebook-based data story with visualizations, some documentation you’re particularly proud of, etc.
Interview & Membership Process
- Based on your submitted materials, a member of our team may contact you to schedule a quick (20-30 minute) phone check-in. We could then ask you to spend 2-4 hours on a small take-home project, which we would discuss in the subsequent interview.
- Co-op Membership Process: This is a member-track contractor position. After a 6 month candidacy period in which you spend at least 500 hours working for Catalyst, you will become eligible for membership in the cooperative. Existing members will then vote on whether to invite you to join. To accept the invitation you will need to sign the co-op’s membership agreement and purchase a member equity share ($1000) which is redeemable when you leave the cooperative.
- Catalyst is committed to creating an inclusive workplace for folks of all backgrounds. Discrimination against race, color, religion, sex (including pregnancy, gender identity, and sexual orientation), national origin, age, disability or genetic information is both illegal and against our cooperative’s values.