The information provided by grid operators (ISOs and RTOs) is some of the richest and most voluminous electricity system that’s publicly available.  The locational marginal pricing (LMP) information is particularly valuable for assessing the economic viability of new and existing system investments.

Benefits & Opportunities

In combination with the EPA CEMS hourly operations data, EIA Form 923 fuel cost data, and estimates of non-fuel variable operating expenses from FERC Form 1, the ISO LMP data should allow us to model the profitability of individual generation units at hourly resolution. This information could be extremely valuable in a regulatory context, for comparing the economic viability of existing fossil plants to new demand side or renewable energy resources.

Fortunately, there is prior work with ISO/RTO that we can learn from:

  • WattTime created an open source python based API for accessing ISO data called PyISO. However, it has not been actively maintained since the fall of 2017 and is likely no longer entirely functional.  In addition, it only offers access to a subset of the potentially available ISO data, and that subset does not include locational marginal pricing data (LMP) which is important for many economic analyses looking at the viability of existing fossil generation, or new resources in competitive markets.
  • Patrick Brown from MIT has also compiled a database of historical LMP values which may be available after his work (currently in review) has been published. This includes both the raw LMP values and the code which he developed to pull data from the various grid operators. In addition he has compiled a geographic mapping of a large portion of the unlabeled grid delivery nodes which we may be able to reuse.

What is contained in this dataset?

  • Grid operators (ISOs and RTOs) provide a huge quantity of data, at up to 5 minute frequency, often in close to real time.
  • Data available from some or all of the ISOs includes:
    • Total load in MW by balancing area.
    • Net trade (import or export) in MW by balancing area
    • Total generation in MW by balancing area, broken down by power source (coal, gas, nuclear, wind, hydro, etc.)
    • Locational marginal price (LMP) in $/MWh by node throughout the grid. This may or may not include the node’s location, depending on the ISO.
    • Load forecasts at a variety of timescales — 15 minutes ahead, day ahead, etc.
  • Much of the ISO/RTO data is similar to what is provided by the FERC Form 714, but at higher time resolution, and it is made available in close to real time instead of with a ~1 year delay.
  • The LMP data appears to be uniquely available from the ISOs.

Who would use it?

  • Users exploring detailed dispatch models and market dynamics, e.g. how the integration of additional renewable energy and demand side programs has affected the operations and economics of fossil fuel plants.
  • Users evaluating the economic viability of existing power plants in competitive markets, e.g. whether monopoly owned coal plants that participate in competitive markets use economic dispatch, or operate out-of-market, increasing costs to ratepayers, as in this research by Joe Daniels at Union of Concerned Scientists.
  • Users exploring the relationship between weather conditions and grid operations –  electricity markets. For example, how extreme weather events affect the grid, when wind and solar resources are most productive, etc.

Risks & Challenges

Each of the ~6 ISOs/RTOs report different data, in different formats, through different platforms. Sometimes these platforms change. This makes accessing the ISO data challenging, and means it requires more work to maintain it over time. It also means our estimate of how much work it will be to integrate this data is probably the least confident of any of the datasets we’re looking at integrating

For a significant historical record, 5-minute frequency data can quickly become very large. Potentially billions of records and terabytes (thousands of gigabytes) of data. Integrating the ISO/RTO data would dramatically benefit from access to a cloud based storage and distributed computation platform. Otherwise it will be hard for users without specialized computing resources to wield effectively.

There will be many grid delivery nodes whose locations need to be mapped. This could be time consuming and tedious or technically challenging. FERC, EIA, and the ISOs/RTOs probably do not use the same IDs or nomenclature to identify grid delivery nodes, meaning integration of the ISO LMP, FERC EQR, EIA Form 860, EIA Form 923, and EPA CEMS data will probably be both challenging and valuable.