HomeGuidesRecipesAPI ReferenceChangelog
Log In
Guides

DataHub ESG Tables

📘

General Availability Release

Arcadia’s new aggregated meter level usage details utility data product, DataHub, is officially in General Availability! Arcadia is excited to announce that we support aggregated meter level usage details to support ESG scope 2 reports in Snowflake through this new offering.

Introduction

DataHub ESG Tables Overview

DataHub ESG Tables release provides calendarized and aggregated meter level utility data at the monthly, quarterly, and annual timeframes to support ESG scope 2 reports. DataHub’s ESG tables provide the underlying meter data to enable you to calculate your emissions. There are four tables available: Calendarized Monthly Usages, Calendarized Quarterly Usages, Calendarized Annual Usages, and a Statement Availability Report.

For example, the Calendarized Monthly Usages table aggregates your meters’ daily usage values and provides monthly level totals for your meters’ usages. This table provides fields for the original usage and usage unit of measure that are available on the utility statement as well as standardized columns for usage and unit of measure for your meters’ usages. This table enables you to understand how much energy the meter uses for a specific month to calculate how much greenhouse gas this meter is responsible for. The Calendarized Quarterly Usages and Calendarized Annual Usages tables support this use case as well and offer the meter level data aggregation at the quarterly or annual levels respectively.

This release also provides the Statement Availability Report to present a summary of how many statements are included in your calendarized ESG report. The Statement Availability Report table provides columns for your total statements, statements included into your calendarized ESG report, statements excluded from your calendarized ESG report, and the statements included percentage. DataHub strives for 95% or more statements to be available in your ESG report, and there are audits that prevent statements with data quality issues from polluting your meters’ total usage values. For the outstanding statements, the utility provider's statements report data with inconsistencies, unintelligible, or filled with gaps. Proration & Inference features address these data inconsistencies by prorating account level charges to the meter level and by inferring the meter total usage when the meter’s previous reading and current reading are available on the statement while the meter’s total usage is not printed. You can learn more about Proration & Inference features on this feature deep dive. Arcadia will continue to enhance this logic to solve additional edge cases over time.

📘

Note

Because of the inconsistency of when utility providers post statements, Arcadia recommends that you wait until the month, quarter, or year is 45 days or more past the last day of the time period to ensure that all meter data is available in your respective report.


Data Access Options

DataHub provides access to your utility data through two new ingestion options: zipped CSV files delivered to your SFTP server or access to a Snowflake Direct Share to query the data in Snowflake directly.

SFTP Delivery

In the SFTP option, you must setup your own SFTP server. You receive your selected calendarized table (e.g. monthly, quarterly, or annual) with aggregated meter level utility data for all statements and the Statement Availability Report upon initial SFTP setup. You also receive aggregated meter level utility data for all statements and the Statement Availability Report on the first of each month moving forward. For a quarterly report example, you will receive two files to your SFTP: datahub_quarterly_usages_02_21_2024-10_13_15.csv.gz and datahub_statement_availability_report_02_21_2024-10_13_15.csv.gz .

Arcadia recommends that you use the most recent file to ensure that your statement and usage data is up to date. You may setup all three calendarized tables as SFTP deliveries if your use cases require monthly, quarterly, and annual calendarized meter usage data. For SFTP delivery files, null values are represented as ,"\n",  for VARCHAR-like data types and as ,"", for empty string values. Null values are represented as  ,\\n,  for NUMBER-like data types.

Snowflake Data Share

For the Snowflake Data Share option, your Snowflake account must be hosted on AWS in us-east-1 availability zone. For additional information on Snowflake Data Shares, review Snowflake’s overview guide. After receiving access to the private Snowflake Data Share, you can query in SQL or Python against the full dataset in all four tables. The Snowflake Data Share table names are datahub_monthly_usages, datahub_quarterly_usages, datahub_annual_usages, and datahub_statement_availability_report.



DataHub - Standardize Unit of Measure Features

Standardizing units of measure on meter usages enables DataHub to better serve the ESG use case by providing more accurate usage totals in the same unit of measure. Arcadia derives energy conversion factors from a combination of Energy Star’s US conversions table, Nist.gov conversions factors, and assumed that gallons are measured in US Liquid Gallons. For conversions that are volume to energy (e.g. natural gas) or mass to energy (e.g. steam), Arcadia is leveraging Energy Star’s conversion factors. Arcadia now supports standardizing units of measure where possible for the following service types:

  • Electric/Lighting: kWh for consumption and kW for demand
  • Natural Gas: therms
  • Water/Sewer/Irrigation: Gallons

DataHub does not support unit of measure conversions for reactive max measured demand and reactive total consumption values. This includes the following units of measure: kvar, kva, kvarh, kvah, mvarh, undefined, unit, and days.

DataHub also does not support standardizing the following units of measure: horsepower, residential cooling hecta liters, nm3/h, sm3, m3/h, kgh, and kg.


DataHub - Calendarization Features

To calendarize meter usages, Arcadia’s system compares the meter’s statements over time to identify if the utility provider prints full or partial measurement period start and end dates to measure the calendarized days in the period and to accurately calculate the daily usage value averages. ‘Full’ indicates that the start or end day is a full 24 hour day within a single statement. ‘Partial’ means that the utility provider reads the meter during business hours, and therefore, the period start or end date is not a full 24 hour day. Partial start and end dates are shared between two or more statements, and Arcadia splits partial period dates evenly when calculating the daily usage values.

These daily usage values enable the system to roll up these values into monthly, quarterly, or annual aggregated meter usage data. For the first statement available for a meter, Arcadia’s system assumes that the meter’s period start date is full. For the meter’s most recent statement, Arcadia’s system determines the mode from the three most recent statements and assigns the mode value to the period end date as either full or partial. For full period start and end dates, the period’s days are the date difference between the two dates including the last day in the period. For partial start and end dates, the period’s days are the date difference between the two dates including the last day in the period minus one.


Data Dictionary

The data dictionary includes all four tables and their fields. The tables available are Calendarized Monthly Usages, Calendarized Quarterly Usages, Calendarized Annual Usages, and Statement Availability Report. For each table, there are field details for field name, field example value, if the field can be null, and field description.

Measured Usages

Measured usage is the metered and measured amount of energy on a utility statement that is used for billing purpose. Utility statements leverage the measured usage for a meter and apply the meter’s readings, multiplier, constant factor and the conversion factor to calculate charge items. Measured usage implies actively metered events. Any meter volumes that are estimated due to utility provider logistical problems are still considered a measured usage event.

Measured usage and cited usage are mutually exclusive. Of particular importance here is recognizing that for Arcadia, measured usage is the foundation for ESG and carbon accounting (Scope 2). Any usage item designated as measured is declaring that observed consumption or demand events occurred in the relevant period.  Any usage metrics that are solely functions of the tariff or customer history and are not dynamically measured in the current statement would not be considered measured usage.

Cited Usages

Cited usage implies that the usage amount is used for billing purposes, but this usage does not represent a quantity of energy (e.g. real consumption or demand) that was actually used or measured by a meter. If a meter only contains cited usages, this event triggers an audit and does not store the meter and its cited usages in the ESG tables. Utility providers tend to cite usage elements as justification for each of the charges assessed in the monthly invoices. Any usage metrics that are functions of the tariff or your usage history and are not dynamically measured in the current statement are considered cited usage. Ratchet demands or annual usage totals are examples of cited usages. Cited usages do impact fees and billing, but these usages do not represent current or actual consumption of the meters.


Recipes