DataHub Foundational Tables
Learn about DataHub's Preview Tables & Capabilities
Preview Feature
Arcadia’s new meter centric utility data product, DataHub, is officially in Preview release! Arcadia is excited to announce that we support Plug meter level details in Snowflake through this new offering.
Introduction
What is DataHub?
DataHub’s Foundational Tables preview release provides utility data across three tables: Meters by Statement, Meter Usages, and Meter Charges. Meters by Statement table records display a specific utility meter’s total measured usage, total cited usage, total demand usage, bidirectional in, bidirectional out, and several additional fields to tell the full story about this meter and the energy consumed at the meter’s service location. To validate the meter level totals available in Meters by Statement table, the Meter Usages and Meter Charges tables provide the ability for a user to search on the relevant Meter ID, Site ID, Meter Data ID, or another field to dig into the meter’s line item usage or charge details. DataHub also provides a link to the statement source file (e.g. PDF) on all records in the three tables.
DataHub enables you to access their utility data through two new ingestion options: incremental daily zipped files for all three tables as CSV files to your SFTP server or Snowflake Data Share direct table access. In the daily incremental files option, the customer receives historical records for all tables for the initial compressed file delivery and also receives new and updated records for each table every day moving forward as compressed files. For the Snowflake Data Share option, your Snowflake account must be hosted on AWS in us-east-1 availability zone. You receive access to a private Snowflake Data Share where you can query in SQL or Python against the full dataset in all three tables. The Snowflake Data Share table paths are: datahub.shares.meters_by_statement_current, datahub.shares.usages_current, and datahub.shares.charges_current.
Prerequisites
Enable Proration and Inference Features
If you are a new customer who signed on or after March 25th, 2024, proration and inference features are enabled by default. If you are a legacy customer as an initial required step, ensure that your organization is enabled for proration and inference features.
Onboard Utility Credentials and Utility Files onto Plug
As a second required step, you will onboard your utility data onto Plug. You will submit utility credentials through Plug’s Connect experience or Create Credential API endpoint. If you do not have utility credentials and instead have the source statement PDFs, you will onboard these statements through Plug’s Bill Uploader module or Add File API endpoint.
Create Sites and Assign Meters to Site
In order to group your Meters by Site ID in DataHub, Arcadia recommends that you create Sites for your relevant geographical locations and map utility meters to the site containers to further organize your utility data in Plug.
DataHub - Standardize Unit of Measure Features
Standardizing units of measure on meter usages enables DataHub to better serve the ESG use case by providing more accurate usage totals in the same unit of measure. Arcadia derives energy conversion factors from a combination of Energy Star’s US conversions table, Nist.gov conversions factors, and assumed that gallons are measured in US Liquid Gallons. For conversions that are volume to energy (e.g. natural gas) or mass to energy (e.g. steam), Arcadia is leveraging Energy Star’s conversion factors. Arcadia now supports standardizing units of measure where possible for the following service types:
- Electric/Lighting: kWh for consumption and kW for demand
- Natural Gas: therms
- Water/Sewer/Irrigation: Gallons
DataHub does not support unit of measure conversions for reactive max measured demand and reactive total consumption values. This includes the following units of measure: kvar, kva, kvarh, kvah, mvarh, undefined, unit, and days.
DataHub also does not support standardizing the following units of measure: horsepower, residential cooling hecta liters, nm3/h, sm3, m3/h, kgh, and kg.
Data Access Options
DataHub provides access to your utility data through two new ingestion options: zipped CSV files delivered to your SFTP server or access to a Snowflake Direct Share to query the data in Snowflake directly.
SFTP Delivery
In the SFTP option, you must setup your own SFTP server. You receive all three tables with all historical records upon initial SFTP setup. You also receive incremental files with new and updated statements every day moving forward. For SFTP delivery files, null values are represented as ,"\n",
for VARCHAR-like data types and as ,"",
for empty string values. Null values are represented as ,\\n,
for NUMBER-like data types.
SFTP Setup
In order to set up your SFTP connection with Arcadia, you need to share the information below with your Arcadia contact. Arcadia will provide you our public key (2048-bit RSA). Please work with your IT or infrastructure department to configure and provide the details below. Please work with your Arcadia representative for handling any issues around initial setup.
- SFTP server hostname/IP
- e.g.
192.158.1.38
or[sftp.example.com](http://sftp.example.com/)
- e.g.
- SFTP server public key. This key must be in OpenSSH’s authorized_keys format.
- e.g.
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDXdVM9rD7wE1vBcDpfJszq3Hs7F...
- If you do not have an existing key pair, you can generate a new private / public key pair in this format with the following command:
ssh-keygen -t rsa -b 2048
- The
.pub
file that is generated from the above command contains the OpenSSH format public key that you should share in the SFTP setup modal. You can optionally copy it to your clipboard by running the following command:
pbcopy < ~/.ssh/<FILENAME_OF_RSA_KEY>.pub
- The private key file (
private_key.pem
or the file you specified when running the command) is the file you should keep secret and NOT share with Arcadia. This private key file should be used to configure your SFTP server.
- e.g.
- The username of the user you have set up for Arcadia to access your SFTP server.
- e.g.
arcadia
- Note that you will need to associate Arcadia’s OpenSSH format public key (2048-bit RSA) with this user, so Arcadia may authenticate to your server. Most SFTP servers have an
authorized_keys
file that you should add this key to. This key can be found under the “Arcadia public key value” field in the SFTP setup modal on your Arc dashboard.
- e.g.
- Full path in the above user’s directory that you would like your Arcadia data to be stored.
- e.g.
/
or/datahub/reports
- You should create this folder and provide the created user write access to it before submitting this information to Arcadia.
- e.g.
Snowflake Data Share
For additional information on Snowflake Data Shares, review Snowflake’s overview guide. For the Snowflake Data Share option, your Snowflake account must be hosted on AWS in us-east-1 availability zone. After receiving access to the private Snowflake Data Share, you can query in SQL or Python against the full dataset in all three tables. The Snowflake Data Share table names are meters_by_statement_current, usages_current, and charges_current.
Snowflake Data Share Setup
To access DataHub’s Snowflake private Data Share, you will need to host your Snowflake account in AWS us-east-1 availability zone. Once you confirm that your Snowflake account is available on this cloud provider and availability zone, please provide these details to your Arcadia contact to setup the Data Share.
In your Snowflake environment, run the command below and provide your current organization name, your current account name, and account locator to your Arcadia contact.
select current_organization_name(), current_account_name(), current_account();
Once Arcadia sets up your Snowflake Account for Data Share, Arcadia database is available in the ‘Ready to Get’ section. In Snowflake’s Snowsight UI, you will:
- Sign in to Snowsight
- Select Data » Private Sharing
- Select the Shared with You tab
- In the Ready to Get section, select the share that you want to create a database for
- Set a database name and the roles that are permitted to access the database
- Select Get Data
Once the Data Share is available, you can query all the data in one of the four tables with:
SELECT * FROM datahub.shares.<table_name>;
Data Dictionary
The data dictionary includes all three tables and their fields. The tables available are Meters by Statement, Meter Usages, and Meter Charges. For each table, there are field details for field name, field example value, if the field can be null, and field description.
Recipes
- Grouping Annual Usages by Site ID & Service Type
- Identify Highest On Peak Demand Charge by Site
- Find Greatest Expenses by Site in 2023
- Highest Actual Peak by Provider & Year
- Largest Delta Between Max Peak Actual Demand & Cited Demand by Meter & Year
- Energy Returned to the Grid by Site, Meter, & Year
Updated 5 months ago