id | title |
---|---|
google-bigquery |
Google BigQuery |
This page succinctly describes how to use Google BigQuery to analyze data from the MultiversX blockchain.
BigQuery is Google's fully managed, serverless data warehouse that enables analysis of extremely large datasets using SQL queries and / or visual tools (such as Google Looker Studio); it also has built-in machine learning capabilities.
MultiversX Blockchain data is published to Google BigQuery, and available (for free) through the Google Cloud Marketplace. The dataset, namely bigquery-public-data.crypto_multiversx_mainnet_eu
, is one of many crypto datasets that are available within Google Cloud Public Datasets. One can query these datasets for free: up to 1TB / month of free processing, every month.
The MultiversX BigQuery dataset closely resembles the set of indices of the MultiversX Elasticsearch instance. Their schema and data are approximately equivalent, the data being mirrored from the Elasticsearch instance to BigQuery at regular intervals (most tables are updated hourly, and a few are updated every 4 hours).
:::note As of February 2024, the MultiversX BigQuery dataset is not updated in real-time (see above). For real-time data, use the public APIs. :::
:::note If you experience any issue with the published dataset, please let us know. :::
Google BigQuery Studio is a unified workspace for Google Cloud's data analytics suite which incorporates, among others, an SQL editor (optionally assisted by AI) and Python notebooks. It is a great way to explore the MultiversX dataset, and to run queries. Below, we'll explore a few example queries.
:::tip Make sure to explore the dataset, the tables and their schema before running queries. Both the schema and a data preview are available in BigQuery Studio. :::
SELECT
DATE(`timestamp`) `day`,
COUNT(*) `transactions`
FROM `bigquery-public-data.crypto_multiversx_mainnet_eu.transactions`
WHERE DATE(`timestamp`) >= DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY)
GROUP BY `day`
ORDER BY `day` DESC
SELECT
DATE(`timestamp`) `day`,
`receiver` `contract`,
COUNT(DISTINCT `sender`) `num_users`,
FROM `bigquery-public-data.crypto_multiversx_mainnet_eu.transactions`
WHERE `isScCall` = true
GROUP BY `day`, `contract`
HAVING `day` >= DATE_SUB(CURRENT_DATE(), INTERVAL 3 DAY) AND `num_users` > 1000
ORDER BY `day` DESC, `num_users` DESC
SELECT
`token`,
`type`,
COUNT(_id) `num_holders`
FROM `bigquery-public-data.crypto_multiversx_mainnet_eu.accountsesdt`
WHERE `type` = 'FungibleESDT' OR `type` = 'MetaESDT'
GROUP BY `token`, `type`
HAVING `num_holders` > 5000
ORDER BY `num_holders` DESC
SELECT
`day`,
`hash`,
`sender`,
`receiver`,
`amount`
FROM (
SELECT
DATE(`timestamp`) `day`,
`_id` `hash`,
`sender`,
`receiver`,
PARSE_BIGNUMERIC(`value`) `amount`,
ROW_NUMBER() OVER (PARTITION BY DATE(`timestamp`)
ORDER BY PARSE_BIGNUMERIC(`value`) DESC) AS `row_num`
FROM
`bigquery-public-data.crypto_multiversx_mainnet_eu.transactions`
WHERE
`status` = 'success'
AND DATE(`timestamp`) >= DATE_SUB(CURRENT_DATE(), INTERVAL 7 DAY) )
WHERE `row_num` = 1
ORDER BY `day` DESC
LIMIT 7;
SELECT
DATE(`timestamp`) `day`,
-- 14400 is the number of rounds per day, and 3 + 1 = 4 is the number of shards
ROUND(COUNT(*) / (14400 * 4), 4) `hit_rate`
FROM `bigquery-public-data.crypto_multiversx_mainnet_eu.blocks`
WHERE
DATE(`timestamp`) >= DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY)
AND DATE(`timestamp`) < CURRENT_DATE()
GROUP BY `day`
ORDER BY `day` DESC
:::note Even if BigQuery includes a generous free tier, it is important to be mindful of the costs associated with running queries. For more information, see BigQuery pricing.
If you believe that specific optimizations can be applied to the dataset (to improve query performance), please let us know. :::
Google Looker Studio is a powerful tool for analyzing data and creating (shareable) reports. Out of the box, it connects to BigQuery (and many other data sources), thus it's a great way to explore the MultiversX dataset.
Example of report created in Looker Studio (leveraging the MultiversX dataset in BigQuery):
:::tip In the BigQuery Studio, you can save the results of a given query as your own BigQuery tables, then immediately import them in Looker Studio, to create visualizations and reports. :::
One can also query datasets programmatically, using the BigQuery client libraries.
See how to query a public dataset with the BigQuery client libraries.