Please describe your proposed solution.
Executive Summary
- We'll build a custom PostgreSQL extension that allows you to interact with cardano CBOR data directly.
- Developers will be able to store raw CBOR as table columns and use our extensions to evaluate predicates and projections directly.
- This simplifies many data indexing scenarios and provides improved performance over alternative queries requiring complex table joins.
- We'll also provide a generic dataset, a simple PostgreSQL schema + data that represents much of the data found on-chain leveraging the Mumak extension.
- We'll integrate Mumak with Mithril and Oura, providing a way for fast processing of the whole history of the chain.
- We'll integrate Mumak into TxPipe's Scrolls indexer providing a way to create richer dataset in a simpler way.
- We'll integrate the Mumak service on Demeter platform so that it can be accessed directly without the need of provisioning your own instance.
Introduction
The de facto wire format used to exchange Cardano data is CBOR. By Cardano data we refer to Blocks, Transactions, UTxOs, Certificates and all of its inner structures.
When indexing this data onto a relational database, such as PostgresQL, we usually deserialize the CBOR structures and map the values to different columns & tables in our schema.
Some times having a normalized data model is exactly what your use-case needs, but there are multiple scenarios where this mapping just adds complexity and performance penalties.
Mumak is a PostgreSQL extension that provides several custom functions to interact with Cardano CBOR data directly on the database.
For example, this means that you could store a whole Cardano Tx as CBOR in a bytes PostgreSQL column and then use SQL to ask things such as:
- Does this transaction involve address X?
- What's the total ADA output of this transaction?
- Is this transaction minting a token for policy X?
The evaluation of these values will happen in-process, as part of the Db query execution.
Usage Example
Let say that we have a table where we store transactions in CBOR format. By doing this, we could connect to the PostgreSQL instance and run queries like these:
> SELECT
> tx_hash(tx_cbor, era),
> tx_lovelace_output(tx_cbor, era)
> FROM txs
> WHERE
> tx_has_address(tx_cbor, era, 'addr1xxx');
> SELECT
> tx_asset_mint(tx_cbor, era, '0x000000', 'myasset')
> FROM txs
> WHERE
> tx_has_asset(tx_cbor, era, '0x000000', 'myasset')
> SELECT
> tx_metadata_label(tx_cbor, era, 721)
> FROM txs
> WHERE
> tx_has_metadata_label(tx_cbor, era, 721)
Let's say that we now have a table where we store Cardano address as raw bytes. We could run the following queries:
> SELECT
> delegation_part(address)
> WHERE
> is_script_address(address)
> SELECT
> payment_part(address)
> WHERE
> delegation_part(address) = "0x000"
>
More information
If you're interested in learning more about the project, even outside the scope of the Catalyst proposal, please bookmark our git repository. We'll make sure to maintain the README updated with relevant information and news.