completed
Rust SDK fix critical CBOR encoding
Current Project Status
Complete
Amount
Received
$50,000
Amount
Requested
$50,000
Percentage
Received
100.00%
Solution

We will implement support in the cddl-codgen for generating Rust code that supports multiple CBOR encoding possibilities to ensure reliability of the Rust-based Cardano tool stack

Problem

Cardano uses a encoding scheme called CBOR that supports multiple ways of encoding the same data. No Rust library supports all possible encodings of data which has caused multiple critical issues

Impact / Alignment
Feasibility
Auditability

dcSpark

3 members

Rust SDK fix critical CBOR encoding

Please describe your proposed solution.

Cardano uses a format called CBOR which supports multiple ways to encode the same thing. For example, the number 0xff can be encoded as a single byte, of 0x00ff which is the same value, but takes 2 bytes instead. Cardano (since the Shelley-era) does not enforce canonical CBOR (which would force you, in the example above, to use 0xff), which means programs that support mapping Cardano CBOR data types to other data types such as Rust types need to be able to either keep track of which encoding format was used or keep track of the raw bytes.

Handling all the possible ways to encode all the possible types is, as you can imagine, a huge amount of manual work and extremely error-prone. In fact, this issue has caused multiple critical issues for projects that use the Cardano Rust infrastructure such as Carp, Pallas, Oura, Scrolls, etc. where they either stopped working entirely or returned the wrong result when certain unhandled variations appeared on the Cardano blockchain. Here is an example of such an issue: https://github.com/txpipe/oura/issues/307

To solve this, we have been working for the past few months on a change to the cddl-codgen library to automatically generate the code to handle all variations. Since generation is automatic, it is less prone to human error and, if an error is found, fixing it once fixes the error permanently for all future uses

Additionally, Cardano does not just use CBOR for overall block encoding. Metadata standards such as CIP25 also use CBOR and Plutus datum & redeemer encodings are CBOR as well. This change will allow easier creation of Rust libraries to parse various metadata or Plutus-based CIPs on Cardano which will unlock a lot of use-cases

One benefit of our approach versus keeping track of the raw bytes is that our implementation will have a smaller memory overhead which is particularly important for WASM-based libraries such as the Cardano Rust SDK used by almost all Cardano dApps. Additionally, our solution makes it easier to enforce specific deserialization constraints via arguments to the code generation which would be extremely hard or impossible to encode for solutions that generically just keep track of raw bytes. The fact that our solution is more robust also makes it well suited as meaningful contribution to the CBOR ecosystem beyond its usage in Cardano-related projects.

Please describe how your proposed solution will address the Challenge that you have submitted it in.

Currently almost all dApps and projects on Cardano depend on the Rust SDK in some way or another (very rare for a project to rely entirely on the Haskell codebase). This upgrade will help make sure all these applications continue to work properly and also provide a stable foundation for other Rust tools in the ecosystem so that they don't have to re-implement CBOR encoding from scatch every time.

What are the main risks that could prevent you from delivering the project successfully and please explain how you will mitigate each risk?

The main risk is that we won't be able to codegen 100% of possible cddl formats. The cddl-codgen repo already lists some limitations and we will document more limitations with regards to variations of CBOR encoding that we won't support for the v1 release. We have already been working on this feature for some time so we're confident we can codegen a lot of the common cases of CBOR in general on the data types required for Cardano

Please provide a detailed plan, including timeline and key milestones for delivering your proposal.

We plan to finish the v1 of the codegen library in early Q3 and then spend time in Q3 using the new codegen to update existing tools such as CSL/CML/Carp and fixing any other issues we find along the way. This tool will also critically be required for other existing & future Catalyst proposals such as the Cardano pricefeed and the CIP25 Rust library proposal.

Please provide a detailed budget breakdown.

All funds will be used for 1 Rust dev who has been working on this feature for the past few months along with part-time commitments from multiple other Rust developers to upgrade specific libraries and review code & design

Please provide details of the people who will work on the project.

Our team consists of the core authors of critical Cardano Rust infrastrucutre such as Carp, ddl-codgen, Cardano-Serialization-Lib (CSL) and CML. Additionally, we use and discuss regularly with txpipe (authors of Pallas, Oura and other Rust tools in the ecosystem). Notably, Github handles SebastienGllmt and rooooooooob will be working on this project (who you may recognize as creators or core contributors to many Rust tools in the ecosystem)

If you are funded, will you return to Catalyst in a later round for further funding? Please explain why / why not.

Most likely. The Rust ecosystem for Cardano is used by many projects and so we are constantly working to improve it, be it through better codegen tooling, better performance, more feature and also notably by adding support everytime there is a Cardano hardfork.

Please describe what you will measure to track your project's progress, and how will you measure these?

Implementation of the v1 of the new codgen logic, followed by its successful implementation & release in the Cardano Rust SDK.

What does success for this project look like?

Usage of the Rust code generated by this tool in multiple Rust projects such as Carp, CIP25-rs, CSL, CML along with downstream usage in many dApps across Cardano

Please provide information on whether this proposal is a continuation of a previously funded project in Catalyst or an entirely new one.

Yes. This codegen tool & related Rust SDKs have been maintained and upgraded as part of past Catalyst proposals

close

Playlist

  • EP2: epoch_length

    Authored by: Darlington Kofa

    3m 24s
    Darlington Kofa
  • EP1: 'd' parameter

    Authored by: Darlington Kofa

    4m 3s
    Darlington Kofa
  • EP3: key_deposit

    Authored by: Darlington Kofa

    3m 48s
    Darlington Kofa
  • EP4: epoch_no

    Authored by: Darlington Kofa

    2m 16s
    Darlington Kofa
  • EP5: max_block_size

    Authored by: Darlington Kofa

    3m 14s
    Darlington Kofa
  • EP6: pool_deposit

    Authored by: Darlington Kofa

    3m 19s
    Darlington Kofa
  • EP7: max_tx_size

    Authored by: Darlington Kofa

    4m 59s
    Darlington Kofa
0:00
/
~0:00