Please describe your proposed solution.
Problem
Cardano smart contract development is far more painful than it needs to be. Our team has the utmost respect for the existing Cardano contract languages (PlutusTx/Plutarch/Aiken), which are impressive technical achievements. Nevertheless, our practical experience using these languages has shown that technical sophistication alone does not amount to an adequate solution to problems faced by developers working in the trenches of real-world development.
Some of the problems we have encountered with existing solutions are:
Why not PlutusTx?
PlutusTx employs the TemplateHaskell language extension to enable users to write vanilla Haskell code that compiles down to UPLC. On paper, this appears to be an excellent solution. The Haskell development ecosystem contains excellent tooling (mature build systems, a reliable language server that integrates with every mainstream editor/IDE, etc). Moreover, GHC is capable of performing remarkable optimizations when compiling Haskell programs, which presumably could be leveraged to generate efficient and compact UPLC.
Unfortunately, PlutusTx falls woefully short in practice. Most importantly, the UPLC emitted by PlutusTx is, in almost every case we are familiar with, significantly inefficient in terms of both script size and execution speed. While PlutusTx may be suitable for extremely simple scripts, we do not consider it to be a viable option for complex projects. Choosing PlutusTx for a complex project is just too risky - it may turn out that the design for a complex smart contract cannot possibly be implemented in PlutusTx without exceeding script size limits. Even if the script size can be minimized, execution performance is often so poor with PlutusTx that the resulting fees render a contract's protocol economically unviable. These are not merely theoretical objections; several members of our team have been involved in PlutusTx projects which ultimately required complete rewrites in Plutarch - at great expense to clients. We note that some of these projects ultimately failed due to the increased expense of a ground-up rewrite.
It may be possible to squeeze performance gains out of PlutusTx by performing certain optimizations in the Haskell code that serves as input to its compile
functions. However, this is extremely unintuitive - a developer attempting to optimize PlutusTx is put in the incredibly awkward position where they must reason about how changes to Haskell code affect UPLC output in spite of the fact that Haskell (a lazy language) and UPLC (a strict language) have different semantics. Even if one could account for the different semantics, the machinery that performs the Haskell -> UPLC translation is an incredibly obscure TemplateHaskell. It is unreasonable to expect that even the most talented developers would be able to perform these tasks.
On a more mundane level, PlutusTx simply does not integrate well with the Haskell ecosystem's tooling. We have lost dozens - if not hundreds - of hours tracking down incredibly obscure errors emitted by the Haskell language server when using PlutusTx. Moreover, PlutusTx requires the use of a special standard library, which makes integration with existing Haskell libraries impossible and (again) gives rise to extremely confusing errors if one accidentally mixes functions from the Plutus standard library with functions from Haskell's prelude. These errors are not intractable, but solving them requires a large amount of folk-wisdom that makes it very difficult to onboard new developers and greatly increases development costs.
Why not Plutarch?
Plutarch eschews compilation-via-metaprogramming for a sophisticated embedding strategy that enables an embedded UPLC DSL. In many respects, Plutarch is a substantial improvement over Plutus: Plutarch, when used by an experienced developer, results in scripts with vastly smaller sizes and greatly improved performance compared to an equivalent PlutusTx implementation.
Although these improvements over PlutusTx have enabled more sophisticated smart contracts, the embedding strategy that powers these performance increases entails a severe cost in terms of ergonomics. At a glance, writing Plutarch requires:
- Heeding the distinction between Plutarch-level and Haskell-level functions, which can be very confusing to developers who are not familiar with complex embedded DSLs
- The capability to write performant "raw" Lambda calculus, which is a skill that not many developers (even experienced functional developers) possess. (To use a simple example, it is frequently necessary to use fixed-point combinators directly in Plutarch)
- A slew of conversions between different data encodings and representations, such that it is frequently unclear which representation should be used in a particular context
- Familiarity with dozens of complicated types and type classes, including (but not limited to)
PLift, PLifted, PConstant, PConstanted, PIsData, PIsDataRepr, PIsDataField, PListLike, PTryFrom, PlutusType, PCon, PMatch, PAsData, PData, PDataSum, PDataRecord
. Many of these types and type classes are implemented using cutting-edge Haskell type system features and type-level programming techniques, rendering the Plutarch source code inaccessible even to experienced Haskellers - The ability to reason about complex custom deriving strategies
- Coping with a large amount of syntactical noise necessitated by the embedding strategy
As type system enthusiasts, we find Plutarch to be a fascinating example of what is possible when the Haskell type system is pushed to its limits. As smart contract developers, however, our familiarity with Plutarch has made clear that it is not a viable general solution to the problem of Cardano smart contract development. The degree of niche expertise required to make full use of Plutarch is just too high.
Why not Aiken?
Aiken, a bespoke language for developing Cardano smart-contracts with a rust-like syntax, is an excellent solution for developers that do not specialize in functional programming. However, Aiken's main strength - a much simpler type system than Haskell’s combined with conventional imperative syntax - is, at the same time, a limitation: While Aiken's simplicity enables imperative developers to build on Cardano without struggling to adopt a new paradigm, it also prevents functional programmers from leveraging a strong type system to create abstractions and express sophisticated invariants at the type level. Aiken does not support type classes, data-generic programming, effects systems (monads, monad transformers, etc), or optics (functional-style data manipulation). Admittedly, these language features can be difficult to master, but when mastered they provide developers with the power to significantly simplify codebases and write intrinsically secure code. In the context of smart-contract development, the absence of these features is likely to lead to verbose code (which increases auditing costs and increases the potential for bugs) that is less secure (because certain important invariants cannot be expressed without these features).
Nevertheless, we acknowledge that Aiken is the right choice for many projects, especially those with relatively simple on-chain logic. In our experience, many projects have relatively simple on-chain logic, and would do better to choose Aiken than PureUPLC. As functional programmers, however, we believe that the advanced features found in strongly-typed pure functional languages are sometimes the best tool for the job. Aiken, in our view, is one part of the solution to the problem of Cardano smart-contract development - but that it is only one part of the solution.
Solution
We propose to build a UPLC backend for PureScript - a production-ready language that has a mature ecosystem and integrates well with existing Cardano development tools (as have been proven by cardano-transaction-library, also funded by Catalyst in the past). PureScript, in our view, is uniquely well-suited for this task: It is a strict functional language with a strong (but not overcomplicated) type system, was designed from the outset to support multiple backends, and yield an abstract-syntax tree which is particularly suitable for conversion to UPLC.
In order to solve the plethora of problems with existing solutions, we propose to develop a UPLC compiler backend for the PureScript Language. We believe that this is the best solution to the problem of smart contract development on Cardano because:
- PureScript is a mature language with an extensive ecosystem and excellent tooling. A PureScript UPLC backend can make use of existing language servers, linters & formatters, and a rich standard library. Importantly, utilizing an existing mature language greatly reduces the maintenance burden - a large number of existing tools can be immediately employed by developers with little-to-no modification required on our part.
- PureScript, like UPLC, employs an eager (i.e. strict) evaluation strategy, which greatly reduces the mental overhead required to reason about performance and simplifies the compilation process.
- The PureScript compiler's abstract syntax tree is significantly less noisy than Haskell's AST, which, again, greatly simplifies the compilation process (particularly when compared to PlutusTx).
- PureScript (unlike Aiken) has a rich type system that can be leveraged to express sophisticated invariants and enables the application of formal reasoning techniques…
- … yet Purescript's type system is much simpler than Haskell's. PureScript, unlike Haskell, does not have a bevy of arcane language extensions, making it a much more suitable introductory functional language that imperative programmers can be onboarded to quickly.
- PureScript offers the potential for optimizations which would be difficult or impossible to perform on UPLC itself. As a general rule, the number of optimizations a compiler can perform is proportional to the amount of information the compiler has access to. By operating on the PureScript CoreFn IR, we anticipate that we will be able to implement a number of sophisticated optimizations that are not possible with limited program information.
- There is a backend optimizer for PureScript's CoreFn IR which we could adapt to our purposes, substantially reducing the amount of research required: <https://github.com/aristanetworks/purescript-backend-optimizer>
- PureScript's builtin support for row types intrinsically reduces the amount of syntactical noise (especially when contrasted with Plutarch) required to implement and operate on different data encodings and representations.
Our solution is conceptually simple: We will implement a PureScript backend that transforms the PureScript compiler IR (internal representation) into UPLC, and spend the remainder of our budget implementing as many optimizations as possible. (See the sections below for a more detailed discussion of particular challenges and our strategy for overcoming them.)
Risks involved
- Row polymorphism: consider forall r. Record (foo :: Bar | r) -> Bar
- We will not be able to use row type parameters easily, because we need to know the position of a field in a record to translate it to access by index
- Possible solution: break per-file compiling. Compile everything together until we have fully instantiated parameters and then translate record field accessors to index accessors
- Another solution: make functions like this parameterizable (implicitly) with a numeric index. Then we could apply this parameter where the position is known, and leave it unapplied where it is unknown. An optimizer pass could then reduce the lambda-abstractions
- Under-estimation of amount of work
- Inability to optimize enough to be competitive with Plutus or Aiken
- Failure to figure out monad desugaring
- ADA price volatility that could put funding at risk
Market
The proposed solution can be useful for Cardano dApp developers who already have experience with Plutus or just functional programming in general.
How does your proposed solution address the challenge and what benefits will this bring to the Cardano ecosystem?
Intended Challenge: OSDE: Open Source Dev Ecosystem
**Challenge statement: “**Can we build a community-owned Open-Source Ecosystem that’s commercially viable to drive growth, increase opportunities, and increase project visibility?”
What does this proposal entail?
At a bare minimum, this proposal entails designing and implementing a UPLC backend for the PureScript compiler that developers can immediately use to quickly build efficient and secure smart-contracts. In order to achieve this goal, we may also have to design and implement PureScript utility libraries for contract development.
Concretely, the problems that we must solve to achieve are stated goals are:
- Translation of PureScript's CoreFN IR into UPLC: Our initial research into this task indicates that it is eminently achievable (although it is certainly not trivial!), and we do not anticipate substantial difficulties.
- Solving the data representation/encoding problem: One significant difference between PureScript and UPLC is that PureScript allows functions to be stored in native data structures, whereas UPLC does not. Attempting to prohibit functions in PureScript data types would cripple the language. Consequently, our tentative plan is to follow Plutarch and develop machinery for translating PureScript data types into Scott-encoded equivalents. Of course, we still must provide users of our language with the capability to work with data-encoded (PlutusData) types, and will need to design and implement a solution that allows them to do so. We suspect that it may be possible to implement a solution to this problem as a PureScript library that uses row-paramaterized types to represent data-encoded objects, but this requires further research and experimentation.
- Optimization: While the PureScript compiler is, in many respects, exceptionally suitable for a UPLC backend, there is one respect in which it is not: The PureScript compiler currently performs only minimal optimizations on the CoreFN IR (though it performs a larger number of optimizations in the JavaScript backend). In particular, the PureScript compiler performs very minimal CSE (Common Subexpression Elimination), which leads to a large amount of duplicated computation of pure terms. This is particularly pernicious when targeting UPLC, as duplicate computation can cause an explosion in execution costs and, consequently, transaction fees. Although other optimizations may be useful, it is absolutely essential that we develop a CSE framework that is tuned for UPLC. Fortunately, CSE is a very well-studied optimization - although this task will certainly require research, we believe it can be achieved.
- FFI: Developers using our backend ought to (ideally) be able to make use of existing PureScript libraries. Unfortunately, nearly every useful PureScript library employs PureScript's JavaScript FFI (or depends on another library that does). In many cases, it should be possible to make existing libraries compatible with our backend simply by providing foreign UPLC imports where foreign JS imports now exist. Though not essential for the success of our project, we would like to explore the possibility of designing and implementing a mechanism that would allow "swappable" foreign imports to reduce the amount of work necessary to integrate with the existing PureScript library ecosystem.
How do you intend to measure the success of your project?
The criterion for success of this project is delivery of a UPLC backend for the PureScript compiler that integrates with existing PureScript tooling, and that developers can immediately use to quickly and ergonomically develop Cardano smart-contracts.
Success can be measured by the number of projects using the tool.
Please describe your plans to share the outputs and results of your project?
MLabs maintains social presence on Twitter and in Plutonomicon Discord, where updates could be posted. Additionally, the release could be announced on IOG technical Discord. All source code will be available on Github.