Please describe your proposed solution.
<u>Making learning from Catalyst projects usable</u>
Our community’s 500 completed Catalyst projects represent a large corpus of knowledge. However, much of this material remains locked up in closeout reports in the form of videos and PDFs, which are unsearchable and undiscoverable; so once a project has presented its close-out, the learning from it is often forgotten. Even where material is held on searchable platforms, it often contains no clear link between a specific piece of knowledge and the evidence that supports it, and no clear attribution to the person who discovered it. There is also no easy way to see connections (or even interesting contradictions) between what different projects have learnt; and no way for a new proposer to look at what has been discovered already and build on it. So we often end up either losing knowledge that is useful to the ecosystem, or reinventing it time and again.
The solution we propose is to build a platform where project teams can add individual, atomic learning points from their completed Catalyst projects, expressed in a CNL (controlled natural language), and supported by links to the evidence and attribution for each learning point. We’ll also offer training in exactly how to shape your material into these individual learning points; and we’ll populate the database with learning from completed Catalyst projects from Fund 3 to Fund 8, so that we can a) test the database and the process, and b) provide some material that can be searched, to test the database’s ability to surface the hitherto-hidden connections between projects.
<u>The theory and research behind our approach</u>
The idea is based on nanopublications: an approach that is popular in life-sciences research, but has yet to reach very far beyond that field. Essentially, a nanopublication is “the smallest possible unit of publishable information” - a small, discrete, machine-readable assertion, supported by provenance information (i.e. what the assertion is derived from, and the research evidence that supports it). It’s an excellent way to share research - but the problem with it is that a nanopublication is expressed in RDF notation, (a WC3 standard originally designed as a data model for metadata), which presents a barrier to adoption because many people find it difficult, and it might not even be appropriate to express some kinds of knowledge.
So in this proposal, we draw on research by Tobias Kuhn et al in 2013, which looked at how to broaden the scope of the “nanopublication” concept by using CNLs (controlled natural language) rather than RDF triples to express a research conclusion; so people can essentially write their learning-points in normal English. Kuhn’s research developed a concept called the “AIDA statement” (an acronym for “Atomic, Independent, Declarative, Absolute”, and nothing to do with the “AIDA” acronym used in the field of marketing!) - a simple framework for what a nanopublication statement in natural language should look like. This is the approach we intend to use.
- Atomic: a sentence describing one thought that cannot be further broken down in a practical way
- Independent: a sentence that can stand on its own, without external references like “this effect” or “we”
- Declarative: a complete sentence ending with a full stop that could in theory be either true or false
- Absolute: a sentence describing the core of a claim, ignoring the (un)certainty about its truth and ignoring how it was discovered (no “probably” or “evaluation showed that”); typically in present tense
<u>How we’ll put this research into practice</u>
Kuhn found that scientists were fairly easily able to create AIDA statements from the abstracts of published research papers. Based on this, we feel confident that with a little training (which we will provide), Catalyst proposers will be able to do the same with material from their monthly or closing reports.
Note that while this process might most obviously be a fit for research-based projects, our initial explorations have shown that it is also very effective for developer projects, especially when (as they commonly do) they have documented their progress, and noted bug-fixes, results of user-testing, etc. It also works very well for education and community engagement projects, who usually use some degree of reflective practice.
Once the material is expressed in this atomic, declarative way, it can then be connected to the provenance that supports it - this could be any link, from a heading in a document or a timestamp in a video, to a GitHub commit, a Tweet, a Miro board, a cell in a spreadsheet, or anywhere else a project recorded its discoveries.
Entering this material into our database via a dashboard-style frontend means it will be searchable by project, by keyword, by person, etc; so connections and similarities between research conclusions from different proposals will become visible. We will also be able to see attribution (i.e. which project or person came up with this learning?), which will help us become more aware of where insights are coming from. Also, note that additions to the platform would not necessarily have to be restricted to material about Catalyst project reporting. Potentially, the community could also add the knowledge that surfaces in a meeting, a collaborative document, or a Twitter space, by translating it into a series of AIDA statements and adding it.
To make the dashboard usable and valuable from the start, this proposal includes a process for our team to populate it with data from finished Catalyst proposals from F3 to F8 and run some test searches, thereby testing that the database is working as intended, and refining the methodology before trying to teach it to others. This work will cover 200 proposals, creating 3 to 5 AIDA statements from each one.
We will then open the dashboard to the community. We'll offer training and some ongoing support; and then proposers of finished projects will be incentivised via Dework bounties to add the learning from their F8 and F9 proposals. We’ll also offer small bounties for people to send us a record of any useful and interesting connections they have discovered from searching the database, which we’ll collate on the project GitBook as a way of demonstrating what kind of insights the dashboard is helping the community to uncover.
So our process will be
- Build the platform; meanwhile, the data-population team prepares AIDA statements from completed Catalyst projects.
- Add prepared data to the dashboard and run test searches.
- Train the community in how to use it: create a short training video and a text-based learning resource, and run some training sessions.
- Offer bounties for community members to add their projects; and smaller bounties for people to record useful connections and insights that they have discovered from searching the database.
Essentially, this approach frames the things we do in Catalyst (potentially, everything we do, from proposals, to After TownHalls, to discussions on Telegram or Twitter) as the “experiments” we have always said they are. It will help clarify and evidence what the community has learnt from projects and conversations, and make that learning searchable and discoverable; it will also surface new insights and previously-unseen connections. Our approach is adaptable to both qualitative and quantitative insights, and it turns all our discussions and ideas into a collaborative research pool that we can all draw on.
How does your proposed solution address the challenge and what benefits will this bring to the Cardano ecosystem?
Our proposal addresses the core question of the challenge by offering both a tool, and community-led research and documentation to support it, to enhance the developer ecosystem. In the words of the challenge, we are bringing “standards, resources [and] documentation that bring … novel innovation to the ecosystem”. This proposal’s adaptation of the nanopublications standard will make it easier to develop on Cardano, by making it easier to research and build on existing knowledge.
Developers on Cardano, particularly newcomers, will be able to use the Nanopublications Dashboard to find out what has already been created in past projects, and iterate on it, very much in the way that traditional nanopublications help academics to discover and build on existing research. This helps Cardano developers to amplify existing discoveries, rather than reinventing the wheel. The Dashboard also enables developers to log insights from their own work, facilitating proper attribution, and helping them find and collaborate with others who are working on similar ideas. Insights added from completed projects might include pitfalls or problems, thus helping future developers avoid or address them. Overall, the proposal offers an approach that can help the developer ecosystem become more iterative and more collaborative.
The benefits to Cardano as a whole include helping ensure that we don’t lose or forget what we learn (whether from Catalyst funded proposals or anything else), and that we can continue to access and draw on it longterm. It will help us see the connections between different projects’ discoveries; it will also help us see any points of disagreement between proposals on similar topics, which could provide fruitful avenues for further exploration. In short, it forms part of our community's memory.
The Dashboard also has the potential to help Cardano with auditability and assessment of impact. It will enable us to audit core learning from a proposal more easily, and track exactly how the team derived that learning. Also, since the process of framing one’s work in the way required by the Dashboard will tend to emphasise conclusions and insights, this encourages us to look at the effects of what we do, and will help us as a community to see the impact that is being made across Catalyst on particular topics.
In the long term, the team hopes to integrate AI tooling into this concept, using LLMs both to create AIDA statements and to compare them/discover similarity. In order to enable this kind of work (which could have far-reaching beneficial effects for Catalyst and Cardano) we need to build this initial proof-of-concept and engage the community with how to use it.
How do you intend to measure the success of your project?
- Number of GitHub commits during the build process
- Number of AIDA statements created during the data-population process
- Qualitative feedback from data population team on ease/ difficulty of creating AIDA statements
- Number of training sessions held with the community (we aim for 10)
- Number of pageviews of our training material on GitBook
- Qualitative feedback from training sessions on usefulness of the approach and how easy/difficult it is to use
- Number of people claiming bounties to add material to the database
- Amount and quality of material added
- Number of people claiming bounties to report insights from searches
Please describe your plans to share the outputs and results of your project?
The dashboard build process will be fully open-source, and trackable on GitHub.
Our initial “mini-whitepaper” on our proposed methodology, plus our documentation of the data population team’s working process, and the training materials we create, will all be publicly available on the project's GitBook. We will share them widely in the Catalyst community via Discord, Telegram, Twitter, and the Cardano forum.
Our 10 training sessions, and the process of publicising our Dework bounties to the community, will enable us to share the dashboard and its underlying ideas widely.