PRISM analytics platform

Funds Fund 9 Proposals Dapps, Products & Integrations PRISM analytics platform

completed

View on Ideascale

Current Project Status

Complete

Amount
Received

$75,300

Amount
Requested

$75,300

Percentage
Received

100.00%

Solution

We’ll be offering a PRISM analytics tool. Not only for usage statistics and reporting, but deep insight for developers and companies using graphs displaying the trust-relationships between DIDs and VC

Problem

There is no insight into the usage statistics of Atala PRISM: How many DIDs and Verifiable Credentials (VC) are currently deployed on testnet / mainnet? How are VC used in Ethiopia, by DISH or others?

Impact / Alignment

Feasibility

Auditability

PRISM analytics platform

Impact

Please describe your proposed solution.

The problem

Self-Sovereign Identity (SSI) movement is gaining momentum. In light of the recent events in the crypto space, trust is more important than ever. But also outside the crypto-community we are in dire need of new solutions to problems of trust, digital identity and the pressure of centralization. Initially, the idealistic idea of a few, Decentralized Identifiers (DIDs) and Verifiable Credentials (VCs) are getting ready for prime-time. Even big organizations like Microsoft are rolling out first SSI solutions available for general use. With Atala PRISM, IOG presented its own technical implementation of that concept.

New projects, integrations, and businesses are starting to emerge (not only thanks to Catalyst) to leverage this new promising technology. Most prominently, it is IOG itself, which is collaborating with the Ministry of Education in Ethiopia to get students all over the country onboarded onto Atala PRISM. The US network provider DISH is in the process of rolling out their customer loyalty program based on PRISM, World Mobile is using DIDs in their network infrastructure and many other projects (in and outside of Catalyst) are on the verge of using PRISM in production – so the statements read.

But what’s happening behind the curtain? How many DIDs are there on the Blockchain? Does it really get used? What are the trust relationships between entities? In this proposal, we are presenting an open web-based analytics software to answer these questions.

We’ve clustered the questions into three categories:

Questions from the community and the public:

How many DIDs have been created on the Cardano Blockchain? How many credentials have been issued?
What is the development over time? How is developer adoption progressing on the testnet? What is happening on the mainnet? How is the growth evolving? Are there any trends that can be identified?
Is what we hear from the larger players and projects just hot air, or can the claims be backed up by raw data on the chain?

Detail questions about the properties and history of individual DIDs or a set of DIDs:

This is primarily of interest to organizations or developers who want to examine a set of DIDs/VCs.

What are the DIDs with the highest number of issued credentials?
How does a particular subset of DIDs compare to another? (engagement, issuing, key-rotation, …)
Which Cardano payment addresses paid for which DIDs/transactions? (mapping of market participants and verification of billing data)
How does the behavior of DIDs and linked VC look on a timeline?
What is the history of Operation-Hashes for specific DIDs? (Important missing feature of the SDK)
What does the SSI-crosschain activity look like? (e.g., VC issued on Cardano to holder-DIDs on KILT)

Questions about the relationship between DIDs

The most interesting questions, however, arise from the representation of DIDs and Verifiable Credentials as a graph. Graphs are used to represent the connections of different entities to each other, e.g., as a social graph: Alice knows Bob. Bob is an employee of Charlie, etc.

In the domain of SSI, the DIDs can be mapped as nodes in a graph. The relationship between the DIDs are then represented by the Verifiable Credentials (the edges in the graph). But DIDs do not always have to represent people (they can also represent a technical device, e.g., a network node, a mobile phone, a location, a vehicle, etc.) and the relationship could also be very abstract and technical, not just personal.

On the blockchain, the DIDs are stored in their entirety (i.e. all DID metadata is stored on the blockchain apart from the private keys), while the VCs are stored on the blockchain only in hashed form. Therefore, the contents of a credential are not stored on the blockchain itself – only the hash and the information that a certain DID issued some kind of credentials at a specific point in time. Only with the existence of the full, unmodified credential we are able to prove that it was indeed that exact credential that was issued by that DID. This is a deliberate design decision to guarantee privacy. So with reading the blockchain with all PRISM transactions, there would still be no graph at this point, since the nodes are there, but not the edges that define the relationship to another DID (node).

However, VCs were not designed to only be private. They are supposed to be verifiable by a third party (as the name implies). To accomplish this, the original credential must be passed to that third party. This is the only way the third party could verify that the VC's hash is on the blockchain. In some cases, this involves sharing the VC with only a specific person who can verify the VC. In other cases, there is no reason against making the VC completely public (e.g. on a website). Good examples of this are endorsements "Company X has a great product", reviews "The service was good, happy to come back" or many types of achievements "Person X attended the course with best grades". These types of credentials are only effective when they are made public and available for all to see and verify by themselves.

In addition, it is also not possible to regain control of the VCs once they have been shared for verification purposes. Nor can anyone force that VCs once shared should be deleted or cannot be passed on to third parties (the use of ZK-proofs is to be left aside here for the time being, that these will not be available for PRISM in the foreseeable future).

The fact, that these VCs are often public, makes them theoretically able to be collected and inspected by anyone. By importing them into a graph, we can now supply the edges of the graph and see the relationships between the DIDs visually. But why, you might ask? A developer might be working on a project and wants to verify a chain of credentials with DIDs rapidly changing keys. Or it might be used by a company which issued thousands of VCs which they have to monitor regularly. Image the VCs representing security guaranties which third parties rely on with SLAs regarding their currency. But more importantly, it might be used by individuals who want to trace a chain of trust. In the world of SSI we are frequently talking about emerging trust networks which are built button up: In these networks trust is not derived by a single central authority which some time ago, issued a certificate to e.g., a dentist, but by hundreds of people who write reviews, share their (sometimes professional) opinions and delegate some of their trust by endorsement. After a while, a complex network emerges which offers more guaranties than just a single certificate on a wall, or untraceable fake reviews on Google. With an ever-growing ecosystem, we also need tools to better understand what we are building.

The final building block of this graph are trust registries. These represent trust anchor points, in which might be endpoints of a trust chain. I trust Alice because Alice got a VC from Bob, and Bob is listed in a trust register. DIDs which are part of trust registries are an essential part of any graph representation.

So, ultimately, it all comes down to the question of

What are the trust relationships between DIDs? What types of credentials are used? Are they valid and trustworthy?

A step towards a solution

This proposal presents a web-based analysis tool that can be used to answer all of the above questions. The tool is divided into three sections according to the questions:

The statistical overview, with already prepared live generated reports, to get an overview of the actual usage of PRISM.
An analysis area to get to the bottom of certain specific questions that are relevant for individual users, companies, developers. E.g.: How many credentials were issued by this DID? How many DIDs associated with this DID show engagement in the last 30 days?
A graph view section to perform dedicated analysis and investigate trust chains in complex networks.

At present, there is no analysis tool that can sufficiently answer the above questions - not even internal tools of IOG. In many respects, we are all operating in the dark: everyone with a small candle in their hand which lights their DID and VCs, but little else.

It should be emphasized that this is not at all about acting as a data crawler, collecting the imported VCs in order to build a complete graph of all DIDs and VCs (which is not possible btw). Nor is it about developing a DID forensic tool to uncover statistically based correlations by means of analysis of payment-flows and temporal sequences, but rather about building a toolkit that helps us to further develop the SSI ecosystem around Atala PRISM and to shed some more light into the darkness. We believe that professional tooling will be a major reason for the adoption of PRISM by large enterprises.

What will the platform look like?

First of all, it will be a public website that displays the most important statistical information of all PRISM-related operations from the testnet/mainnet in real-time. It will also provide the ability to quickly search for DIDs to get an overview over publicly available information.

Additionally, there will be a secure private area (login with DID: already implemented by us as a proof of concept -> see demo of the blocktrust identity wallet). Here the user will be able to specify multiple set of DIDs, import VCs and manage and execute more complex queries. This private area is subdivided into:

An area for statistical information on the given sets of DIDs and a timeline of the DIDs/VCs that are being analyzed. As well as engagement metrics, usage, etc. For professional users, this data might even be provided to be directly linked to be used in Power BI or Tableau.
The described graph view to examine individual trust chains.

Both the public and private sections are intended to be made completely free with the features described here. (Subject to rate limiting and implementation of API keys for excessive use, or advanced export functionality for enterprise users).

The application consists of the following parts:

Blocktrust Connector: .net application running on Linux as a service to extract all PRISM related information from PostgreSQL DB(-> db-snyc). Decode and verify them and send them to a graph-based Cosmos DB Instance on Azure.
Different Azure Services, including the before mentioned Cosmos DB, as well as a scheduled application running Gremlin queries against the graph database to extract statistical information about the state of PRISM (usage, number DIDs/VC). Redis Cache to store frequent historical queries.
Web-API based application backend behind an API-Management (for rate limiting), to serve the different kinds of personalized queries from the frontend. Allows to import VCs into the graph and maintain separation of user-accounts with their respected DIDs and imported VCs.
Blazor / Typescript-based frontend

A visual representation of the architecture, the setup of PRISM and a mock-up of the graph view can be found here: <https://blocktrust.dev/analytics>

Please describe how your proposed solution will address the Challenge that you have submitted it in.

The proposed solution is an essential building block for everyone involved in SSI. It helps the community to monitor and verify the growth and usage of Atala PRISM. It also offers developers and businesses a tool to better understand trust relationships.

This being not only a tool for developers, but everyone involved with SSI, we believe this challenge is the best fit for this proposal.

What are the main risks that could prevent you from delivering the project successfully and please explain how you will mitigate each risk?

Technical risks

From a technical perspective, the risks can be considered to be relatively low. The proposed technologies have all been used before, and there is a good understanding of what can and cannot be achieved using the existing data. For testing purposes, PRISM data has already been extracted from the testnet and made analyzable. The aim of this proposal is therefore to build the infrastructure to make the data available to the public. All necessary libraries for handling PRISM data have already been developed by us.

There is already experience with large datasets of relational, non-relational and graph data. The analysis and presentation of these should not pose any major problems.

Planning risks

Software development is difficult to plan. Most of the time there are bigger delays than expected. With our experience in large software projects, we are fully aware of this and plan with appropriate margins. But as the project progresses, goals and requirements naturally shift, and features that were initially considered easy or great later become difficult or impossible to implement. Instead, other, better ideas emerge and are being developed instead. We wouldn't call this a risk, but something to be aware of if you're not familiar with agile software development.

Budget risks

Due to growing data volumes over time and complex queries, there is generally a risk of increasing costs of the cloud infrastructure. In this case, some kind of rate limiting may have to be introduced to make the service usable for all. However, possible budget overruns are covered by us, as outlined below, and therefore do not pose a risk to the completion or operation of the website.

Feasibility

Please provide a detailed plan, including timeline and key milestones for delivering your proposal.

This is a relative timeline. The project is scheduled to start in September.

Development:

Phase 1 (4 weeks)

The technical implementation can begin immediately. An explorative or research phase is not required since the preliminary work has already been carried out.

Setup of a Cardano-Node (starting on testnet), DB-Sync & PostgreSQL, Cosmos DB (as a graph)
Development of the blocktrust connector to extract data from the SQL DB and to transform them into usable PRISM objects to upload them into the Cosmos DB. Implementation of the connector as a long-running service.
Deployment of the connector and integration of a feasible monitoring solution.
Adding a scheduler for common queries and setup of a Redis instance for caching.

Phase 2 (4 weeks)

Defining the queries and reports that will later be displayed as diagrams in the frontend.
Implement queries, measure, optimize using existing data in testnet.
Develop templates for queries that can be used for the analysis of trust chains. If necessary, implement a reduced Gremlin syntax for DID-related graph queries.

Phase 3 (8 Weeks)

Development of a web app (Blazor + native JS) to display the defined and implemented queries and reports from the CosmosDB.
Development of an API, and implementation of a private area behind a login, where users can save queries and upload credentials.
Websocket (SignalR) based connection of the web app for real-time updates of new data found on the blockchain.

Phase 4 (8 Weeks)

Development of an import option for trust registries and VCs.
Optional support of Webhooks for certain events.
Beta testing, improvements and gradual rollout, onboarding the first users.
Collecting feedback
Refinement of helpful queries to better understand the graph.
Setup of an identical infrastructure for the Cardano mainnet.
Communication of the availability of the project through the established channels

Operation:

Phase 5 (12 months)

Operating the platform for the Cardano (and therefore PRISM) testnet and mainnet (bug fixes, updates monitoring)

Most of the development work will be done by Björn Sandmann. On the frontend side, he will be supported by John Grabenmeier with the design and integration of diagrams. Further support may be expected from an additional team member (see next section) to allow the development of multiple projects in parallel.

Other projects and timelines

If there will be funding of any additional proposals from blocktrust, there will inevitably be an overlap. At this point, priority will be given first to the proposal with the most votes (considering the technical feasibility of the order in which the projects could be completed most efficiently).

To be as transparent as possible: here are the other proposals from us in this fund:

Björn Sandmann is also working on a previous proposal from Fund 7. It allowed the proposer to work full-time on PRISM and laid therefor the basis for much of what was accomplished since then (e.g., .net SDK for Atala PRISM, PoC of a browser-extension wallet). The proposal from F7 is still ongoing and will likely finish as planned (September) and therefore will not collide with the new proposal.

Please provide a detailed budget breakdown.

Development costs (Phase 1 – 4: 6 months)

Backend-Engineer (Björn Sandmann): working for 120 h/month over 6 months: 720 hours total. With a rate of 80 USD, this would amount to 57,600 USD, which is not feasible for this kind of community funded project. Consequently, I (the main proposer) would invest my own time/money into the project and therefore would reduce the cost to 60 USD for myself. Resulting in 43,200 USD.

Frontend-Developer (John Grabenmeier) 2 Month with 60h/month *60 USD (same arrangement) = Resulting in 7,200 USD

Cloud / Server Infrastructure within the development phase (running only on testnet): 250 USD/month for VM, Cosmos DB, APIM, WebApp all for 6 months = 1,500 USD.

Sum development costs = 43,200 USD + 7,200 USD + 1,500 USD = 51,900 USD

Operational costs (Phase 5: 12 months)

Bug fixing, updating software/ node, writing reports, basic support: 20h/month with 60 USD/h = 1,200 USD * 12 months = 14,400 USD

Cloud / Server Infrastructure (running both on testnet & mainnet): 750 USD/month * 12 months = 9,000 USD

Sum operational phase = 14,400 USD + 9,000 USD = 23,400 USD

Overall: 51,900 USD (development for 6 months) + 23,400 USD (operation and support for 12 months) = 75,300 USD

A contingency for budget overruns is not necessary in our opinion, since we are already below our normal hourly rates and are willing to take financial cuts to be able to implement this project. Delays or increased costs will be covered by us personally.

Please provide details of the people who will work on the project.

blocktrust is a startup focusing on developing technologies around Atala PRISM. For more information about our other proposals and current technology demos (like the browser-extension wallet) visit: <https://blocktrust.dev>

Björn Sandmann

9+ years of full stack development with the .net. Focused on identity and privacy solutions. PRISM Pioneer, Atala ASTRO, Plutus Pioneer, already funded proposals.

Björn Sandmann will primarily work on the technical infrastructure, the integration with PRISM and the development of the queries. He will implement the web app and all aspects of login, credential import and user management.

LinkedIn: <https://www.linkedin.com/in/codedata/>

Project history and technical skills: <https://www.gulp.de/gulp2/g/spezialisten/profil/bsandmann>

John Grabenmeier

20+ years of frontend development. Proven track record of from small online shops to high-profile enterprise systems.

John Grabenmeier will primarily be working on the display of the diagrams, charts, and reports. He will also be involved with the visualization of the graph.

LinkedIn: <https://www.linkedin.com/in/johngrabenmeier/>

Project history and technical skills: <https://www.johngrabenmeier.com/>

Should the workload increase due to the realization of more than one submitted proposal, there is currently the opportunity to onboard another developer at relatively short notice, who has been considering working full-time in the blockchain space for some time, but needs secure financing. With over 15 years of professional experience with backend code and working for large companies in complex system landscapes, this person would be the ideal addition to the team. However, this decision can only be finalized once the funding has been secured. A naming is not yet possible due to a current employment contract.

If you are funded, will you return to Catalyst in a later round for further funding? Please explain why / why not.

A further funding round of the proposal will depend on its adoption. Should the interest that became apparent in discussions actually materialize, and the platform is used, a second funding round is conceivable after the conclusion of the first to ensure the further development and permanent operation of the platform (if no other monetization model can be found). Basically, though, the premise is to keep the platform free for the normal user.

In any case, the planned budget is sufficient to build the platform to the extent outlined here and then operate it free of charge for a period of at least 12 months.

Auditability

Please describe what you will measure to track your project's progress, and how will you measure these?

During the development phase (phase 1 - 4), we will write a blog entry every two weeks at www.blocktrust.dev/blog, which will provide information about the progression of the work. This allows the community to follow the development progress during this busy time. In the blog entry we will report on the technical details of the work and at the same time state whether we are within the predicted time frame.

As we enter the operational phase (5), we will post monthly usage reports of the platform. In the usage reports we will post the number of page views, created accounts and executed queries and track them over the course of the project.

What does success for this project look like?

For us, success is not measured by mere user numbers, but by the progress and success of the entire PRISM ecosystem. An analytics platform like the one presented here not only provides transparency and thus trust in the entire ecosystem, but it also attracts companies for which such tools are necessary building blocks. At the same time, such a tool also opens up whole new application scenarios and creates thought inspiring ideas about trust chains and their various use cases – many of which aren't even though of yet.

Specifically, we hope:

That will be able to shed light and give the community clear metrics on how Atala PRISM is evolving.
That developers will use the platform to better understand their SSI applications and dApps.
That this analysis platform will give bigger companies the confidence that they have the tools to implement their idea based on Atala PRISM.

Please provide information on whether this proposal is a continuation of a previously funded project in Catalyst or an entirely new one.

This proposal is entirely new and has no relationship to previous ones.

bookmarked!

bookmarked!