completed
AI lawyer on Cardano - Profila/HSLU
Current Project Status
Complete
Amount
Received
$78,000
Amount
Requested
$78,000
Percentage
Received
100.00%
Solution

Profila & University of Luzern are building an “AI lawyer”, giving automated expert advice by matching people’ legal questions with a smart knowledge base of +1 million laws, contracts & policies

Problem

Your offline/online personal & business life is governed by laws & contracts you have to accept but don’t understand, which have real consequences. Today, you lack access to legal knowledge/lawyers.

Impact / Alignment
Feasibility
Auditability

Profila

3 members

AI lawyer on Cardano - Profila/HSLU

Please describe your proposed solution.

As a first use case of the AI lawyer, we have been funded by the Swiss government to create a smart knowledge base of privacy laws, where we can match people's questions about privacy with privacy laws and privacy policies, in order to provide them an answer. This knowledge base is to be used inside our existing Profila app which helps you control your data and access to your personal information.

However, once the model works with one set of documents, it is easy to replicate. Under this proposal, we want to use our learnings from our existing privacy research project for the benefit of the Cardano community, who have to navigate an often difficult legal landscape while working on groundbreaking technology.

As a second use case of the AI lawyer - which we start building under this proposal - , we will collect additional laws relevant to people and entrepreneurs on Cardano (e.g. taxation, financial laws, corporate laws etc), and feed it to our smart knowledge base. The community will have one source of legal documentation relevant for their activities in the Cardano space, which they can query for brief, concise and useable answers. It will be a repository of useful information for the Cardano community about financial and legal documents.

------------------------------------------------------------------------------------------------------

<u>introduction - the pioneering Innosuisse project </u>

Before we go into detail in the proposal "AI lawyer on Cardano - Profila & HSLU", we need to explain the preparation work that we are already doing with HSLU for the first use case. Please find below the first use case for privacy knowledge and privacy laws, explaining how people's personal data can (or cannot be) processed:

“Are They Allowed To Do That?” — Profila and HSLU Developing An “AI Lawyer” For Legal Questions so that people can finally understand the privacy and cookie policies they are forced to accept when browsing the internet

What is the problem that Profila and HSLU are solving?

People interact with digital services and purchase products from companies (Brands) globally. Each interaction you have online is regulated by a privacy policy that you are forced to accept. Thereby, you agree to share your personal data, without knowing how they Brands use it.

But what can you do if Brands use your data against your expectations? Do you actually know what terms you agreed to when you accepted to share your personal data?

A real-life example

Meet Maria (46) and Daniel (48), with their kids, Mia (16) and Noah (21). Daniel subscribed online to his local newspaper (let call the company “The Newspaper”) and accepted the privacy policy, which specifies that different The Newspaper-affiliated companies could send offers by E-Mail and SMS (I agree that The Newspaper and companies of the Newspaper Group may inform me in the future about other interesting offers by e-mail and SMS).

Maria registers an account for an online pick-up service from her local supermarket (let call the company “The Supermarket”) and is asked to confirm the privacy policy (I confirm that I have read and understood the privacy policy and the processing of my personal data described therein). According to the 10-page privacy policy, The Supermarket can share Maria’s data with The Supermarket-linked companies.

The family now starts receiving messages more frequently than expected and via different channels (email, SMS, mail), but also from brands they have never shared data with. Maria receives 15 emails/week with newsletters from The Supermarket. The family letter box is full of flyers from “The Spa”, “The Nightshop” and “The Movies” (also part of The Supermarket Group).

The problem here is: Almost all of the info on how companies use personal data is (albeit not in plain language) included in their privacy policies the family “accepted”. However, people generally don’t understand what they agree to, and do not know how to act when they want to alter the terms of their relationship with brands.

Today - How Profila solves a consumers' understanding of how a brand uses your data today

Profila is a platform (mobile IOS/Android application for individuals and a web-based dashboard for companies, organizations, governments, and other legal entities, we call “Brands”) that enables individuals to communicate with various organizations in their lives, privately, one-to-one, and without supervision or surveillance.

Through it, Brands can subscribe to personal data that consumers keep in their Profila (one source of truth with a consumer’s contact details, communication-, product preferences), after a consumer accepts the (privacy and legal) terms of such contract.

Using our example above:

Migros pays Maria 10 CHF/year to access her (i) personal data (email, phone), (ii) preferences about food (vegetarian, bio-products, lactose intolerant) and (iii) communication preferences (Maria would like to receive discounts/product info via WhatsApp). If Migros listens to Maria, she will be a happy consumer.

AI Lawyer tomorrow - How the AI lawyer innovation project on Cardano will improve your understanding of privacy policies you are forced to accept outside of the Profila platform

Supporting customer’s comprehension of the term of a privacy agreement they are accepting by interacting with a website or concluding a contract for an online service is the core point of this Innovation project. Our privacy solution tomorrow will help you better understand the terms that brand propose to you when they ask you for your personal data while you are browsing the internet (outside of the protection of the profila platform).

A typical user perceives this legal document as complicated and therefore the privacy statement will be simply accepted without reading it throughout. Nevertheless, by supporting the average user in understanding it, we expect a better awareness of the impact such an agreement acceptance can have with respect of the personal data usage by the companies.

Under the Innosuisse Project, Profila and HSLU are developing a ”smart knowledge base” (KB) composed of questions and answers (Q&A) that can be matched with the specific doubt or question a user has. This will work as the intelligence behind a conversational agent, proposed as a self-help tool.

Image File

A chatbot answers the user’s query by scanning the knowledge base for the best matches to propose. This should be able to solve the most common and standard aspects but will fail for more specific needs. In order to provide an advanced service, a match between the specific request and experts will be provided, in order to give a rapid and low-cost option to clarify the subject with a legal expert.

Additionally, by collecting user feedback about the perceived quality of the information and the experts suggested, the system will improve its own performance, taking into account these metrics.

The vision is to provide a one-stop-solution for every user privacy-oriented question of online services. The creation of the KB (on the top left in the illustration) and the matching algorithm will be based on Natural Language Processing (NLP), in order to support the adoption of natural language in the interaction with the user, thus reducing the entry barrier for customers.

------------------------------------------------------------------------------------------------------

<u>Next step - commercial real life use case of our AI lawyer project for the Cardano community</u>

As the challenge shows, there is a need to build effective solutions to explore and provide legal & financial services and support to funded Proposers.

If you receive funding to build out your DEX, your token offering, your financial service, your NFT offering, or even your meme token (love to Hosky!), you are faced with a variety of legal questions around regulatory compliance, taxation, AML/KYC, payment services, etc, and this potentially in many jurisdictions considering you are offering your product or service on the internet. The only way to navigate this web of legal obligations is to consult lawyers, who are expensive, often only specialized in one aspect of your business, and only able to provide you an answer for one jurisdiction.

Via this project, we are looking to collect a wide variety of legal documents, laws, regulations, recommendations, court cases, company policies, industry whitepapers, etc, that contain the answer to most of the legal questions you may have as an innovative entrepreneur on Cardano. By making the knowledge base "smart" (=following our research with HSLU as explained above) and easy to query (=ask answers to), you will be able to get an automated answer in seconds and at a price of 2-10 USD per relevant answer. In case the answer to your question is not found in the knowledge base or if its not entirely clear, you can connect to a group of legal experts in our network who provide you a more detailed response.

This solution will let you gain time, confidence and legal certainty to operate and expand your blockchain activities at a lower cost and higher speed.

Please describe how your proposed solution will address the Challenge that you have submitted it in.

As the challenge shows, there is a need to build effective solutions to explore and provide legal & financial services and support to funded Proposers.

First, our solution will improve access to legal knowledge for proposers (and others in the Cardano ecosystem), at the mere cost of 2-10 USD per question you ask the AI lawyer.

As part of this innovation projects first work package, we are building the first steps of our AI-lawyer which can automatically answer certain questions from the industry. Instead of the need to contact a real-life lawyer each time you have a specific legal question, our AI-lawyer will allow you to query the smart-knowledge base and almost instantaneously get the most relevant answer which it has matched against more than 1.000.000 legal documents. Only when your answer is not sufficient, or cannot be found in the knowledge base, will the AI lawyer refer you to a legal expert who can answer more complex questions.

Second, our solution will improve access to actual lawyers for people, without any retainers or large legal fees. We are building up a network of blockchain lawyers that will be able to give you understandable answers to your legal questions (not solved by the knowledge base automatically), at a cost of approx. 25 USD per question (up to 150 USD per question).

What are the main risks that could prevent you from delivering the project successfully and please explain how you will mitigate each risk?

We have attached a "risk matrix", which gives a very detailed overview of 20 individual risks we identified, including the description, the responsible person, the indicators or occurrence, the probability, impact and preventive measures.

As an illustration, we included 3 challenges below (without the detailed information from the matrix, which you can find in the matrix in attachment, titled "Profila and HSLU - Innovation Project - Risk Matrix v. 2.1"):

Document collection - Behind the “AI lawyer” is an intelligent database developed at the HSLU. Alexander Denzler’s research team from the HSLU’s School of Information Technology is “feeding” the database’s algorithm with almost one million documents, including legal cases, court rulings and data usage guidelines issued by authorities and associations to companies’ general terms and conditions of business. The more data the system has at its disposal, the better it learns to respond to specific cases, explains Denzler. We are currently well on our way in the document collection effort, with a.o. 600.000 privacy policies collected.

Speech recognition - Speech recognition represents a particular challenge for IT researchers. This is because most users are not familiar with the legal terminology. “Nevertheless, our artificial intelligence must be able to understand the legal issues behind a question,” says Denzler. “When it comes to legal questions, the answers have to be precise.” However, the HSLU team have the required extensive experience in NLP and speech recognition projects.

Database testing & accuracy - To ensure that the “AI lawyer” works as intended by giving appropriate answers, Profila’s lawyers are putting the database through its paces using test questions. An initial version capable of understanding and answering questions should launch in summer 2022. The more the program is used, the more new cases, and thus new data, it will be able to leverage. This is an exercise that is taking a lot of time and effort from the legal team. Profila has already hired an intern who is working the entire summer on testing the database accuracy.

Please provide a detailed plan, including timeline and key milestones for delivering your proposal.

Title of work package - Domain Specific “Legal” Concept Extraction

Timeline of the work package: The work package has already kicked off and will continue for the coming 3-6 months (July - december). We are working on this regardless of the funding decision, and if we get funding, we will be able to show the results within as little as two or 3 months after receiving funding.

Description of work package: Relevant data from a set of predefined data sources (in English) will be preprocessed (via a web form similar to this one - https://bupte.enterpriselab.ch/Profila/index.php) and merged into a centralized storage (similar to this repository - https://bupte.enterpriselab.ch/Profila/read.php). Additional law-specific data sources are used for this task, such as StackExchange Legal and publicly available Court Cases & Ruling repositories.

Description of the activities: This work package includes three sub-activities: (1) Data engineering - structuring of the data sources and definition of its divisions. (2) Concept identification for the legal domain - How can new, meaningful concepts be identified and extracted from within

a given legal document.

Description of the quantifiable deliverables: The deliverables for the three sub-activities under this WP are the following: D2.1: Data engineering - of at least 1000 entries in the repository i in the following form - https://bupte.enterpriselab.ch/Profila/read.php. D2.2: Concept and subdomain identification of the legal domain – (a) Design of a suitable concept selection approach; and (b) collection of extracted concepts composed of at least 100 specialized legal concepts. D2.3: Quality evaluation - extracted concepts. The main metric used will be meaningfulness of the proposed terminology, with respect to the specificity of the legal domain.

Please provide a detailed budget breakdown.

No. of hours worked per project partner under this work package:

HSLU - Hochschule Luzern - 868 hours at 100 USD per hour (this is funded by the Swiss government for 50% only) - 43.400 USD

Profila GmbH - 276 hours at 100 USD per hour - 27.600 USD

100 USD is the reduced price per hour of the legal experts that are working on this work package no. 2, namely Clara-Ann Gordon, Michiel Van Roey, Elena Meier.

Additional budget required - 10 days of project management and reporting work @ 700 USD per day (7.000 USD) (coordinator calls, monthly reporting, newsletter updates, AMA sessions, townhall prep presentations etc).

See the project proposal attached which includes all the details per work package, as well as a timeline of all the work packages.

Please provide details of the people who will work on the project.

Main contact persons:

Profila GmbH

Michiel Van Roey, Co-Founder of Profila and Project Head

See LinkedIn https://www.linkedin.com/in/michielvanroey/

Lucerne School of Information Technology

Prof Dr. Alexander Denzler, lecturer and Project Head

Extended Profila team working on the project:

Ipek Sahiner (head engineer) - Swiss-based engineer, has a long experience as project manager. Her main skills include project planning, architectural design and mathematics, with experience as database admin of databases and structure knowledge management); See linkedIn, https://www.linkedin.com/in/ipeksahinerschlecht/.

Mikko Kotila (CTO) - IT project manager and data analyst with 20+ years’ experience multi-disciplinary technology and software development projects (throughout all phases of implementation with global scope and resource team). See LinkedIn, https://www.linkedin.com/in/mikkokotila/.

Privacy/legal team - Michiel Van Roey is the company’s general counsel and privacy expert with 8 years PQE as IT/privacy lawyer in law firms and companies in EU/US/CH. See LinkedIn https://www.linkedin.com/in/michielvanroey/. Clara-Ann Gordon is a partner and head of privacy at Niederer Kraft Frey (Swiss law firm, Zurich).See LinkedIn, https://www.linkedin.com/in/clara-ann-gordon/.

Shawn Jensen is the founder of Profila GmbH. Shawn has delivered in numerous senior management roles in Vodafone Global Enterprise (VGE), most recently as the Head of Product - Connectivity and the Head of Customer Presales and Service MEA. See LinkedIn, https://www.linkedin.com/in/shawnj/.

The team will be helped by Mitchell Goodie (product analyst) https://www.linkedin.com/in/mitchellgoudie/

and Elena Meier (legal intern) https://www.linkedin.com/in/elena-meier3/

Extended HSLU research team working on the project:

See the full team on the HSLU website, consisting of Prof. Dr. Denzler, Luca Mazzola, Diamantis Argyris, Christian Renold, Atreya Shankar, Michael Stauffer, Andreas Waldis - https://www.hslu.ch/en/lucerne-university-of-applied-sciences-and-arts/research/projects/detail/?pid=5823

If you are funded, will you return to Catalyst in a later round for further funding? Please explain why / why not.

Under this first work package, we will focus on the legal concept extraction as a first crucial step to create the Cardano AI lawyer. We will develop a web form to manually upload documents in the repository (similar to this one https://bupte.enterpriselab.ch/Profila/index.php) and setup a repository with different categories of information (similar to this one https://bupte.enterpriselab.ch/Profila/read.php). We will try to include as many keywords as possible per legal document, so they are easy to query for answers.

However, there are many more work packages that need to be finalized before our actual AI lawyer solution is ready to be used (to the full extent, namely automatically selecting the correct piece of information from one source as "the" correct answer based on the quality of the information) and lateron integrated with the Cardano blockchain and payment services.

We are currently analyzing if Cardano tech (blockchain, prism, singularity etc) can be utilized in our innovation project as part of the innovations under the other work packages as set out below. Any advice or comments from the Cardano community where you consider this appropriate, are welcome. If so, the chance is likely that we file a follow up project.

The innovativeness of this project lays within the following proposed solutions per work package:

Domain Specific “Legal” Concept Extraction (WP2)

the automatic identification of the right concepts provided by the existing datasources (a heterogeneous corpus of generic and legal-domain specific documents). This poses the challenge to identify a stable and representative set from a limited corpus, using a generic model with domain-specific filtering. The set of concepts works as users' vocabulary for the questions tagging (E)

Creation of a Concept Map (WP3)

automatic extraction of semantically meaningful relationships between the identified concepts, based on similarity metrics. Building meaningful relationship between concepts that resembles the context within the legal domain is a complex task, that will bring us to define a Legal Knowledge Map. This is used as the base for the estimation of knowledge depth and breadth of a given question (B) and as the available vocabulary for supporting questions classification (D)

Image File

Knowledge Structure Generation (WP4)

being able to identify meaningful knowledge areas (clusters/granules) and imposing hierarchical

knowledge structures that take into account the typical organization of knowledge, with generic

concepts on the top and more specific ones at a lower level. Requirements for achieving these results are to define the right clustering approach, to identify the best-suited level of hierarchical layers and the matrices for affiliating the concept to a specific layer. The resulting Granular Legal Knowledge Graph allows a fine-grained characterization of questions (B) and assists users in determining the specific legal domain (G)

Image File

User Knowledge Modelling (WP5)

creating a detailed and precise user model requires a reasoned combination of explicit profiling and

implicit modeling, and the cooperation between these two processes in the different phases of the

system operation has to be researched to find the most balanced interplay (A). This is also fundamental to reduce the cold-start problem, where at the beginning no user activity exists and the model is purely based on explicit profiling with artificial thresholds (C). This is a core part as directly support the consideration of the knowledge evolution over time (F) and the best-suited expert selection.

Image File

Legal Document Knowledge Mapping (WP6)

the counterpart of the user model is the document knowledge profile (also considering the complexity, on different granules and different layers) that removes the need for an explicit user-based classification of documents (D)

Inference Engine (WP7)

an adaptive neuro-fuzzy inference system (ANFIS) will be considered to find the most suited concept's profile (Fig 02) by considering the meaningful historical usage of concepts by users (C) using, on top of the concepts, also metrics for encoding different weights, such as the effects of time on knowledge retention (F)

Automated pricing model for micropayments (WP8)

considering the question knowledge profile, specialized payment models will be ideated and tested.

This will guarantee a more fine-grained and tailored pricing model for accessing existing answers,

based on a balanced mix of recency, usages, rating, and of depth and breadth of the posed question (I) as from figure

Image File

Please describe what you will measure to track your project's progress, and how will you measure these?

Profila will provide the community with detailed periodical progress for this proposal (once funded) in the following ways:

  1. Github repository updated (1x per month, after the initial scrum sessions for creation of the PRD, product requirement document
  2. We will provide a link to the actual document repository so people can see the progress of the documents being uploaded there.
  3. 2-weekly updates to other Cardano proposers via the Catalyst coordinator call
  4. 2-weekly updates in our "Cardano projects" newsletter (register via our website https://ico.profila.com)
  5. Monthly project process and KPI reports submitted to Catalyst teams and available to the public for verification.
  6. Monthly Swarm session office hour (at end of townhall) for a Question and Answer session about our funded projects.
  7. Periodical AMAs by the Profila founders to talk about our progress.

What does success for this project look like?

Publication of our first paper was a first milestone in this AI lawyer project - titled "Privacy and Customer’s Education: NLP for Information Resources Suggestions and Expert Finder Systems" - HCI for Cybersecurity, Privacy and Trust - Springer · Jun 1, 2022 - http://dx.doi.org/10.1007/978-3-031-05563-8_5

Quantifiable deliverables under this WP as part of fund 9: The deliverables for the three sub-activities under this WP2 are the following: D2.1: Data engineering - of at least 1000 entries. s. D2.2: Concept identification of the legal domain – (a) Design of a suitable concept selection approach; and (b) Collection of extracted concepts composed of at least 100 specialized legal concepts. D2.3: Quality evaluation - extracted concepts. The main metric used will be meaningfulness of the proposed terminology, with respect to the specificity of the legal domain.

Please provide information on whether this proposal is a continuation of a previously funded project in Catalyst or an entirely new one.

First time project - entirely new.

close

Playlist

  • EP2: epoch_length

    Authored by: Darlington Kofa

    3m 24s
    Darlington Kofa
  • EP1: 'd' parameter

    Authored by: Darlington Kofa

    4m 3s
    Darlington Kofa
  • EP3: key_deposit

    Authored by: Darlington Kofa

    3m 48s
    Darlington Kofa
  • EP4: epoch_no

    Authored by: Darlington Kofa

    2m 16s
    Darlington Kofa
  • EP5: max_block_size

    Authored by: Darlington Kofa

    3m 14s
    Darlington Kofa
  • EP6: pool_deposit

    Authored by: Darlington Kofa

    3m 19s
    Darlington Kofa
  • EP7: max_tx_size

    Authored by: Darlington Kofa

    4m 59s
    Darlington Kofa
0:00
/
~0:00