completed
Wolfram: AI - LLM Distributed Inference Services
Current Project Status
Complete
Amount
Received
₳100,000
Amount
Requested
₳100,000
Percentage
Received
100.00%
Solution

Development and prototyping of a distributed LLM inference service. The modular system will support a range of models

Problem

The construction and utilization of LLM inference infrastructure is costly and centralized, placing control in the hands of major corporations and making it inaccessible to global populations.

Impact Alignment
Feasibility
Value for money

Wolfram Blockchain Labs

3 members

Wolfram: AI - LLM Distributed Inference Services

Please describe your proposed solution.

<https://youtu.be/ICKp73-4tp8>Artificial Intelligence plays a growing role in assisting users that work in knowledge-based work.

Wolfram Blockchain Labs (WBL) is engaging actively in Cardano Catalyst-sponsored research to develop assistance chat bot services for Cardano Catalyst in Fund10 (note – we've now changed the name of that project to the Cardano “Catalyst Navigator”).

In the course of this work, it’s become clear that in order to appropriately deploy LLM-based applications for Catalyst, blockchain and general communities worldwide, lower-cost inference resources are needed

Thus, in order to build appropriate distributed infrastructure to power such low-cost inferences, WBL is collaborating with GridRepublic (an organization with nearly two decades of experience in distributed computing applications, at global scale):

For this project, we propose to build a modular inference service, with a simple API (and WebUI), but running on a global network of participating servers. Towards this end, the project will provide a simple-to-deploy LLM-Server application which can be run on appropriately provisioned servers, and will then automatically plug into and integrate with the global inference service: "The Network is the computer", as they used to say at Sun Microsystems.

Through this distributed inference infrastructure, Project Catalyst will then be able to launch scalable and low-cost LLM applications trained on Cardano datasets. This will help enhance sharing of knowledge about, and boost participation in, the Cardano ecosystem.

Please define the positive impact your project will have on the wider Cardano community.

A key component in this is the creation of a distributed infrastructure for running LLM-based applications, which aims to lower costs and enhance democratic control of critical systems.

The current project proposes development of a prototype. Future work, however, could enable integration of ADA-based payments to support the ecosystem, e.g. LLM users could in principle pay in ADA, and resource-providers (i.e. participants running the LLM-Server app) could be paid in ADA.

It's worth noting also that we intend to build our prototype around the MDEL LLM, which has unique and extensive multilingual capabilities – being, we believe, the only LLM supporting languages like Hindi, Vietnamese, and others. This opens exciting avenues of outreach to communities worldwide where both tools and infrastructure for advanced AI are presently unavailable. (*MDEL have experience and tooling for extending the range of languages: thus providing another key avenue for future growth.)

What is your capability to deliver your project with high levels of trust and accountability? How do you intend to validate if your approach is feasible?

Wolfram is working with GridRepublic (which manages Charity Engine), an entity we've worked with before, with over a decade of experience building distributed computational systems. (Wolfram Research has also been a [Charity Engine customer and user for many years; see also Wolfram Language Batch Compute ](<https://reference.wolfram.com/language/ref/batchcomputationprovider/CharityEngine.html >))

GridRepublic team (through the Charity Engine service) has operated distributed applications running on as many as a million simultaneous CPU cores, in domains ranging from molecular simulation, advanced mathematics, and genomics.

For example:

  • For Math Fans: A Hitchhiker’s Guide to the Number 42 (In partnership with MIT + University of Bristol)
  • [Finding Candida auris in public metagenomic repositories](<https://doi.org/10.1101/2023.08.30.555569 >) (*in partnership with the US Center for Disease Control and Prevention (CDC))

What are the key milestones you need to achieve in order to complete your project successfully?

Development of easy-to-deploy LLM Server (e.g in container form)

  • Containerized LLM application (usable via command line)
  • Deployment documentation

>Development of LLM API (*Concept)

  • APIs to enable interaction with the LLM
  • Basic API documentation

>Development of LLM WebUI (*Concept)

  • WebUI to a locally running version of the LLM
  • Lab Notebook interface to a locally running version of the LLM

>Iterate: Deploy/Test/Debug

  • A prototype distributed LLM service, e.g. WebUI and Notebook inference requests served by remote resources, over the internet
  • Basic user documentation

>N/A

>Share working demo: an LLM inference service powered by distributed compute resources, with tolerable latency

Who is in the project team and what are their roles?

Matthew Blumberg, GridRepublic Co-founder and CEO

Matthew Blumberg has been working in the fields of network computing and large scale collaboration for 15 years. He is Executive Director of GridRepublic and Co-Founder of Charity Engine, two large-scale distributed computing services. Past projects include work as Fellow at Harvard's MetaLAB; Visiting Fellow at the Laboratory for Innovation Science at Harvard (LISH); Section Editor of "The Handbook of Human Computation"; Consultant to DARPA’s “Social Computing Seedling”; and Partner in TGT Energy, an industrial-scale energy storage venture.

Jon Woodard, WBL CEO

Jon Woodard is the CEO at Wolfram Blockchain Labs, where Jon coordinates the decentralized projects that connect the Wolfram Technology ecosystem to different DLT ecosystems. Previously at Wolfram Research Jon worked on projects at the direction of Wolfram Research CEO Stephen Wolfram and prior to that was a member of the team who worked on the monetization strategies and execution for Wolfram|Alpha. Jon has a background in economics and computational neuroscience. He enjoys cycling in his spare time.

Johan Veerman, WBL CTO

Johan Veerman is General Manager at Wolfram Research South America and CTO at Wolfram Blockchain Labs. Previously he has been Science Advisor at the Ministry of Foreign Affairs in Peru and Chief Scientist on two Antarctic expeditions. Johan's background is on physics and business management. He enjoys playing soccer and is a certified cave diver.

Steph Macurdy, WBL Head of Research and Education

Steph Macurdy has a background in economics, with a focus on complex systems. He attended the Real World Risk Institute in 2019, lead by Nassim Taleb, and has been investing in the crypto asset space since 2015. He previously worked for Tesla as an energy advisor and Cambridge Associates as an investment analyst. Steph is a youth soccer coach in the Philadelphia area and is interested in permaculture.

Gabriela Guerra Galan, WBL Business Operations Specialist

Gabriela Guerra Galan: Gabriela has 15+ years of experience leading projects. She is a certified PMP and Product Owner with bachelor's degree in Mechatronical Engineering, complemented by a master's degree in Automotive Engineering. As the co-founder of Bloinx, a startup that secured funding from the UNICEF Innovation Fund, she has demonstrated a passion for driving innovation and social impact.

Please provide a cost breakdown of the proposed work and resources.

Milestone 1: ₳10,000

  • Development of easy-to-deploy LLM Server (e.g in container form)
  • Containerized LLM application (usable via command line)
  • Deployment documentation

Milestones 2: ₳25,000

  • Development of LLM API (*Concept)
  • APIs to enable interaction with the LLM
  • Basic API documentation

Milestone 3; ₳20,000

  • Development of LLM WebUI (*Concept)
  • WebUI to a locally running version of the LLM
  • Lab Notebook interface to a locally running version of the LLM

Milestone 4: ₳35,000

  • Iterate: Deploy/Test/Debug
  • A prototype distributed LLM service, e.g. WebUI and Notebook inference requests served by remote resources, over the internet
  • Basic user documentation

How does the cost of the project represent value for money for the Cardano ecosystem?

This initiative involves developing a distributed infrastructure to run LLM-based applications. It not only complements other applications that require LLM usage but also offers substantial benefits to the Catalyst community.

GridRepublic's tools and expertise pave the way for such a functional, reliable, and scalable distributed inference service at a relatively low cost.

Furthermore, as noted above, this concept system will be well-suited for future projects. These include potential integration of Cardano-based payment and incentive systems to develop a Minimum Viable Product (MVP) in the future, scaling into a sustainable 'intelligence-as-a-service' ecosystem on Cardano.

close

Playlist

  • EP2: epoch_length

    Authored by: Darlington Kofa

    3m 24s
    Darlington Kofa
  • EP1: 'd' parameter

    Authored by: Darlington Kofa

    4m 3s
    Darlington Kofa
  • EP3: key_deposit

    Authored by: Darlington Kofa

    3m 48s
    Darlington Kofa
  • EP4: epoch_no

    Authored by: Darlington Kofa

    2m 16s
    Darlington Kofa
  • EP5: max_block_size

    Authored by: Darlington Kofa

    3m 14s
    Darlington Kofa
  • EP6: pool_deposit

    Authored by: Darlington Kofa

    3m 19s
    Darlington Kofa
  • EP7: max_tx_size

    Authored by: Darlington Kofa

    4m 59s
    Darlington Kofa
0:00
/
~0:00