Please describe your proposed solution.
Problem:
The recent rise in AI and ML resulted in the lack of available GPU computing power on the market which is dominated by big cloud service providers and are often expensive and out of reach.
Currently there is no real-world use-case for distributed scaling that works on a decentralized network. There is a need for a globally distributed computing infrastructure that seamlessly works on consumer devices and machines. That can be achieved with seamless scaling (combining decentralized GPUs into large scale clusters).
Unique solution:
Currently NuNet enables GPU computing on a single GPU on one machine for a single task which was implemented as part of funded Fund8 proposal NuNet: Decentralized GPU ML Cloud. Scaling up to multi-machine decentralized clusters for multiple parallel computing tasks is the next step.
The seamless scaling can be achieved by amalgamating and integrating containerization, distributed computing frameworks and peer to peer networking as described in the response section.
This is directly related to and dependent on a decentralized hardware infrastructure that would take care of the requirements for realizing such an integrated software workflow. In addition to single GPU consumer machines, which was the main focus for Fund8, the need for multi-GPU consumer machines (primarily mining farms), would play a significant and crucial role.
Irrespective of devices with single or multiple GPUs, containers would be able to distribute a single computational job on the network with such an established computing network with careful efficiency.
Detailed approach:
This proposed project is a pioneering research aimed at developing a decentralized, hardware-independent, accelerated processing unit environment for distributed computational tasks. Our recent explorations have proven the feasibility of employing containerization technologies, such as Docker, to establish a hardware-independent setup capable of accelerated processing tasks. This methodology has the potential to democratize access to high-performance processing units and transform the landscape of complex data analysis. To fully harness this potential, it's crucial to integrate this approach with distributed computational techniques and decentralized networking protocols.
1. Broad Access to GPU Computational Resources: The proposed exploration aims to design a solution indifferent to the processing unit vendor, thereby broadening the scope for researchers and developers to access high-performance processing resources. This broad access can expedite advancements in complex data analysis and intelligent systems by eliminating restrictions associated with vendor-specific solutions.
2. Optimal GPU Resource Utilization: By implementing resource allocation through frameworks such as Horovod, the proposed solution will ensure that processing resources across the network are utilized optimally. This would provide a cost-effective alternative to traditional cloud-based solutions, particularly for organizations with underutilized processing resources.
3. Decentralization: Machines can already connect with each other via NuNet. The integration of containerization, distributed resource allocation framework and libraries and peer to peer networking will enable a decentralized network of containers, each functioning as a node in the network, where a container on any machine would be able to see another container on any other machine on our network. This decentralization can enhance the system's resilience and robustness, reducing its susceptibility to single points of failure.
4. Scalability: The amalgamation of aforementioned frameworks will facilitate highly scalable computational workloads. As the network expands with more resources, the system can scale to accommodate larger tasks, thereby supporting the growth of complex data analysis applications. To be able to use the virtual GPU cluster through NTX on the Cardano testnet/preprod/mainnet, it would correspond to an updated version of our service provider dashboard that would allow running a single GPU job on different containers that either run on the same or different machines on our network.
Multi-GPU machines would contribute a lot in enhancing such scaling measures. Dedicated AMD/Intel GPUs would also contribute in scaling up with a cross-vendor point of view and opening doors for distinct technologies to work with each other, particularly for open source software.
Proposed Use of Funds:
The financial support from Cardano will be utilized to further the exploration and development of this innovative solution. Specifically, the funds will be allocated towards:
1. Research and Development: To enhance the proposed solution, overcome technical barriers, and ensure its effectiveness and efficiency.
2. Testing and Verification: To conduct extensive testing of the solution in diverse scenarios to ensure its reliability and performance.
3. Dissemination and Training: To distribute the research outcomes, provide training resources for other developers and researchers, and promote the adoption of the solution in the wider computational community.
Therefore, this proposal presents a groundbreaking approach to democratizing access to high-performance processing resources and enabling efficient, scalable, and decentralized computational tasks. The funding from Cardano will be instrumental in bringing this pioneering solution to life, thereby making a significant contribution to the advancement of computational infrastructure.
Proposed Structure:
Our solution will incorporate three key elements: containerisation (e.g. Docker) for hardware standardization, resource allocation frameworks for managing distributed computation (e.g. Horovod), and native peer-to-peer libraries for decentralized network communication (e.g. libp2p). Each Docker container will host computational tasks, functioning as a network node. Horovod will control the distribution of tasks across nodes, while libp2p will facilitate communication among nodes.
Benefits for the Cardano ecosystem:
The research is a continuation and expansion of the already completed Fund8 proposal. It will enable all dapps and usecases in the web2 and web3 space that need GPU computing power to source it via NuNet. The value for the compute provided will be exchanged via NTX token which is a Cardano Native Token.
Each transaction will be executed as a Smart Contract on the Cardano blockchain which will directly increase the volume of tx, volume of CNT as well as provide unique use cases to be built on top of it for the Cardano ecosystem.
How does your proposed solution address the challenge and what benefits will this bring to the Cardano ecosystem?
The proposal addresses the following directions of the challenge:
- Deployment, testing, and monitoring frameworks
- Knowledge base & Documentation
The research done in this proposal would lead to the development of the NuNet framework to be available as Open Source to all the users in the Cardano ecosystem and wider with further development. In order for the Open Source community to use NuNet, extensive knowledge base, documentation and step-by-step procedures shall be prepared.
The current hot trends are in AI and large scale machine learning and are not slowing down. GPU computing is the main aspect of AI and ML which this proposal research and PoC addresses.
NuNet is building technology that will allow people to provision hardware for AI/ML jobs monetized via Cardano ecosystem; in the short term and in case of success, that may boost Cardano usage; in the long term, it would connect real-world assets (computing power) and crypto payment space with the help of Cardano integration.
NuNet builds a potentially disruptive technology where it has a potential to tap into the global computing market valued at 548 B USD, with a potential to grow to 1240 B USD. Tapping into just a fraction of it would result in potentially huge values being moved via Cardano Smart Contracts. Based on this proposal and consequent research an implementation shall proceed where more precise estimation on the number of users could take place. Anyone in the Cardano ecosystem could deploy and use the cheaper GPU cluster resources for AI, ML, rendering and many other applications. It is a fundamental enabling technology.
Source:
<https://www.marketsandmarkets.com/Market-Reports/cloud-computing-market-234.html#:~:text=The%20global%20Cloud%20Computing%20Market,at%20a%20CAGR%20of%2017.9%25>.
How do you intend to measure the success of your project?
The proposal is a research which will lead to a development and deployment solution. It is difficult to quantify the research in itself, but is an essential step to select the best development path.
In this regard, it can be anticipated that a number of users will be building based on the solutions in this research and consequent development.
The success can be defined as that this research will lead to selection of the best path to develop and implement the decentralized GPU scaling up.
The proposal is a research and PoC development which will lead to a development and deployment solution. It is difficult to quantify the research in itself, but is an essential step to select the best development path and to test it with proof of concept implementation, which will become the basis of future solutions and consequent development in the community.
The success can be defined as that the proposed project will lead to selection of the best path to develop and implement the decentralized GPU scaling up toward large scale GPU clusters.
Some of the direct benefits to the Cardano ecosystem are:
- Number of projects using cheaper GPU resources for AI/ML tasks
- Computing resources used in the processes are to be compensated in NTX, which is a Cardano Native Token
- Each exchange of value will be done via a Smart Contract on Cardano
- Currently over 2000+ people are in NuNet Discord testing the various builds of the NuNet platform
Some of the indirect benefits to the Cardano ecosystem are:
- Cardano becomes the settlement layer for decentralized Open Source computing frameworks used in training AI/ML models
- Other solutions can be built on top of the NuNet framework, greatly expanding the potential business models
- With the right onramp/offramp solutions Web2 users can utilize compute power without even realizing the Web3 layer underneath. NuNet is interested in joint work with the experts in this field.
Please describe your plans to share the outputs and results of your project?
Spreading Outputs Over a Timescale
Our project plan includes clear milestones and deliverables, which will be shared publicly as they are completed. This incremental release of outputs will ensure a continuous stream of updates for the community.
This approach lets us provide updates on a regular basis, and offers users the chance to provide feedback that we can use to guide subsequent development.
Sharing Outputs, Impacts, and Opportunities
We intend to leverage various communication channels to share our project's outputs, impacts, and opportunities:
- GitLab: The primary hub for our technical work, hosting our codebase, documentation, and issue tracking. This will be the main point of reference for the details of our project.
- Social Platforms: We plan to regularly post updates on our progress on platforms like Twitter, LinkedIn, and Reddit. This will include major milestones, bug fixes, and insights from our development work.
- Technical Discussions: We will continue to hold weekly technical discussions where we discuss the technical aspects of our work. This provides a forum for live Q&A and discussion with our community.
- Blogs: A regular blogs to summarize the progress we have made, highlighting key achievements and outlining the next steps in our project.
Testing and further research
As an open-source project, our outputs will be freely accessible for further research and development. We encourage the community's involvement in testing our solutions to enhance their real-world performance.
Community Testing: We'll invite our users to participate in alpha and beta testing phases, where they can help identify bugs and suggest improvements. We'll use GitLab's issue tracking for managing feedback and provide guidelines for issue reporting and feature suggestions.
Internally, we'll use project insights and community feedback to guide our future work, optimize performance, and prioritize new features. Our aim is to foster a collaborative development ecosystem that is robust, relevant, and of high quality.