In this blog we provide technical details on our Danksharding CUDA implementation
Danksharding is the new sharding design proposed as part of Ethereum’s rollup-centric scaling roadmap. From an architectural perspective it mainly defines a new data type called blob-data. The blob-data is stored by the consensus layer (beacon chain) for a limited amount of time. It is not accessible by the execution layer (EVM), though its contents and availability can be verified by it. Danksharding utilizes a new actor type called a builder. Builders submit block proposals to proposers whose job it is to select the block with the highest bid. Only the selected block builder is then required to build the entire block. Once published, the selected block can be efficiently verified via data availability sampling.
ICICLE is an open-source software project aimed at improving the performance and efficiency of zero-knowledge cryptographic operations using CUDA-enabled GPUs. ICICLE focus is on implementation of widely used ZK primitive functions using CUDA and providing API with Rust lang out-of-the-box. Currently, the library supports MSM and NTT, Elliptic Curve NTT (ECNTT) and some other helper utilities.
The raw blob-data block contains 256 blobs and serves as the builder’s input. Each blob is a sequence of 4096, 32-byte field-elements from the BLS12–381 scalar field. The raw data is typically organized into a matrix of 256 rows by 4096 columns.
The following list outlines the three distinct outputs required to be constructed by the builder:
The figure above depicts the builder’s block construction process. The right, middle, and left branches of the drawing describe the commitments, interpolation, and proofs tasks. Green boxes describe NTT, ECNTT operations, yellow boxes describe vector multiplication and blue boxes are for Multi-Scalar Multiplication (MSM). It can be immediately seen that the scheme is heavily dominated by Number Theoretic Transform (NTT) and Inverse NTT (INTT) operations. A detailed description of the process is provided in our danksharding math paper.
Using ICICLE, we have implemented a flow that can run entirely in the GPU, with only the relevant outputs to be sent to a host machine and populated on the Ethereum network. Our work focuses on the smallest functions and building blocks that are relevant to the Danksharding builder. The CUDA functions that we designed, along with bindings to our Rust wrappers are capable of executing this process.
We designed the flow to be optimized in response time to the Ethereum network, by first splitting the input matrix D_in into two parallel execution branches:
Accelerating the KZG commitment calculation and share it to the network takes highest priority. Therefore, we start with calculating the matrix C_rows = INTT_rows(D_in) which is a baseline for the rest of the process.
From this point, we call ICICLE to perform a series of computations of MSMs, ECNTTs and vector multiplication.
We finish the KZG commitment calculation by concatenating the MSM output vector with the result of the last ECNTT. This 512 element vector is pushed to the host CPU and sent to the Ethereum network.
The interpolation matrix D is calculated using a highly parallel sub-processes of INTT, vector multiplication and NTT to result 3 sub-matrices C_rows, C_cols and C_both. These sub-matrices, together with the input matrix C, construct the interpolation matrix D.
Since matrix D is used by both the CPU and the remaining part of KZG proofs, it is necessary to copy the matrix data to the host machine and store it.
We have designed and implemented Danksharding scheme in Rust, with a simple API to call all individual building-block functions. The implementation follows the scheme and, at least for the first release, provides both an application that integrates ICICLE, and a ping-pong design between the host machine and the GPU device. This design leaves a lot of space for planned optimizations towards a vision of a full application that will run on accelerated hardware.
The design was tested for correctness using Ethereum research danksharding python reference code. Our test vectors and code to generate them can be found in the repo.
We are happy to officially open our Discord server. This is the place to meet and chat with the dev team, talk about features, bugs, redesign and basically everything around acceleration of ZKPs.
Join us: https://www.ingonyama.com/careers