Provable Watermark Extraction

zkDL++ is a novel framework designed for provable AI. Leveraging zkDL++, we address a key challenge in generative AI watermarking: Maintaining privacy while ensuring provability. By enhancing the watermarking system developed by Meta, zkDL++ solves the problem of needing to keep watermark extractors private to avoid attacks, offering a more secure solution. Beyond watermarking, zkDL++ proves the integrity of any deep neural network (DNN) with high efficiency. In this post, we outline our approach, evaluate its performance, and propose avenues for further optimization.

Introduction

Stable Signature is a watermarking (WM) scheme introduced by Meta AI to allow for identification of images generated by their latent diffusion model (LDM). It uses a Convolutional Neural Network(CNN) to extract the WM, and has the advantage of being very robust to image tampering, allowing for detection even if the image was filtered, cropped, compressed, etc. However, Stable Signature suffers from the drawback of having to keep the extractor model private, as exposing it makes the WM susceptible to attacks. This drawback can be significant if at any point there is a controversy about ownership of a certain image. While Meta, or any other company using such a WM method, can run the extractor and see that a certain image was generated by their model, they cannot prove it to a third party such as social media users, unless they expose the weights of the extractor model. Such exposure may lead to obsolescence of the WM system.

We propose using the emerging technology of Zero-Knowledge Proof (ZKP) as a solution to this problem. In a ZKP scheme, a prover wishes to convince a verifier that they have knowledge of a witness 𝑊, which obeys a certain functional relation 𝐹(𝑋,𝑊) = 0, without revealing 𝑊. The relation 𝐹
is generally represented by an arithmetic circuit and is public. The variable 𝑋 contains public input and output to the circuit. In the case of Stable Signature, the relation 𝐹 is the architecture of the WM extractor, with its output compared to the key, and it is satisfied if-and-only-if the extracted WM from the input image matches the key. The public variable

includes the image and the Boolean output, and the private witness 𝑊 = (𝑤,𝑘) includes the weights of the extractor and the key as is depicted in the diagram above. A verifier can then run a verification algorithm that tests the proof and returns True if the relation is indeed satisfied, and False otherwise.

An additional challenge is ensuring that the weights weren’t deliberately chosen to produce a specific result for a given image. Fortunately, this issue can be easily resolved by the following:

Real-world application of provable watermarking extraction

According to Meta, the end goal is to be able to label AI generated content on social platforms. Technologies like Stable Signature can significantly enhance detection accuracy by embedding robust watermarks into AI-generated media. Meta can add these watermarks to content generated by their models, but users cannot verify them independently because the extractor model’s weights must remain private. Moreover, this challenge is compounded by the fact that each company, like Google, has its own proprietary watermarking system that is incompatible with others.

ZKPs complement perfectly the shortcomings of such a system. For AI-generated images (the same approach applies to audio as well), Meta can attach a proof of watermark extraction, akin to a digital signature, without revealing sensitive model details. Clients can then run ZK verification for each image in their feed, which enables automatic labeling of AI-generated content. Additionally, other companies like Google can integrate their own watermark extraction proofs, allowing the system to distinguish between different AI models.

Meta will handle the resource-intensive task of running the prover, while clients — such as mobile devices — can handle the lightweight verification. Since the watermark is robust against transformations (like cropping or compression), Meta can rerun the prover to account for any transformations and produce new proofs, as demonstrated with approaches like VIMz.

Integrity Proofs for DNNs

In Meta’s Stable Signature, the watermark extractor is a CNN specifically designed to analyze watermarked images. Let’s focus on images sized
128 × 128 pixels. The diagram below shows the architecture of Meta’s extractor DNN, which consists of 9 convolutional layers, an average pooling layer, and a fully connected layer. This model contains approximately
38 million parameters.

Current industry solutions for proving DNN inference often rely on zkVMs. However, zkVMs are not application-specific and introduce unnecessary complexity, making them less suitable for this task. In contrast, two recent academic works, zkCNN and zkDL, propose SNARK frameworks tailored to proving DNN inference, both based on the Sumcheck protocol.

As shown in the diagram below, zkCNN leverages the GKR protocol, while zkDL independently proves each layer of the DNN. The trade-offs between the two are notable:

zkCNN commits only to the inputs and outputs of the network, whereas zkDL requires commitment to all intermediate inter-layer values.
However, zkDL allows for concurrent processing of layer proofs, while zkCNN must process them sequentially.
Another advantage of zkDL is its flexibility — each layer-proof can use a different proving system, as long as shared commitments are maintained.

In the next section, we outline how we integrated concepts from both zkDL and zkCNN to develop a proof-of-concept for a DNN-SNARK framework, which we call zkDL++.

Before diving into our implementation, let’s first compare the proof construction run-times for a network of two fully-connected layers. This analysis evaluates zkDL/zkDL++, Jolt zkVM, and SP1 zkVM. As illustrated in the diagram below, the input to the network of size 𝐿 is reduced to size 𝐿/4 in the first fully-connected layer, which then passes through a second fully-connected layer that produces an output of size 𝐿/8. Both inputs and network parameters are randomized uniformly.

We start by implementing the model in all three frameworks. We sample the input vector as a random byte-vector and arithmetize the fully-connected layers computation (i.e. matrix-vector products) in the three frameworks. This is followed by proof generation and benchmarking. The code for our evaluation is available at zkDL++ (available soon), Jolt, and SP1.

The comparison shows that zkDL/zkDL++ outperforms both Jolt and SP1 in prover run-time performance, especially as input sizes increase. While Jolt and SP1 suffer significant performance degradation with larger inputs, zkDL/zkDL++ maintains relatively low run-times, making it ideal for systems that demand high performance and scalability. For example, when the input size is 2¹⁰, zkDL generates a proof around 200 times faster than SP1, as is presented in the prover run-time graph below.

In this scenario, the number of model parameters is 0.3 million — still much lower than a typical DNN model. In such cases, zkDL++ would likely be the only framework capable of practically generating proofs of integrity.

It is worth noting all the experiments were performed on the same hardware configuration, which consists of a powerful CPU (13th Gen Intel Core i9–13900K) and a high-end GPU (NVIDIA GeForce RTX 4080). Although the machine is equipped with a capable GPU, the experiments were not specifically optimized to fully utilize the GPU’s potential. GPU optimization could potentially yield faster results, but that wasn’t the focus of our experiments.

zkDL++

zkDL++ can be viewed as a hybrid of zkDL and zkCNN, designed to optimize both approaches. While zkDL and zkCNN represent two extremes — proving each layer independently versus proving the entire network as a whole — zkDL++ strikes a balance between them. This flexibility allows for high computational parallelism while maintaining a manageable commitment cost.

Our proof-of-concept is built on a fork of zkDL. Initially, zkDL only supported primitives for fully connected and ReLU layers. By adding support for convolutional and average pooling layers, we were able to construct a SNARK for the watermark extractor DNN. In the following sections, we present the run-time results of our proof-of-concept, delve into the mathematical foundations of the proving primitives, and conclude with a discussion on limitations and future directions.

Results and Methodology

We run an end-to-end SNARK prover to prove the correctness of the inference of the extractor DNN. The computation is split into the following steps.

Preprocessing: We commit to the parameters of the DNN, i.e. the weights and biases of the trained DNN model. Once the model is trained, this is a one-time computation.
Inference: We run the inference on randomized input (of size 3 × 128 × 128) with the extractor DNN. In the context of SNARKs, this step is referred to as witness generation.
Witness commitments: We start generating the proof by first committing to the witnesses. Specifically, we commit to the input and output of each layer individually.
Sumcheck proofs: We then compute the sumcheck proofs for each layer that attests to the correctness of each layer individually. This is followed by the computation of opening proofs for the witness polynomials in each layer.

We benchmark the zkDL++ framework on the same hardware as described in the previous section. The following table shows the run-times of each of these steps.

The total time for generating the proof is approximately 5.4 minutes. Next, we discuss the overview of our techniques. The objective is to arithmetize the primitives used in DNNs.

Convolution

To keep things concise, we’ll focus on the arithmetization of 1D convolution. Let 𝑢 and 𝑣 be vectors of length 𝑛, representing the input and output of the convolution, respectively, and let 𝑓 be a filter of length 2𝑡+1.

We aim to represent this convolution between the input 𝑢 and the filter 𝑓 as a polynomial equation. Let 𝑝ᵤ(𝑥) and 𝑝𝒻(𝑥) be the polynomials corresponding to the coefficients of 𝑢 and 𝑓, respectively as

Multiplying 𝑝ᵤ(𝑥) and 𝑝𝒻(𝑥) yields the convolution of 𝑢 and 𝑓. To visualize this, consider the following example from Wikipedia.

When multiplying two polynomials 𝑃 and 𝑄, summing terms of the same color gives the convolution. Note, however, that tail terms appear as remnants from the multiplication, which are not part of the convolution. Using this approach, we obtain

Average Pooling

Matrix multiplication can be efficiently proven using sumcheck as demonstrated here by Justin Thaler. Therefore, the correctness of the average pooling operation can be verified using sumcheck.However, a minor challenge arises as the verifier would need to evaluate a bi-variate polynomial, corresponding to the constant matrix in the above equation, on a challenge

which implies linear work for the verifier. To avoid this and ensure efficient verification, we can exploit the structure of the matrix as

The prover can leverage this tensor product structure to construct a more efficient sumcheck since it is essentially a multiplication of two multi-linear polynomials in different variables.

Fully-Connected

A fully-connected layer applies a weight matrix on an input vector to generate an output vector. Let 𝑢 be the input vector and 𝑣 be the output vector, as shown in the diagram below.

This operation is represented as a matrix-vector product as

As noted in the average pooling case, we can use the sumcheck technique for matrix-multiplication to prove the correctness of the fully-connected layer. Note that the weight matrix is committed to as a part of pre-processing.

ReLU

The zkDL paper gives a technique to arithmetise the ReLU function as a sumcheck instance. Since ReLU is a non-linear operation, we need a slightly different approach in its arithmetisation. For the purpose of exposition, we will demonstrate arithmetisation for a single ReLU function. Let 𝑢 be the input to the ReLU function and 𝑣 be its output. The ReLU function is applied on a scaled version of the input as follows

Limitations and Future work

The current proof generation process, taking approximately 5.4 minutes, can appear resource-intensive when scaled for content production. A significant portion of this time is spent on two key tasks: computing witness commitments and generating sumcheck proofs. At present, these operations are performed sequentially, which introduces inefficiencies that can be addressed.

Sequential Witness Commitments: Proof generation involves computing commitments for 13 witness vectors, corresponding to 9 convolution layers, 1 average pooling layer, 1 fully connected layer, and the input and output. These commitments are currently calculated layer by layer in a linear fashion, starting from the input and progressing to the output. This sequential process creates bottlenecks, particularly when dealing with large models. Parallelizing the computation of these witness commitments is a straightforward and promising area for improvement.
Batching Sumcheck Proofs: Each layer also requires the generation of sumcheck proofs, which independently verify different properties of that layer. Presently, sumcheck proofs are computed one after the other, further contributing to the extended proof generation time. There is scope to parallelize the sumcheck proofs across layers, as they are independent. Additionally, within each layer, multiple sumcheck proofs are needed to prove various statements. Batching these sumcheck computations — both within individual layers and across different layers — can help reduce redundancy and improve overall performance.
Choice of commitment scheme: The current implementation uses the Hyrax multi-linear commitment scheme for witness commitments and sumcheck opening proofs. While Hyrax is effective, exploring alternative schemes such as Zeromorph could be beneficial, particularly for applications where evaluation proofs need to be zero-knowledge. Zeromorph offers potential advantages in terms of performance and security in zero-knowledge settings.
Choice of the field: The current setup uses the BLS12–381 elliptic curve and its corresponding scalar field for computations. While this curve is widely adopted for cryptographic purposes, exploring alternative fields, such as Mersenne primes or Baby Bear (and their extensions), could provide speedups in the proof generation process.

There are several promising avenues for improving the efficiency of proof generation. Parallelizing witness commitments and sumcheck proofs, exploring alternative commitment schemes, and experimenting with smaller fields represent significant opportunities for future optimization.

What’s Next?

In this work, we demonstrated zkDL++ with a real-world example involving a CNN. However, the zkDL++ framework is flexible and can be extended to support additional layers, making it applicable to any deep neural network (DNN). In our proof of concept for private watermark extraction, we achieved prover runtimes in the range of minutes for a standard CNN. With further optimizations, we believe this can be reduced to seconds, meeting the performance requirements for large-scale deployment by companies like Meta.

Our code is production-ready and available for testing. We encourage collaborators, researchers, and developers to explore zkDL++ with us and help advance this powerful framework.

Acknowledgements

We would like to thank the authors of the zkDL and Stable Signature papers. Special thanks goes to Pierre Fernandez and Haochen Sun ❤️.

‍