Applied Sciences (Sep 2024)
<span style="font-variant: small-caps">DPTracer</span>: Integrating Log-Driven Accountability into Data Provision Networks
Abstract
Emerging applications such as blockchain, autonomous vehicles, healthcare, federated learning, self-consistent large language models (LLMs), and multi-agent LLMs increasingly rely on the reliable acquisition and provision of data from external sources. Multi-component networks, which supply data to the applications, are defined as data provision networks (DPNs) and prioritize accuracy and reliability over delivery efficiency. However, the effectiveness of the security mechanisms of DPNs, such as self-correction, is limited without a fine-grained log of node activities. This paper presents DPTracer: a novel logging system designed for DPNs that uses tamper-evident logging to address the challenges of maintaining a reliable log in an untrusted environment of DPNs. By integrating logging and validation into the data provisioning process, DPTracer ensures comprehensive logs and continuous auditing. Our system uses Process Tree as a data structure to store log records and generate proofs. This structure permits validating node activities and reconstructing historical data provision processes, which are crucial for self-correction and verifying data sufficiency before results are finalized. We evaluate the overheads introduced by DPTracer regarding computation, memory, storage, and communication. The results demonstrate that DPTracer incurs reasonable overheads, making it practical for real-world applications. Despite these overheads, DPTracer enhances security by protecting DPNs from post-process and in-process tampering.
Keywords