Dianxin kexue (Aug 2023)
Watermark embedding and detection based on generative causal language model
Abstract
Artificial intelligence generated content (AIGC) generated text itself carried moral and legal compliance risks, and the circulation of generated text content need to be regulated.Therefore, there was an urgent need for copyright protection of AIGC generated text.Watermarking technology was currently the most widely used method for digital copyright protection.A watermark embedding technology was proposed for generating text using generative causal language models.An in-process watermark embedding method was adopted, which implicitly embeded text watermark during the text generation process.Compared to traditional post-process watermark embedding technology, it had less impact on the quality of generated text and had advantages such as low perception, transparency, and robustness.The proposed method has low coupling with existing models and can eliminate the need to adjust the original model structure, training strategies, deployment methods, and increase the computational cost of the original generation process.Through experimental results, the proposed watermark embedding strategy has good robustness and can effectively detect text embedded watermarks even after a certain degree of editing by users.