A Survey of Text Watermarking in the Era of Large Language Models

Feb 1, 2024·
Aiwei Liu
,
Leyi Pan
,
Yijian Lu
Jingjing Li
Jingjing Li
,
Xuming Hu
,
Xi Zhang
,
Lijie Wen
,
Irwin King
,
Hui Xiong
,
Philip S Yu
· 1 min read
Overview of text watermarking in the era of LLMs
Abstract
Text watermarking algorithms are crucial for protecting the copyright of textual content. Historically, their capabilities and application scenarios were limited. However, recent advancements in large language models (LLMs) have revolutionized these techniques. LLMs not only enhance text watermarking algorithms with their advanced abilities but also create a need for employing these algorithms to protect their own copyrights or prevent potential misuse. This work conducts a comprehensive survey of the current state of text watermarking technology, covering four main aspects: (1) an overview and comparison of different text watermarking techniques; (2) evaluation methods for text watermarking algorithms, including their detectability, impact on text or LLM quality, and robustness under target or untargeted attacks; (3) potential application scenarios for text watermarking technology; and (4) current challenges and future directions for text watermarking. This survey aims to provide researchers with a thorough understanding of text watermarking technology in the era of LLMs, thereby promoting its further advancement.
Type
Publication
ACM Computing Surveys

This survey paper provides a comprehensive overview of text watermarking technology in the era of Large Language Models (LLMs). The work covers four main aspects:

  1. Text Watermarking Techniques: A detailed overview and comparison of different text watermarking methods, including their strengths and limitations.

  2. Evaluation Methods: Analysis of how to evaluate text watermarking algorithms, focusing on:

    • Detectability
    • Impact on text quality
    • Impact on LLM performance
    • Robustness against targeted and untargeted attacks
  3. Application Scenarios: Exploration of potential use cases for text watermarking technology, particularly in the context of LLM-generated content.

  4. Challenges and Future Directions: Discussion of current limitations and promising research directions in the field.

The survey aims to provide researchers with a thorough understanding of text watermarking technology in the LLM era, helping to advance the field and protect intellectual property in the age of AI-generated content.