glow

Table of Contents

1. DETECTING HUMAN

2. SPOOFING

  • Transparency Attacks: How Imperceptible Image Layers Can Fool AI Perception
    • dataset poisoning using the attack to mislabel a collection, in background(hidden) layer in grayscale
    • cause mislabeling
    • use cases:evading facial recognition and surveillance, digital watermarking, content filtering, dataset curating, automotive and drone autonomy, forensic evidence tampering, and retail product misclassifying

2.1. VIDEO GLOW

  • PRIME: Protect Your Videos From Malicious Editing

3. DIFFUSION CENSOR

  • parent: stablediffusion
  • Ambient Diffusion: train diffusion models given only corrupted images as input (copyrightless-ed)
  • Seeing the World through Your Eyes (getting image from reflection of the eyes)
  • LoRA Fine-tuning Efficiently Undoes Safety Training in Llama 2-Chat 70B
    • successfully undo the safety training using lora
  • Cheating Suffix: Targeted Attack to Text-To-Image Diffusion Models with Multi-Modal Priors
    • MMP-Attack, confusing a model into adding a target object into the image content while simultaneously removing the original object

3.1. DETECTING AI GENERATED

  • AEROBLADE: Training-Free Detection of Latent Diffusion Images Using Autoencoder Reconstruction Error
    • does not require any training
  • Organic or Diffused: Can We Distinguish Human Art from AI-generated Images?

3.2. ERASING CONCEPTS

  • erasing concepts https://note.com/gcem156/n/n9f74d7d1417c
    • Using stable diffusion eraser to replace a concept in one model with the same concept from another
    • Forget-Me-Not: Learning to Forget in Text-to-Image Diffusion Models
    • All but One: Surgical Concept Erasing with Model Preservation in Text-to-Image Diffusion Models
      • new approach without issues
  • ORES: Open-vocabulary Responsible Visual Synthesis
    • synthesize images avoiding concepts but following query as much as possible
    • using a llm
  • One-dimensional Adapter to Rule Them All: Concepts, Diffusion Models and Erasing Applications
    • solution to erase or edit concepts for diffusion models (DMs), 0.5% extra parameters of the DM
  • EraseDiff: Erasing Data Influence in Diffusion Models
  • SepME: Separable Multi-Concept Erasure from Diffusion Models
    • avoid unlearning substantial information
  • MACE: Mass Concept Erasure in Diffusion Models
    • successfully scaling the erasure scope up to 100 concepts and balancing generality and specificity
  • Robust Concept Erasure Using Task Vectors
    • concept erasure that uses Task Vectors (TV) is more robust to unexpected user inputs
    • diverse inversion: used to estimate the required strength of the tv edit
      • apply a TV edit only to a subset of the model weights

3.2.1. LLM

  • TOFU: A Task of Fictitious Unlearning for LLMs
    • so that it truly behaves as if never trained on the forgeted data
  • Scissorhands: Scrub Data Influence via Connection Sensitivity in Networks
    • retrains the trimmed model through a optimization process
    • seeking parameters that preserve information on the remaining data while discarding information related to the forgetting data

3.3. FINGERPRINTING

  • and watermarking
  • CopyRNeRF: Protecting the CopyRight of Neural Radiance Fields
    • replacing the original color representation in NeRF with a watermarked color representation
  • Tree-Ring Watermarks: Fingerprints for Diffusion Images that are Invisible and Robust
    • patterns hiddens in fourier space
  • WOUAF:Weight Modulation for User Attribution and Fingerprinting in Text-to-Image Diffusion Models
    • model fingerprinting that assigns responsibility for the generated images
  • FLIRT: Feedback Loop In-context Red Teaming
    • automatic framework that exposes unsafe, inappropriate, content generation and vulnerabilities
  • ZoDiac: Robust Image Watermarking using Stable Diffusion
    • inject a watermark into the trainable latent space, which are reliably detected in the latent vector
  • RAW: A Robust and Agile Plug-and-Play Watermark Framework for AI-Generated Images with Provable Guarantees

3.4. ANTI-GLOW

  • MimicDiffusion: Purifying Adversarial Perturbation via Mimicking Clean Diffusion Model
    • purification technique, approximates the clean image as input
  • DisDet: Exploring Detectability of Backdoor Attack on Diffusion Models
    • detecting poisoned input noise, 100% detection rate for trojan triggers

Author: Tekakutli

Created: 2024-04-07 Sun 13:56