Show simple item record

Very Simple Membership Inference and Synthetic Identification in Denoising Diffusion Models

dc.contributor.advisorMoyer, Daniel
dc.creatorQu, Bowen
dc.date.accessioned2024-05-15T16:33:08Z
dc.date.available2024-05-15T16:33:08Z
dc.date.created2024-05
dc.date.issued2024-03-25
dc.date.submittedMay 2024
dc.identifier.urihttp://hdl.handle.net/1803/18823
dc.description.abstractGenerative models represent a cornerstone in the field of machine learning, tasked with synthesizing data that mimics the distribution of real-world examples. Among these models, diffusion models such as Denoising Diffusion Probabilistic Models (DDPM) have garnered considerable attention for their ability to generate high-fidelity samples across diverse domains. However, alongside their remarkable capabilities, generative models also raise concerns regarding security and privacy risks. These models learn to capture complex patterns and structures present in the data, potentially including sensitive or private information. Therefore, there exists a possibility for adversarial entities to exploit generative models to reconstruct or generate synthetic data resembling the original, thereby posing threats such as data leakage, identity inference, and unintended disclosure of confidential information. We find that the Euclidean norm of the estimated noise in DDPM is markedly different between training, testing, and synthetic (model generated) datasets when manipulating the noise-process timestep. Given control over model inputs x and t and access to model outputs ε we can infer from which set (train, test, or synth) a trial datapoint x was drawn with surprisingly high accuracy using only this summary statistic and a simple threshold classifier. We present empirical results both on our own trained instances of the DDPM model as well as publicly available architectures and weights. We also evaluated the performance of our proposed approach on two widely used datasets CelebA-HQ and ImageNet. Additionally, we assessed the effectiveness of our proposed method across varying data resolutions and training epochs, achieving consistently high levels of accuracy.
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.subjectMembership Inference
dc.subjectDenoising Diffusion Probabilistic Models
dc.titleVery Simple Membership Inference and Synthetic Identification in Denoising Diffusion Models
dc.typeThesis
dc.date.updated2024-05-15T16:33:08Z
dc.type.materialtext
thesis.degree.nameMS
thesis.degree.levelMasters
thesis.degree.disciplineComputer Science
thesis.degree.grantorVanderbilt University Graduate School
dc.creator.orcid0009-0003-5600-7295


Files in this item

Icon

This item appears in the following Collection(s)

Show simple item record