Publications

You can also find my articles on my Google Scholar profile.

Get my drift? Catching LLM Task Drift with Activation Deltas

Sahar Abdelnabi*, Aideen Fay*, Giovanni Cherubin, Ahmed Salem, Mario Fritz, Andrew Paverd. SaTML'25

[Paper] [Code]

Hypothesizing Missing Causal Variables with LLMs

Ivaxi Sheth, Sahar Abdelnabi, Mario Fritz. Causality and Large Models NeurIPS'24 Workshop

[Paper]

Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?

Egor Zverev, Sahar Abdelnabi, Soroush Tabesh, Mario Fritz, Christoph H. Lampert. Secure and Trustworthy Large Language Models ICLR'24 Workshop

[Paper]

Dataset and Lessons Learned from the 2024 SaTML LLM Capture-the-Flag Competition

Edoardo Debenedetti, Javier Rando, Daniel Paleka, Silaghi Fineas Florin, Dragos Albastroiu, Niv Cohen, Yuval Lemberg, Reshmi Ghosh, Rui Wen, Ahmed Salem, Giovanni Cherubin, Santiago Zanella-Beguelin, Robin Schmid, Victor Klemm, Takahiro Miki, Chenhao Li, Stefan Kraft, Mario Fritz, Florian Tramèr, Sahar Abdelnabi, Lea Schönherr. NeurIPS'24 (datasets and benchmarks - Spotlight)

[Paper]

Cooperation, Competition, and Maliciousness: LLM-Stakeholders Interactive Negotiation

Sahar Abdelnabi, Amr Gomaa, Sarath Sivaprasad, Lea Schönherr, Mario Fritz. NeurIPS'24 (datasets and benchmarks)

[Paper] [Code]

Not what you’ve signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection

Kai Greshake*, Sahar Abdelnabi*, Shailesh Mishra, Christoph Endres, Thorsten Holz, Mario Fritz. AISec'23 workshop (co-located with CCS. Oral presentation. Best Paper Award)

[Paper] [Code]

Fact-Saboteurs: A Taxonomy of Evidence Manipulation Attacks against Fact-Verification Systems

Sahar Abdelnabi and Mario Fritz. USENIX Security'23

[Paper] [Code]

Open-Domain, Content-based, Multi-modal Fact-checking of Out-of-Context Images via Online Resources

Sahar Abdelnabi, Rakibul Hasan, and Mario Fritz. CVPR'22

[Paper] [Video] [Code] [Page]

Artificial Fingerprinting for Generative Models: Rooting Deepfake Attribution in Training Data

Ning Yu*, Vladislav Skripniuk*, Sahar Abdelnabi, and Mario Fritz. ICCV'21 (Oral)

[Paper] [Video] [Code]

What’s in the box?!: Deflecting Adversarial Attacks by Randomly Deploying Adversarially-Disjoint Models

Sahar Abdelnabi and Mario Fritz. Moving Target Defense Workshop, in conjunction with CCS'21

[Paper] [Code]

VisualPhishNet: Zero-Day Phishing Website Detection by Visual Similarity

Sahar Abdelnabi, Katharina Krombholz, and Mario Fritz. CCS'20

[Paper] [Video] [Code] [Page]