Get my drift? Catching LLM Task Drift with Activation Deltas
Sahar Abdelnabi*, Aideen Fay*, Giovanni Cherubin, Ahmed Salem, Mario Fritz, Andrew Paverd. SaTML'25
[Paper] [Code]You can also find my articles on my Google Scholar profile.
Sahar Abdelnabi*, Aideen Fay*, Giovanni Cherubin, Ahmed Salem, Mario Fritz, Andrew Paverd. SaTML'25
[Paper] [Code]Ivaxi Sheth, Sahar Abdelnabi, Mario Fritz. Causality and Large Models NeurIPS'24 Workshop
[Paper]Egor Zverev, Sahar Abdelnabi, Soroush Tabesh, Mario Fritz, Christoph H. Lampert. Secure and Trustworthy Large Language Models ICLR'24 Workshop
[Paper]Edoardo Debenedetti, Javier Rando, Daniel Paleka, Silaghi Fineas Florin, Dragos Albastroiu, Niv Cohen, Yuval Lemberg, Reshmi Ghosh, Rui Wen, Ahmed Salem, Giovanni Cherubin, Santiago Zanella-Beguelin, Robin Schmid, Victor Klemm, Takahiro Miki, Chenhao Li, Stefan Kraft, Mario Fritz, Florian Tramèr, Sahar Abdelnabi, Lea Schönherr. NeurIPS'24 (datasets and benchmarks - Spotlight)
[Paper]Sahar Abdelnabi, Amr Gomaa, Sarath Sivaprasad, Lea Schönherr, Mario Fritz. NeurIPS'24 (datasets and benchmarks)
[Paper] [Code]Kai Greshake*, Sahar Abdelnabi*, Shailesh Mishra, Christoph Endres, Thorsten Holz, Mario Fritz. AISec'23 workshop (co-located with CCS. Oral presentation. Best Paper Award)
[Paper] [Code]Sahar Abdelnabi and Mario Fritz. USENIX Security'23
[Paper] [Code]Sahar Abdelnabi, Rakibul Hasan, and Mario Fritz. CVPR'22
[Paper] [Video] [Code] [Page]Ning Yu*, Vladislav Skripniuk*, Sahar Abdelnabi, and Mario Fritz. ICCV'21 (Oral)
[Paper] [Video] [Code]Sahar Abdelnabi and Mario Fritz. Moving Target Defense Workshop, in conjunction with CCS'21
[Paper] [Code]Sahar Abdelnabi and Mario Fritz. S&P'21
[Paper] [Video] [Short Video] [Code]