Background/Motivation: Authorship verification (AV) is used in areas such as forensics, plagiarism detection, and fake news detection to identify the true author of a text. The goal of authorship verification (AV) is to classify whether two or more texts were written by the same author (Y) or not (N). A major problem is that authors can intentionally obscure their writing style (adversarial obfuscation). Such attacks include, among other things, synonym replacements, paraphrasing, machine translations, or the use of language models for automatic rephrasing. These attacks often lead to AV systems making incorrect decisions, as superficial stylistic markers disappear. While current systems achieve high accuracy in controlled scenarios, there is a lack of systematic investigations into how robust they are against targeted obfuscations.
Objective: The objective of this work is to investigate various attacks on style concealment and to develop an AV system that is as robust as possible against them. To achieve this, a systematic framework should be established that:
Texts transformed with various obfuscation methods, measuring the impact of these attacks on common AV models, and designing a robust procedure (e.g., through adversarial training or contrastive learning) that better defends against these attacks.
Results: The work aims to demonstrate how vulnerable existing AV approaches are to different obfuscation strategies and which approaches remain particularly robust. In addition, an adversarially trained model is presented that significantly improves robustness. The results contribute to the development of safe, practical AV systems and provide a foundation for future research on adversarial robustness in the field of stylometry.
Be part of change
* Implementing and evaluating attacks for style obfuscation (paraphrasing, synonym replacement, translations, LLM rewriting).
* Researching and implementing robust AV methods (e.g., adversarial training, contrastive learning).
* Self-critical evaluation and comparison with baselines on benchmark datasets (e.g., PAN).
* Presenting the results and discussing the weaknesses of current methods.
What you contribute
* Knowledge in the field of Machine Learning, ideally in the area of NLP and Transformer models.
* Good Python skills, preferably experience with PyTorch or HuggingFace.
* Scientific interest in robustness, security, and evaluation metrics in AI systems.
* Motivation to engage with adversarial attacks and modern AV approaches.
What we offer
* Independent work schedule management
* Insights into the intersection of academic research and industrial application
We value and promote the diversity of our employees' skills and therefore welcome all applications – regardless of age, gender, nationality, ethnic and social origin, religion, ideology, disability, sexual orientation and identity. Severely disabled persons are given preference in the event of equal suitability. Our tasks are diverse and adaptable – for applicants with disabilities, we work together to find solutions that best promote their abilities.
With its focus on developing key technologies that are vital for the future and enabling the commercial utilization of this work by business and industry, Fraunhofer plays a central role in the innovation process. As a pioneer and catalyst for groundbreaking developments and scientific excellence, Fraunhofer helps shape society now and in the future.
Ready for a change? Then apply now and make a difference! Once we have received your online application, you will receive an automatic confirmation of receipt. We will then get back to you as soon as possible and let you know what happens next.
Fraunhofer Institute for Secure Information Technology SIT
Requisition Number: 82694 Application Deadline: