Probing Cross-Lingual Bias Encoding in Transformer Models

This repository provides a reproducible research framework for quantifying and analyzing gender-science bias in transformer-based language models using the Word Embedding Association Test (WEAT). The work systematically compares multiple transformer architectures, evaluates cross-lingual bias in English and Urdu, and explores how bias evolves across hidden layers of neural models.

📌 Motivation

Pretrained transformer models such as BERT, RoBERTa, and XLM-RoBERTa have significantly advanced natural language processing, but also inherit and amplify societal biases present in training data. Measuring these biases is crucial to ensuring fairness and ethical deployment of language technologies. The WEAT metric, inspired by social science association tests, has become a foundational method for detecting bias in word embeddings by comparing target and attribute word group associations. :contentReference[oaicite:1]{index=1}

🎯 Project Goals

This repository aims to:

Implement a rigorous and reproducible pipeline for computing WEAT effect sizes from contextualized transformer hidden representations.
Compare gender-science bias across multiple transformer models (BERT, RoBERTa, DistilBERT, XLM-RoBERTa).
Extend bias evaluation to multilingual scenarios by incorporating Urdu text corpora.
Conduct layer-wise bias analysis to reveal how associations progress through neural representations.
Provide publication-ready visualizations and benchmarks for cross-lingual fairness research.

🛠 Setup & Installation

Clone the repository and set up the environment:

git clone https://github.com/adilrasheed139/Probing-Cross-Lingual-Bias-Encoding-in-Transformers

🧠 Methodology

This project uses the Word Embedding Association Test (WEAT) to quantify bias by comparing the cosine similarities between groups of target words (e.g., math vs. arts) and groups of attribute words (e.g., male vs. female). The effect size is a normalized score indicating the relative association strength. For robust evaluation, permutation testing is used to estimate statistical significance. Effect sizes and distributions are calculated for both model-level embeddings and intermediate hidden states.

📈 Key Results

The repository includes visualizations referenced in the research manuscript:

Figure Name	Description
Bias Metric (WEAT) Effect Size In English	WEAT effect size for English transformer models.
Bias Metric (WEAT) Effect Size in Urdu	WEAT effect size for Urdu analysis.
Cross-Lingual Comparison of Gender-Science Bias	Overlay of bias progression across languages.
Cross-Lingual Divergence in Bias Encoding	Comparative plot showing divergence in bias dynamics.
English (BERT) Layer 2 Word Associations Heatmap	Cosine similarity heatmap at peak bias layer (English).
Urdu (XLM-R) Layer 11 Word Associations Heatmap	Heatmap for peak bias layer in Urdu.
Linguistic Bias Evolution through Transformer Layers	Layer-wise effect size plot.
WEAT Effect Size Comparison Across Models In English	Comparative bar chart of bias scores for English models.
WEAT Effect Size Comparison Across Models In Urdu	Comparative bar chart for Urdu models.
WEAT Permutation Test - Gender-Science Bias	Permutation distribution and observed score.
Word-Level Bias Profile English (BERT) Layer 2	Radar plot of individual English word biases.
Word-Level Bias Profile Urdu (XLM-R) Layer 11	Radar plot for individual Urdu word biases.

All figure files are located in the PDF document and are suitable for academic presentations and publications.

🧪 Reproducibility & Best Practices

This README conforms to established guidelines for scientific documentation by providing clear project description, installation instructions, methodology, and usage examples. Including detailed metadata improves reproducibility and helps external users understand the structure and purpose of the project. ([UC Davis Data Lab][2])

✍️ Authorship

Probing Cross-Lingual Bias Encoding in Transformer Models Author: Adil Rasheed Affiliation: Government College University Faisalabad Email: adilrasheed139@gmail.com Status: Under process ( manuscript and code development )

🎓 Citation

Please cite this work if you use it:

@article{crosslingualbias,
  title={Probing Cross-Lingual Bias Encoding in Transformer Models},
  author={Adil Rasheed},
  year={2025},
  note={Under process / manuscript in preparation}
}

📄 License

This project is released under the MIT License – see LICENSE for details.

🧠 Acknowledgements

This research builds upon foundational work in bias detection in word embeddings and contextualized models. Core inspiration and methods derive from the original WEAT methodology used to evaluate societal bias in embeddings.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Cross‑Lingual Bias Dynamics in Transformer Hidden State.pdf		Cross‑Lingual Bias Dynamics in Transformer Hidden State.pdf
README.md		README.md
cross-lingual-bias-evolution-in-transformer-archit.ipynb		cross-lingual-bias-evolution-in-transformer-archit.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Probing Cross-Lingual Bias Encoding in Transformer Models

📌 Motivation

🎯 Project Goals

🛠 Setup & Installation

🧠 Methodology

📈 Key Results

🧪 Reproducibility & Best Practices

✍️ Authorship

🎓 Citation

📄 License

🧠 Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Probing Cross-Lingual Bias Encoding in Transformer Models

📌 Motivation

🎯 Project Goals

🛠 Setup & Installation

🧠 Methodology

📈 Key Results

🧪 Reproducibility & Best Practices

✍️ Authorship

🎓 Citation

📄 License

🧠 Acknowledgements

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages