ACID Benchmark

ACID Test: A Benchmark for Cultural Safety and Alignment in LALMs

The ACID Benchmark investigates cultural preferences, sensitivities, and failures in LALMs for real-world multilingual scenarios. It extends text-based cultural harm analyses into the audio domain, focusing on audio-driven bias and sensitivity, output quality, and fairness across ten languages, including Arabic, Bengali, English, French, Gujarati, Hindi, Russian, Telugu, Turkish, and Vietnamese, and twelve societal dimensions (Ethics, Religion, Security, Political, Happiness, etc.).

The ACID Benchmark includes:

10 Languages: Arabic, Bengali, English, French, Gujarati, Hindi, Russian, Telugu, Turkish, Vietnamese.
1315+ Hours: Multilingual audio-text pairs covering twelve societal dimensions (based on World Values Survey).
Three Main Sets:
- Set A: Cultural harm evaluation (11,620 samples)
- Set B: Contextual sensitivity (77,860 samples)
- Set C: Alignment/preference optimization (300,000 prompt-response pairs per language)
- Robust Curation: BLEU metrics for translation, Whisper-v3/DNS-MOS for audio/transcription quality, and human evaluation for subjective validation.

License and Download:

To obtain access to the dataset, please email the duly filled license agreement to databases@iab-rubric.org with the subject line "Licence agreement for ACID-Benchmark dataset".
NOTE: The license agreement must be signed by someone with the legal authority to sign on behalf of the institute, such as the head of the institution or the registrar. If a license agreement is signed by someone other than the intended recipient, it will not be processed further.
The database can be downloaded from the following link: Coming Soon

Citation:
If you use the ACID Benchmark or any of the released components in your publications or reports, you must refer to the following paper: Please cite the AAAI 2026 ACID Benchmark paper:

@inproceedings{DBLP: conf/aaai/acid_bench, author = {Bikash Dutta and Adit Jain and Rishabh Ranjan and Mayank Vatsa and Richa Singh}, title = {{ACID Test}: A Benchmark for Cultural Safety and Alignment in LALMs}, booktitle = {The 40th Annual AAAI Conference on Artificial Intelligence, {AAAI} 2026, Singapore, January 20-27, 2026}, publisher = {AAAI Press}, year = {2026}}