Schedule

December 14, 2024 - West Meeting Room 111-112

Pre-registration form: https://forms.gle/YBCwn7L8N5AxExMG7

09:00 AM - 09:05 AM PST

Opening remarks

09:05 AM - 09:40 AM

Hoda Heidari

Reflections on Fairness Measurement: From Predictive to Generative AI

Abstract: In this talk, I will provide an overview of the algorithmic fairness literature, which has historically focused on predictive AI models, designed to automate or assist high-stakes decisions. I will contrast that line of work with the recently growing set of benchmarks, metrics, and measures for bias and unfairness through Generative AI. I will conclude with some reflections on conceptualizing and measuring GenAI unfairness, drawing on my past work.

Invited Talk

09:40 AM - 09:45 AM PST Short break

09:45 AM - 10:20 AM

Kush Varshney

Title: Harm Detectors and Guardian Models for LLMs: Implementations, Uses, and Limitations

Abstract: Large language models (LLMs) are susceptible to a variety of harms, from non-faithful output to biased and toxic generations. Due to several limiting factors surrounding LLMs (training cost, API access, data availability, etc.), it may not always be feasible to impose direct safety constraints on a deployed model. Therefore, an efficient and reliable alternative is required. To this end, we present our ongoing efforts to create and deploy harm detectors and guardian models: compact classification models that provide labels for various harms. In addition to the models themselves, we discuss a wide range of uses for these detectors and guardian models - from acting as guardrails to enabling effective AI governance. We also deep dive into inherent sociotechnical challenges in their development.

Invited Talk

10:20 a.m. - 10:25 a.m. Short break

10:25 a.m. - 10:30 a.m.

Alex Tamkin

Title: Evaluating and Mitigating Discrimination in Language Model Decisions

Contributed Talk

10:30 a.m. - 10:35 a.m

To Eun Kim

Title: Towards Fair RAG: On the Impact of Fair Ranking in Retrieval-Augmented Generation

Contributed Talk

10:35 a.m. - 11:20 a.m.

Poster Session 1

11:25 a.m. - 12:00 p.m.

Seth Lazar

Title: Evaluating the Ethical Competence of LLMs

Abstract: Existing approaches to evaluating LLM ethical competence place too much emphasis on the verdicts—of permissibility and impermissibility—that they render. But ethical competence doesn’t consist in one’s judgments conforming to those of a cohort of crowdworkers. It consists in being able to identify morally relevant features, prioritise among them, associate them with reasons and weave them into a justified conclusion. We identify the limitations of existing evals for ethical competence, provide an account of moral reasoning that can ground better alternatives, and discuss the practical—and philosophical—implications if LLMs ultimately do prove to be adept moral reasoners.

Invited Talk

12:00 p.m. - 12:10 p.m. Short break

12:10 p.m. - 1:00 p.m.

Roundtables

Discussions

Evaluation

Leads: Candace Ross (Meta AI), Tom Hartvigsen (University of Virginia)

Metrics

Lead: Angelina Wang (Stanford)

1:00 p.m. - 2:00 p.m. Lunch break

2:00 p.m. - 2:35 p.m.

Sanmi Koyejo and Angelina Wang

Title: Fairness through Difference Awareness: Measuring Desired Group Discrimination in LLMs

Abstract: Algorithmic fairness has conventionally adopted a perspective of racial color-blindness (i.e., difference unaware treatment). We contend that in a range of important settings, group difference awareness matters. First, we present a taxonomy of such settings, such as in the legal system where it can be legally permissible to discriminate (e.g., Native Americans sometimes have privileged legal status, men enter the compulsory draft in America while women do not). Second, we present a benchmark suite that spans eight different settings for a total of 16k questions that enables us to assess for difference awareness. Third, we show that difference awareness is a distinct dimension of fairness and that existing bias mitigation strategies may backfire on this dimension.

Invited Talk

2:35 p.m. - 2:40 p.m. Short break

2:40 p.m. - 2:45 p.m.

Prakhar Ganesh

Title: Different Horses for Different Courses: Comparing Bias Mitigation Algorithms in ML

Contributed Talk

2:45 p.m. - 2:50 p.m.

Natalie Mackraz

Title: Evaluating Gender Bias Transfer between Pre-trained and Prompt Adapted Language Models

Contributed Talk

2:50 p.m. - 2:55 p.m.

Benjamin Laufer · Manish Raghavan · Solon Barocas

Title: The Search for Less Discriminatory Algorithms: Limits and Opportunities

Contributed Talk

2:55 p.m. - 3:55 p.m.

Poster Session 2

3:55 p.m. - 4:15 p.m. Break

4:15 p.m. - 4:55 p.m.

Panel: Rethinking fairness in the era of large language models

Hoda Heidari (CMU) · Sanmi Koyejo (Stanford University) · Jessica Schrouff (Google DeepMind) · Seth Lazar (Australian National University)

Discussions

4:55 p.m. - 5:00 p.m. Short break

5:00 p.m. - 5:10 p.m.

Closing remarks

5:10 p.m. - 5:30 p.m.

Poster Session 3

Page updated

Google Sites

Report abuse

Schedule

December 14, 2024 - West Meeting Room 111-112

Opening remarks

Hoda Heidari

Reflections on Fairness Measurement: From Predictive to Generative AI

Kush Varshney

Alex Tamkin

To Eun Kim

Poster Session 1

Seth Lazar

Roundtables

Evaluation

Metrics

Sanmi Koyejo and Angelina Wang

Prakhar Ganesh

Natalie Mackraz

Benjamin Laufer · Manish Raghavan · Solon Barocas

Poster Session 2

Panel: Rethinking fairness in the era of large language models

Hoda Heidari (CMU) · Sanmi Koyejo (Stanford University) · Jessica Schrouff (Google DeepMind) · Seth Lazar (Australian National University)

Closing remarks

Poster Session 3