OpenAI DevDay 2024 | Community Spotlight | LaunchDarkly
8:55

OpenAI DevDay 2024 | Community Spotlight | LaunchDarkly

OpenAI 17.12.2024 2 206 просмотров 40 лайков

Machine-readable: Markdown · JSON API · Site index

Поделиться Telegram VK Бот
Транскрипт Скачать .md
Анализ с AI
Описание видео
Social justice and prompt engineering: Large language models are only as good as the data we feed into them. Unfortunately, we haven't quite dismantled racism, sexism, and all the other -isms just yet. Given the imperfect tools that we have, how can we write LLM prompts that are less likely to reflect our own biases? In this session, Tilde will review current research about LLM prompt engineering and bias, including practical examples. You'll leave with some ideas that you can apply as both users and builders of LLM applications, to iterate towards a more equitable world.

Оглавление (2 сегментов)

Segment 1 (00:00 - 05:00)

hello thanks Kevin my name is Tilda I use they them pronouns I am a senior developer educator at launch darkley and today we're going to talk about social justice and prompt engineering because large language models have so much promise so much potential and yet they have inherited all of our human flaws because they're trained on human data so what do we do well the 's actually a lot of researchers actively digging into this right now and I'm excited to share their findings with you so I'll cover one paper from industry one from academic and kind of go into how do we apply these and then summarize the takeaways for you so let's go the first paper I'm going to talk about is from anthropic in December 2023 now there's no real scientific consensus on how to audit an algorithm for bias so these researchers have taken a page from social scientist book a popular way for social scientists to study bias and hiring is correspondence experiments where they would take the same exact resume put two different names on it names where you could infer a race and a gender from the name and send it out into the field and see who is more likely to get a call back now if you were going to do a correspondent experiment on a large language model instead of sending it a resume you would send it a prompt and similarly put in names now these researchers were investigating ating whether Claude 2. 0 showed bias when asked to make yes or no high stakes decisions about hypothetical human beings and their first point which I cannot underscore enough is that you should not use large language models to make high six decisions about human beings not yet they are not ready for that so they used the model to come up with a list of topics and then each topic got a prompt in this example it is about should we hire a person with this qualifications and then they put the demographic data which in this case is a 30-year-old white female directly in the prompt now all these prompts were designed so that a yes response was a positive outcome for the hypothetical human being in question the researchers tested both putting in the demographic data directly as you saw there or using a name that could be associated with the race or a gender now the results of this study actually surpris me which is that Claude showed positive discrimination it was more likely to give a yes response to women or non-white people however Claude showed negative age discrimination against folks over 60 years old so the researchers tried modifying their prompts by adding statements really don't discriminate really really really don't discriminate they also tried some more variations like saying affirmative action should not affect your decision instructing the model to ignore any demographic information reminding the model that discrimin mination is illegal and a combination of those last two factors which turned out to be the winner um if you look at this chart the bright yellow is a combination of reminding the model that discrimination is illegal and telling it to ignore demographic characteristics and that dropped bias the most out of everything that they tried some limitations of this study is that it does not consider every possible Vector of discrimination like size gender identity religion all that good stuff and it does not account for intersectionality which is the idea that if you're a member of multiply marginalized groups bias is multiplicative not additive and I cannot mention intersectionality without giving credit to Kimberly kensaw the brilliant theorist who invented the term so the next paper comes from Princeton University now raise your hand if you've heard of an implicit association test ah some of you probably have if you've ever taken unconscious bias training at your company so these tests were developed for humans to understand the biases that we are ourselves not even aware of and these researchers came up with a way to give implicit bias test to large language models which is pretty cool because I don't work at open AI I don't have access to you know proprietary Clos Source data but I do have access to the output so these researchers ask the models to associate words into categories and this is very similar to how you do this for a human in this prompt the models asked to choose between white or black for the following list of words and write their choice after each words so if you Associated I don't know Axe and grenade with black you might be showing implicit racism which unfortunately all of the models that were tested did they showed high levels of stereotypical bias so then the next question becomes does this bias actually impact how these models make decisions and for that the researchers wrote another round of prompts that had the potential to be discriminatory but weren't blatantly so in this prompt they were asked to

Segment 2 (05:00 - 08:00)

generate two short profiles about black and white preschoolers who lived in different neighborhoods and they were each supposed to draw one concept painful or joyful who should pick what now all these models showed bias in their decision making but the good news is it was an order of magnitude less than the implicit bias shown in the previous round of testing and what these researchers found was that asking the models to make explicit absolute decisions yes or no led to less bias than relative decisions like choose one candidate over the other which might explain anthropic results because those were extremely absolute so these researchers also tried adding a clause saying treat people from different socioeconomic statuses etc equally and that dropped the bias in GPD 4 almost in half and this is a finding that was seems to be a pattern it was replicated in one other paper that I can find so the combination of telling models to ignore demographic information um seems to be the winner now I'm going to show you how to apply this to a real life prompt one other paper that I found that I really liked was about writing reference letters and I think this is a really cool and interesting use case because people are using models to write letters of reference today and I don't think that's a bad thing so if you ask a large language model to write a letter of reference it can show gender bias emphasizing women's personal traits more and men's achievements but these researchers The Prompt that they used is shown here and it's not very detailed or specific so and they didn't try to modify it at all so I think we can do better now I ran this prompt through GPT 40 mini with some names Brad and leesa and the results that it gave me out of the gate were not too bad but it still emphasize Brad's outstanding academic performance and lisha's Leadership skills so in order to make this better I to add more relevant contextual information like the students GPA and their extracurricular activities and I'm going to instruct the model to ignore any demographic information and remind it that hey it's important to treat everybody equally so when we do this the results that we get are much more equivalent now if you want to run head-to-head tests of different models and different prompts I work at a company called launch Darkly which is a developer first feature management and experimentation platform and our AI flag make it super easy to test models and prompts against one another and you can find out more information at that at the QR code there but just to summarize what we have learned today if you take away one thing do not use large language models to make high stakes decisions about human beings and if your company is telling you to do that you need to push back for unbiased prompt engineering remind the model that discrimination is illegal and tell it not to tell it to ignore demographic characteristics tell it to make absolute rather than relative decisions if you can and anchor your prompts with relevant external data like when we passed in the GPA to the letter of reference architectural patterns like rag can be really helpful with that and blinding also isn't that effective because just like human beings large language models can infer demographics from things like your ZIP code or where you went to college so it's hard to truly mask that data now prompts are very sensitive to small changes in wording and new models are coming out at such a rapid Pace it's really hard to keep up so you're going to want to build flexibility into your architectural systems and the ability to keep testing and iterating now thank you so much here's where you can find some slides with more detailed references here where you can find me on social media or just find me during the break I would love to chat with you thank you

Другие видео автора — OpenAI

Ctrl+V

Экстракт Знаний в Telegram

Экстракты и дистилляты из лучших YouTube-каналов — сразу после публикации.

Подписаться

Дайджест Экстрактов

Лучшие методички за неделю — каждый понедельник