For 15 years, Stack Overflow has been the main hub for discussions of computer programming and development. It’s where users who are facing a tricky conundrum or are hitting a wall in their code can come to ask questions of fellow users.
And historically, it has been a male-dominated space. In the organization’s annual survey of its users conducted in 2022, 92 percent of respondents identified as male, and three-quarters as white or European. The platform acknowledged then that it has “considerable work to do.”
But in 2023, Stack Overflow’s survey, published on June 13, stripped out questions about gender and race.
“I kind of would understand if they decided not to ask about people, but they still ask geography, age, developer type, years of coding, and a bunch of things about salary and education,” says Sasha Luccioni, a member of the board of Women in Machine Learning, an organization lobbying to increase awareness of, and appreciation for, women in the tech sector. “But not gender. That’s really screwed up.”
Luccioni says the decision to not collect data on gender balance—particularly after previous years showed it to be so highly skewed—is avoiding, rather than confronting, the problem. “This is very symptomatic of the tech industry,” she says. “It's not just about AI, it’s also in general. Like, who, who codes our code? Young white male people.”
In 2022, just one in four researchers who published academic papers on AI were female. The likelihood of at least one man appearing as an author of research on AI is twice as great as an AI publication having at least one woman.
“We did not exclude demographic questions from this year's survey to skirt our responsibility there,” says Joy Liuzzo, Stack Overflow’s vice president of marketing. “We removed the demographic questions due to concerns about personally identifiable information, given the increasingly complex regulatory environment and the highly international nature of the survey.”
Liuzzo acknowledged “there's a lot of work to be done to make the field of software development more diverse and inclusive, and Stack Overflow has a big role to play in that work.” She says the organization has published a new, more inclusive code of conduct in recent weeks and has overhauled the process of asking questions on the platform. She hopes this will reduce barriers to entry, which may historically have caused underrepresented groups to shy away from the site. “We recognize there is much more to be done, and we are committed to doing the work to make change happen,” she says.
However, that’s small comfort to Kate Devlin, a reader in artificial intelligence and society at King’s College, London. “It's common knowledge that tech has a gender problem,” she says. “If we are serious about increasing diversity in tech, then we need to know what the landscape looks like.” Devlin points out that it’s difficult to measure progress—or regression—without a baseline of data.
Whatever the reasons for removing key questions about who’s using the platform, the survey results—or lack of them—highlight a problem with Stack Overflow’s user demographics, and a broader issue across tech: Non-male participants are woefully underrepresented.
“Removing gender from the annual survey is an egregious erasure of the problems of the gender gap that pervade the tech industry. And worse, it removes important context for the data that is scraped and fed into large language models,” says Catherine Flick, a computing and social responsibility scholar at De Montfort University. “Without that context, the bias of the data set is unknown, and it’s well documented that gender bias is frequently built into technology, from variable names to form fields to assumptions about occupations, roles, and capabilities.”
More women than ever are taking, and gaining, degree-level qualifications in science, technology, engineering, and mathematics, according to the National Science Foundation—though the proportion of women getting undergraduate computer science degrees has dropped by nearly 20 percentage points in the past 40 years. (The share of master’s degrees in computer science going to women has increased slightly.) But even if the pipeline is being fixed, retaining women in the tech sector is tricky. Half of women who enter the industry drop out by age 35, according to data from Accenture.
The problem becomes more pressing because of tech’s ubiquity in our lives, and the way in which artificial intelligence in particular is set to be integrated into everything we do and interact with. The humans behind tech platforms make countless decisions—big and small—about their products and tools that can act to the detriment of people who are not like them.
“With non-AI code, you can debug it, get a second pair of eyes from a different demographic, and check it quite straightforwardly,” says Luccioni. “But if you have AI code, all these decisions that drove the data or the model architecture, they’re baked in.”
Take early versions of ChatGPT: The tool provided responses that suggested its belief system was hard-coded with the idea that good scientists are white men, and everyone else is not. That issue was fixed, and OpenAI CEO Sam Altman asked users to help train the model by flagging such responses in the future—marking them with a thumbs-down button—but the broader issue perpetuates.
“Part of the legacy of those who have developed and implemented AI in the last two decades is to be partially responsible for worrisome backward steps in gender equality,” says Carissa Véliz, associate professor at the Institute for Ethics in AI at the University of Oxford.
Véliz worries that the gender imbalances in designing and coding major platforms—from social media to the new generative AI tools we’re using now—are negatively affecting how women are treated by those platforms. “From the way social media hurts women to hiring algorithms offering more opportunities to men and discriminating against women, tech bros have brought back a toxic culture that is not only bad for women, but for society at large,” she says.
Flick worries that without clear data about who is coding the tools we’re likely to use every day, the bias that will probably be encoded into them is “doomed to be replicated within the results that the LLM [large language model] produces, further entrenching it.”
It’s imperative that that changes—fast, particularly when looking at AI. “Until that happens,” Veliz says, “there is little hope that we will have ethical AI.”
Get More From WIRED
Credit belongs to : www.wired.com