Platforms Can Optimize for Metrics Beyond Engagement

1 year ago 73

Social media, news, music, shopping, and other sites all rely on recommender systems: algorithms that personalize what each individual user sees. These systems are largely driven by predictions of what each person will click, like, share, buy, and so on, usually shorthanded as “engagement.” These reactions can contain useful information about what’s important to us, but—as the existence of clickbait proves—just because we click on it doesn’t mean it’s good.

Many critics argue that platforms should not try to maximize engagement, but instead optimize for some measure of long-term value for users. Some of the people who work for these platforms agree: Meta and other social media platforms, for example, have for some time been working on incorporating more direct feedback into recommender systems.

For the past two years, we have been collaborating with Meta employees—as well as researchers from the University of Toronto, UC Berkeley, MIT, Harvard, Stanford, and KAIST, plus representatives from nonprofits and advocacy organizations—to do research that advances these efforts. This involves an experimental change to Facebook’s feed ranking—for users who choose to participate in our study—in order to make it respond to their feedback over a period of several months.

Here’s how our study, which launches later this year, will work: Over three months, we will repeatedly ask participants about their experiences on the Facebook feed using a survey that aims to measure positive experiences, including spending time online with friends and getting good advice. (Our survey is a modified version of the previously validated Online Social Support Scale.) Then we’ll try to model the relationship between what was in a participant’s feed—for example, which sources and topics they saw—and their answers over time. Using this predictive model, we’ll then run the experiment again, this time trying to select the content that we think will lead to the best outcomes over time, as measured by the recurring surveys.

Our goal is to show that it’s technically possible to drive content selection algorithms by asking users about their experiences over a sustained period of time, rather than relying primarily on their immediate online reactions.

We’re not suggesting that Meta, or any other company, should prioritize the specific survey questions we’re using. There are many ways to assess the long-term impact and value of recommendations, and there isn’t yet any consensus on which metrics to use or how to balance competing goals. Rather, the goal of this collaboration is to show how, potentially, any survey measure could be used to drive content recommendations toward chosen long-term outcomes. This might be applied to any recommender system on any platform. While engagement will always be a key signal, this work will establish both the principle and the technique for incorporating other information, including longer-term consequences. If this works, it might help the entire industry build products that lead to better user experiences.

A study like ours has never been done before, in part due to serious distrust between the researchers studying how to improve recommender systems and the platforms that operate them. Our experience shows just how difficult it is to arrange such an experiment, and how important it is to do so.

The project came out of informal conversations between an independent researcher and a Meta product manager more than two years ago. We then assembled the academic team, as well as researchers from nonprofits and advocacy groups to help keep the focus on public benefit. Perhaps we were naive, but we were taken aback by rejections from people who nevertheless agreed that we were asking valuable questions. Some organizations passed because of the communications risk, or because some of their staff argued that collaborations with Big Tech are PR efforts at best, if not outright unethical.

Read Original