The design process for a sustainable and human-centered AI system starts with respecting each person as the author of their own story.
Think back to your childhood and recall the kind of kid you were. There were likely many days, especially in your teenage years, when you would ‘try on’ entirely new personas to see what they felt like. You’d experiment with clothing, music, language, even friends.
Now ask yourself: are you the same person you were when you were little? Are you the same person you were 10 years ago? How about a year ago? How about a week ago?
The older we get, the harder it becomes for us to summarize ourselves through our current tastes or hobbies alone. We’re sensitive to our environments, to the needs of others, and to the social languages of belongingness and recognition. We’re doing our best to communicate our needs while respecting the boundaries of a constantly evolving and expanding culture.
In short, personal identity is complicated.
Now think of a time when you felt misunderstood; in particular a time when someone misrepresented your intentions. The experience of having one’s purpose, competency, ability, or values called into question can have long-lasting and formative effects on the way we learn to build and maintain trusting and productive partnerships.
Quite often, what we need is a little more benefit of the doubt to feel like we can be ourselves.
The early internet opened up interaction modalities that addressed identity in novel and important ways. It offered the ability to take on new personas with each chat room encounter, message board discussion, or gaming session; to explore our most private thoughts and questions, often through a simple text box and a list of ten blue text links.
Ten little nonjudgmental blue links. Search engines like Yahoo! and Google became something like a virtual confessional; a transaction that returned reliably useful results for even the most taboo queries, supported by a nearly limitless supply of benefit of the doubt.
The design metaphors of search engines have evolved considerably since their inception. They’ve shifted from research tools, to “answer” engines, to something vaguely resembling the voice interface from Star Trek. The consumer tech market has evolved as well, with advances in ubiquitous connectivity making for an odd-coupling alongside an increasing awareness of issues like data privacy, ethics, and ad targeting. And as the big tech firms have largely succeeded in marketing themselves as mediators of world knowledge, many users have come to hold them to standards closer to journalistic integrity; a stark contrast to the flawed conventional wisdom that big data somehow reflects a ‘mirror of the world’ back to the public (it actually exacerbates the disparities between the mean and the margins).
A false premise
Machine learning is the science of making predictions based on patterns and associations that have been automatically discovered in data. Much the same way one works to develop a hunch or an intuition about something, the job of a machine learning model is to figure out just how wrong it can be in order to be as right as possible as often as possible.
Machine learning models don’t emerge from thin air, though. Every facet of their development is fueled and mediated by human judgment; from the idea to develop the model in the first place; to the sources of data chosen to train from; to the data itself and the methods and labels used to describe it; to the success criteria for the aforementioned wrongness and rightness.
But establishing clear success criteria for personalization models is no simple task, because as we’ve established, identity isn’t easily defined.
A prevailing assumption that accompanies the productization of AI is that it’s possible to know people better than they know themselves. That if algorithms can be exposed to a sufficient amount of human data, inferences can be made about people based solely on their behavior, and those inferences can be used to achieve ideal personalization.
This leaves two troubling paths forward:
1. Presumed accuracy. Classifiers trained on inferences derived from online behaviors will almost always have a higher false positive rate for minority subgroups, in large part because of the sheer statistical challenge of attempting to build predictions from data with unequal representation. This disparate error rate results in a more robust experience for people from majority subgroups because they benefit from a higher resolution graph of their personal traits, thereby exacerbating opportunity gaps — and the prevalence of stereotypes — for people from underrepresented subgroups.
2. “Perfect” accuracy. Sometimes referred to as the “filter bubble” effect, precision-targeted classifiers enable an unprecedented form of psychographic redlining. Even if someone can’t afford to buy that fancy handbag today; even if someone doesn’t agree with the policy position of that politician; even if someone doesn’t understand the beliefs or lifestyle of another; should they not get exposed to their existence because the targeting criteria weren’t aligned? In this model, technologists are emboldening a system where the world of some can become invisible to others; the world of the privileged to the underprivileged; the world of liberals to conservatives; women to men.
Both paths share the same faulty premise: that what people want — and need — is to only experience things they’ll “like” or that have been deemed “right” for them by someone else.
What’s at stake for tech firms
The current AI boom signals as much about the tech industry’s fascination with deep learning as it does its overconfidence in evergreen access to plentiful and genuine user data. Technologists and investors have grown entitled; treating user data like they’re a veritable renewable resource. But consumer trust isn’t going up-and-to-the-right anymore. People are freaked out by the amount that can be known about them. And the rate of publicly visible missteps resulting from automation is ramping up at breakneck speed.
While we may laugh at the frailty of recommender systems that result in shoe ads that follow us around the internet from Amazon to Facebook, the same root causes pose serious risks to human well-being in higher stakes domains such as healthcare, criminal justice, welfare, and education. The trolley problem and the over-heralded coming of the singularity are ultimately distractions from the challenges already in plain sight, as biased data and flawed algorithms are impacting elections, policing, criminal sentencing, finance, jobs, and even dating.
Personalization systems in their current state are forcing users to wrestle with the ramifications of their actions without offering any clear model for how to reconcile lurking questions like “who does this system believe me to be?” or “what am I committing to becoming?”
When we’re overly conscious of what a system will think of us, we alter our behavior, and in serious cases, we regress our uniqueness back to the cultural mean. Or we hide in incognito mode… or we stop using the system altogether (assuming we’re privileged enough to be able to do so). More generically, if we’re distrustful of how a system will operate, we’ll take fewer risks, thereby reducing the dimensionality of the potential outcome space. In either case, the resulting damage to training data quality (in particular its diversity and veracity), and thereby useful personalization, would be irreparable.
Advertiser trust has already been significantly impacted by inflammatory associations drawn between their brand and the content it’s appeared next to. Skewed, stereotypical, or outright fake results are being discovered by users on an almost-daily basis, and tech journalists are eager to fan the flames. Meanwhile, AI-powered products, from voice recognition to robot assistants, simply can’t keep up with the grandiose claims being made about their capabilities. Tech firms are starting to look pretty stupid, and that’s only going to get worse if they continue to hinge their brand differentiation on an ability to be all-knowing and all-seeing. It’s simply too precipitous a ledge to be constantly balancing on.
A People’s Bill of Rights for Personalization
I think we’ve fundamentally misplaced the burden of proof as technologists — and honestly, as scientists. I believe a humane approach to personalization would mean treating the following as our default positions; in need of being disproved — rather than proven anew — every single time a novel technological opportunity presents itself:
- People prefer to understand the goals of a system even if it reduces their expectations of its power.
- People behave more genuinely when they control access to their personal and behavioral data.
- People prefer a certain amount of noise over “perfect accuracy” (which I’ll define as only seeing things they agree with and/or have previously shown interest in).
- People prefer to set their own goals rather than having them inferred by others, even if it means more upfront “work” is necessary to get the desired results.
Furthermore, I believe it’s our job as technologists is to try to improve the lives of as many people as possible by augmenting their capabilities, and that in order to do so, we’ll need to pivot from a product mindset that treats users purely as consumers to one that respects them as authors of their own story.
As individuals and as an industry, we demonstrate our values through our products, not our words. To that end, I propose the following as inalienable rights for people interacting with personalized experiences, in the hope that they’ll be directly applicable to the day-to-day messy reality of building and shipping products.
The right to what’s yours
The fear of a looming permanent record is a frequently-employed extrinsic motivator for social conformity, and disproportionally affects women and young people. Fear inhibits freedom, and without a dynamic spectrum of human behaviors to learn from, learned systems will only be capable of reinforcing the conventional and normative behaviors of the past. Therefore, personal and behavioral data must remain on-device unless willfully authorized by the user, and learned systems must support the ability for users to leave no personally-identifiable trace at the datacenter. (If you’re wondering about how this can be done at scale for machine learning, check out the approaches introduced by Apple and Google.)
From consumers: How might we bring personal and behavioral data closer to datacenters?
⇢ To authors: How might we bring code closer to devices, where users control access to their personal and behavioral data?
From consumers: How might we make more accurate inferences about users?
⇢ To authors: How might we offer inferences that let the user decide what’s accurate?
The right to be who you want to be
When unasked for, a poorly tailored experience is frustrating, and a perfectly tailored experience is creepy. Furthermore, who we want to be may be different from who we have been. Therefore, learned systems must support user goals that are self-determined and impermanent.
From consumers: How might we build trust in automated systems?
⇢ To authors: How might we give users control over their representation?
From consumers: How might we use learned systems to predict behavior?
⇢ To authors: How might people use learned systems to achieve their own goals?
The right to personal safety
A system’s awareness of personal characteristics should never pose physical risks to people, either directly or through second-order effects. Therefore, learned systems must support transparency and user oversight with respect to personal characteristics that would be considered sensitive within expected contexts of use. The level of system transparency and user oversight must scale based on the severity of potential negative outcomes.
From consumers: How might we measure average performance?
⇢ To authors: How might we measure performance by subgroup, context of use, and use case?
From consumers: How might we develop short-term engagement metrics that can measure success?
⇢ To authors: How might we measure the longitudinal environmental conditions that predict success?
The right to serendipity
Personal growth is a perpetual process of observing, assessing, and calibrating one’s behaviors with respect to one’s goals, but the natural human tendency towards confirmation bias and routine can send a confusing signal to personalization models. Without structured goals around diversity of content, style, format, cadence, and performance metrics, systems will inevitably constrain, confine, and stereotype users. Therefore, learned systems must support frequent resampling of data, and treat homogeneity — in all its forms — as a bug.
From consumers: How might we develop a taxonomy of semantics?
⇢ To authors: How might we develop a taxonomy of pragmatics?
From consumers: How might AI answer our questions?
⇢ To authors: How might AI illuminate patterns in our world?
The right to multiplicity
The notion of a singular all-encompassing artificial intelligence is both deeply shortsighted and inaccurate. Human knowledge makes use of a vast and distributed array of sensory inputs — both internal and extended through environmental, social, and technological means. Therefore, learned systems must support multiple unique and collaborative intelligences.
From consumers: How might we make smarter machines?
⇢ To authors: How might we make people feel more capable?
From consumers: How might we create more engaging AI?
⇢ To authors: How might AI enable more human engagement?
The right to experiment
The quality of a human-centered system hinges on users’ sense of self-efficacy; their belief in their ability to act successfully in service of their goals. This is distinct from the oft-touted — and product-centric — goal of “earning user trust”. By offering affordances that allow users to play with the mechanics and boundaries of a system’s behavior, including the ability to identify personally meaningful reference points and ‘try before they buy’, users can build mental models that are highly functional. Therefore, learned systems must support the ability for users to explore multiple potential paths — either in-practice or through simulation, especially when outputs of the system could be interpreted as representations of the user’s competence, values, or creativity.
From consumers: How might we build AI that’s bug-free?
⇢ To authors: How might we build AI that’s flexible to change?
From consumers: How might we train models that improve on benchmarks?
⇢ To authors: How might we train models that augment human capability?
Onward and accountable
“One of the painful things about our time is that those who feel certainty are stupid, and those with any imagination and understanding are filled with doubt and indecision.” — Bertrand Russell
It’s my hope that as human-centered practitioners, we can learn to make friends with our doubt and indecision instead of being paralyzed by it — to be unceasingly curious and optimistic. Because the universal design material that must be available to every discipline, technical and non-technical alike, is the power of asking “why?”
Josh is a Principal Design Manager at Microsoft, where he works at the intersection of product design, ethics, and artificial intelligence. He believes that human-centered design thinking can change the world for the better; that by seeking to address the needs of people — especially those at the margins — in ways that respect, restore, and augment their capabilities, we can invent forms of technological innovation that would have otherwise been invisible.
Illustrations by Karen M. Chappell. More of her beautiful work can be found at karenmariechappell.com .