BRYAN PON: 41 shades of Uber

While there's probably plenty of real Uber drama that would be worthy of that scintillating title, this story is about something decidedly less sexy: experiments.

Specifically, data-driven experimentation and testing that can only occur in a digital environment. At its most basic, this kind of experimentation is often framed as A/B testing, or split testing, where a content owner or publisher tests two different types of content--version A and version B--and then measures the response. Online marketers are famous for using this kind of test to evaluate different subject lines of marketing emails. They send out a few thousand using each version, measure the open rate or click-through-rate of each, and then select the best message to use for the rest of the emails.

In perhaps the most famous example of this kind of experiment, Google infuriated designers everywhere when it disclosed that it had actually tested 41 different shades of blue on users in order to determine the clinically optimal hue for encouraging users to click on links. While this fealty to hard data over intuition caused a lead designer to quit ("I had a recent debate over whether a border should be 3, 4 or 5 pixels wide, and was asked to prove my case. I can’t operate in an environment like that. I’ve grown tired of debating such minuscule design decisions. There are more exciting design problems in this world to tackle.") this isn't that surprising for Google. It's core competencies have always been about collecting and sorting information, and using algorithms to determine optimal outcomes.

And while this degree of quantification--41 different shades?--seems extreme, there are obvious benefits to user testing. As information and content have become digitized and served through progressively more sophisticated new forms of displays, we have had to figure out (or listen to Jakob Nielson declare) how to apply design principles to new problems, new use cases. Things that seem obvious now--pinch to zoom, colored text is a link, menus and navigation expand to show more choices--weren't so in the early days of software and web user interfaces. Firms continue to spend a lot of resources conducting user experience research to determine how well potential users are able to complete specific tasks, learning in the process the barriers or bugs or wrong assumptions about how users will behave.

This kind of optimization is viable with digital products because the testing is typically very cheap, feedback close to immediate, and implementing the changes is often relatively easy. Change a few digits of the color value in Google's style sheets, and voila, the change is everywhere. Update your mobile application and push out a new version that updates all existing copies. The near-zero marginal cost of digital reproduction enables fast, cheap iteration, leading to incremental development approaches that continuously optimize.

These same characteristics of digital products are what allow Uber--essentially a software company--to conduct experiments with driver commissions. The company rolled out new tiered commissions for its drivers in San Francisco in April as part of a test to evaluate drivers' willingness to work for less. According to Forbes: "a small percentage of new UberX drivers will pay a 30% commission on their first 20 rides in a week, 25% on their next 20 rides, and then 20% on any rides beyond that. Uber is also testing the same commission in San Diego, except that the tiers are for the first 15 and next 15 rides in a week."

Because Uber tracks all aspects of driver performance and manages their accounts completely digitally, it is relatively easy for the company to filter out a subset of drivers (San Francisco-based), create rules (based on rides per week), and apply the tiered commission structure directly to their pay checks. Of course, recording lots of data and using it to manage your internal operations more effectively makes business sense.

But there's a fundamental difference with what Uber is doing with this commission experimentation. It's not trying to optimize the performance or satisfaction of its users. Uber is experimenting to see how low it can set wages before too many of its drivers quit. And it can play around with wages in very granular ways to optimize to the nth degree--that is, reduce as much as possible--how much it pays different drivers. In this example, new drivers and part-time drivers are penalized with higher commissions, but we can also imagine that Uber could pay less to, say, drivers who refuse to work on Sundays, or who live in certain neighborhoods.

Some business owners may look at this situation with envy. Being able to know exactly how low you can set wages before employees quit could be a valuable cost-reduction tool. But most businesses aren't completely digital, and the cost of doing this kind of testing is prohibitive. That's a good thing. Because unlike testing for end-user experience, changing up peoples' wages (especially when payments are confusing, and many drivers don't even know how much they're getting paid) results in a more unpredictable income, which has very real negative economic consequences. From the Economic Policy Institute:

"Much of tipped employment is the epitome of “just-in-time” employment—adjusting staffing levels on an immediate basis in response to customer flows. While this may be good for the employer, it is far less beneficial for workers because it can produce highly unpredictable work hours, and thus highly unpredictable pay. Wage volatility is further exacerbated by workers’ reliance on tips from customers, which also vary considerably. A tipped worker’s paycheck can vary wildly depending on the fluctuations of customer tips and assigned shifts, making it difficult for tipped workers to budget, or make investments that require more stable and predictable income levels—such as buying a home or a car, or seeking further education."

Where does the wage experimentation end? How "optimal" can Uber become? Consider the type of experimentation Uber is most (in)famous for: its "surge" pricing model, where the cost of rides to the end-user goes up during peak demand, creating a highly dynamic market for rides. Although Uber agreed to cap surge pricing during emergencies emergencies, it still vehemently defends its practice of balancing supply and demand. What, then, would stop Uber from more aggressively experimenting with the supply side via changes in wages? Given that the majority of Uber drivers are only working part-time for Uber, it could tie wages to the national unemployment index, knowing that as unemployment ticks higher, under-employed workers are more willing to take low-paying jobs. What if a large factory closes in a city, leaving hundreds of people without employment? Uber could quickly lower wages for drivers in that city, knowing that demand for work just skyrocketed. If Uber integrates actuarial risk models, credit scores, crime statistics, and so on, it could implement dynamic, real-time wages to its drivers based on their profiles. Anti-discrimination laws protect workers from bias based on race, age, and religion. But Uber might experiment with paying less to a male driver who lives in a poor neighborhood and drives a Buick compared to a female Prius driver who lives uptown. Discrimination could be very hard to prove.

Uber executives defended surge pricing with the logic of free market choice ("Nobody is required to take an Uber"), but the reality of some situations--natural disasters, emergencies--means that choice is conditional and relative. And as the nature of the workforce shifts from full-time employment with the concomitant legal benefits to part-time contractors with fewer rights--a debate that Uber is squarely in the middle of--we need to recognize the leverage and power that digitally based businesses have in controlling wages in a more dynamic fashion than ever before. Companies like Google, Facebook, and Amazon have become extremely sophisticated in their ability to process our digital lives and tailor their offerings accordingly. When companies like Uber use this same apparatus on their workers, we need to think about whether this kind of granular experimentation and control should be regulated.

BRYAN PON

Wednesday, August 5, 2015

41 shades of Uber

No comments:

Post a Comment