Episode 62: Is Machine Learning the Secret to Uber’s Success?
If you think Uber is a ride-hailing business, you’re wrong. It’s actually a machine learning business. Machine learning is what makes Uber’s service possible by, among other things, helping to ensure that drivers get to you as quickly as possible. In this episode, Jon Prial talks with Michael Del Balso, Product Manager, Machine Learning at Uber. Together, they discuss how Uber is using machine learning, how machine learning is changing the company’s product management function, and what it takes to bring a product to market with machine learning everywhere.
You’ll hear about
- The use of predictions in machine learning
- Uber’s greatest challenges with machine learning
- The role of machine learning in product management
- How to stay on top of advances in machine learning
Who is Michael Del Balso?
Michael Del Balso is the Product Manager, Machine Learning at Uber. Uber in San Francisco. Prior to this role, he was at Google where he managed the core machine learning systems that power Google’s ads auction. Previous to that he worked on Maps at Google. In addition, Michael was a participant of Canadian startup accelerator, Next 36, where he co-founded Playfit, a health startup. He holds a BASc in electrical and computer engineering from the University of Toronto.
Full transcript of the episode
Jon Prial: Hi, everyone. It’s Jon Prial, and today, I’m coming to you from the corner of 2nd Avenue and 22nd Street in New York City. Yeah, there’s a little bit of street noise, but I’m doing this because today on this podcast, we’re going to be talking with Uber.
Now, I ordered an Uber, and while I’m waiting, let me tell you about our podcast. It looks like I have like two minutes before Allison picks me up in a Toyota Camry. Look, I don’t need to tell you too much about Uber. It’s become part of our vernacular.
How many companies today say they’re going to become “the Uber of,” fill in the blank. You can’t knock success. A quick data point: Since its founding in 2009, Uber has provided five billion rides in over 600 cities worldwide.
This is classic market disruption. Here in New York City, a taxi medallion — that’s the license required to deploy and run a yellow cab — well, it used to cost well north of a million dollars. Now, it sells for a fraction of that price.
Uber has paved the way for the sharing economy, which means the taxis, the taxi drivers, and all these new service providers really have become commoditized. When most people think of Uber, they think of a ride‑hailing business.
While on the surface level, that’s true, if you look at bit deeper, you’re going to see that in reality, Uber is actually a machine learning business. After all, it’s machine learning that helps Uber get that car to you as fast as possible, and ensures that that estimate of how long it will take is as accurate as can be. We’ll talk a lot more about that.
Oh, look, I think I see my car coming now. I’m going to be talking later with Michael Del Balso, a product manager of machine learning at Uber. We’ll be talking about how the company is using machine learning, how machine learning is changing the product management function, and why machine learning is such an important part of Uber’s strategy.
I’ve got to go. Catch you later. How are you doing?
Driver: Hey, there. How’s your afternoon going?
Michael Del Balso: We have information on drivers, riders, vehicles, trip‑level information, Uber eats, there’s meals, preparation time, trip time. Everything that you can think of, we try to measure that so we can optimize it in some way or another.
Jon: That was Michael Del Balso. He’s the product manager of machine learning at Uber. As you just heard, he’s got a lot of data that the teams at Uber are going to work with. Michael, we’re obviously talking about machine learning today.
It’s become an all‑encompassing term over the past couple of years. I think to start things off, can you just share your definition of machine learning?
Michael: Sure. Machine learning has many different definitions people use. It’s a very broad term that I don’t think has a very clear, specific definition. It’s a mix between computational statistics and mathematical optimization.
I think of it more as using inferential statistics, but focusing more on applications with a lot of data. It’s something that people are using for pattern recognition across the industry right now.
Jon: You talked about inferential. Is it fair, I often hear the term that machine learning gets into predictions? Does predictions work for you as a term?
Michael: Predictions are a really big part of it. You can break down machine learning into two different areas, two most common areas. One is supervised learning, where you’re actually trying to predict a specific value.
That would actually break down into regression and classification. Regression where you’re trying to predict, imagine at Uber we’re trying to predict, how many trips are going to happen next week? Then we’re trying to predict that exact actual number.
Then you can imagine a bank might also be trying to predict a different kind of thing, which is just a class of something. Not a specific number, but like this payment that we saw, is this fraud or not? That’s predicting a class. That’s regression and classification.
That’s when you’re predicting a specific thing, which is called supervised learning. There is also, people do what’s called unsupervised learning, where you don’t have a specific thing you’re trying to predict, but you’re trying to extract some patterns out of your data.
You have a lot of data, and we might find that wow, a lot of riders like to take trips in this kind of way. Like, they only take trips on the weekends, and other riders use Uber for commuting. It’s helpful to think of it as segmentation at times, but it’s also helpful for a lot of other use cases. There’s predictions, but then also like pattern extractions.
Jon: Then as a product manager, when you’ve got a pattern extraction, then you would be able to take what you gleaned from the data, and help evolve your product offerings?
Michael: Yeah, it’s huge. Segmentation, and any kind of analysis you can do to your data to better understand your customers is valuable for any product manager.
In this situation, we’re fortunate to have quite a bit of data. We can do finer, more granular understanding of our customers, more robust segmentation, to the point where different teams might even be interested in different types of segmentation.
We might have not just one type of partitioning of all of our riders, drivers, trips, or whatever, but many different types of partitioning based on who the customer for that data is internally, who’s really interested in this.
Jon: What do you see as some of the biggest challenges that you’re facing in doing all this work?
Michael: Our biggest challenges are trying to actually get the most out of machine learning within the company. That means identifying the places where machine learning can be helpful, and then providing the support, which can be many different things, to those teams that may be able to use machine learning, and to build these machine learning systems.
Then thirdly, to actually implement it and use it in production. There’s a lot of different kinds of challenges there, which are not only machine learning, algorithm kind of challenges, like can we build the most accurate learning model, but are things like does this team have the right people to know how to build a system like this?
Does this team have that right support from internal data infrastructure to be confident in the data that they’re basing their system on? Or does this team have the right systems to allow them to deploy this machine learning model after they’ve built it?
People like data scientists, or if you’re in graduate school learning machine learning, typically you’re going to learn how to build…You’re going to do a lot of your modeling in R, or Python Scikit‑Learn. For deep learning stuff, there’s a whole bunch of newer packages in the past couple years.
What you get out of those packages after you train those models is a model. It’s a specific model, but it doesn’t mean you have a system that’s production scaled. We find that a lot of data scientists are able to get to, “OK, I have a model that I think is accurate. Now, what do I do with it?”
We also have to give them a lot of help to say, “OK, let’s take that model, and let’s put it in this system that will deploy it to a scaled production environment that could handle making 100,000 requests or predictions per second,” for example.
That’s a very challenging engineering task. There’s a lot of engineering tasks that go beyond a machine learning algorithm component.
Jon: We talked a little bit about aligning and the business results. If we could take a step back, then, and bring up that piece of it, you’ve held machine learning product management roles. First, you were at Google, and now at Uber. Can you tell me how you evolved and came to be a product manager?
Michael: I think I actually have a, I don’t know if you want to say conventional or really unconventional way of becoming a PM, which is that most product managers become product managers after they’ve worked as an engineer in industry for a couple years.
Maybe they do an MBA or something, and then they take a role as a product manager. It’s not super rare, but it’s rather uncommon for product managers to start as PMs right out of college. I went to the University of Toronto, and I was finishing up my electrical engineering degree there.
Google called me up, and they wanted to interview me for their associate product management program, which is a rotational program where they hire approximately 30 people per year into this program.
It’s like a PM training thing, where they give you some mentorship and special resources, like talks from entrepreneurs and very senior product managers. Then they place you on a 12‑month rotation on one team, and then you rotate onto another team after that. Actually, in between, they take you on a, it’s called the APM trip.
It’s like a famous trip within Google, and it’s been replicated in other Silicon Valley companies as well, where we visited various Google offices around the world to understand the challenges of that market, what’s unique about that market, what is specific about Google’s products in that market.
That’s a two‑week trip. That was a really solid introduction to product management for me. I stayed at Google for three and a half years. My first rotation was on Maps and Maps data. I moved my second rotation onto Google ads, the AdWords, the auction machine learning that goes into the auction.
That’s where I really got a solid introduction to best practices of machine learning, state of the art machine learning and production, how to build super reliable, scalable systems. We push the boundaries on what are best practices in machine learning.
These teams have been doing similar types of things for years before this new machine learning wave. They had some maturity in their processes that it’s hard for teams who are newer at this to gain. These teams, like Google’s ads auction teams, have a lot financially on the line. Having a reliable and stable system is a very high priority for a lot of these things.
Jon: Talking a little bit about the older folks and the newer folks, it’s very clear, and I’m an old programmer. Machine learning, AI, it’s really changing how software is built, and what we focus on. You talked a lot about it scaling in different ways.
In the old days, there were objectives. Somebody had the right software algorithms against objectives. It was a very deterministic model. Of course, now machine learning is probabilistic, and the data is driving where we’re going to go with it.
Yet we still need to make sure it’s quite relevant from both a performance perspective, as well from a business objectives. How do you see machine learning changing what you think about as a product manager?
Michael: It’s interesting, I have a slightly different product management role than most product managers, because I’m focusing on tools to help people build machine learning systems easily. I’m not focusing as much on a specific end customer product.
However, there still is the core objective. We will choose something that we’re optimizing for. We’re trying to estimate how long a trip will take, because we want to tell you your car’s going to be there is five minutes.
We can measure how accurate we are, and try to improve our models to be more and more accurate with that error over time.
Jon: As I listen to this, it sounds that you’re a little bit of a center of excellence to help the teams really get things right. Is that what’s relevant? You publicly announced this Michelangelo platform, your internal platform to deliver machine learning as a service. What can you tell us about how that works for you?
Michael: Michelangelo is our internal machine learning platform which is, I guess you could say, our primary tool for building and deploying machine learning systems at scale within Uber. We’ve been working on that for about two years now.
I can go into some detail on that platform, but I think the concept of that platform is intended to empower people, data scientists and engineers at Uber, to be able to make use of machine learning. You need a lot of help to actually run a machine learning system in production.
This is intended to be that missing infrastructure piece that will get your system from, “OK, I have this model that I trained in R,” to, “OK, now, I have a machine learning system that’s running in production.”
Jon: As this evolved, and obviously, you’re evolving to be a machine learning‑first world, how do you personally stay on top of the field? You’ve got smart people to talk to. You play with the technology. Obviously, you’re doing all of the above.
What does it take for you then to not just have the Michelangelo evolved, because obviously there’ll be things you want to put in the platform, but they have to be ready, mature? How do you stay on top of it, and how do you decide what the pass down to the teams?
Michael: It’s really not something that one person can do. For me to stay on top of what’s going on in the industry, I’m going to conferences. I know a lot of people in different parts of industry and academia, and keep maintaining a network, following other announcements that are happening from other machine learning‑focused companies.
Also, literally just trying to re‑implement papers on my own. I write machine learning code for fun just to try to re‑implement things, and make sure I actually understand it. I’m someone who has to do things to actually understand them properly.
However, at Uber, I cannot just dictate a road map by myself. It’s possible for me to do that, but that’s not the right way to do things. It’s not going to lead to the right things being built. I have a set of internal customers who are the teams who are trying to build machine learning systems.
I am really trying to forge strong relationships with them, and open lines of communication so they can communicate to me what their priorities are, what they’re interested in building, and what they need, so then I can prioritize that in our platform, so the platform can be as valuable as possible to them.
Jon: You definitely have to keep the developers happy.
Michael: When you’re building something that people actually want, it’s going to be much easier for you than trying to tell people, “No, trust me. This is going to be good. You just have to try it.” When you don’t have to sell things, then your whole effort is going to go much smoother.
That’s a way that I can figure out what I should be building internally, is I understand what people are looking for. It’s stuff that I don’t really have to sell to them. It’s stuff that I feel like will go over better within the company.
Jon: It could be top‑down. You figure out the cool stuff. In the meantime, you’ve got to keep your pulse on the bottoms‑up, and see what the lab guys are doing as well to support them. Do you feel there was an evolution at Uber? Was Uber always machine learning first, or did that change over time?
Michael: I think we are relying increasingly on machine learning. That’s not to say that there wasn’t machine learning two‑plus years ago, but the scale to which machine learning is starting pervade every single corner of Uber is really impressive.
I think the folks at Uber, all the individual teams have really bought into the value of machine learning. That’s gone really well.
Jon: That’s great. I think the hard question is, what role do you see humans playing over the long term? From my perspective, it’s obvious that we can get the algorithms to get it right, or get it right for the most part, but there’s elements of the court of public opinion.
Every once in a while, you’ve bumped into a little bit of court of public opinion in terms of surge pricing. I don’t even want to go down the Sydney example. What’s your view in terms of the human and business strategy thought as it ties to machine learning, and what you’re asking of the teams?
Michael: Ultimately, these are just computer programs. In some sense, a machine learning model’s learning to behave on its own, but it’s up to the engineer, data scientist who’s building these systems to set constraints within which that model can operate.
Beyond your specific examples, there’s any numbers of ways that, if you build a very complex machine learning model, it could behave in ways that you’re not expecting. That could be a situation in which your prediction is way off from…
Maybe you see some problem in your upstream data, like there was some outage. Something happened that your system had never seen before, and then you don’t really have any guarantees on how your model responds to that.
Your model is essentially operating in an undefined way. In that sense, it’s similar to other software systems, where you have to maintain really tight monitoring, alerting, and all of the typical software engineering best practices, hygiene kind of stuff to ensure that your system is operating the way you want it to be.
There’s always going to be a human in the loop in that sense to provide that level of guarantees that business objectives are being met with these systems. Machine learning is definitely not just a train a model, deploy it, set it, and forget it, and it’s going to do what I want forever.
There’s times when the world changes, and your model gets old. You need to retain it, for example. It’s a system that requires maintenance over time, certainly.
Jon: It’s training and continuous learning. They’re separate. You get the training right. You build as much edge cases as you can to challenge the model. Once you’re done, you’re not done.
Michael: Right. You know that there’s certain systems where you train a model once, and then you just put it into production. You begin using that model. There’s and risk there, which is imagine we’re trying to predict how long it’s going to take for a car to get to you.
Right after we launch that model, Young Street, for example, has construction on it for the next six months. All of that model’s estimates of that car can go up Young Street are going to be incorrect now, because it’s not going to have a proper understanding of how long it takes t go up Yonge Street.
Another thing that we can do is regularly retrain the model, where we would say, let’s every day look at the data from yesterday, and every hour, look at the data from the past hour to understand if there are new traffic jams, or new construction in different places, and update the model accordingly.
We have faster to adapt predictions, but the more you go in that direction, the tighter your needs are for automated quality checks. There’s even more maintenance that you have to do, because you’re not going to have a human look over the model at that time every single hour that you retrain that model.
Jon: There’s two degrees. I absolutely get the fact that I got to automate that. When Young Street goes red, and you can’t get cars up and down Young Street, you got to divert the drivers. What about the human, higher level of things?
Is that something different, where humans need to make sure at a more holistic level, the system’s operating the way you want it to look, and how do you capture that?
Michael: It’s different for every single use case, but it is very important to have someone who understands how these models operate, deeply understands the business problem, and can look at things holistically.
Often, that role falls to me, a product manager, because data scientists, they’re the ones who build these models, for example. They maybe have spent the last couple of weeks deeply with their heads in the problem, thinking about the problem in terms of, “OK, this is this prediction I have to make. How accurate can I be in this prediction?”
Then there’s the other side of the whole problem, which is the team that may not know much about machine learning, but just wants to have some system that tells them how long a trip’s going to take, for example.
They may not understand the nuances of how a machine learning model could go wrong in all different kinds of weird situations, edge cases. Having someone that can bridge the gap there can bring to light a lot of potential problems.
It’s very critical to have that kind of end‑to‑end thinking with a machine learning project, otherwise you can find yourself in trouble pretty quickly.
Jon: Excellent. That’s a great way to wrap it. Michael Del Balso, thank you so much for taking the time with us today. It’s been a pleasure chatting.
Michael: Thanks very much. Happy to be here.