How WorkFusion Used Differential Privacy
to Solve Its Cold-Start Problem
Robotic Process Automation company WorkFusion had an amazing product. The problem was that because of privacy concerns it could take up to six months to onboard new clients and start driving revenue. The company needed a way to solve its cold-start problem so that it could begin delivering immediate results for its customers.
WorkFusion helps companies digitize their operations by automating what have traditionally been highly manual tasks. Invoice processing is a great example. To process invoices, you typically need to have someone read them, identify the relevant data they contain and enter that data into a spreadsheet. It’s a labor-intensive job and one that costs time and money. All the more so for big companies that can receive tens or even hundreds of thousands of invoices every month.
Certain there was a better way, WorkFusion set out to automate processes like this. Using optical character recognition (OCR) technology, the company built a software product capable of reading invoices, among other documents, and extracting key data points from them. While the technology it created was perfect for invoice processing, it’s equally helpful in a variety of other applications in industries ranging from financial services to healthcare and commerce.
Although trivial for any human, developing the software necessary to do this work represented a significant machine learning challenge. That’s in part because every company uses a different template for their invoices, which makes analyzing them a much more onerous task. To meet the challenge, the machine learning model WorkFusion built needed to be able to look at any invoice and accurately tag the relevant data it contained while filtering out all of the noise.
Although WorkFusion was able to develop a machine learning model to accomplish the task, the team was left facing a significant problem. For it to work, the model needed to be trained with a large number of invoices. And, since WorkFusion’s customers viewed sharing their invoices with others as a potential risk, the model had to be individually trained for each customer — a process that could take up to six months. Not only was that delay less than ideal for customers, it also meant that WorkFusion had to wait up to half of a year to start generating revenue from each new customer that it brought on board.
Overcoming the Cold-Start Problem
After leading a $35 million Series D investment in WorkFusion in early 2017, Georgian Partners learned about the challenge the robotic process automation company was facing. It was a classic cold-start problem, where the team simply didn’t have enough data at the outset of a customer relationship to be able to offer their solution. It’s also one that Georgian Partners had already helped some of its other portfolio companies solve using a cutting-edge technology called differential privacy.
Differential privacy provides a mathematical definition for the privacy loss that results to individuals when their private information is used to create a data product. This makes it possible to mathematically show how private a machine learning model or query function is.
“We knew that we could use differential privacy to aggregate all of the invoice data that WorkFusion was collecting while guaranteeing its customers’ privacy,” says Ji Chao Zhang, Director of Software Engineering on the Georgian Partners Impact team. “That way we could eliminate WorkFusion’s cold-start problem and help the company start to deliver immediate results to its customers.”
Delivering Meaningful Results
Over the course of three months, Zhang and other members of the Impact team partnered with WorkFusion, using a variety of deep learning, generative and differential privacy techniques. First, the team developed a deep learning model that helped improve the performance of WorkFusion’s existing machine learning model, increased the efficiency of Workfusion’s data science team, and ultimately allowed the company to automate new processes within their customer base.
Next, the team developed a generative data solution, making it possible to automatically generate labeled input data that could be used to train certain machine learning models. Instead of needing thousands of invoices to train the respective models, the company could achieve the same level of model accuracy by using just several invoices. As a result, the machine learning training task could now be completed in a matter of weeks rather than several month, thereby accelerating client onboarding.
Finally, by adopting Georgian Partners’ differentially private products, Workfusion has been able to incorporate differential privacy into their AI solutions and transfer key learnings across customers for a large number of processes, while providing provable privacy guarantees. As a result, the company is not able to onboard new customers in just a matter of days.
“The increase in speed and efficiency that our machine learning model has gained thanks to the Georgian Impact team has been phenomenal,” says Andrew Volkov, WorkFusion’s CTO. “We’ve been able to go from an onboarding process that took up to six months to one that can be completed almost instantaneously, and with reasonable performance that’s under our typical tolerance rate.”
After delivering its code to WorkFusion in the summer of 2017, the company has put it into production. The early results are incredibly positive. One major financial services customer is so enamored with the new capability that is has decided to expand its relationship with WorkFusion after just one year.
“We’re confident that’s just the start,” says Volkov. “The upgrades that we have been able to make to our solution with Georgian Partners’ help has been an absolute game changer.”
Download the Case Study
"The increase in speed and efficiency that our machine learning model has gained thanks to the Georgian Impact team has been phenomenal. We’ve been able to go from an onboarding process that took up to six months to one that can be completed almost instantaneously, and with reasonable performance that’s under our typical tolerance rate."