Sergey Nivens - Fotolia

Inside the data science behind Workday Insight Applications

In a Workday Rising Q&A, the data scientists behind Workday's new predictive analytics apps talk shop about machine learning and the challenges of categorization.

SAN FRANCISCO -- Workday Inc. next year plans to offer Insight Applications, which will be aimed at helping managers...

predict business trends and make better HR and finance decisions.

The applications make predictions partly by examining data such as employee  skills that are highly sought by competitors.  An "intelligent" engine called SYMAN maps and classifies Workday data and uses predictive analytics to make recommendations. The software, for example, would help pinpoint employees who are at risk of leaving and recommend ways to retain them.

Workday, based in Pleasanton, Calif., obtained the technical expertise for Insight Applications in February when it purchased San Francisco-based Identified. Two executives from Identified, Mohammad Sabah and Adeyemi "Ade" Ajao are now Workday's director of data science and vice president of technology product strategy, respectively.

SearchFinancialApplications editors sat down with them at the Workday Rising conference.

How much is the user going to have data scientists at Workday actually providing service and helping them design or even perform some of the analysis?  To what degree is that automated?

Adeyemi Ajao: The product is completely automated. It's really software and it should work independently of the data [and] independently of the client. However, it is in the design of the product -- before we actually build any software -- where we want to make sure we really have a lot of client impact. So the way we have been coming out with the products is by actually sitting Mohammad and the data science team together with clients doing what are called design partner agreements, in which we will do our consulting [and] build a version of the product. Once we do this for two or three clients, now we are productizing and you can literally turn it on.

You are productizing things that you have a pretty good idea other customers can use?

Ajao: Yes. For example, with the retention-risk model, we actually do have data with numbers …  [for] a particular client of employees [who] we have predicted were going to leave and actually have left the company. We want to make sure before we say, 'OK, something is ready,' [that] we actually have a pretty good idea that it is working.

Mohammad Sabah: Just to expand a bit on that, we don't expect data scientists on this site, or on the site of the customer, to be involved at all. The software is going to be seamless. It expects some data. To keep it very simple, we expect to have a spreadsheet of 100 columns. You fill it out for us, or it can point us to where the data is. We separate that, either through Workday or through some other external system. We have a tool … that does that for us.

Here's the benefit of it: Amy [Gannaway, senior director of worldwide HRIS at VMware, in an analytics session at the conference] mentioned that one of the things that really impressed her ... was the accuracy with which we were able to predict, with the data that we have, the retention risk of employees. And we actually validated that, using data from the past, seeing what [would happen] over the next three years. We were able to predict pretty accurately what these employees are doing to do.

Ajao: This is not a change that is going to happen overnight. We think of this as the beginning of a conversation; the beginning of helping data be more present in decisions. We have a long, long way to go.

Can you talk about some other directions in applying data science to the financial side of the Workday data set?

Sabah: Financials is really a very rich space, even richer than HR, not just in the volume of transactions but also in the use cases that you can derive from it and the problems you can solve. In the case of expenses, imagine a big company. You have hundreds of expense reports that are getting assigned to managers for approval. How about if we automatically figure out if there is an anomalous transaction here or not? Based on that, you can really streamline the process. And also, that can lead you to proactively point out, what is the expense that I am going to incur over the next year, over the next quarter and so on? You can predict that with a high degree of accuracy.

Workday Rising logo

For collection, the use case is … if you are a B2B business, is this customer going to pay on time? Instead of taking steps after the fact, you actually can predict ... looking at the social factors ... if he is not going to be on time, you can start the process six months earlier than the deadline. It's like looking into the future and really giving recommendations.

These are two examples, or two prototypes. I would say this is like the tip of the iceberg, just like retention risk …  is the tip of the iceberg. You will find out there are more layers and there are more interesting challenges, problems that you can solve in finance. Customer churn is another example.

Ajao: Another thing to point out on finance is that you are going to see, in general, the Insight Applications … answering more and more strategic questions. One of the key interesting overlaps between HR and financials is revenue-generating people and what happens when you start analyzing the impact of retention of revenue-generating people in your financials, and how that can help you with strategic decisions, product decisions, customer decisions  or even cash-flow management.

Ade, in the keynote you talked about the classification problem being the source of the most difficult work. Why is it so difficult?

Ajao: Natural language. We are really facing the same problem that the people working on [Apple's] Siri are facing when they are trying to get them to understand my accent and why my accent in English is actually so difficult to get, because for our computers my slight mispronunciation of "R" is actually difficult. Let’s say that I meet someone at Workday [who] tells me [he’s] an analyst. I  know that he is an engineer -- he is probably a system analyst. I meet someone at Goldman Sachs [who] tells me he is an analyst and then I watch [The Bourne Identity] movie, where he is a CIA analyst. I have context and as a human, I know roughly what I am looking at. Computers do not have that awareness. For them to have that awareness, we would be talking about artificial intelligence.

Sabah: This generally comes under a category of algorithms and machine-learning reduction. The reason that is so important is [because] there is a term called "curse of dimensionality" in machine learning. What happens is, if you have so many factors to learn from and not enough data, you actually will be learning noise. Give you an example: In the first use case, Ade and I engaged with this customer, a big medical-device manufacturer, the very first week of our acquisition, where we took in 25 years of history. That particular company had more than 10,000 job titles. If you are look at the transitions that are happening  between one job title and another job title, it's like one or two or five or 10. The moment you apply SYMAN, you reduce those 10,000 or 5,000 to like, 10% of that 400 or 500. Now you have much more density. It leads to better quality … [and] all the learning you do on top of it … it’s much better and much more robust.

Are there any other ways SYMAN employs machine learning to improve over time?

Sabah: Let's say you are using Workday and you are a recruiter. You ask for this Java developer, using SYMAN. As you are taking action on those results, if you like a candidate or if you do not like a candidate, the system is learning all the time … Just like on Netflix, if you play a movie five minutes as opposed to 60 minutes, there is a huge difference. It learns that and gives you better recommendations. Same here.

Next Steps

Read a case study from Workday Rising

See what cloud, mobile users said in a conference panel

Understand Workday Big Data Analytics

Dig Deeper on Financial analytics and reporting