Queues are for the Brits, Part 2

del.icio.us Tags: ,,

In part 1 of this article, I raised some issues with traditional ACD algorithms.  The issues raised are best summarized by generalizing* ACD algorithms as FIFO queues with the only variances being skill levels and the actual agent allocation algorithm.  Agent allocation algorithms generally break down into something fairly simple such as which agent has been off the phone the longest, taken the fewest calls, or spoken to the person before.  There is a lack of intelligence when it comes to considering a many-to-many call/agent match.

* I really do understand that there are likely algorithms out there that do achieve 90% of what it is that we need to achieve.  The question is not, “How much does solution x achieve for company y out of the box?”, the question is, “To what level does solution x allow all companies to customize the algorithm to achieve 100% of their needs?”

A Proposed Solution

Before I propose a solution, I should make two things clear.  First, I am by no means an expert in contact center theory.  There is a world of things I don’t know – contact center theory is one of them.  Second, I am not fully educated on the existing solutions out there.  Due to the proprietary nature of algorithms and the obvious interests of the companies in protecting them, the best I can do is find descriptions of how algorithms currently work.  I can also see the flaws in our current UCCX system.

So how do we determine which is the best agent to assign to a call?  Skills-based routing excels at limiting the pool of agents to available agents who are capable of taking the call.  We want to break past that barrier to achieve the following:

A desirable algorithm for contact centers will create an optimal call-agent match, adjusted for time considerations.

Note that the statement above does not de facto consider availability.  Availability certainly should be part of the equation, but only part.  For contact centers who have frequent repeat callers, agent consistency may be a desirable trait.  If agent consistency is desirable, the algorithm may state that if the caller’s assigned agent will be available in less than a minute, the caller will be placed on hold until the agent is available.

Base Match Score

Skills-based routing has achieved the first part of the equation: making sure that someone capable of handling the call is assigned to the call.  Because there are only the few dimensions involved (call disposition, available agents, and agent skills), call match may not be entirely optimal.  The first step to creating an optimal call-agent match is to increase the number of dimensions that factor into the final assignment.  As was stated in the introduction to this post, we have designed an algorithm that takes a number of factors into consideration.  A sample list of considerations may include the following factors:

clip_image001[9]

** Note that unless an alternative receptor has been configured for calls with a zero match score for all agents, all multipliers should be greater than zero.

The above chart over-simplifies in one significant respect: it treats ranges as a flat match or non-match score.  In our actual algorithm, we support ranged multipliers, but this example needs to be easy to understand.  I also recommend this type of a chart for gathering requirements from the business; it’s easier to understand and easier to supply rows rapidly.  That said, this may be the match score matrix for a fictional company.  The first row indicates that, as with many companies, skill match is critical to agent assignment.  The non-match multiplier minimizes the chance of a agent without a skill match being assigned to this call.  Unless voice mail or an alternate “queue” is configured to handle calls with a zero match score, I highly recommend having all multipliers be greater than zero.  The second and third rows assign a significant priority to agents who have spoken with this person before.  The third to fifth rows flatten a ranged multiplier.  In true implementation, I recommend attempting to use an actual mini-algorithm to deal with these types of issues as the result is a more accurate, less “bucketed” match score.  In this case, we are attempting to de-prioritize any agent that is not currently available.  The longer it will take the agent to become available, the less likely it is that they will be assigned to the call.

It is critical to note at this point that we feel this is one of the key differentiators of this algorithm.  Most algorithms only consider the available agent pool and rescan every few seconds if a match cannot be found.  In this case, a match can be made even before an agent finishes a call if the reason (previous contact, for instance) for assigning the call to that agent is compelling enough.  The IVR could even theoretically ask the person if they wanted to speak to their assigned agent and alter the match/non-match multipliers for assigned agent based upon their answer.

The final row of the table assigns a value to variable cost.  If our company handles many of the calls in-house, those calls probably have a very low variable cost (most of the costs would be considered fixed or sunk costs).  If additional calls can be routed to an outsourcer for $20/call, it is important to know the value of routing that call.  In general, the algorithm should prefer routing to agents with the lowest variable cost.

Once the match score criteria have been determined, an example matrix can be set up.  In this case, we are attempting to assign four incoming calls to three agents.  We consider each criterion for the match score and evaluate a base match score for all agents for all calls.  [We will continue to build on this same matrix when we introduce the timeline modifier.]

image

We calculated the match score by multiplying one times each of the values for the match/non-match values as appropriate for each match factor.

Timeline Modifier

Now that we have a concept of a base match score, we need to introduce a timeline modifier.  The timeline modifier ensures that calls with poor match scores across the board eventually get picked up.  How compressed the timeline modifier is depends upon your business model.  If you wish to have a good match and have a high call volume, a longer timeline may make sense.  If you don’t care about match and just want to ensure that calls are picked up, a more compressed timeline may make sense.  You could even replicate a FIFO queue by using an extremely compressed timeline modifier.  We currently use this equation to calculate the timeline modifier: if the call is not yet ready to be transferred, we divide 1.25 by the number of minutes until transfer.  If the call is ready to be transferred, we add three to the number of minutes it has been ready for transfer.  If we treat negative numbers as calls that are ready to be transferred and positive numbers as calls that are not ready to be transferred, extending our example above yields the following timeline modifiers for our calls.

image

One of the most interesting things about the timeline modifier is that it really boils down to just another match factor, but we treat it differently for two reasons: first, it’s more critical that we have a smooth range rather than buckets when referring to the amount of time a call has been on hold.  Second, business people understand timelines to be distinct from other match criteria.

Modified Match Score

We then use the timeline modifier to calculate the modified match score.  Multiplying the base match score by the timeline modifier yields the modified match score, as shown:

clip_image001[7]

Call Assignment

Finally, we assign calls by maximizing the sum of agent-call matches.  In this case, our sum is maximized at 55.8125 by assigning Call 4 to Agent 1, Call 2 to Agent 2, and Call 3 to Agent 3.

Implications and Conclusion

There are a number of implications to how we have assigned calls.  Note that calls 2 and 3, which are not ready to be transferred, are already assigned to an agent by the algorithm.  We distinguish between optimal match and actual assignment.  For actual assignment, we prevent calls more than two minutes from transfer from being assigned.  We could achieve the same thing by tweaking our timeline modifier equation to yield a different timeline modifier.  This brings up perhaps the most important differentiator of this algorithm, however.  Because we consider calls that aren’t yet ready to be transferred, we have some level of predictive ability that allows a better call-agent match.  If we go back to part 1 of this post, it is easy to see why predictive ability is important.  Rather than just looking at the here-and-now, we look at the soon-to-be and are able to reserve agents whose skills match with calls that will soon be ready to transfer.  The key to getting this information is to have the IVR send the ACD periodic notifications of calls en route, their current disposition and probable end state.  Finally, it is important that we consider not only the best agent for the call, but also the best call for the agent.  Looking at the call assignment grid above, you’ll note that we assigned Call 3 to Agent 3.  However, Call 3 had a higher match score with Agent 1.  The reason we assign Call 3 to Agent 3 is because Agent 1 has a better match with a different call, meaning that Agent 3 gets Call 3 by process of elimination.

The result of our experimentation with OCS has been this: last year, we struggled for months to hook into UCCX’s ACD in order to direct calls to the right destination based solely on one piece of information.  In our pilot with OCS, we were able to achieve multi-factor routing in a matter of days for a small fraction of the cost we incurred last year.  It is entirely accurate to say that Office Communications Server does not ship with an ACD.  The sleeper, however, is that they do ship a platform that is simple to hook into and allows development of a very complex and highly customized ACD that fits your business model.  For us, unless we find a blocking problem with OCS the choice is simple.

In future posts, we will start to lay out a high-level architectural diagram of how the various pieces work, where the messaging links are, and any gotchas that we find.

Queues are for the Brits, Part 1

del.icio.us Tags: ,,

First off, I need to say that I sincerely hope that no one takes offense to the title of this post.  The title is meant to be a play on the phrase, “gone to the birds,” not an ethnocentric slur.  The alert reader will also pick up on the double entendre that in Britain, queue is a much more common word.

I have stated in earlier posts that I lead a team of top-notch developers.  My primary responsibility is early research and architecture; my team are the ones to transform the vision into reality.  We recently completed our pilot project, which I plan to blog about at some length.  One of the requirements that we met was the ability to design an advanced ACD that takes into account any number of variables and implements a “best-match” algorithm.  We believe this to be a significant advance over traditional algorithms of the skills-based routing.  In fact, a multiple-hour search yielded an awfully slim amount of relevant data.

The Problem with Skills-Based Routing

The primary problem with queues is implicit in the name: most queues operate as first-in-first-out mechanisms.  Before we can address why FIFO is a bad idea for contact centers, we need to have some context.

There is an even bigger problem that is a fundamental part of the queue problem.  This problem is in the existing call distributor algorithms.  A simple example will illustrate:

Assume we have three available agents.  Agent 1 is licensed to sell insurance in Michigan and Indiana.  Agent 2 is licensed to sell insurance in Indiana and Ohio.  Agent 3 is licensed to sell insurance in Indiana only.  Assume also that 80% of our inbound calls are from Michigan residents, 10% from Indiana residents, and 10% from Ohio residents.  If a call comes in from a resident of Indiana, any of the three may be able to sell an insurance policy to the caller.  However, only one is best suited in this context to sell the insurance policy to the caller: Agent 3.  There is a 90% chance that the next call will be from either Michigan or Ohio, meaning that Agent 3 would not be able to handle the call.  That means that we want to reserve those agents whose skills are most in demand for the calls that demand those skills.

Many contact and call center software providers trumpet their algorithms for call distribution, but most of the algorithms are some variant of, “How do I spread the load of calls evenly over my agent pool?”  The question that should be asked is, “How do I find the best possible agent to handle this call?”  Some software providers have started to deal with this question and have introduced a second factor into skills-based routing.  By assigning a rating to an agent’s skill, there is indeed additional intelligence available to the call distributor.  Consider the following example:

Assume we have three available agents, with skills rated as shown in the matrix.

Agent Windows skill level Office skill level Internet skill level
Agent 1 70 40 10
Agent 2 10 70 40
Agent 3 40 10 70

 

Upon receipt of an inbound call, the skill matrix is analyzed and the agent with the highest rating in the skill necessary is assigned to the call.  Therefore, if we receive a Windows call, the call is assigned to Agent 1.  If we receive another Windows call while Agent 1 is still occupied, Agent 3 is assigned.  If all agents become available again and an Internet call is received, the call is assigned to Agent 3.

In many senses, the second example is still vulnerable to the problem noted in the first example.  Assume our call load is 90% Windows, 5% Office and 5% Internet.  If a call comes in for Windows, it should be rightly assigned to Agent 1.  However, if a subsequent call comes in for Internet support, it might be more appropriate to assign Agent 2 to the Internet call and reserve Agent 3 for the next Windows call.

The fundamental problem with traditional skills-based routing is that it takes so few factors into account.  In part 2, we will consider a potential solution to this problem.