Office Communications Server Deployment, Day 9

Note: Sorry this wasn’t posted sooner, there was a bit of a shake-up internally as we tried to decide what all was appropriate to post.  I’ve had this post ready for a few days now and have just been waiting for definitive answers from my management.  This post represents nearly complete OCS deployment.  By the time it ends, we have Enterprise Voice complete.  The remaining things we will deploy are the archiving server, the QoE monitoring role, and edge servers.

1:07 PM : Creating UM Dial Plan

image

Note: there are three important things here.  The first is the dial plan name.  You’ll see that when I create the location profile in OCS that the name is slcutloc.extendhealth.com.  That must match.  Second is the URI type – it must be SipName for OCS integration.  The last thing is VoIP security, which should be Secured for OCS.  (Secured > SipSecured)

 image

 image

Have to add the dial plans to the UM servers – both mail1 and mail2.

image

image

image

image

1:20 PM : Running ExchUCUtil.ps1

image

image 

Verified IP gateways.  If there were more, I’d have to disable them.

image

1:31 PM : Creating Location Profiles

I’m not going to comment on this much as there is a lot to say.  Screen caps should be sufficient to let you know what I’m doing.

image

image

image

2:07 PM : Running OcsUMUtil.exe

The last step is to integrate from the OCS side by running OcsUMUtil, which creates OCS objects for the auto assistant and subscriber access numbers in Exchange UM.  This facilitates access to these numbers from Communicator.

image

image

image

image

image

2:10 PM : Assigning a Default Location to the Pool

image

image

image

2:15 PM : Configuring Mediation Servers

image

image

image

2:22 PM : Configuring Policies and Phone Usages

image

Office Communications Server Deployment, Day 8

8:08 AM : Loopback Fix

I’ve been here for a while, catching up on some of my non-blog communication, MBA coursework, etc.  About ten minutes ago, I started testing a probable fix for the validation error I had last night.  Just as a reminder, that validation error looked like this:

clip_image00161_thumb1

The fix is recorded in Appendix D of the Office Communications Server 2007 Enterprise Edition and Communicator 2007 Deployment Guide.  In a nutshell, you need to add a multi-string value to HKLM\SYSTEM\CurrentControlSet\Control\Lsa\MSV1_0.  The MSV should be named BackConnectionHostNames and should have a value of your pool’s FQDN.  What this does is allow IIS to validate certain FQDNs as being valid for loopback.  You’ll want to remove this value when you’re not validating, and more detail is available by reading the referenced guide.

When I followed the instructions for the fix, the validation wizard for the remaining steps executed properly.

8:16 AM : Validation Wizards

image

image

image

image

(Yes, that’s a different validation wizard.)

image

image

image

(Yes again.)

image

image

8:23 AM : Validation Results

So the current state of our deployment is that there are two validation warnings, neither of which I care about because I haven’t deployed Enterprise Voice or edge access yet.

From the Validate Front End Server Configuration wizard, we have:

image

From the Validate Web Components Server Functionality wizard, we have:

image

8:27 AM : Internal Deployment Complete

Aside from the above validation warnings, it seems that internal deployment is complete.  I do have one more warning in my Communicator client regarding Exchange Web Services, but the Exchange deployment on this domain isn’t complete yet, so it’s also expected.  The ramification at this point is that Communicator can’t automatically set my status to “In a Meeting” if I have a meeting scheduled in Outlook.

Next step is external user access, meaning I’ll be bringing up a scaled single-site edge topology.  I’ll try to explain that in more detail, but there will probably be some downtime here as I test Communicator internally and prep another couple of servers to be edge servers.  (I have to install Server 2003 at least.)

1:53 PM : Enterprise Voice

image

image 

image

image

image

image

image

image

1:56 PM : Activating Mediation Server

image

image

image

image

image

2:00 PM : Assigning Certificates

image

clip_image001

image

image

image

image

image

image

image

image

3:16 PM : Enterprise Voice Prep

I’ve been reading (and will continue to read through) the Microsoft Office Communications Server 2007 Enterprise Voice Planning and Deployment Guide.  This will probably take the rest of the day and will ensure that I make minimal mistakes when deploying Enterprise Voice.  I have a good idea of what it is that I need to do, but I want to be certain.

Office Communications Server Deployment, Day 7.5

All of these steps and screenshots were performed late last night.  I’ll fill in commentary now (morning of Day 8).

Back Story

I was crushingly disappointed when Microsoft told me that I’d have to reinstall my entire PKI because the hashing algorithms I used were for a Cryptography Next Generation (CNG) CSP, not a CryptoAPI Version 1 CSP.  Knowing what I know now, I can see some allusions to that on pp. 158-159 of Brian Komar’s book.  Before I left work yesterday, I e-mailed Brian and explained my situation and that I was on a support call with Microsoft.  I then updated him via e-mail of their response (“it’s not supported) and the fact that they were closing the support case.

He sent this response:

Mark,

There is a security update that will allow XP and 2003 clients to validate certificates that implement SHA-2 signatures.
The update is included in Windows XP service pack 3.
Per the release notes for service pack 3:

Microsoft Cryptographic Module

Implements and supports the SHA2 hashing algorithms (SHA256, SHA384, and SHA512) in X.509 certificate validation. This has been added to the crypto module rsaenh.dll.

XP SP2 crypto modules Rsaenh.dll/Dssenh.dll/Fips.sys had been certified according to FIPS 140-1 specifications. The Federal Information Processing Standard (FIPS) 140-1 standard has been replaced by FIPS 140-2, and these modules have been validated and certified according to this standard. For more information, see the Microsoft Kernel Mode Cryptographic Module.

You cannot create these certs in 2k3, but you would be able to validate them.

Brian

Based upon that hope, I went out and did some strategic searching and came across this KB: http://support.microsoft.com/kb/938397.  After an hour of waiting on hold while some (nice enough) tech researched the history on my support case, I was finally given a link to download the hotfix.  Note that there is a link there to register for the hotfix also, which I did, but was told that it would take up to 24 hours.  It actually took about two hours. 

Hotfix in hand, I patched the server and all the certificates looked great!  There were still a couple of strange artifacts with how I had to request certificates, but I was able to do it without incident.

Now that the back story is complete, I’ll try to recreate the timeline as best I can based upon the timestamps in my screencaps.  Thanks, OneNote!

8:50 PM : Assigning the Certificate to IIS

This is where things went awry yesterday.  If you want to know what to do to get to this point, read that post.

clip_image001

clip_image001[4]

clip_image001[6]

clip_image001[8]

clip_image001[10]

8:52 PM : Starting Services

I’m deviating here from the norm of not including the wizard starts in the screen captures.  The final screen of a wizard generally has useful information (like success, hopefully), but the start of a wizard usually just says what it is you’re doing.  Since I generally label what it is that I’m doing already, I had been skipping the first screen for the wizards.  At this point, however, the wizards start to blur together, especially in the validation phases.  Therefore, I’m going to include some wizard start screens if I can to differentiate the wizards.  (That said, I think I noticed last night that all the validation wizards start with the same screen anyway.)

clip_image001[12]

clip_image001[14]

clip_image001[16]

clip_image001[18]

clip_image001[20]

clip_image001[22]

9:29 PM : Server/Pool Validation

[Delay reason: had to put my son to bed.]

clip_image001[27]

Oops… in order to validate the server and pool functionality, I need a couple of user accounts to be enabled for Office Communications Server.  The trick to this is that you have to use Active Directory Users and Groups to enable the users, but you also have to have the OCS Administrative Tools installed on that computer.  Because my domain controller is Server 2008, I can’t install the OCS Administrative Tools there (and be supported).  In this case, I just opened an MMC on ocsfe1, added the Active Directory Users and Groups snapin, and connected to the extendhealth.com domain.  Right-clicking on users now exposes the following option:

image

clip_image001[31]

clip_image001[33]

clip_image001[35]

image

Now that the users are enabled, I can see them if I open the Office Communications Server snapin (Start > All Programs > Administrative Tools > Office Communications Server 2007).

image

9:36 PM : Back to Validation

clip_image001[41]

clip_image001[43]

clip_image001[45]

image

clip_image001[49]

Note that I didn’t check test connectivity of federated users because I don’t have external access yet.

clip_image001[51]

clip_image001[53]

clip_image001[55]

clip_image001[57]

clip_image001[59]

This was the only warning I had.  Since I haven’t deployed Enterprise Voice yet, I’m not concerned about this warning.

11:15 PM : More Validation

I think I took some time before this screenshot to correct some previous validation errors, but I can’t recall very clearly.  I do want to note that I ran into some validation errors last night, as the following screenshot shows:

clip_image001[61]

I believe this particular screenshot is an artifact of a known issue with IIS loopback, so I’ll try to fix it this morning.  I didn’t think it was important last night since I recalled how to deal with it (although not the specific steps) and since the server and pool validated okay.

11:23 PM : The Payoff

clip_image001[65]

Enough said.

Office Communications Server Deployment, Day 7

8:33 AM : Picking Up Where We Left Off

As you may recall, I ran into an issue last night just before I left because I didn’t have the SQL client tools necessary (specifically the SQL 2005 Backwards Compatibility Pack and the SQL Native Client) installed on my front end server ocsfe1.  I did try installing the tools this morning to no avail – unfortunately I wasn’t even getting a good quality error message, just “Pool backend discovery failed” – the same message I posted yesterday.

I’m pursuing a workaround at this point for two reasons:

  1. I need to keep the ball rolling.  I have to get the internal deployment completed today.
  2. I’m planning to move the database to an official cluster anyway, per the directions in the Admin guide for moving the backend database for an Enterprise pool.

Primarily because of reason two, I don’t feel bad about installing SQL locally for a short time period (<1 month) until our cluster is ready to support the Enterprise pool.  As with other cautions I’ve offered, this isn’t recommended.  For me, it’s just real life.  To achieve the goal I want, I’ve created a CNAME (alias) in DNS to tell my computer that dbcluster1 is currently the same as ocsfe1.  I’ve also installed SQL Server 2005 Standard Edition SP2 32-bit locally.

8:39 AM : Creating the Enterprise Pool

image

image

image

Two notes here:

  1. We specified a different internal web farm FQDN because we may eventually move to an expanded configuration, and having a different FQDN may facilitate that transition.
  2. The planning documentation states that if you don’t specify an external web farm FQDN at this point, you’ll need to use the command line utility later.  Usefulness of command line utilities notwithstanding, I’d rather specify it now since I know what it is.

image

image 

Another note: our database files will be going onto a SAN with the transition to the database cluster.  If you aren’t storing your database files on a SAN, you’ll want to make sure the database and log files are on different spindles (different physical volumes).  This is basic database optimization, not an OCS thing.

image

I didn’t enable meeting archiving yet as it probably requires the Archiving and CDR role, which doesn’t exist yet in my infrastructure.  I’m quite certain you can enable this later, so I’ll skip it for now.  I have put the path in, however, so that you can see what I would be using if I were to enable it right now.

image 

image

Archiving is not enabled for the same reason listed above.

image

image

image

image

image

Ugh.  I made a mistake early on in the wizard – my pool is named ocspool.extendhealth.com, not pool.extendhealth.com.  I think I can probably fix this later, so I’ll keep going for now.  There were no other warnings in the log.

8:59 AM : Configuring Enterprise Pool

image

image

image

There’s the wrong pool name I mentioned above.

image

  Pros Cons
DNAT > 65,000 users Increased difficulty of configuration
SNAT Easy configuration < 65,000 users

image

image

Note: Only one pool or server can authenticate automatic logon requests.

image

image

I’ll definitely be configuring external user access, but two things are stopping me from doing it right now:

  1. I want the edge deployment to be distinct from the pool deployment for my own sanity and anyone’s sanity following along with this thread.
  2. I think the only way you can configure your edge topology right now is if you’re migrating from LCS 2005 R2? and already have an edge topology deployed.  I’m not certain on that, I just think that’s what I recall.

image

image

image

9:10 AM : Adding Ocsfe1 to Pool

So far so good this morning – everything seems to be turning out okay aside from my dumb mistake with the pool name and the issues with the pool backend.  I’m now ready to add ocsfe1 to the pool as the first front-end server.

image

image

image

image

image

(Takes a while.  Lots of time for screen captures.)

image

Apparently Microsoft thinks it’s funny to continually remind me of my mistakes.

image

Yes, the password really is that long.  As a reminder (I think for the third time), I use WinGuides Password Generator to generate passwords for service accounts.

image

image

image

image

image

Same warnings as before:

image

Aside from that error being in the logs about 20 times, there were no other errors.  I think I’m still okay.

9:30 AM : Fixing the Pool FQDN

Before I proceed any further, I want to correct the pool FQDN.  I’ve been warned sufficiently.  As part of installing the Front End role, the administrative tools for OCS were installed.  I’m opening them from Start > All Programs > Administrative Tools > Office Communications Server 2007.

image

9:36 AM : ???

Wow … http://forums.microsoft.com/unifiedcommunications/ShowPost.aspx?PostID=2931495&SiteID=57

Apparently I’ll be removing the pool and creating it all over again.  Hope that goes okay.

image

image

image

Lesson learned: get the pool name right in the first place.

9:44 AM : Configuring Certificates

Well, at least it didn’t take too long to get back on track.  For this next step, please note that there are two distinct steps.  The Web Components role requires its certificate to be manually configured in IIS.  The rest of the Front End roles have a wizard.  I’ll deal with the wizard first, then IIS.

image

image

Because I have a PKI deployed, I can opt to send the request to an online certification authority (Active Directory will help me locate one).

image

In this case, we don’t care if the cert is exportable, but I left the box checked anyway.  We also don’t care about client EKU – the only place that matters is for the certificate assigned to the external interface for the Access Edge role.

image

image

I chose to include the local machine name in the SAN here.  If you’re configuring automatic client logon, the SAN must also contain sip.<domain>.  In my case, it was automatically populated because of the choices I made in earlier wizards to enable automatic client logon.

image

image

image

… and … I accidentally clicked through the next screen, so I think it succeeded but I’m not 100% certain.

image

image

image

image

Well, I got that far before realizing that the prior wizard had actually failed.  It has something to do with Server 2003 not recognizing the authenticity of the certificate chain.  My PKI is completely implemented with Server 2008, so I guess it’s time to go research what to do.

3:22 PM : Square 1

As if there weren’t enough blocks already…

I just got off the phone with Microsoft support.  The certificate issue is “by design”.  In this case, I interpret “by design” to mean, “We knew about the problem but haven’t taken the initiative to fix it.”  The specific issue is that Server 2003 and Windows XP don’t support certificate chains with algorithms > SHA1.  Since my root CA had a SHA512 thumbprint, and my other CAs had a SHA256 thumbprint (per NIST guidelines), Server 2003 barfed.

Generally speaking I’m very happy with Microsoft.  Today, I’m not.  Off to rebuild the PKI from scratch…

Office Communications Server Deployment, Day 6

I spent the entire day yesterday dealing with administrative and management issues.  As such, there was nothing to report.

5:35 AM : Amber Alert (Ex post facto)

This morning, I arrived at our data center to finish up some final issues remaining from the previous day.  Installing all of this new equipment has caused heartburn, to say the least.  The IP KVM we have (by Avocent) is not particularly incredible and has been on the fritz since Sunday, meaning that I couldn’t remote control any computers to install them from the office.  That said, the plan this morning was to bypass the IP KVM, install a couple of servers with Windows Server 2003, and head back to the office to actually start on the OCS deployment steps past planning complete.  Upon arrival, however, I immediately noticed that I didn’t get an IP address from our DHCP server there.  The second thing I noticed was that all of our slave switches in the enclosures appeared dead.  The third thing I noticed is that the consoles on the front of the blade enclosures were amber.  In case you’re not a network admin (which I’m not any more, but experience has taught me), amber = bad.

It turned out that overnight, our data center had a significant A/C failure and had caused lots of problems.  This isn’t a small data center, it’s enterprise class.  A failure like this hasn’t happened in the entire history of the facility.  Of course it would have to happen while I’m trying to deploy OCS: administrator’s law.

12:00 PM : Amber Remediated (Ex post facto)

By noon, we had the issues straightened out at the data center.  I should note here that Dell wasn’t particularly well trained on our equipment, which is brand new (in the sense of recently released to manufacturing).  It turned out that our Cisco switches had overheated and shut themselves down as a protective measure.  Reseating the switches finally resolved most of our problems there.  On the plus side, the work with fixing the amber alerts also somehow fixed the IP KVM.

Back at the office, I was finally able to deploy Windows Server 2008 (for an Exchange deployment) and Windows Server 2003 to servers.  The current deployment toolset is using Microsoft Deployment as I was never able to get Configuration Manager 2007 running properly.

2:28 PM : Windows Server 2003 R2 with SP2 Deployment Complete

After working through several minor driver issues, I was just able to finish deploying Windows Server 2003 R2 (with SP2) via Microsoft Deployment.  There were actually two different Broadcom drivers necessary, and I had to be sneaky about where I put one of them.  If you happen to run into issues with a similar situation and need help, you can submit a comment here, but I don’t feel the need to detail what I did – it’s time to get into OCS, finally!

2:40 PM : Planning Recap

Since there were some final adjustments to several IPs internally, I’ll repost the planning table I posted last week with the updated IPs.  If you can’t see it all, just copy and paste it into Excel.

Edit: Removed planning table

2:50 PM : Created A Records

I just created the A records for ocspool, ocsmeetings, and ocsmeetingsext.  Note that certain parts of the planning documentation are pretty picky about whether these are A or CNAME records.  I was also under the impression that I needed to create a sip.extendhealth.com A record, but can’t find mention of it in the planning docs for now, so I’ll skip it until it becomes a problem.

2:54 PM : Crashed MMC 3.0

It might be just me, but the MMC 3.0 seems particularly unstable.  I just tried to add the SRV record for automatic configuration (_sipinternaltls._tcp.extendhealth.com) and the MMC crashed.

2:57 PM : Created SRV Record for Client Automatic Configuration

Note: this record gets created in the Forward Lookup Zones/<domain>/_tcp node.

clip_image001

2:59 PM : Finishing Updates

The ocsfe1 server will be the first server to come up (be added to the pool).  It’s currently finishing some updates, which is why I’ve been picking away at DNS requirements.  I should also note (if you didn’t read the posts from last week) that I have a PKI infrastructure in place to deal with the certificate requirements.

The one other critical thing I should highlight while I wait is that we expect some load balancers within two weeks.  The VIPs referenced above would normally be assigned to the load balancer.  For now, since we’re still missing this hardware, I plan to proceed with deployment as if they already existed.  In order to (hopefully) fool OCS, I plan to assign the IP address that will be assigned to the VIP to ocsfe1 (temporarily).  That means that ocsfe1 will currently have the following three IPs: 10.10.3.1, 10.10.3.51, 10.10.3.53.  Please note that this is almost certainly not the recommended course of action, and I’m only ignoring my own advice out of necessity.  When the load balancer comes in, I’ll assign the VIP IP to it, remove it from the server, and rerun the validation wizard and the best practices analyzer.

3:08 PM : Creating File Shares

Another thing you need to do before deploying OCS is set up some file shares that will store (mostly) Live Meeting related files.  I have set up four shared folders on my file server: OCS\AddressBook, OCS\MeetingArchive*, OCS\MeetingContent, and OCS\MeetingMetadata.

* Optional, will only need this if archiving and CDR archives meetings.

3:20 PM : Installed IIS

Since I will be deploying an OCS Enterprise Pool, Consolidated Configuration, I installed IIS from the Add Role wizard.  I didn’t enable ASP.NET as I don’t think OCS uses ASP.NET.  (The planning documentation says you need ASP, however.)

3:30 PM : Opening the Setup Wizard

I think I’ve completed all the prerequisite steps for OCS installation and am opening the setup wizard for the first time.  I’ll try to take as many screenshots as are relevant through the installation process.

3:32 PM : Preparing Active Directory

clip_image001[5]

clip_image001

clip_image001[11]

clip_image001[13]

clip_image001[15]

image

image

clip_image001[17]

image

image

(Snipped for some semblance of brevity.)

image

(This wizard happened too fast to even grab a screen cap of the process.)

image

3:45 PM : Active Directory Prepared

Everything went flawlessly (or at least apparently so) in the Active Directory preparation phase.  I’m now ready to create the Enterprise Pool.  The one thing I think I might need here is user accounts that I haven’t created yet.  I create my passwords from the WinGuides Password Generator for security’s sake.

3:47 PM : Creating Enterprise Pool

As with above, relevant screenshots.

image

image

image

Curses!  The first error.  I just forgot to install the SQL client tools.

4:14 PM : SQL Client Install

image

4:30 PM : EOD

Unfortunately, that’s where it’s going to have to sit for tonight.  Hopefully will be able to finish off the pool by mid-morning tomorrow, barring the type of disasters that happened today.

Interact 2008 Summary, Day 1

I am now wrapping up the third and final day of the Interact 2008 (not to be confused with Interact 2008) conference.  I don’t know what I can say about it other than to say that is has been time extraordinarily well spent.  There are many things that you do in life that are worthwhile days; as far as careers go, this ranks among the most useful days I’ve ever had.  I don’t intend that statement to be either hyperbole or summarily discardable.  There has been absolutely fantastic face time with Microsoft employees and a wonderful opportunity to interact (no pun intended) with key vendors and other attendees.  Although I’m completely saturated and exhausted, I’ll try to give a rundown of the sessions and events.  First up, the sessions that I attended.

Tuesday

Keynote (Gurdeep Singh Pall)

Overall a great keynote, but very similar in content to the keynote delivered at VoiceCon 2008.  I can’t hold that too much against him since I’m sure that writing new keynote speeches for every event he speaks at probably isn’t and shouldn’t be his priority.  That said, I recommend you watch the actual keynote from VoiceCon rather than reading my poor summary of what Gurdeep said.

Panel on Planning Voice Architecture and Deployment in Microsoft Office Communications Server 2007 (Mahendra Sekaran; Sean Olson; John Kenerson; Francois Doremieux; Russell Bennett; Jens Trier Rasmussen; Ken Ewert)

I tremendously enjoyed this session; it was an open forum for anyone with a question to address some of the brightest minds on the OCS team.  The people that stood out to me in particular were Mahendra Sekaran, who answered my question about topologies with 100-person outsource shops (and whether we needed to deploy a pool/enterprise voice equipment at that location); Sean Olson, who answered most of the questions about general vision; Francois Doremieux, who handled many questions about actual deployments; and Russell Bennett, who contributed intelligent comments to several questions.  I particularly enjoyed the range of knowledge available in the room, it seemed that there were answers for every question asked.

Microsoft Office Communications Server 2007 Edge Drill Down (Wajih Yahyaoui)

This was a very good session on details for the various edge servers.  Wajih has a very noticeable accent, but was obviously passionate about the subject matter, so it was a real pleasure to listen to him.  He was able to handle most of the questions in the room, but it was helpful that several other knowledgeable Microsoft personnel were there also.  The conversation had one major interruption when a guy who I considered to be acting very belligerently.  The contention was based around whether OCS’s requirement to open port ranges in external firewalls unnecessarily creates security vulnerabilities.  Neil Deason responded that a firewall is only ultimately secure if all ports are closed: it makes no difference whether there are 10,000 ports or one port open, your network is vulnerable if you have ports open in your firewall.  While I don’t think that response was a good response overall, the point of the response should be considered sufficient.  The point of what Neil was saying is that firewalls are only one element of a properly hardened network’s defenses.  Hardening a network involves hardening multiple elements, not just the firewall.  The firewall-only approach is typically described like an M&M: crunchy on the outside, chewy on the inside.  If you have a network defended only by a firewall, your network is vulnerable to internal attacks.  Although research shows that most attacks originate from outside the network, the same research shows that an alarming percentage of breaches originate from inside the network.  See http://answers.google.com/answers/threadview?id=15439 for links to credible sources.

Advanced Validation and Troubleshooting for OCS 2007 (Byron Spurlock, Tom Laciano)

Byron and Tom handled this session on how to ascertain the source of an OCS 2007 problem.  Both presenters were enjoyably humorous, considering the amount of time we’d been sitting that day.  We got a chance to see Byron use a number of tools such as the Snooper tool, validation wizards and more.  As I sat there, I realized how much I would have benefited from knowing that those tools existed a few weeks ago, when I spent a significant amount of time using Wireshark to diagnose what was wrong with OCS.  Afterwards, I went to one of the Coffee Chats with Tom and sat for a while as he explained in detail what subject names and subject alternative names are necessary for certificates in various scenarios.

Evening Event: Surfing @ Wave House

In the evening, we relaxed on the beach at the Wave House.  It was a great time breathing the (very) cool salt air, throwing back a few drinks, and doing some surfing!  I stunk (figuratively), but here’s a shot, courtesy of my colleague David DeWinter who went with me:

Mark Surfing

OCS Roles Primer, Part 2

In part 1 of this post, we examined the core pool roles for Microsoft Office Communications Server 2007.  Specifically, we covered front-end servers, directors, the three variants of conferencing servers, and the archiving and CDR server.  There are still several key roles to be covered to understand the full breadth of the OCS offering.  These roles fit into three key areas: edge servers, telephony servers, and “other”.  Before we get into the specifics of the roles, please take a brief moment to review the vocabulary from part 1:

  • Office Communications Server: A Microsoft product designed to facilitate communications both inside and outside the office.
  • Presence: A metric that takes into account both your availability (available, idle, away) and your willingness (available, busy, on a call) to communicate.
  • Endpoint: Any device (SIP phone) or software package that registers itself with Office Communications Server as belonging to a user, meaning that the user can be contacted through the device or software package.
  • Enterprise Voice: Probably the most noteworthy addition to the product since 2005; allows calls to enter and exit Office Communications Server.  This means that from any endpoint, users can make or receive calls to traditional phone numbers.
  • Public Switched Telephone Network (PSTN): The traditional telephone network that delivers telephone service over dedicated copper cables.

Edge Servers

Access Edge Server

The access edge server provides three very key services: authenticating and enables connectivity for remote users, negotiating federated communications, and connecting to public IM services such as MSN, AOL, and Yahoo.  Authentication and communications with remote users is unequivocally the most common usage of access edge server.  This server is critical whenever an employee needs to use Communicator but is outside of the corporate LAN.  Traveling sales representatives with Communicator Mobile, home-based employees and other situations are supported when using Access Edge Server.  Federation is the term used to refer to two Active Directory domains that have set up a federated relationship.  Note that a federated relationship is not the same thing as a domain trust, but is similar.  Generally federation happens along corporate boundaries.  Two companies in a strategic alliance or other partnership will federate to allow key contacts greater visibility and easier access to communications.  Microsoft OCS can also allow connectivity with public IM services, enabling communications from Communicator to MSN Messenger, AIM, or Yahoo! Messenger.

Personal Side-note: Access edge server is one of the most amazing roles in my opinion.  I have been witness to quite literally taking an OCS endpoint, moving it outside of the network, and having it seamlessly connect back up to OCS without any additional configuration.  Imagine being able to grab your desk phone and go home for the day!  We currently have a Cisco UCCX system in place.  In order to take my phone home, I have to take a hardware VPN home, hook it up directly to my cable modem (in the basement) and then hook my phone straight to that.  With OCS, I was able to take my laptop home, turn on my wireless and connect immediately.  If I can say one thing that would be the most important thing for a Cisco customer to hear, it’s this:

Our Cisco system is technically capable of achieving everything we need it to achieve, but our experience with OCS has blown us away.  Actually getting your hands on to a sample OCS setup is the best thing that you can do for yourself.

To summarize, the access edge server:

  • Authenticates and enables connectivity for remote users
  • Allows two entities to federate, which in turn allows greater visibility for communications
  • Allows connectivity to public IM networks

A/V Edge Server

The A/V edge server enables audio and/or video conferences to happen with users outside of the corporate LAN.  It is important to note that telephony conferences are considered distinct from this scenario and are covered by the telephony conferencing server (see part 1).  The A/V edge server allows remote users authenticated by access edge server to establish internal audio or video calls, or VoIP calls for enterprise telephony scenarios.

Web Conferencing Edge Server

Similar to the A/V edge server, the web conferencing edge server enables Live Meeting 2007 sessions to include users outside of the corporate LAN.  Many companies will use this role slightly differently than they will the other edge server roles.  Where access edge server and A/V edge server are deployed to allow external known users to connect and conference, Web conferencing edge server may arguably be used to conference in more anonymous users (who are still actually authenticated by digest authentication) than known users.  This allows companies an internally controlled, paid-for mechanism similar to WebEx that allows public sharing of desktops and other information.

Requirements** (for all edge servers):

  • Dual processor, dual core 3.0GHz+ processor
  • 2 x 18GB HDD
  • 4GB+ RAM
  • 2 x Gigabit NIC
  • Windows Server 2003 SP1+*

* I was not able get the OCS primary installer to run successfully on Windows Server 2008 RTM.  It may be that the individual installers would run successfully, but I have not confirmed this.  The only role I have successfully installed on Windows Server 2008 is Speech Server 2007.

** The work of mixing audio channels is intense; A/V servers will benefit from more robust hardware.

Communicator Web Access

Communicator Web Access (CWA) is to Office Communications Server what Outlook Web Access is to Exchange Server 2007.  It provides an attractive, AJAX (slick update without refresh) based interface for internal or external users to use.  CWA functions much like the director role in that it proxies connections, but differs in that it also proxies internal connections.  Also, CWA is restricted to communicating via instant messaging.  There is no support for audio/video conferences, Live Meeting, or enterprise voice.

Requirements:

  • Dual processor, 3.2GHz+ processor
  • 1 x 36GB HDD
  • 4GB+ RAM
  • Gigabit NIC
  • Windows Server 2003 SP1+*

* I have not yet attempted to install this role on Windows Server 2008.

Web Components Server

This role has probably the least visible functionality of all server roles: it’s primary responsibilities are to allow users to join Web conferences by clicking a URL, allow download of Address Book data, and expand membership in distribution groups (in ways, simply an expansion of the Address Book functionality).

Requirements:

  • Dual processor, dual core 2.6GHz+
  • 2 x 18GB HDD
  • 2GB+ RAM
  • Gigabit NIC
  • Windows Server 2003 SP1+*

* I have not yet attempted to install this role on Windows Server 2008.

Mediation Server

Contrary to the Web components server, the mediation server role has profound visibility and is arguably as important as a front-end, back-end, or edge server.  The mediation server is what makes enterprise voice possible.  When Microsoft implemented enterprise voice, they elected to use proprietary codecs (RTAudio and RTVideo) in order to overcome some significant hurdles such as inconsistent bandwidth.  However, their choice to use these proprietary codecs meant that right from the beginning, Microsoft wasn’t able to play nicely with many pieces of PSTN hardware.  In their defense, the enterprise voice market is very confused right now.  There are many competing standards such as ICE, SIP, and others that still aren’t fully or consistently supported.  Microsoft saw this and decided that it would be easier to simply draw a strong line between external and internal voice traffic.  That line is drawn right through mediation server.

Microsoft states that there are three ways to connect Office Communications Server to the PSTN.  The first is through a basic media gateway.  A basic media gateway is simply a piece of hardware that terminates PSTN lines (whether in FXS/FXO or T/E/DS form).  The media gateway’s responsibility is to accept incoming calls on the PSTN lines and hold the line open until the call is complete.  To know when the call is completed, the basic media gateway talks to the mediation server, generally via G.711.  The mediation server does the job of decoding G.711 voice traffic and encoding into RTAudio (and vice versa, for outbound voice traffic).

A basic hybrid gateway does essentially the same thing except that it merges the mediation server role directly onto the media gateway.  The benefit of a basic hybrid gateway over a basic media gateway fundamentally boils down to TCO: it’s cheaper and easier to manage one box than it is to manage two.

The final means of connecting Office Communications Server to the PSTN is for the media gateway itself to directly support the native OCS protocols (like RTAudio and ICE).  Microsoft calls this an advanced media gateway.  Please note that the difference between the advanced media gateway and the basic hybrid gateway is that in the basic hybrid scenario there are two functions coexisting on one box – they are still distinguishable functions.  With advance media gateways, the functions are no longer distinguishable.  The media gateway natively speaks OCS’ language.

In the next post in this series, we’ll consider a final smattering of server roles that don’t always require a full server, consider coexistence scenarios and some final “gotchas” that I wish I’d known about when I started deploying OCS.

Queues are for the Brits, Part 2

del.icio.us Tags: ,,

In part 1 of this article, I raised some issues with traditional ACD algorithms.  The issues raised are best summarized by generalizing* ACD algorithms as FIFO queues with the only variances being skill levels and the actual agent allocation algorithm.  Agent allocation algorithms generally break down into something fairly simple such as which agent has been off the phone the longest, taken the fewest calls, or spoken to the person before.  There is a lack of intelligence when it comes to considering a many-to-many call/agent match.

* I really do understand that there are likely algorithms out there that do achieve 90% of what it is that we need to achieve.  The question is not, “How much does solution x achieve for company y out of the box?”, the question is, “To what level does solution x allow all companies to customize the algorithm to achieve 100% of their needs?”

A Proposed Solution

Before I propose a solution, I should make two things clear.  First, I am by no means an expert in contact center theory.  There is a world of things I don’t know – contact center theory is one of them.  Second, I am not fully educated on the existing solutions out there.  Due to the proprietary nature of algorithms and the obvious interests of the companies in protecting them, the best I can do is find descriptions of how algorithms currently work.  I can also see the flaws in our current UCCX system.

So how do we determine which is the best agent to assign to a call?  Skills-based routing excels at limiting the pool of agents to available agents who are capable of taking the call.  We want to break past that barrier to achieve the following:

A desirable algorithm for contact centers will create an optimal call-agent match, adjusted for time considerations.

Note that the statement above does not de facto consider availability.  Availability certainly should be part of the equation, but only part.  For contact centers who have frequent repeat callers, agent consistency may be a desirable trait.  If agent consistency is desirable, the algorithm may state that if the caller’s assigned agent will be available in less than a minute, the caller will be placed on hold until the agent is available.

Base Match Score

Skills-based routing has achieved the first part of the equation: making sure that someone capable of handling the call is assigned to the call.  Because there are only the few dimensions involved (call disposition, available agents, and agent skills), call match may not be entirely optimal.  The first step to creating an optimal call-agent match is to increase the number of dimensions that factor into the final assignment.  As was stated in the introduction to this post, we have designed an algorithm that takes a number of factors into consideration.  A sample list of considerations may include the following factors:

clip_image001[9]

** Note that unless an alternative receptor has been configured for calls with a zero match score for all agents, all multipliers should be greater than zero.

The above chart over-simplifies in one significant respect: it treats ranges as a flat match or non-match score.  In our actual algorithm, we support ranged multipliers, but this example needs to be easy to understand.  I also recommend this type of a chart for gathering requirements from the business; it’s easier to understand and easier to supply rows rapidly.  That said, this may be the match score matrix for a fictional company.  The first row indicates that, as with many companies, skill match is critical to agent assignment.  The non-match multiplier minimizes the chance of a agent without a skill match being assigned to this call.  Unless voice mail or an alternate “queue” is configured to handle calls with a zero match score, I highly recommend having all multipliers be greater than zero.  The second and third rows assign a significant priority to agents who have spoken with this person before.  The third to fifth rows flatten a ranged multiplier.  In true implementation, I recommend attempting to use an actual mini-algorithm to deal with these types of issues as the result is a more accurate, less “bucketed” match score.  In this case, we are attempting to de-prioritize any agent that is not currently available.  The longer it will take the agent to become available, the less likely it is that they will be assigned to the call.

It is critical to note at this point that we feel this is one of the key differentiators of this algorithm.  Most algorithms only consider the available agent pool and rescan every few seconds if a match cannot be found.  In this case, a match can be made even before an agent finishes a call if the reason (previous contact, for instance) for assigning the call to that agent is compelling enough.  The IVR could even theoretically ask the person if they wanted to speak to their assigned agent and alter the match/non-match multipliers for assigned agent based upon their answer.

The final row of the table assigns a value to variable cost.  If our company handles many of the calls in-house, those calls probably have a very low variable cost (most of the costs would be considered fixed or sunk costs).  If additional calls can be routed to an outsourcer for $20/call, it is important to know the value of routing that call.  In general, the algorithm should prefer routing to agents with the lowest variable cost.

Once the match score criteria have been determined, an example matrix can be set up.  In this case, we are attempting to assign four incoming calls to three agents.  We consider each criterion for the match score and evaluate a base match score for all agents for all calls.  [We will continue to build on this same matrix when we introduce the timeline modifier.]

image

We calculated the match score by multiplying one times each of the values for the match/non-match values as appropriate for each match factor.

Timeline Modifier

Now that we have a concept of a base match score, we need to introduce a timeline modifier.  The timeline modifier ensures that calls with poor match scores across the board eventually get picked up.  How compressed the timeline modifier is depends upon your business model.  If you wish to have a good match and have a high call volume, a longer timeline may make sense.  If you don’t care about match and just want to ensure that calls are picked up, a more compressed timeline may make sense.  You could even replicate a FIFO queue by using an extremely compressed timeline modifier.  We currently use this equation to calculate the timeline modifier: if the call is not yet ready to be transferred, we divide 1.25 by the number of minutes until transfer.  If the call is ready to be transferred, we add three to the number of minutes it has been ready for transfer.  If we treat negative numbers as calls that are ready to be transferred and positive numbers as calls that are not ready to be transferred, extending our example above yields the following timeline modifiers for our calls.

image

One of the most interesting things about the timeline modifier is that it really boils down to just another match factor, but we treat it differently for two reasons: first, it’s more critical that we have a smooth range rather than buckets when referring to the amount of time a call has been on hold.  Second, business people understand timelines to be distinct from other match criteria.

Modified Match Score

We then use the timeline modifier to calculate the modified match score.  Multiplying the base match score by the timeline modifier yields the modified match score, as shown:

clip_image001[7]

Call Assignment

Finally, we assign calls by maximizing the sum of agent-call matches.  In this case, our sum is maximized at 55.8125 by assigning Call 4 to Agent 1, Call 2 to Agent 2, and Call 3 to Agent 3.

Implications and Conclusion

There are a number of implications to how we have assigned calls.  Note that calls 2 and 3, which are not ready to be transferred, are already assigned to an agent by the algorithm.  We distinguish between optimal match and actual assignment.  For actual assignment, we prevent calls more than two minutes from transfer from being assigned.  We could achieve the same thing by tweaking our timeline modifier equation to yield a different timeline modifier.  This brings up perhaps the most important differentiator of this algorithm, however.  Because we consider calls that aren’t yet ready to be transferred, we have some level of predictive ability that allows a better call-agent match.  If we go back to part 1 of this post, it is easy to see why predictive ability is important.  Rather than just looking at the here-and-now, we look at the soon-to-be and are able to reserve agents whose skills match with calls that will soon be ready to transfer.  The key to getting this information is to have the IVR send the ACD periodic notifications of calls en route, their current disposition and probable end state.  Finally, it is important that we consider not only the best agent for the call, but also the best call for the agent.  Looking at the call assignment grid above, you’ll note that we assigned Call 3 to Agent 3.  However, Call 3 had a higher match score with Agent 1.  The reason we assign Call 3 to Agent 3 is because Agent 1 has a better match with a different call, meaning that Agent 3 gets Call 3 by process of elimination.

The result of our experimentation with OCS has been this: last year, we struggled for months to hook into UCCX’s ACD in order to direct calls to the right destination based solely on one piece of information.  In our pilot with OCS, we were able to achieve multi-factor routing in a matter of days for a small fraction of the cost we incurred last year.  It is entirely accurate to say that Office Communications Server does not ship with an ACD.  The sleeper, however, is that they do ship a platform that is simple to hook into and allows development of a very complex and highly customized ACD that fits your business model.  For us, unless we find a blocking problem with OCS the choice is simple.

In future posts, we will start to lay out a high-level architectural diagram of how the various pieces work, where the messaging links are, and any gotchas that we find.

Interact 2008

INTERACT2008BlogArt2

This morning, I received an invitation to Microsoft’s Interact 2008 conference that is taking place from April 8-10 in San Diego, California.  Those of you who have been monitoring this blog for the past month know that I’ve been on the very-fast track to learning about Microsoft Unified Communications.  Although we primarily are developing against Speech Server and Office Communications Server at this point, we envision writing some code against Exchange Web Services to bring e-mails into our CRM also, truly unifying communications.  What gets me fired up is the fact that we truly have the ability to do whatever it is that we want to do.  I’ve said it before and I’ll say it again: what I like most about Microsoft is that they give you a solid foundation, a blank slate, and then they step back and  get out of the way.  Our biggest problem is convincing the business people in our company that even though the sky is the limit, we need to contain things at a more manageable level.

If you’re headed to Interact 2008 and you work in the contact center space or are interested in call distributors, I’d love to take some time to meet up with you and chat one-on-one.  If there is enough interest, we may even be able to form a small community of our own to address contact center concerns and OCS.  It seems that we could all benefit by coming together and building a solid community.  I will continue to blog about our experiences in developing a call distributor, blended predictive dialer, and other software, but we need to get more voices out there.  Feel free to ping me on my blog through comments or by e-mailing me at definitejunkmail (yes, definitejunkmail) on GMail.  I’ll reply back with my work e-mail so that we can stay in touch.

Looking forward to meeting some of you!

OCS Roles Primer, Part 1

One of the first things I really struggled with when ramping up on Microsoft Office Communications Server 2007 was understanding roles, their responsibilities, and how roles overlap.  What roles do I need?  How many of them do I need?  Can an edge role be installed on a mediation server?  What’s more, Microsoft tosses out diagrams like the following with very little explanation:

The information that is available out there about OCS server roles is scattered across many different documents, Web sites, and blog posts.  Hopefully, this post will distill information on the roles available in OCS, their coexistence constraints and hardware/software requirements.  That said, let’s get a little bit of vocabulary out of the way first.

  • Office Communications Server: A Microsoft product designed to facilitate communications both inside and outside the office.
  • Presence: A metric that takes into account both your availability (available, idle, away) and your willingness (available, busy, on a call) to communicate.
  • Endpoint: Any device (SIP phone) or software package that registers itself with Office Communications Server as belonging to a user, meaning that the user can be contacted through the device or software package.
  • Enterprise Voice: Probably the most noteworthy addition to the product since 2005; allows calls to enter and exit Office Communications Server.  This means that from any endpoint, users can make or receive calls to traditional phone numbers.
  • Public Switched Telephone Network (PSTN): The traditional telephone network that delivers telephone service over dedicated copper cables.

 

Pool Server Roles

In enterprise-class deployments, multiple servers are typically assigned the traditional front-end server roles.  Although this is sometimes referred to as a cluster, the term that has been assigned in this case is “pool”.  As an aside, I would recommend that any business who is seriously considering OCS for Enterprise Voice use an enterprise-class deployment even if scalability is not a concern.  Resiliency is also something you gain when you deploy multiple servers playing the same role.

Front-End Servers

The front-end server handles a lot of the plumbing for OCS.  It authenticates users against Active Directory, routes invitations to appropriate endpoints, and generally coordinates all other features of OCS.  It is important to note that once a front-end server has assisted in routing the invitation to appropriate endpoints, it gracefully steps back and allows the endpoints to communicate directly.  This means that a front-end server will not be overwhelmed by handling thousands of simultaneous video chats.  It merely directs the invitations and coordinates the set-up of communications, then allows the endpoints to communicate however they may choose.  (The exception to this rule is instant messaging sessions, which always travel through the front-end server for archiving purposes.)  It is also important to note that a front-end server may handle merely these responsibilities or more if the deployment team decides to consolidate multiple roles onto the front-end server.

Specifically, the front-end server:

  • Authenticates users against Active Directory
  • Aggregates and disseminates presence, contact, and other similar information
  • Routes invitations to appropriate endpoints (or gateways) and cancels other invitations once an endpoint has accepted
  • Tracks status of sessions (even though the session communicates from endpoint to endpoint) and delivers status update messages like typing, ringing, accepted, etc.

Requirements:

  • Dual processor, dual core 2.6GHz+ processor
  • 2 x 18GB HDD
  • 2GB RAM
  • Gigabit NIC
  • Windows Server 2003 SP1+*

Directors

Directors have two primary responsibilities: user authentication and direction.  User authentication comes in especially handy when dealing with remote users.  Because remote users cannot be redirected to their home server/pool, the director role proxies their connection through to the correct server/pool.  However, user authentication by a director is also important when an enterprise hosts multiple standard servers or enterprise pools.  Users are “homed” to a server/pool, meaning that that server/pool is the one that stores the users account information.  A director will get the user to the right pool on the first try rather than repeatedly polling every server/pool to see if the user’s account is stored there.  In summary, directors are always recommended when you have remote users and/or multiple standard servers or enterprise pools.

Conferencing Servers

Conferencing servers, in contrast to front-end servers, are designed to maintain a central hub for communications when more than two parties are involved.  This minimizes bandwidth requirements for the conference by maintaining one open channel per endpoint.  Conferencing servers also register and update “focus” on front-end servers.  “Focus” means security and state management for conferences.  Conferencing servers come in two major flavors: A/V conferencing servers, which facilitate audio and/or video conferences, and Web conferencing servers, which facilitate Live Meeting 2007 conferences.

The IM conferencing server:

  • Controls focus for three or more IM participants
  • Focus includes:
    • Participant list tracking
    • Leader determination
    • Security and management controls
  • Is installed by default on all front-end servers and cannot be installed separately

The A/V conferencing server:

  • Enables three or more parties to have audio and/or video chats (two parties use a point-to-point connection)
  • Reduces bandwidth requirements by mixing audio from other participants into one channel instead of n channels

The Web conferencing server:

  • Allows application and file sharing
  • May leverage an A/V conferencing server to distribute audio/video in a Live Meeting

The Telephony conferencing server:

  • Provides functionality for exposing an audio conference to the PSTN
  • Allows the organizer to
    • Mute everyone except the presenter
    • Mute themselves
    • Remove parties
  • Is installed by default on all front-end servers and cannot be installed separately

Requirements**:

  • Dual processor, dual core 3.0GHz+ processor
  • 2 x 18GB HDD
  • 4GB+ RAM
  • Gigabit NIC
  • Windows Server 2003 SP1+*

Archiving and CDR Server

The archiving and CDR server is designed to provide a first-shot at archiving instant messaging sessions and call detail records for compliance purposes.  Although this is Microsoft’s stated intent for the role, it’s hard for a layman to imagine how an out-of-the-box solution could really meet the compliance needs of any reasonable organization.  I have not been able to locate any customizable fields, which means that we have a great track record of who called whom at what time, but that’s pretty much it.

Requirements:

  • Dual processor, dual core 2.6GHz+ processor
  • 2 x 18GB HDD
  • 4GB+ (16GB+ if archiving is enabled) RAM
  • Gigabit NIC
  • Windows Server 2003 SP1+*

* I was not able get the OCS primary installer to run successfully on Windows Server 2008 RTM.  It may be that the individual installers would run successfully, but I have not confirmed this.  The only role I have successfully installed on Windows Server 2008 is Speech Server 2007.

** The work of mixing audio channels is intense; A/V conferencing servers will benefit from more robust hardware.