Office Communications Server Deployment, Day 9

Note: Sorry this wasn’t posted sooner, there was a bit of a shake-up internally as we tried to decide what all was appropriate to post.  I’ve had this post ready for a few days now and have just been waiting for definitive answers from my management.  This post represents nearly complete OCS deployment.  By the time it ends, we have Enterprise Voice complete.  The remaining things we will deploy are the archiving server, the QoE monitoring role, and edge servers.

1:07 PM : Creating UM Dial Plan

image

Note: there are three important things here.  The first is the dial plan name.  You’ll see that when I create the location profile in OCS that the name is slcutloc.extendhealth.com.  That must match.  Second is the URI type – it must be SipName for OCS integration.  The last thing is VoIP security, which should be Secured for OCS.  (Secured > SipSecured)

 image

 image

Have to add the dial plans to the UM servers – both mail1 and mail2.

image

image

image

image

1:20 PM : Running ExchUCUtil.ps1

image

image 

Verified IP gateways.  If there were more, I’d have to disable them.

image

1:31 PM : Creating Location Profiles

I’m not going to comment on this much as there is a lot to say.  Screen caps should be sufficient to let you know what I’m doing.

image

image

image

2:07 PM : Running OcsUMUtil.exe

The last step is to integrate from the OCS side by running OcsUMUtil, which creates OCS objects for the auto assistant and subscriber access numbers in Exchange UM.  This facilitates access to these numbers from Communicator.

image

image

image

image

image

2:10 PM : Assigning a Default Location to the Pool

image

image

image

2:15 PM : Configuring Mediation Servers

image

image

image

2:22 PM : Configuring Policies and Phone Usages

image

Office Communications Server Deployment, Day 8

8:08 AM : Loopback Fix

I’ve been here for a while, catching up on some of my non-blog communication, MBA coursework, etc.  About ten minutes ago, I started testing a probable fix for the validation error I had last night.  Just as a reminder, that validation error looked like this:

clip_image00161_thumb1

The fix is recorded in Appendix D of the Office Communications Server 2007 Enterprise Edition and Communicator 2007 Deployment Guide.  In a nutshell, you need to add a multi-string value to HKLM\SYSTEM\CurrentControlSet\Control\Lsa\MSV1_0.  The MSV should be named BackConnectionHostNames and should have a value of your pool’s FQDN.  What this does is allow IIS to validate certain FQDNs as being valid for loopback.  You’ll want to remove this value when you’re not validating, and more detail is available by reading the referenced guide.

When I followed the instructions for the fix, the validation wizard for the remaining steps executed properly.

8:16 AM : Validation Wizards

image

image

image

image

(Yes, that’s a different validation wizard.)

image

image

image

(Yes again.)

image

image

8:23 AM : Validation Results

So the current state of our deployment is that there are two validation warnings, neither of which I care about because I haven’t deployed Enterprise Voice or edge access yet.

From the Validate Front End Server Configuration wizard, we have:

image

From the Validate Web Components Server Functionality wizard, we have:

image

8:27 AM : Internal Deployment Complete

Aside from the above validation warnings, it seems that internal deployment is complete.  I do have one more warning in my Communicator client regarding Exchange Web Services, but the Exchange deployment on this domain isn’t complete yet, so it’s also expected.  The ramification at this point is that Communicator can’t automatically set my status to “In a Meeting” if I have a meeting scheduled in Outlook.

Next step is external user access, meaning I’ll be bringing up a scaled single-site edge topology.  I’ll try to explain that in more detail, but there will probably be some downtime here as I test Communicator internally and prep another couple of servers to be edge servers.  (I have to install Server 2003 at least.)

1:53 PM : Enterprise Voice

image

image 

image

image

image

image

image

image

1:56 PM : Activating Mediation Server

image

image

image

image

image

2:00 PM : Assigning Certificates

image

clip_image001

image

image

image

image

image

image

image

image

3:16 PM : Enterprise Voice Prep

I’ve been reading (and will continue to read through) the Microsoft Office Communications Server 2007 Enterprise Voice Planning and Deployment Guide.  This will probably take the rest of the day and will ensure that I make minimal mistakes when deploying Enterprise Voice.  I have a good idea of what it is that I need to do, but I want to be certain.

Office Communications Server Deployment, Day 7.5

All of these steps and screenshots were performed late last night.  I’ll fill in commentary now (morning of Day 8).

Back Story

I was crushingly disappointed when Microsoft told me that I’d have to reinstall my entire PKI because the hashing algorithms I used were for a Cryptography Next Generation (CNG) CSP, not a CryptoAPI Version 1 CSP.  Knowing what I know now, I can see some allusions to that on pp. 158-159 of Brian Komar’s book.  Before I left work yesterday, I e-mailed Brian and explained my situation and that I was on a support call with Microsoft.  I then updated him via e-mail of their response (“it’s not supported) and the fact that they were closing the support case.

He sent this response:

Mark,

There is a security update that will allow XP and 2003 clients to validate certificates that implement SHA-2 signatures.
The update is included in Windows XP service pack 3.
Per the release notes for service pack 3:

Microsoft Cryptographic Module

Implements and supports the SHA2 hashing algorithms (SHA256, SHA384, and SHA512) in X.509 certificate validation. This has been added to the crypto module rsaenh.dll.

XP SP2 crypto modules Rsaenh.dll/Dssenh.dll/Fips.sys had been certified according to FIPS 140-1 specifications. The Federal Information Processing Standard (FIPS) 140-1 standard has been replaced by FIPS 140-2, and these modules have been validated and certified according to this standard. For more information, see the Microsoft Kernel Mode Cryptographic Module.

You cannot create these certs in 2k3, but you would be able to validate them.

Brian

Based upon that hope, I went out and did some strategic searching and came across this KB: http://support.microsoft.com/kb/938397.  After an hour of waiting on hold while some (nice enough) tech researched the history on my support case, I was finally given a link to download the hotfix.  Note that there is a link there to register for the hotfix also, which I did, but was told that it would take up to 24 hours.  It actually took about two hours. 

Hotfix in hand, I patched the server and all the certificates looked great!  There were still a couple of strange artifacts with how I had to request certificates, but I was able to do it without incident.

Now that the back story is complete, I’ll try to recreate the timeline as best I can based upon the timestamps in my screencaps.  Thanks, OneNote!

8:50 PM : Assigning the Certificate to IIS

This is where things went awry yesterday.  If you want to know what to do to get to this point, read that post.

clip_image001

clip_image001[4]

clip_image001[6]

clip_image001[8]

clip_image001[10]

8:52 PM : Starting Services

I’m deviating here from the norm of not including the wizard starts in the screen captures.  The final screen of a wizard generally has useful information (like success, hopefully), but the start of a wizard usually just says what it is you’re doing.  Since I generally label what it is that I’m doing already, I had been skipping the first screen for the wizards.  At this point, however, the wizards start to blur together, especially in the validation phases.  Therefore, I’m going to include some wizard start screens if I can to differentiate the wizards.  (That said, I think I noticed last night that all the validation wizards start with the same screen anyway.)

clip_image001[12]

clip_image001[14]

clip_image001[16]

clip_image001[18]

clip_image001[20]

clip_image001[22]

9:29 PM : Server/Pool Validation

[Delay reason: had to put my son to bed.]

clip_image001[27]

Oops… in order to validate the server and pool functionality, I need a couple of user accounts to be enabled for Office Communications Server.  The trick to this is that you have to use Active Directory Users and Groups to enable the users, but you also have to have the OCS Administrative Tools installed on that computer.  Because my domain controller is Server 2008, I can’t install the OCS Administrative Tools there (and be supported).  In this case, I just opened an MMC on ocsfe1, added the Active Directory Users and Groups snapin, and connected to the extendhealth.com domain.  Right-clicking on users now exposes the following option:

image

clip_image001[31]

clip_image001[33]

clip_image001[35]

image

Now that the users are enabled, I can see them if I open the Office Communications Server snapin (Start > All Programs > Administrative Tools > Office Communications Server 2007).

image

9:36 PM : Back to Validation

clip_image001[41]

clip_image001[43]

clip_image001[45]

image

clip_image001[49]

Note that I didn’t check test connectivity of federated users because I don’t have external access yet.

clip_image001[51]

clip_image001[53]

clip_image001[55]

clip_image001[57]

clip_image001[59]

This was the only warning I had.  Since I haven’t deployed Enterprise Voice yet, I’m not concerned about this warning.

11:15 PM : More Validation

I think I took some time before this screenshot to correct some previous validation errors, but I can’t recall very clearly.  I do want to note that I ran into some validation errors last night, as the following screenshot shows:

clip_image001[61]

I believe this particular screenshot is an artifact of a known issue with IIS loopback, so I’ll try to fix it this morning.  I didn’t think it was important last night since I recalled how to deal with it (although not the specific steps) and since the server and pool validated okay.

11:23 PM : The Payoff

clip_image001[65]

Enough said.

Office Communications Server Deployment, Day 7

8:33 AM : Picking Up Where We Left Off

As you may recall, I ran into an issue last night just before I left because I didn’t have the SQL client tools necessary (specifically the SQL 2005 Backwards Compatibility Pack and the SQL Native Client) installed on my front end server ocsfe1.  I did try installing the tools this morning to no avail – unfortunately I wasn’t even getting a good quality error message, just “Pool backend discovery failed” – the same message I posted yesterday.

I’m pursuing a workaround at this point for two reasons:

  1. I need to keep the ball rolling.  I have to get the internal deployment completed today.
  2. I’m planning to move the database to an official cluster anyway, per the directions in the Admin guide for moving the backend database for an Enterprise pool.

Primarily because of reason two, I don’t feel bad about installing SQL locally for a short time period (<1 month) until our cluster is ready to support the Enterprise pool.  As with other cautions I’ve offered, this isn’t recommended.  For me, it’s just real life.  To achieve the goal I want, I’ve created a CNAME (alias) in DNS to tell my computer that dbcluster1 is currently the same as ocsfe1.  I’ve also installed SQL Server 2005 Standard Edition SP2 32-bit locally.

8:39 AM : Creating the Enterprise Pool

image

image

image

Two notes here:

  1. We specified a different internal web farm FQDN because we may eventually move to an expanded configuration, and having a different FQDN may facilitate that transition.
  2. The planning documentation states that if you don’t specify an external web farm FQDN at this point, you’ll need to use the command line utility later.  Usefulness of command line utilities notwithstanding, I’d rather specify it now since I know what it is.

image

image 

Another note: our database files will be going onto a SAN with the transition to the database cluster.  If you aren’t storing your database files on a SAN, you’ll want to make sure the database and log files are on different spindles (different physical volumes).  This is basic database optimization, not an OCS thing.

image

I didn’t enable meeting archiving yet as it probably requires the Archiving and CDR role, which doesn’t exist yet in my infrastructure.  I’m quite certain you can enable this later, so I’ll skip it for now.  I have put the path in, however, so that you can see what I would be using if I were to enable it right now.

image 

image

Archiving is not enabled for the same reason listed above.

image

image

image

image

image

Ugh.  I made a mistake early on in the wizard – my pool is named ocspool.extendhealth.com, not pool.extendhealth.com.  I think I can probably fix this later, so I’ll keep going for now.  There were no other warnings in the log.

8:59 AM : Configuring Enterprise Pool

image

image

image

There’s the wrong pool name I mentioned above.

image

  Pros Cons
DNAT > 65,000 users Increased difficulty of configuration
SNAT Easy configuration < 65,000 users

image

image

Note: Only one pool or server can authenticate automatic logon requests.

image

image

I’ll definitely be configuring external user access, but two things are stopping me from doing it right now:

  1. I want the edge deployment to be distinct from the pool deployment for my own sanity and anyone’s sanity following along with this thread.
  2. I think the only way you can configure your edge topology right now is if you’re migrating from LCS 2005 R2? and already have an edge topology deployed.  I’m not certain on that, I just think that’s what I recall.

image

image

image

9:10 AM : Adding Ocsfe1 to Pool

So far so good this morning – everything seems to be turning out okay aside from my dumb mistake with the pool name and the issues with the pool backend.  I’m now ready to add ocsfe1 to the pool as the first front-end server.

image

image

image

image

image

(Takes a while.  Lots of time for screen captures.)

image

Apparently Microsoft thinks it’s funny to continually remind me of my mistakes.

image

Yes, the password really is that long.  As a reminder (I think for the third time), I use WinGuides Password Generator to generate passwords for service accounts.

image

image

image

image

image

Same warnings as before:

image

Aside from that error being in the logs about 20 times, there were no other errors.  I think I’m still okay.

9:30 AM : Fixing the Pool FQDN

Before I proceed any further, I want to correct the pool FQDN.  I’ve been warned sufficiently.  As part of installing the Front End role, the administrative tools for OCS were installed.  I’m opening them from Start > All Programs > Administrative Tools > Office Communications Server 2007.

image

9:36 AM : ???

Wow … http://forums.microsoft.com/unifiedcommunications/ShowPost.aspx?PostID=2931495&SiteID=57

Apparently I’ll be removing the pool and creating it all over again.  Hope that goes okay.

image

image

image

Lesson learned: get the pool name right in the first place.

9:44 AM : Configuring Certificates

Well, at least it didn’t take too long to get back on track.  For this next step, please note that there are two distinct steps.  The Web Components role requires its certificate to be manually configured in IIS.  The rest of the Front End roles have a wizard.  I’ll deal with the wizard first, then IIS.

image

image

Because I have a PKI deployed, I can opt to send the request to an online certification authority (Active Directory will help me locate one).

image

In this case, we don’t care if the cert is exportable, but I left the box checked anyway.  We also don’t care about client EKU – the only place that matters is for the certificate assigned to the external interface for the Access Edge role.

image

image

I chose to include the local machine name in the SAN here.  If you’re configuring automatic client logon, the SAN must also contain sip.<domain>.  In my case, it was automatically populated because of the choices I made in earlier wizards to enable automatic client logon.

image

image

image

… and … I accidentally clicked through the next screen, so I think it succeeded but I’m not 100% certain.

image

image

image

image

Well, I got that far before realizing that the prior wizard had actually failed.  It has something to do with Server 2003 not recognizing the authenticity of the certificate chain.  My PKI is completely implemented with Server 2008, so I guess it’s time to go research what to do.

3:22 PM : Square 1

As if there weren’t enough blocks already…

I just got off the phone with Microsoft support.  The certificate issue is “by design”.  In this case, I interpret “by design” to mean, “We knew about the problem but haven’t taken the initiative to fix it.”  The specific issue is that Server 2003 and Windows XP don’t support certificate chains with algorithms > SHA1.  Since my root CA had a SHA512 thumbprint, and my other CAs had a SHA256 thumbprint (per NIST guidelines), Server 2003 barfed.

Generally speaking I’m very happy with Microsoft.  Today, I’m not.  Off to rebuild the PKI from scratch…

Office Communications Server Deployment, Day 6

I spent the entire day yesterday dealing with administrative and management issues.  As such, there was nothing to report.

5:35 AM : Amber Alert (Ex post facto)

This morning, I arrived at our data center to finish up some final issues remaining from the previous day.  Installing all of this new equipment has caused heartburn, to say the least.  The IP KVM we have (by Avocent) is not particularly incredible and has been on the fritz since Sunday, meaning that I couldn’t remote control any computers to install them from the office.  That said, the plan this morning was to bypass the IP KVM, install a couple of servers with Windows Server 2003, and head back to the office to actually start on the OCS deployment steps past planning complete.  Upon arrival, however, I immediately noticed that I didn’t get an IP address from our DHCP server there.  The second thing I noticed was that all of our slave switches in the enclosures appeared dead.  The third thing I noticed is that the consoles on the front of the blade enclosures were amber.  In case you’re not a network admin (which I’m not any more, but experience has taught me), amber = bad.

It turned out that overnight, our data center had a significant A/C failure and had caused lots of problems.  This isn’t a small data center, it’s enterprise class.  A failure like this hasn’t happened in the entire history of the facility.  Of course it would have to happen while I’m trying to deploy OCS: administrator’s law.

12:00 PM : Amber Remediated (Ex post facto)

By noon, we had the issues straightened out at the data center.  I should note here that Dell wasn’t particularly well trained on our equipment, which is brand new (in the sense of recently released to manufacturing).  It turned out that our Cisco switches had overheated and shut themselves down as a protective measure.  Reseating the switches finally resolved most of our problems there.  On the plus side, the work with fixing the amber alerts also somehow fixed the IP KVM.

Back at the office, I was finally able to deploy Windows Server 2008 (for an Exchange deployment) and Windows Server 2003 to servers.  The current deployment toolset is using Microsoft Deployment as I was never able to get Configuration Manager 2007 running properly.

2:28 PM : Windows Server 2003 R2 with SP2 Deployment Complete

After working through several minor driver issues, I was just able to finish deploying Windows Server 2003 R2 (with SP2) via Microsoft Deployment.  There were actually two different Broadcom drivers necessary, and I had to be sneaky about where I put one of them.  If you happen to run into issues with a similar situation and need help, you can submit a comment here, but I don’t feel the need to detail what I did – it’s time to get into OCS, finally!

2:40 PM : Planning Recap

Since there were some final adjustments to several IPs internally, I’ll repost the planning table I posted last week with the updated IPs.  If you can’t see it all, just copy and paste it into Excel.

Edit: Removed planning table

2:50 PM : Created A Records

I just created the A records for ocspool, ocsmeetings, and ocsmeetingsext.  Note that certain parts of the planning documentation are pretty picky about whether these are A or CNAME records.  I was also under the impression that I needed to create a sip.extendhealth.com A record, but can’t find mention of it in the planning docs for now, so I’ll skip it until it becomes a problem.

2:54 PM : Crashed MMC 3.0

It might be just me, but the MMC 3.0 seems particularly unstable.  I just tried to add the SRV record for automatic configuration (_sipinternaltls._tcp.extendhealth.com) and the MMC crashed.

2:57 PM : Created SRV Record for Client Automatic Configuration

Note: this record gets created in the Forward Lookup Zones/<domain>/_tcp node.

clip_image001

2:59 PM : Finishing Updates

The ocsfe1 server will be the first server to come up (be added to the pool).  It’s currently finishing some updates, which is why I’ve been picking away at DNS requirements.  I should also note (if you didn’t read the posts from last week) that I have a PKI infrastructure in place to deal with the certificate requirements.

The one other critical thing I should highlight while I wait is that we expect some load balancers within two weeks.  The VIPs referenced above would normally be assigned to the load balancer.  For now, since we’re still missing this hardware, I plan to proceed with deployment as if they already existed.  In order to (hopefully) fool OCS, I plan to assign the IP address that will be assigned to the VIP to ocsfe1 (temporarily).  That means that ocsfe1 will currently have the following three IPs: 10.10.3.1, 10.10.3.51, 10.10.3.53.  Please note that this is almost certainly not the recommended course of action, and I’m only ignoring my own advice out of necessity.  When the load balancer comes in, I’ll assign the VIP IP to it, remove it from the server, and rerun the validation wizard and the best practices analyzer.

3:08 PM : Creating File Shares

Another thing you need to do before deploying OCS is set up some file shares that will store (mostly) Live Meeting related files.  I have set up four shared folders on my file server: OCS\AddressBook, OCS\MeetingArchive*, OCS\MeetingContent, and OCS\MeetingMetadata.

* Optional, will only need this if archiving and CDR archives meetings.

3:20 PM : Installed IIS

Since I will be deploying an OCS Enterprise Pool, Consolidated Configuration, I installed IIS from the Add Role wizard.  I didn’t enable ASP.NET as I don’t think OCS uses ASP.NET.  (The planning documentation says you need ASP, however.)

3:30 PM : Opening the Setup Wizard

I think I’ve completed all the prerequisite steps for OCS installation and am opening the setup wizard for the first time.  I’ll try to take as many screenshots as are relevant through the installation process.

3:32 PM : Preparing Active Directory

clip_image001[5]

clip_image001

clip_image001[11]

clip_image001[13]

clip_image001[15]

image

image

clip_image001[17]

image

image

(Snipped for some semblance of brevity.)

image

(This wizard happened too fast to even grab a screen cap of the process.)

image

3:45 PM : Active Directory Prepared

Everything went flawlessly (or at least apparently so) in the Active Directory preparation phase.  I’m now ready to create the Enterprise Pool.  The one thing I think I might need here is user accounts that I haven’t created yet.  I create my passwords from the WinGuides Password Generator for security’s sake.

3:47 PM : Creating Enterprise Pool

As with above, relevant screenshots.

image

image

image

Curses!  The first error.  I just forgot to install the SQL client tools.

4:14 PM : SQL Client Install

image

4:30 PM : EOD

Unfortunately, that’s where it’s going to have to sit for tonight.  Hopefully will be able to finish off the pool by mid-morning tomorrow, barring the type of disasters that happened today.

Office Communications Server Deployment, Day 2

Edit (21 May 2008):

Apparently encryption algorithms > SHA 1 will prevent any Server 2003 or less, or Windows XP or less, machine from obtaining a certificate.  I implemented my root CA with a SHA512 hash algorithm and my subordinate CAs with a SHA256 hash algorithm.  I now get to redeploy the entire PKI.

6:41 AM : Back to Work

I actually got here over an hour ago, but have been catching up on my e-mail and such.  It’s almost 7am, and I’m ready to tackle Configuration Manager again.  There will probably be fewer updates today as I think I need to spend some time watching a few Webcasts this morning.  I checked briefly into the error with the Management Point and it seems like I recall something I needed to do in Active Directory with permissions.  I’ll try to find that and fix that problem first.

7:31 AM : Coffee Break

Taking a breather from the Webcast I’m watching (on System Center Configuration Manager 2007 SP1 and R2 upcoming releases) to grab a small cup of coffee.

7:37 AM : PKI

Another tangent, but I received the number I was waiting for to deploy our PKI.  I’m going to deviate from Configuration Manager long enough to get the PKI going, and then when I get around to it I can switch Configuration Manager over to Native Mode instead of Mixed Mode.  Again, I’m using Brian Komar’s Windows Server 2008 PKI and Certificate Security to make sure I follow updated best practices for deploying the PKI.

8:02 AM : Root CA capolicy.inf

I’m using the following configuration to initialize my enterprise root CA:

[Version]
Signature = “$Windows NT$”

[BasicConstraintsExtension]
PathLength = 3
Critical=true
[Certsrv_Server]
RenewalKeyLength = 4096
RenewalValidityPeriodUnits = 20
RenewalValidityPeriod = years
CRLPeriod = days
CRLPeriodUnits = 7
CRLDeltaPeriod = hours
CRLDeltaPeriodUnits = 4
DiscreteSignatureAlgorithm = 1

8:20 AM : Root CA Installed

I’m using the following script after installation to guarantee settings:

::Declare Configuration NC
certutil -setreg CA\DSConfigDN CN=Configuration,DC=extendhealth,DC=com

::Define CRL Publication Intervals
certutil -setreg CA\CRLPeriodUnits 52
certutil -setreg CA\CRLPeriod “Weeks”
certutil -setreg CA\CRLDeltaPeriodUnits 0
certutil -setreg CA\CRLDeltaPeriod “Days”
certutil -setreg CA\CRLOverlapPeriod “Weeks”
certutil -setreg CA\CRLOverlapUnits 2

::Apply the required CDP Extension URLs
certutil -setreg CA\CRLPublicationURLs “1:%windir%\system32\CertSrv\CertEnroll\%%3%%8%%9.crl\n10:ldap:///CN=%%7%%8,CN=%%2,CN=CDP,CN=Public Key Services,CN=Services,%%6%%10″

::Apply the required AIA Extension URLs
certutil -setreg CA\CACertPublicationURLs  “1:%windir%\system32\CertSrv\CertEnroll\%%1_%%3%%4.crt\n2:ldap:///CN=%%7,CN=AIA,CN=Public Key Services,CN=Services,%%6%%11″

::Enable all auditing events for the Extend Health Root CA
certutil -setreg CA\AuditFilter 127

::Set Validity Period for Issued Certificates
certutil -setreg CA\ValidityPeriodUnits 10
certutil -setreg CA\ValidityPeriod “Years”

:: Enable discrete signatures in subordinate CA certificates
Certutil -setreg CA\csp\DiscreteSignatureAlgorithm 1

::Restart Certificate Services
net stop certsvc & net start certsvc

certutil –crl

8:26 AM : Root CA Configuration Complete

Everything seems good on the root CA, moving on to the policy CA.

10:21 AM : Back on Task

I was distracted for a couple of hours talking to Microsoft and taking care of some tasks around the office, but am back on task.  I just imported the certificate revocation lists onto the policy CA.  I wasn’t able to make Brian’s command line (page 125) work, so I just right-clicked the certificate and allowed them to import the way they wanted to.  I’m a bit concerned since that adds them to the user account’s stores, but we’ll see if it causes a problem.

11:02 AM : Policy CA capolicy.inf

I’m using the following configuration to initialize the policy CA:

[Version]
Signature = “$Windows NT$”

[PolicyStatementExtension]
Policies = ExtendHealthCPS

[ExtendHealthCPS]
OID = 1.3.6.1.4.1.31088.1.1
Notice = “By enrolling a certificate from this certificate server, you agree to the posted legal notice.”
URL = “http://capolicies.extendhealth.com/defaultCps.aspx”

[Certsrv_Server]
RenewalKeyLength = 2048
RenewalValidityPeriodUnits = 10
RenewalValidityPeriod = years
CRLPeriod = days
CRLPeriodUnits = 7
CRLDeltaPeriod = hours
CRLDeltaPeriodUnits = 4
DiscreteSignatureAlgorithm = 1

I also just realized that I was supposed to save capolicy.inf to the %WINDIR% (usually C:\Windows) folder, not the system32 folder.  Maybe that’s why it didn’t work last time.

11:12 AM : Policy CA Installed

I’m using the following script after installation to guarantee settings:

::Declare Configuration NC
certutil -setreg CA\DSConfigDN CN=Configuration,DC=extendhealth,DC=com

::Define CRL Publication Intervals
certutil -setreg CA\CRLPeriodUnits 52
certutil -setreg CA\CRLPeriod “Weeks”
certutil -setreg CA\CRLDeltaPeriodUnits 0
certutil -setreg CA\CRLDeltaPeriod “Days”
certutil -setreg CA\CRLOverlapPeriod “Weeks”
certutil -setreg CA\CRLOverlapUnits 2

::Apply the required CDP Extension URLs
certutil -setreg CA\CRLPublicationURLs “1:%windir%\system32\CertSrv\CertEnroll\%%3%%8%%9.crl\n10:ldap:///CN=%%7%%8,CN=%%2,CN=CDP,CN=Public Key Services,CN=Services,%%6%%10″

::Apply the required AIA Extension URLs
certutil -setreg CA\CACertPublicationURLs  “1:%windir%\system32\CertSrv\CertEnroll\%%1_%%3%%4.crt\n2:ldap:///CN=%%7,CN=AIA,CN=Public Key Services,CN=Services,%%6%%11″

::Enable all auditing events for the Extend Health Root CA
certutil -setreg CA\AuditFilter 127

::Set Validity Period for Issued Certificates
certutil -setreg CA\ValidityPeriodUnits 5
certutil -setreg CA\ValidityPeriod “Years”

:: Enable discrete signatures in subordinate CA certificates
Certutil -setreg CA\csp\DiscreteSignatureAlgorithm 1

::Restart Certificate Services
net stop certsvc & net start certsvc

certutil –crl

11:32 AM : Publish to Active Directory Complete

I just finished publishing all CRLs and relevant certificates to Active Directory so that they are still available when I take the root and policy CAs offline.  I’m taking the root CA down and beginning installation of an issuing CA.

11:50 AM : Issuing CA capolicy.inf

I’m using the following configuration to initialize the policy CA:

[Version]
Signature = “$Windows NT$”

[Certsrv_Server]
RenewalKeyLength = 2048
RenewalValidityPeriodUnits = 5
RenewalValidityPeriod = years
CRLPeriod = days
CRLPeriodUnits = 3
CRLOverlapPeriod = hours
CRLOverlapPeriodUnits = 4
CRLDeltaPeriod = hours
CRLDeltaPeriodUnits = 12
DiscreteSignatureAlgorithm = 1

11:59 AM : Issuing CA Installed

I’m using the following script after installation to guarantee settings:

::Declare Configuration NC
certutil -setreg CA\DSConfigDN CN=Configuration,DC=extendhealth,DC=com

::Define CRL Publication Intervals
certutil -setreg CA\CRLPeriodUnits 3
certutil -setreg CA\CRLPeriod “Days”
certutil -setreg CA\CRLDeltaPeriodUnits 12
certutil -setreg CA\CRLDeltaPeriod “Hours”
certutil -setreg CA\CRLOverlapPeriod “Hours”
certutil -setreg CA\CRLOverlapUnits 4

::Apply the required CDP Extension URLs
certutil -setreg CA\CRLPublicationURLs “1:%windir%\system32\CertSrv\CertEnroll\%%3%%8%%9.crl\n10:ldap:///CN=%%7%%8,CN=%%2,CN=CDP,CN=Public Key Services,CN=Services,%%6%%10″

::Apply the required AIA Extension URLs
certutil -setreg CA\CACertPublicationURLs  “1:%windir%\system32\CertSrv\CertEnroll\%%1_%%3%%4.crt\n2:ldap:///CN=%%7,CN=AIA,CN=Public Key Services,CN=Services,%%6%%11″

::Enable all auditing events for the Extend Health Root CA
certutil -setreg CA\AuditFilter 127

::Set Validity Period for Issued Certificates
certutil -setreg CA\ValidityPeriodUnits 2
certutil -setreg CA\ValidityPeriod “Years”

:: Enable discrete signatures in subordinate CA certificates
Certutil -setreg CA\csp\DiscreteSignatureAlgorithm 1

::Restart Certificate Services
net stop certsvc & net start certsvc

certutil –crl

12:10 PM : PKI Complete

For all intents and purposes, I believe the PKI deployment to be complete.  Going to lunch and then back to Webcasts.

1:50 PM : Finished Webcasts

Just finished watching a Webcast on Configuration Manager 2007 SP1 and R2, and a two-part series on using Configuration Manager to deploy operating systems.

4:09 PM : End of Day

Watched several more Webcasts and tried to fix some errors in Configuration Manager’s status view.  No luck tonight.  Will start again tomorrow.

Office Communications Server Deployment, Day 1

And so it begins.  As promised, I plan to chronicle in detail my journey through deploying Office Communications Server 2007.  The first several days will be filled with deploying supporting infrastructure.  We have made the decision to cut over from a separate internal domain name to a domain name that aligns with our external e-mail address domain and what will be our SIP domain (extendhealth.com).  All times are in MST.

11:50 AM : Current State

I should probably start by detailing my starting environment.  As I said above, we are cutting over to a new domain.  As such, we have a completely clean domain to work with in a brand new forest.  The new domain is called extendhealth.com.  I haven’t done anything to the domain outside of creating a user account for installation.  The user account is a domain, enterprise, and schema admin in anticipation of the OCS deployment (where I’ll need rights from all three groups).  In general, that’s not the recommended action.  Various tasks should be delegated to different personnel, and the permissions should be locked down much more tightly than they currently are in this domain.

The domain controller is running Windows Server 2008 Standard x64 (RTM).  It is up-to-date with all patches.  The domain controller is also virtualizing four virtual machines (VM) at this point, three of which will go offline shortly.  Two of the machines are an enterprise root certification authority (CA) and a policy server for the public key infrastructure (PKI).  Both of these machines will be taken offline and only be brought online when the issuing CAs need to have their certificates renewed.  [Side note: I knew quite a bit about setting up a PKI before starting this process, but am referencing Brian Komar's Windows Server 2008 PKI and Certificate Security for any questions I have.]  The PKI is currently pending a private OID request from the IANA.  This will allow us to use certificates in a manner that caters to external publishing of those certificates.  The certificates will not be used for public certificate chains – we have a wildcard certificate for that – but the official OID makes it easier to publish certificate policies and have them accepted by other parties.

The other machine that will go offline shortly is a temporary database server running SQL Server 2008 x64 February CTP.  When the next CTP comes out, we will install a database cluster and move our databases to the cluster.  In case I need to reference it, the name of this server is currently dbcluster1 (even though it’s not clustered).  The SQL Server install is complete and has all features installed.  I used a service account with a 64-character password generated by an online password generator.  Because these passwords are so random, we actually store them in a database that is particularly locked down.  Only one or two people in our entire organization have access to this database.

12:07 PM : Configuration Manager Intro

I’m currently looking at the pre-validation screen for System Center Configuration Manager 2007.  Configuration Manager is the fourth virtual machine (the only one that won’t go offline) on the domain controller.  I’ve allocated four processors and eight gigs of RAM for this server, which is currently named mgr1.  The Configuration Manager installation isn’t related to OCS; it’s more of a general infrastructure setup that I want to get out of the way.  We will be using Configuration Manager to deploy operating systems and updates.  I picked up a book on Configuration Manager the other day at the Microsoft Company Store.  (I was up in Redmond for the mid-sized market CIO summit.)  The plan is to set up very simple deployment of Configuration Manager before deploying OCS.  I have to deploy operating systems all along the way, so there’s no telling how much time it will actually take.  Configuration Manager should facilitate the deployment of those operating systems and their updates.  My naive estimate would be that it will take today and tomorrow to install and play with Configuration Manager, and then deployment of OCS will start Wednesday.  I feel very well prepared on my OCS deployment.  Configuration Manger scares me, however.  The deployment doesn’t seem especially streamlined, meaning I may make a significant mistake and have to redeploy.  We’ll see how things actually turn out.

12:13 PM : Configuration Manager Pre-validation Run 1

I haven’t done anything at all to this server (other than updates, joining to the domain, and enabling Remote Desktop).  It’s listing two warnings (I haven’t run the schema extensions and I don’t have the WSUS SDK) and several errors related to IIS, BITS, and WebDAV not being installed/running.  The only error that actually surprises me is the SQL Server sysadmin rights error.  Aside from that, I’m going to fix the other errors before looking into that one.

12:22 PM : Fixing Validation Errors

I installed a default installation of the IIS and Application Server Roles, and am downloading/installing Windows Software Update Services (WSUS) 3.0 SP1 from http://www.microsoft.com/downloads/details.aspx?FamilyId=F87B4C5E-4161-48AF-9FF8-A96993C688DF&displaylang=en.  I’m also downloading and installing the 64-bit version of WebDAV for IIS7.  Last but not least, I’m deciding whether or not I need to extend the Active Directory schema by reading this article.

12:34 PM : Schema Extensions and WSUS

I’ve decided to extend the schema per Microsoft’s recommendation and am adding some functionality to IIS per WSUS installation requirements in the ReadMe.  This is what I need to verify is enabled/installed in IIS:

  • Windows Authentication
  • Static Content
  • ASP.NET
  • 6.0 Management Compatibility
  • 6.0 IIS Metabase Compatibility
  • I also noticed the BITS Server Extensions wasn’t enabled when I went into the Features part of Windows Server 2008, so I enabled them.  After installing those pieces, WSUS still alerts me that I don’t have the Microsoft Report Viewer 2005 Redistributable installed, but I don’t care about that until I need it.

    12:55 PM : Installing and Updating WSUS

    Still working on installing WSUS.  I had to provision a drive for updates on our SAN, which took a bit, and I’ve worked through all the other issues that I know of.  The installer is currently running.

    1:26 PM : Lunch

    Frustrated.  WSUS installed successfully and BITS and WebDAV certainly seem to be installed, but the Prerequisite Checker doesn’t seem to see them.  Rebooting and breaking for lunch.

    2:02 PM : Back to Work

    No change on reboot.

    2:19 PM : Success!

    Extended the Active Directory schema using ExtADSchema.exe in SMSSETUP/I386.  Installed a couple of additional IIS components (WMI compatibility, console) that cleared up the errors regarding BITS and WebDAV.  All systems are go at this point, but I’m a bit leery of what will be installed on dbcluster1 (my temporary SQL Server).  I had to turn off the firewall to get all checks to pass.  I’ll re-enable it after the install is complete, but having to turn it off to get the Prerequisite Checker to work doesn’t seem like a good sign.

    2:22 PM : Configuration Manager Installation

    Step by step:

    1. Selected “Install a Configuration Manager site server”
    2. Agreed to license terms
    3. Selected “Custom settings” (largely because the book recommends it)
    4. Selected “Primary site” since this is my first (and only) site
    5. Agreed to Customer Experience Improvement Program – I want Microsoft to improve installation environment awareness
    6. Product key was read-only
    7. Left default path (C:\Program Files (x86)\Microsoft Configuration Manager)
    8. Entered site code (DC1) and name
    9. Chose “Configuration Manager Mixed Mode”*
    10. Added NAP to selected client agents
    11. Specified SQL Server (dbcluster1) and database (sccm2007_dc1)
    12. Left default location (mgr1) for SMS provider – since the database will eventually be on a cluster, I can’t install the SMS provider there
    13. Left defaults for management point (install a management point on mgr1)
    14. Left defaults for port settings (HTTP/80 since I selected Mixed Mode)
    15. Allowed checking for updated prerequisite components
    16. Specified a download path for prerequisite components

    2:33 PM : Settings Complete

    After downloading a number of unnecessary prerequisites (multiple languages for Windows XP and Server 2003, neither of which are running), settings are complete and installation is ready to begin.  Installer, however, complains that the machine account for mgr1 does not have admin privileges on the SQL Server.

    2:36 PM : Settings Complete, Take 2

    Added computer account for mgr1 to the Administrators group on dbcluster1.  Prerequisite check has passed.  Install began at 2:37 PM.

    2:40 PM : Fatal Error

    Fatal errors during database initialization.  Not sure what that means since it created the database and tables.  Some tables are also populated.  (I looked at dbo.Agents.)  Great.  I have a message that says: Setup has detected an incomplete primary site installation on this computer.  You must uninstall the incomplete installation before continuing.  Here we go.

    2:48 PM : Fatal Error, Take 2

    Again with the fatal error.  Log (C:\ConfigMgrSetup.log) says: <05-12-2008 14:38:58> ***SqlError: [42000][650][Microsoft][ODBC SQL Server Driver][SQL Server]You can only specify the READPAST lock in the READ COMMITTED (if not based on row versioning) or REPEATABLE READ isolation levels. : sp_SetupSDMPackage

    Googling it.

    2:52 PM : Not Good

    https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=329707

    Starting over with a SQL Server 2005 SP2 database.  Back in a couple of hours.

    5:24 PM : A New Error

    After installing SQL Server 2005 as the SQL2005 instance and trying to bind it to the standard SQL port (1433), I couldn’t get the Configuration Manager installer to see the instance, so I uninstalled both SQL 2008 and SQL 2005, and then reinstalled SQL 2005 and SP2 for the second time today.  That means the bulk of my time today has been spent installing and uninstalling SQL Server.  I’m now on to a new error: the error message says “Setup failed to install SMS Provider.”  Logs give me the following errors:

    <05-12-2008 17:24:14> CompileMOFFile: Failed to compile MOF C:\Program Files (x86)\Microsoft Configuration Manager\bin\i386\smsRprt.mof, error -1
    <05-12-2008 17:24:14> Setup cannot compile MOF file C:\Program Files (x86)\Microsoft Configuration Manager\bin\i386\smsRprt.mof.  Do you want to continue?
    <05-12-2008 17:24:14> Setup failed to install SMS Provider.  For more information about this error, see Microsoft Knowledge Base at http://microsoft.com or contact Microsoft Technical Support for further assistance.

    Other .mof files apparently compiles successfully before this one.  Back to Google.

    6:28 PM : Finished?

    I’ve finally made it through the wizard (it only took most of the day).  I have some pretty serious complaints.  The first would be that things like extending the schema should be part of the wizard.  The second was the problem I just spent an hour on: Kerberos issues.  I did eventually find my answer at http://myitforum.com/cs2/blogs/rcrumbaker/archive/2007/10/12/system-center-configuration-management-with-remote-sql-installations.aspx.  That happens to be the clearest explanation of a couple of really complex issues – SPNs and delegation.  We’ve had a ticket open with Microsoft for over 18 months regarding a particular Kerberos issue and have had many, many people unsuccessfully try to fix the issue.  Anyway, I had to set up two SPNs, one each for the NETBIOS and FQDNs of dbcluster1.  The commands I ran (from the domain controller) were:

    1. setspn -A MSSQLSvc/dbcluster1.extendhealth.com:1433 extendhealth\sqlservice
    2. setspn -A MSSQLSvc/dbcluster1:1433 extendhealth\sqlservice
    3. setspn -l extendhealth\sqlservice

    Two notes: first, the last command runs setspn in “list” mode, so that you don’t have to run adsiedit.msc.  Don’t get me wrong, I actually think adsiedit.msc is much better (and faster) at editing SPNs – but I thought I didn’t have it available, which brings me to my second note.  Setspn is available from the command line on Windows Server 2008 domain controllers (more accurately, computers with the AD DS role installed).  Adsiedit is also apparently available there, but doesn’t bind to your directory root by default.

    It seems to me that the Prerequisite Checker should have caught the problem if the SPNs weren’t configured properly.  Whining aside, I did make it the rest of the way through the wizard and only one thing had a red X by it: the management point.  After reviewing the log, it seems that just the monitoring of the management point failed, and when I open the console everything seems to be functional.  I think I’ll leave it at this point (when I can be optimistic) and pick it up again tomorrow.

    Interact 2008 Summary, Day 1

    I am now wrapping up the third and final day of the Interact 2008 (not to be confused with Interact 2008) conference.  I don’t know what I can say about it other than to say that is has been time extraordinarily well spent.  There are many things that you do in life that are worthwhile days; as far as careers go, this ranks among the most useful days I’ve ever had.  I don’t intend that statement to be either hyperbole or summarily discardable.  There has been absolutely fantastic face time with Microsoft employees and a wonderful opportunity to interact (no pun intended) with key vendors and other attendees.  Although I’m completely saturated and exhausted, I’ll try to give a rundown of the sessions and events.  First up, the sessions that I attended.

    Tuesday

    Keynote (Gurdeep Singh Pall)

    Overall a great keynote, but very similar in content to the keynote delivered at VoiceCon 2008.  I can’t hold that too much against him since I’m sure that writing new keynote speeches for every event he speaks at probably isn’t and shouldn’t be his priority.  That said, I recommend you watch the actual keynote from VoiceCon rather than reading my poor summary of what Gurdeep said.

    Panel on Planning Voice Architecture and Deployment in Microsoft Office Communications Server 2007 (Mahendra Sekaran; Sean Olson; John Kenerson; Francois Doremieux; Russell Bennett; Jens Trier Rasmussen; Ken Ewert)

    I tremendously enjoyed this session; it was an open forum for anyone with a question to address some of the brightest minds on the OCS team.  The people that stood out to me in particular were Mahendra Sekaran, who answered my question about topologies with 100-person outsource shops (and whether we needed to deploy a pool/enterprise voice equipment at that location); Sean Olson, who answered most of the questions about general vision; Francois Doremieux, who handled many questions about actual deployments; and Russell Bennett, who contributed intelligent comments to several questions.  I particularly enjoyed the range of knowledge available in the room, it seemed that there were answers for every question asked.

    Microsoft Office Communications Server 2007 Edge Drill Down (Wajih Yahyaoui)

    This was a very good session on details for the various edge servers.  Wajih has a very noticeable accent, but was obviously passionate about the subject matter, so it was a real pleasure to listen to him.  He was able to handle most of the questions in the room, but it was helpful that several other knowledgeable Microsoft personnel were there also.  The conversation had one major interruption when a guy who I considered to be acting very belligerently.  The contention was based around whether OCS’s requirement to open port ranges in external firewalls unnecessarily creates security vulnerabilities.  Neil Deason responded that a firewall is only ultimately secure if all ports are closed: it makes no difference whether there are 10,000 ports or one port open, your network is vulnerable if you have ports open in your firewall.  While I don’t think that response was a good response overall, the point of the response should be considered sufficient.  The point of what Neil was saying is that firewalls are only one element of a properly hardened network’s defenses.  Hardening a network involves hardening multiple elements, not just the firewall.  The firewall-only approach is typically described like an M&M: crunchy on the outside, chewy on the inside.  If you have a network defended only by a firewall, your network is vulnerable to internal attacks.  Although research shows that most attacks originate from outside the network, the same research shows that an alarming percentage of breaches originate from inside the network.  See http://answers.google.com/answers/threadview?id=15439 for links to credible sources.

    Advanced Validation and Troubleshooting for OCS 2007 (Byron Spurlock, Tom Laciano)

    Byron and Tom handled this session on how to ascertain the source of an OCS 2007 problem.  Both presenters were enjoyably humorous, considering the amount of time we’d been sitting that day.  We got a chance to see Byron use a number of tools such as the Snooper tool, validation wizards and more.  As I sat there, I realized how much I would have benefited from knowing that those tools existed a few weeks ago, when I spent a significant amount of time using Wireshark to diagnose what was wrong with OCS.  Afterwards, I went to one of the Coffee Chats with Tom and sat for a while as he explained in detail what subject names and subject alternative names are necessary for certificates in various scenarios.

    Evening Event: Surfing @ Wave House

    In the evening, we relaxed on the beach at the Wave House.  It was a great time breathing the (very) cool salt air, throwing back a few drinks, and doing some surfing!  I stunk (figuratively), but here’s a shot, courtesy of my colleague David DeWinter who went with me:

    Mark Surfing

    OCS Roles Primer, Part 2

    In part 1 of this post, we examined the core pool roles for Microsoft Office Communications Server 2007.  Specifically, we covered front-end servers, directors, the three variants of conferencing servers, and the archiving and CDR server.  There are still several key roles to be covered to understand the full breadth of the OCS offering.  These roles fit into three key areas: edge servers, telephony servers, and “other”.  Before we get into the specifics of the roles, please take a brief moment to review the vocabulary from part 1:

    • Office Communications Server: A Microsoft product designed to facilitate communications both inside and outside the office.
    • Presence: A metric that takes into account both your availability (available, idle, away) and your willingness (available, busy, on a call) to communicate.
    • Endpoint: Any device (SIP phone) or software package that registers itself with Office Communications Server as belonging to a user, meaning that the user can be contacted through the device or software package.
    • Enterprise Voice: Probably the most noteworthy addition to the product since 2005; allows calls to enter and exit Office Communications Server.  This means that from any endpoint, users can make or receive calls to traditional phone numbers.
    • Public Switched Telephone Network (PSTN): The traditional telephone network that delivers telephone service over dedicated copper cables.

    Edge Servers

    Access Edge Server

    The access edge server provides three very key services: authenticating and enables connectivity for remote users, negotiating federated communications, and connecting to public IM services such as MSN, AOL, and Yahoo.  Authentication and communications with remote users is unequivocally the most common usage of access edge server.  This server is critical whenever an employee needs to use Communicator but is outside of the corporate LAN.  Traveling sales representatives with Communicator Mobile, home-based employees and other situations are supported when using Access Edge Server.  Federation is the term used to refer to two Active Directory domains that have set up a federated relationship.  Note that a federated relationship is not the same thing as a domain trust, but is similar.  Generally federation happens along corporate boundaries.  Two companies in a strategic alliance or other partnership will federate to allow key contacts greater visibility and easier access to communications.  Microsoft OCS can also allow connectivity with public IM services, enabling communications from Communicator to MSN Messenger, AIM, or Yahoo! Messenger.

    Personal Side-note: Access edge server is one of the most amazing roles in my opinion.  I have been witness to quite literally taking an OCS endpoint, moving it outside of the network, and having it seamlessly connect back up to OCS without any additional configuration.  Imagine being able to grab your desk phone and go home for the day!  We currently have a Cisco UCCX system in place.  In order to take my phone home, I have to take a hardware VPN home, hook it up directly to my cable modem (in the basement) and then hook my phone straight to that.  With OCS, I was able to take my laptop home, turn on my wireless and connect immediately.  If I can say one thing that would be the most important thing for a Cisco customer to hear, it’s this:

    Our Cisco system is technically capable of achieving everything we need it to achieve, but our experience with OCS has blown us away.  Actually getting your hands on to a sample OCS setup is the best thing that you can do for yourself.

    To summarize, the access edge server:

    • Authenticates and enables connectivity for remote users
    • Allows two entities to federate, which in turn allows greater visibility for communications
    • Allows connectivity to public IM networks

    A/V Edge Server

    The A/V edge server enables audio and/or video conferences to happen with users outside of the corporate LAN.  It is important to note that telephony conferences are considered distinct from this scenario and are covered by the telephony conferencing server (see part 1).  The A/V edge server allows remote users authenticated by access edge server to establish internal audio or video calls, or VoIP calls for enterprise telephony scenarios.

    Web Conferencing Edge Server

    Similar to the A/V edge server, the web conferencing edge server enables Live Meeting 2007 sessions to include users outside of the corporate LAN.  Many companies will use this role slightly differently than they will the other edge server roles.  Where access edge server and A/V edge server are deployed to allow external known users to connect and conference, Web conferencing edge server may arguably be used to conference in more anonymous users (who are still actually authenticated by digest authentication) than known users.  This allows companies an internally controlled, paid-for mechanism similar to WebEx that allows public sharing of desktops and other information.

    Requirements** (for all edge servers):

    • Dual processor, dual core 3.0GHz+ processor
    • 2 x 18GB HDD
    • 4GB+ RAM
    • 2 x Gigabit NIC
    • Windows Server 2003 SP1+*

    * I was not able get the OCS primary installer to run successfully on Windows Server 2008 RTM.  It may be that the individual installers would run successfully, but I have not confirmed this.  The only role I have successfully installed on Windows Server 2008 is Speech Server 2007.

    ** The work of mixing audio channels is intense; A/V servers will benefit from more robust hardware.

    Communicator Web Access

    Communicator Web Access (CWA) is to Office Communications Server what Outlook Web Access is to Exchange Server 2007.  It provides an attractive, AJAX (slick update without refresh) based interface for internal or external users to use.  CWA functions much like the director role in that it proxies connections, but differs in that it also proxies internal connections.  Also, CWA is restricted to communicating via instant messaging.  There is no support for audio/video conferences, Live Meeting, or enterprise voice.

    Requirements:

    • Dual processor, 3.2GHz+ processor
    • 1 x 36GB HDD
    • 4GB+ RAM
    • Gigabit NIC
    • Windows Server 2003 SP1+*

    * I have not yet attempted to install this role on Windows Server 2008.

    Web Components Server

    This role has probably the least visible functionality of all server roles: it’s primary responsibilities are to allow users to join Web conferences by clicking a URL, allow download of Address Book data, and expand membership in distribution groups (in ways, simply an expansion of the Address Book functionality).

    Requirements:

    • Dual processor, dual core 2.6GHz+
    • 2 x 18GB HDD
    • 2GB+ RAM
    • Gigabit NIC
    • Windows Server 2003 SP1+*

    * I have not yet attempted to install this role on Windows Server 2008.

    Mediation Server

    Contrary to the Web components server, the mediation server role has profound visibility and is arguably as important as a front-end, back-end, or edge server.  The mediation server is what makes enterprise voice possible.  When Microsoft implemented enterprise voice, they elected to use proprietary codecs (RTAudio and RTVideo) in order to overcome some significant hurdles such as inconsistent bandwidth.  However, their choice to use these proprietary codecs meant that right from the beginning, Microsoft wasn’t able to play nicely with many pieces of PSTN hardware.  In their defense, the enterprise voice market is very confused right now.  There are many competing standards such as ICE, SIP, and others that still aren’t fully or consistently supported.  Microsoft saw this and decided that it would be easier to simply draw a strong line between external and internal voice traffic.  That line is drawn right through mediation server.

    Microsoft states that there are three ways to connect Office Communications Server to the PSTN.  The first is through a basic media gateway.  A basic media gateway is simply a piece of hardware that terminates PSTN lines (whether in FXS/FXO or T/E/DS form).  The media gateway’s responsibility is to accept incoming calls on the PSTN lines and hold the line open until the call is complete.  To know when the call is completed, the basic media gateway talks to the mediation server, generally via G.711.  The mediation server does the job of decoding G.711 voice traffic and encoding into RTAudio (and vice versa, for outbound voice traffic).

    A basic hybrid gateway does essentially the same thing except that it merges the mediation server role directly onto the media gateway.  The benefit of a basic hybrid gateway over a basic media gateway fundamentally boils down to TCO: it’s cheaper and easier to manage one box than it is to manage two.

    The final means of connecting Office Communications Server to the PSTN is for the media gateway itself to directly support the native OCS protocols (like RTAudio and ICE).  Microsoft calls this an advanced media gateway.  Please note that the difference between the advanced media gateway and the basic hybrid gateway is that in the basic hybrid scenario there are two functions coexisting on one box – they are still distinguishable functions.  With advance media gateways, the functions are no longer distinguishable.  The media gateway natively speaks OCS’ language.

    In the next post in this series, we’ll consider a final smattering of server roles that don’t always require a full server, consider coexistence scenarios and some final “gotchas” that I wish I’d known about when I started deploying OCS.

    Queues are for the Brits, Part 2

    del.icio.us Tags: ,,

    In part 1 of this article, I raised some issues with traditional ACD algorithms.  The issues raised are best summarized by generalizing* ACD algorithms as FIFO queues with the only variances being skill levels and the actual agent allocation algorithm.  Agent allocation algorithms generally break down into something fairly simple such as which agent has been off the phone the longest, taken the fewest calls, or spoken to the person before.  There is a lack of intelligence when it comes to considering a many-to-many call/agent match.

    * I really do understand that there are likely algorithms out there that do achieve 90% of what it is that we need to achieve.  The question is not, “How much does solution x achieve for company y out of the box?”, the question is, “To what level does solution x allow all companies to customize the algorithm to achieve 100% of their needs?”

    A Proposed Solution

    Before I propose a solution, I should make two things clear.  First, I am by no means an expert in contact center theory.  There is a world of things I don’t know – contact center theory is one of them.  Second, I am not fully educated on the existing solutions out there.  Due to the proprietary nature of algorithms and the obvious interests of the companies in protecting them, the best I can do is find descriptions of how algorithms currently work.  I can also see the flaws in our current UCCX system.

    So how do we determine which is the best agent to assign to a call?  Skills-based routing excels at limiting the pool of agents to available agents who are capable of taking the call.  We want to break past that barrier to achieve the following:

    A desirable algorithm for contact centers will create an optimal call-agent match, adjusted for time considerations.

    Note that the statement above does not de facto consider availability.  Availability certainly should be part of the equation, but only part.  For contact centers who have frequent repeat callers, agent consistency may be a desirable trait.  If agent consistency is desirable, the algorithm may state that if the caller’s assigned agent will be available in less than a minute, the caller will be placed on hold until the agent is available.

    Base Match Score

    Skills-based routing has achieved the first part of the equation: making sure that someone capable of handling the call is assigned to the call.  Because there are only the few dimensions involved (call disposition, available agents, and agent skills), call match may not be entirely optimal.  The first step to creating an optimal call-agent match is to increase the number of dimensions that factor into the final assignment.  As was stated in the introduction to this post, we have designed an algorithm that takes a number of factors into consideration.  A sample list of considerations may include the following factors:

    clip_image001[9]

    ** Note that unless an alternative receptor has been configured for calls with a zero match score for all agents, all multipliers should be greater than zero.

    The above chart over-simplifies in one significant respect: it treats ranges as a flat match or non-match score.  In our actual algorithm, we support ranged multipliers, but this example needs to be easy to understand.  I also recommend this type of a chart for gathering requirements from the business; it’s easier to understand and easier to supply rows rapidly.  That said, this may be the match score matrix for a fictional company.  The first row indicates that, as with many companies, skill match is critical to agent assignment.  The non-match multiplier minimizes the chance of a agent without a skill match being assigned to this call.  Unless voice mail or an alternate “queue” is configured to handle calls with a zero match score, I highly recommend having all multipliers be greater than zero.  The second and third rows assign a significant priority to agents who have spoken with this person before.  The third to fifth rows flatten a ranged multiplier.  In true implementation, I recommend attempting to use an actual mini-algorithm to deal with these types of issues as the result is a more accurate, less “bucketed” match score.  In this case, we are attempting to de-prioritize any agent that is not currently available.  The longer it will take the agent to become available, the less likely it is that they will be assigned to the call.

    It is critical to note at this point that we feel this is one of the key differentiators of this algorithm.  Most algorithms only consider the available agent pool and rescan every few seconds if a match cannot be found.  In this case, a match can be made even before an agent finishes a call if the reason (previous contact, for instance) for assigning the call to that agent is compelling enough.  The IVR could even theoretically ask the person if they wanted to speak to their assigned agent and alter the match/non-match multipliers for assigned agent based upon their answer.

    The final row of the table assigns a value to variable cost.  If our company handles many of the calls in-house, those calls probably have a very low variable cost (most of the costs would be considered fixed or sunk costs).  If additional calls can be routed to an outsourcer for $20/call, it is important to know the value of routing that call.  In general, the algorithm should prefer routing to agents with the lowest variable cost.

    Once the match score criteria have been determined, an example matrix can be set up.  In this case, we are attempting to assign four incoming calls to three agents.  We consider each criterion for the match score and evaluate a base match score for all agents for all calls.  [We will continue to build on this same matrix when we introduce the timeline modifier.]

    image

    We calculated the match score by multiplying one times each of the values for the match/non-match values as appropriate for each match factor.

    Timeline Modifier

    Now that we have a concept of a base match score, we need to introduce a timeline modifier.  The timeline modifier ensures that calls with poor match scores across the board eventually get picked up.  How compressed the timeline modifier is depends upon your business model.  If you wish to have a good match and have a high call volume, a longer timeline may make sense.  If you don’t care about match and just want to ensure that calls are picked up, a more compressed timeline may make sense.  You could even replicate a FIFO queue by using an extremely compressed timeline modifier.  We currently use this equation to calculate the timeline modifier: if the call is not yet ready to be transferred, we divide 1.25 by the number of minutes until transfer.  If the call is ready to be transferred, we add three to the number of minutes it has been ready for transfer.  If we treat negative numbers as calls that are ready to be transferred and positive numbers as calls that are not ready to be transferred, extending our example above yields the following timeline modifiers for our calls.

    image

    One of the most interesting things about the timeline modifier is that it really boils down to just another match factor, but we treat it differently for two reasons: first, it’s more critical that we have a smooth range rather than buckets when referring to the amount of time a call has been on hold.  Second, business people understand timelines to be distinct from other match criteria.

    Modified Match Score

    We then use the timeline modifier to calculate the modified match score.  Multiplying the base match score by the timeline modifier yields the modified match score, as shown:

    clip_image001[7]

    Call Assignment

    Finally, we assign calls by maximizing the sum of agent-call matches.  In this case, our sum is maximized at 55.8125 by assigning Call 4 to Agent 1, Call 2 to Agent 2, and Call 3 to Agent 3.

    Implications and Conclusion

    There are a number of implications to how we have assigned calls.  Note that calls 2 and 3, which are not ready to be transferred, are already assigned to an agent by the algorithm.  We distinguish between optimal match and actual assignment.  For actual assignment, we prevent calls more than two minutes from transfer from being assigned.  We could achieve the same thing by tweaking our timeline modifier equation to yield a different timeline modifier.  This brings up perhaps the most important differentiator of this algorithm, however.  Because we consider calls that aren’t yet ready to be transferred, we have some level of predictive ability that allows a better call-agent match.  If we go back to part 1 of this post, it is easy to see why predictive ability is important.  Rather than just looking at the here-and-now, we look at the soon-to-be and are able to reserve agents whose skills match with calls that will soon be ready to transfer.  The key to getting this information is to have the IVR send the ACD periodic notifications of calls en route, their current disposition and probable end state.  Finally, it is important that we consider not only the best agent for the call, but also the best call for the agent.  Looking at the call assignment grid above, you’ll note that we assigned Call 3 to Agent 3.  However, Call 3 had a higher match score with Agent 1.  The reason we assign Call 3 to Agent 3 is because Agent 1 has a better match with a different call, meaning that Agent 3 gets Call 3 by process of elimination.

    The result of our experimentation with OCS has been this: last year, we struggled for months to hook into UCCX’s ACD in order to direct calls to the right destination based solely on one piece of information.  In our pilot with OCS, we were able to achieve multi-factor routing in a matter of days for a small fraction of the cost we incurred last year.  It is entirely accurate to say that Office Communications Server does not ship with an ACD.  The sleeper, however, is that they do ship a platform that is simple to hook into and allows development of a very complex and highly customized ACD that fits your business model.  For us, unless we find a blocking problem with OCS the choice is simple.

    In future posts, we will start to lay out a high-level architectural diagram of how the various pieces work, where the messaging links are, and any gotchas that we find.