Came across an issue the other day that I thought I would share. When testing some Lync 2013 Enterprise Edition Pool fail-overs, an error such as this when running Invoke-CsPooolFailover:
If “doing the needful” by checking the Front End services and if applicable, the Hardware Load balancer comes back clean, where do you go with this one?
Some mentions of this error or something similar are, here and here but nothing really matched 100% or none of the solutions seemed applicable. This error did not show itself immediately after starting the failover, and several of the spawned steps of Invoke-CsPoolFailover such as hydration seemed be going through fine. Once the error hits the failover stops and the pool will remain in a failed state until you bring it back with a:
Set-CsRegistrarConfiguration -Identity “Service:Registrar:Pool1.contoso.com” -PoolState Active
Did you know you can use the Lync\SfB Logging tool to log Powershell? Lets take a look there.
While logging with the logging tool while a cmdlet is running, additional info may be logged into the Logging Tool (dubious wording here for sure) that may not be shown in the shell window or the extended logging even with the -verbose switch. This is especially useful when running something like Invoke-CsPoolFailover which is one of those cmdlets that spawns many others as part of its run time.
Looking at the log of the pool failover while logging with the Logging Tool did show an error like this:
15225 TL_INFO(TF_COMPONENT) 21530.211A8::05/21/2015-21:07:46.988.00003bc3 (PowerShell,ConfDirManagementClientFactory.Create:confdirmanagementclientfactory.cs(112))(0000000002026B62)Creating ConfDirManagement client for address [net.tcp://pool1eefe01.contoso.com:9001].
Whats this? TCP Port 9001? What the @#$@#$ is that used for? More on that in a minute.
A simple telnet or test-netconnection showed that this port was, in fact, blocked. Once opened up between the data-centers that homed the paired pools in question, the Invoke-CsPoolFailover process worked correctly.
So what is this port for? I looked around quite a bit and turned up zip, so a shout out to Twitter and the most awesome SfB MCM\MVP Jonathan McKinney @ucomsgeek helped me by pointing out the Set-CsUserServer cmdlet. One of the switches to Set-CSUserServer is -ConfDirManagementWcfTcpPort who default value is 9001. So it appears that a modification to the conference directories is done at pool failover, and if this port is blocked between the pools, the failover will, er, fail.
This can be verified while Invoke-CsPoolFailover is running by looking at the connections on the front end running the cmdlet with a netstat -ano:
You will see connections from the “Source” FE to each “Target” FE in the corresponding pool on TCP Port 9001 as the steps are run. So the moral of this story is that this port should be considered in any planning for cross-site pool fail-overs.
Hope this helps and please let me know if you have any questions or comments.
Additional thanks to Jeremy Willey for tracking this down.