Skype for Business CU fails to install – Error 1603: Server.msp had errors installing

Microsoft have done a good job making the patching process for Skype for Business as simple as possible but over time it is possible that you may suddenly come across a server that will just not install a CU.

When you look at the logs the error doesn’t give you a lot of information to work on:

Executing command: msiexec.exe  /update “Server.msp” /passive /norestart /l*vx “c:patchesServer.msp-SRV01-[2018-11-28][19-27-42]_log.txt”

ERROR 1603: Server.msp had errors installing.

ERROR: SkypeServerUpdateInstaller failed to successfully install all patches

Right.

Luckily if does give you a log file in the first line. A REALLY BIG log file.

If you search for “error” then you will likely find a few but don’t get too worried. In particular one entry points you yet another log. This is located in the AppData\Local\Temp folder of the user running the upgrade and is called LCSSetup_Commands.txt. Inside this you will find the following information:

Install-CsDatabase : Command execution failed: Install-CsDatabase was unable to find suitable drives for storing the database files. This is often due to insufficient disk space; typically you should have at least 32 GB of free space before attempting to create databases. However, there are other possible reasons why this command could have failed. For more information, see http://ift.tt/1Og9jlm

So it seems that Skype for Business won’t patch the database if the disk on the server drops below a certain threshold. This log mentions 32GB but we’ve found that it will go lower than this.

After a bit of housekeeping the patch will run through successfully.

Skype for Business Web Sites Fail to Work Using Microsoft Web Application Proxy

In a classic case of skim reading the documentation we had trouble publishing a Skype for Business environment externally using a Microsoft Web Application Proxy (WAP). This was mainly impacting the Skype for Business mobile client which was failing to log on.

You could still access all of the standard web sites through a browser such as dialin and meet though.

This happened due to a missed step when setting up the WAP. If the internal and external URLs are different, you need to disable the translation of URLs in the request headers. Use the following powershell command on the WAP server.


$Rule = (Get-WebApplicationProxyApplication -Name "Insert Rule Name to Modify").ID
Set-WebApplicationProxyApplication –ID $Rule –DisableTranslateUrlInRequestHeaders:$True

Once completed reload the mobile client and it should connect without issues.

Misconfigured Skype for Business Edge Server Breaks Office 365 Hybrid Federation

We’ve been moving more customers to Office 365 recently. Not only are they seeing the business case stacking up from a cost point of view but they are also after the cloud only features which are now more frequently appearing. A troubling development with these migrations are the number of broken Skype for Business Edge servers that we are seeing.

Now these aren’t totally broken but just broken enough that when we try to integrate their on-premises Skype for Business environment with Office 365 services things go wrong.

How will you detect this?

This will often show up when trying to get voicemail configured to use hosted voicemail in Exchange Online since this is often the first hybrid service being deployed. When the call is redirected to the Exchange Online server it fails. Looking at the event logs on the front end server it says that the dial plan wasn’t configured correctly.


Attempts to route to servers in an Exchange UM Dialplan failed

No server in the dialplan [Hosted__exap.um.outlook.com__tenant.onmicrosoft.com] accepted the call with id [XXXXXXXXXXXXXXXXXXXXXXXXX].

Cause: Dialplan is not configured properly.

Resolution:

Check the configuration of the dialplan on Exchange UM Servers.


All the configuration looked fine and so we needed to dig into the SIP traffic a little more. We did this using snooper. We could see the message being handed off to the edge server from the front-end server but then the edge server connection timed out.


Response Data
504  Server time-out
ms-diagnostics:  1008;reason=”Unable to resolve DNS SRV"

This was a little strange as the edge server was working fine for other federation partners, and DNS lookups were working on the edge server.

What was happening?

One thing that didn’t look right though was that the internal interface was configured to use the internal DNS server. Referring to the Edge server deployment guide confirmed that this wasn’t correct.

https://docs.microsoft.com/en-us/skypeforbusiness/deploy/deploy-edge-server/deploy-edge-servers

Interface configuration without DNS servers in the perimeter network
1. Install two network adapters for each Edge Server, one for the internal-facing interface, and one for the external-facing interface.

Note
The internal and external subnets must not be routable to each other.


2. On your external interface, you’ll configure one of the following:


a. Three static IP addresses on the external perimeter network subnet. You’ll also need to configure the default gateway on the external interface, for example, defining the internet-facing router or the external firewall as the default gateway. Configure the adapter DNS settings to point to an external DNS server, ideally a pair of external DNS servers.


b. One static IP address on the external perimeter network subnet. You’ll also need to configure the default gateway on the external interface, for example, defining the internet-facing router or the external firewall as the default gateway. Configure the adapter DNS settings to point to an external DNS server, or ideally a pair of external DNS servers. This configuration is ONLY acceptable if you have previously configured your topology to have non-standard values in the port assignments, which is covered in the Create your Edge topology for Skype for Business Server article.


3. On your internal interface, configure one static IP on the internal perimeter network subnet, and don’t set a default gateway. Also leave the adapter DNS settings empty.

4. Create persistent static routes on the internal interface to all internal networks where clients, Skype for Business Server, and Exchange Unified Messaging (UM) servers reside.

5. Edit the HOST file on each Edge Server to contain a record for the next hop server or virtual IP (VIP). This record will be the Director, Standard Edition server or Front End pool you configured as the Edge Server next hop address in Topology Builder. If you’re using DNS load balancing, include a line for each member of the next hop pool.

How to fix it?

The edge servers were changed to meet this guidance by creating a hosts file with all servers in the topology using both short names and FQDNs, as well as setting the external adapter to be the only adapter with DNS settings which were external to the organisation.

Voicemail started working once this change was made.

Why this happens

So why did this happen? Part of the setup for voicemail located in Office 365 is configuring a hosting provider in Skype for Business.


New-CsHostingProvider -Identity 'Exchange Online' -Enabled $True -EnabledSharedAddressSpace $True -HostsOCSUsers $False -ProxyFqdn "exap.um.outlook.com" -IsLocal $False -VerificationLevel UseSourceVerification

This provider has shared address space enabled. This means that endpoints with the same SIP domain name can be located either on-premises or in the cloud. In our case the endpoint is the Exchange Online UM service.

When a call is routed to Exchange Online UM it looks up the local directory to see that the user isn’t located on-premises. The call is passed to the Edge server which performs a lookup of the _sipfederationtls._tcp.domain.com DNS record. Why is it doing this? Well basically it’s trying to make a federation request with it’s own domain and this is the start of that process. But the _sipfederationtls._tcp.domain.com record only exists externally so that lookup is failing. Since it can’t federate with itself it doesn’t go to the next step which is establishing a connection to Exchange Online.

This can also be fixed by adding the DNS record to your internal DNS but the edge server would still not being configured correctly. It’s possible that using the internal DNS server would result in something else not working later on. Far better to fix it properly.

Just as a matter of interest if you tried to configure a hybrid mode with Skype Online you would also experience issues where your on-premises users couldn’t see presence or send messages to cloud users. This is the same reason as the Exchange UM issue with shared address space also enabled on this hosted provider


New-CSHostingProvider -Identity SkypeforBusinessOnline -ProxyFqdn "sipfed.online.lync.com" -Enabled $true -EnabledSharedAddressSpace $true -HostsOCSUsers $true -VerificationLevel UseSourceVerification -IsLocal $false -AutodiscoverUrl https://webdir.online.lync.com/Autodiscover/AutodiscoverService.svc/root

Both Exchange Online and Skype for Business have hybrid relationships with Skype for Business On-Premises. The only difference, apart from the provider endpoint address, is that the Skype Online provider is configured to host users while the Exchange Online provider hosts services.

Skype for Business Admin and Powershell Unresponsive

I had an interesting issue where a Skype for Business admin site would sit at the spinning wheel at 100%. This environment had two Enterprise pools so I checked the other site to find the same thing. At this stage I was fairly convinced that it was bigger than just a bad server.

I then opened up powershell which connected fine. Great!!

Next I ran a command after much thought or more to the point after typing get-cs<couple of tabs><enter> which happened to end up on Get-CSADDomain.

So this returned LC_DOMAINSETTINGS_STATE_FAILED. Urgh!

That looks pretty average for what, at this point, is an operational environment.

So next I ran get-CSUser, and we waited. Yeah there are a few users in the environment so that’s the be expected but after a couple of minutes I knew that this wasn’t going to end.

I checked the event log and found the following error in the Lync Server Log


Source: LS Remote PowerShell

Level: Error

Event ID: 35009

Remote PowerShell cannot create InitialSessionState.

Remote PowerShell cannot create InitialSessionState for user: S-1-5-21-XXXXXXXXX-XXXXXXXXX-XXXXXXXXX-XXXXX. Cause of failure: Thread was being aborted.. Stacktrace: System.Threading.ThreadAbortException: Thread was being aborted.

at System.Threading.WaitHandle.WaitOneNative(SafeHandle waitableSafeHandle, UInt32 millisecondsTimeout, Boolean hasThreadAffinity, Boolean exitContext)

at System.Threading.WaitHandle.InternalWaitOne(SafeHandle waitableSafeHandle, Int64 millisecondsTimeout, Boolean hasThreadAffinity, Boolean exitContext)

at System.Threading.WaitHandle.WaitOne(Int32 millisecondsTimeout, Boolean exitContext)

at Microsoft.Rtc.Management.Store.Sql.ClientDBAccess.OnBeforeSprocExecution(SprocContext sprocContext)

at Microsoft.Rtc.Common.Data.DBCore.ExecuteSprocContext(SprocContext sprocContext)

at Microsoft.Rtc.Management.Store.Sql.XdsSqlConnection.ReadDocItems(ICollection`1 key)

at Microsoft.Rtc.Management.ScopeFramework.AnchoredXmlReader.Read(ICollection`1 key)

at Microsoft.Rtc.Management.ServiceConsumer.CachedAnchoredXmlReader.Read(ICollection`1 key)

at Microsoft.Rtc.Management.ServiceConsumer.TypedXmlReader.Read(SchemaId schemaId, IList`1 scopeContextList, Boolean useDefaultIfNoneExists)

at Microsoft.Rtc.Management.ServiceConsumer.ServiceConsumer.ReadT

at Microsoft.Rtc.Management.RBAC.ServiceConsumerRoleStoreAccessor.GetRolesFromStore()

at Microsoft.Rtc.Management.Authorization.OcsRunspaceConfiguration.ConstructCmdletsAndScopesMap(List`1 tokenSIDs)

at Microsoft.Rtc.Management.Authorization.OcsRunspaceConfiguration..ctor(IIdentity logonIdentity, IRoleStoreAccessor roleAccessor, List`1 tokenGroups)

at Microsoft.Rtc.Management.Authorization.OcsAuthorizationPlugin.CreateInitialSessionState(IIdentity identity, Boolean insertFormats, Boolean insertTypes, Boolean addServiceCmdlets)

Cause: Remote PowerShell can fail to create InitialSessionState for varied number of reasons. Please look for other events that can give some specific information.

Resolution:

Follow the resolution on the corresponding failure events.


Well that doesn’t look so good. Reading this it looked like it might be a database issue. This would make sense since the CMS database is in a single location with all servers accessing it. Even if an object is in AD, Skype for Business will get information about it from a single place, the CMS.

If you have multiple pools including fail-over pools then there is still just one CMS service.

The database server was busier than expected but nothing was standing out as really bad. (60% average CPU for the SQL process and a few deadlocked processes reported in the SQL log) but it did seem responsive.

It was at this point that other services using the same SQL server also were reported as being down and the SQL admin made the call to restart the SQL Service.

Once restarted everything became responsive again.

Unfortunately I never got to the bottom of what was wrong in the SQL server, but I think it’s still good to remember the heavy reliance on the database service in Skype for Business. Yes there is a SQL service on each Skype for Business Server, but this isn’t used for all processes.