VMware SSO Upgrade Bug – VM_SSORenameDir 1603 solved!
Hello all – just solved a difficult problem where my vSphere SSO upgrade failed with mysterious error 1603.
My situation is this: today I am completing an upgrade of Â vSphere infrastructure. This is a royal pain as patches to vCenter Server can’t just be applied easily. Instead, there is an elaborate process to upgrade vCloud Director, vShield Manager (now renamed vCloud Networking and Security), the vSphere components (SSO, Inventory Manager, vCenter Server), and other solutions (VMware Upgrade Manager, vCenter Operations Server, lots more). To add even more confusion, just to update the vSphere components one uses the 222-page “vSphere Upgrade” document. Phew!
As if that wasn’t bad enough, after following all directions carefully I ran into a known problem during the Single Sign-On (vSSO) upgrade. I made the mistake of mounting the downloaded ISO as a CD to the vCenter Server; there is a known problem (KB1006565) where this doesn’t work…intead, you must extract the ISO contents to a *local folder* on the machine where you are doing the install. This is not just for vSSO but for any of the vSphere components.
Unfortunately, I had not done that but run the SSO update directly from the mounted ISO. So I extracted the ISO contents to a local folder and ran again.
“Hmm,” I thought, “Perhaps it wants a reboot. After all, the upgrade did mention something about a reboot, right?” So I rebooted the vCenter Server where I had the SSO component installed.
Nothing started. No SSO Service. No vCenter Server service. No Update Manager service. Nada. Zip. Zilch.
“Hmm,” I thought, “Methinks perhaps another stab at the old install will solve the problem.” So I run the SSO update again from the local disk.
Failure very quickly.
I do another reboot and verify that the same problem shows up again (services do not start). Then I try the SSO update again with the same error. I finally think to look at the Event Viewer and see I got “Error 1603”. This leads to a Microsoft article (KB834484) that tells me permissions are probably wrong, or I’m using an encrypted drive, or a SUBST’ed drive letter. None of this is true, of course 🙂
Then it occurs to me to look at the actual MSI log file. this happens to be in the local user’s TEMP folder and it was named vim-sso-msi.log. I opened the file and – voila! – I found that the failing issue was a task called VM_RenameSSODir. This was failing with error 1603. So…why???
I looked deeper into the log file and saw an interesting set of information. First, I saw that the action started as below:
Action 10:38:36: VM_RenameSSODir. Action start 10:38:36: VM_RenameSSODir. MSI (c) (A8:D8) [10:38:36:189]: Invoking remote custom action. DLL: C:\Users\SABAEB~1.ADM\AppData\Local\Temp\MSI8603.tmp, Entrypoint: VMRenameSSODir MSI (c) (A8!BC) [10:38:36:925]: PROPERTY CHANGE: Adding SSO_RENAME_DIRS_TIME property. Its value is '518bb4ec'.
Looking further I saw another property with a list of directories:
Property(S): SSO_RENAME_DIRS = lib;thirdparty-license;utils\jars;utils\lib;utils\bin\windows-x86;utils\bin\windows-x86_64;webapps\ims\WEB-INF\lib;webapps\sso-adminserver\WEB-INF\lib
So I took a wild guess that these were backups being made of the folders to be updated within the SSO program directory. Now…here’s the money shot! Take a look at the following:
C:\Program Files\VMware\Infrastructure\SSOServer>dir Volume in drive C is OSDisk Volume Serial Number is 7C5F-C550 Directory of C:\Program Files\VMware\Infrastructure\SSOServer 05/09/2013 09:45 AM <DIR> . 05/09/2013 09:45 AM <DIR> .. 09/10/2012 01:43 PM 422 .ims.files.txt 12/04/2012 01:55 PM <DIR> bin 12/04/2012 01:56 PM <DIR> conf 10/18/2012 02:08 PM 126 config.properties 12/04/2012 01:55 PM <DIR> endorsed 03/12/2013 03:42 PM <DIR> lib.518ba878 05/09/2013 09:47 AM <DIR> logs 10/18/2012 02:08 PM 348,160 msvcr71.dll 10/18/2012 02:08 PM 584 rsaIMSLiteMSSQLDropUsers.sql 10/18/2012 02:08 PM 1,080 rsaIMSLiteMSSQLSetupUsers.sql 12/04/2012 01:56 PM <DIR> scripts 12/04/2012 01:56 PM <DIR> security 10/18/2012 02:08 PM 130 setsapassword.sql 12/04/2012 01:55 PM <DIR> sso-replication-cli 12/04/2012 01:55 PM <DIR> ssolscli 05/08/2013 03:10 PM <DIR> temp 03/12/2013 03:41 PM <DIR> thirdparty-license.518ba878 09/10/2012 01:43 PM 3,458 thirdparty.txt 10/18/2012 02:10 PM 57,846 TOMCAT_LICENSE 10/18/2012 02:10 PM 1,228 TOMCAT_NOTICE 10/18/2012 02:08 PM 114,688 unzip.exe 05/09/2013 09:45 AM <DIR> utils 03/12/2013 03:41 PM 510 vmtcsConfig.txt 10/18/2012 02:08 PM 25,214 vpx.ico 12/04/2012 01:55 PM <DIR> webapps 12/04/2012 01:55 PM <DIR> work 12 File(s) 553,446 bytes 16 Dir(s) 23,591,739,392 bytes free
Do you see what I see? Look at the bolded lib.518ba878 above. That folder is *supposed* to be simply “lib”. As in “library.” As in…the stupid SSO update program that failed did not properly rollback its directory renames! (The logic is obviously: rename all folders that will be modified as a backup, then perform the update, verify correct operations, then delete the backup folders.)
The solution? I could have restored from snapshot (I of course made one) but this irritated me (the snapshot is very time-consuming due to me using thick-provision eager zero for disks). So I instead simply renamed the files! Here are my commands:
C:\Program Files\VMware\Infrastructure\SSOServer>move lib.518ba878 lib 1 dir(s) moved. C:\Program Files\VMware\Infrastructure\SSOServer>move thirdparty-license.518ba878 thirdparty-license 1 dir(s) moved. C:\Program Files\VMware\Infrastructure\SSOServer>move utils\jars.518ba878 utils\jars 1 dir(s) moved. C:\Program Files\VMware\Infrastructure\SSOServer>move utils\lib.518ba878 utils\lib 1 dir(s) moved. C:\Program Files\VMware\Infrastructure\SSOServer>move utils\bin\windows-x86.518ba878 utils\bin\windows-x86 1 dir(s) moved. C:\Program Files\VMware\Infrastructure\SSOServer>move utils\bin\windows-x86_64.518ba878 utils\bin\windows-x86_ 64 1 dir(s) moved. C:\Program Files\VMware\Infrastructure\SSOServer>move webapps\ims\WEB-INF\lib.518ba878 webapps\ims\WEB-INF\lib 1 dir(s) moved. C:\Program Files\VMware\Infrastructure\SSOServer>move webapps\sso-adminserver\WEB-INF\lib.518ba878 webapps\sso -adminserver\WEB-INF\lib 1 dir(s) moved.
Then I re-ran the SSO upgrade and – hey, presto! – everything worked.
What a disaster! But it’s another bug identified and solution offered.