After I installed the 5100 build of vCenter Server Appliance 6.5, and ran with it for 1 month, I suddenly could not connect to the Web Interface anymore, all I got was this:
“503 Service Unavailable (Failed to connect to endpoint: [N7Vmacore4Http20NamedPipeServiceSpecE:0x7f009c095810] _serverNamespace = / _isRedirect = false _pipeName =/var/run/vmware/vpxd-webserver-pipe)”
This is due to a bug in the appliance with duplicate entries in the postgres database, to get the info you need, login to the appliance with SSH:
Type “shell” for shell access and goto thos folder:
var/log/vmware/vpostgres
In here there is a lot of logfiles, they are all called postgresql-“dayofmonth”.log, so for the 2. day in the month, it would be “postgresql-02.log”,
Show this file with command:
“cat postgresql-02.log”
Look for this:
2017-04-02 18:07:21.974 UTC 58e13db9.11d2 1008636 VCDB vc ERROR: duplicate key value violates unique constraint “pk_vpx_vm_virtual_device”
2017-04-02 18:07:21.974 UTC 58e13db9.11d2 1008636 VCDB vc DETAIL: Key (id, device_key)=(8941, 4000) already exists.
Notify the ID and DEVICE_KEY, in this case, it’s 8941 and 4000
Now connect to db:
/opt/vmware/vpostgres/current/bin/psql -d VCDB -U postgres
To remove the duplicate key run the below command:
DELETE FROM vc.vpx_vm_virtual_device where id=’8941′ and device_key=’4000′;
“DELETE 1” = Means OK 🙂
Then quit the DB cli:
\q
And type:
Reboot
When the server reboots, all should be okay.
In my case, after reboot, I got the same error again, but when I looked into the logs, I had a new id that was duplicate, I deleted that, and rebooted, and after that, the VCSA was fine again 🙂
Read through a VMware KB article that provides more insights on the error, it’s causes and troubleshooting steps. For more information, see VMware 503 Service Unavailable KB @ https://kb.vmware.com/s/article/67818.