Concourse event being dropped


#1

We are using the bosh concourse release. Our admins recently changed from a single web VM, to an ha proxy VM in front of two web VMs. This is all working with the exception that around the same time (possibly related) users viewing a running job will lose the connection and have to reload the page to see the latest updates.
Using Safari in developer mode and watching the network traffic the following appears to be happening.
An event object is open and receives events relating to new content to be displayed. This can run for many seconds (30+). It is unclear what causes it to terminate, or if it stays open longer if events keep arriving. A new event object is then attempted, but this fails after 30 seconds without getting any response or any data.
It would appear that from this point on, the process never picks up the logs again, the only way to recover and see current status is to hit reload in Safari. (Other users with Chrome/Firefox experience the same user experience).

How do I determine what is causing the connection to close? Assuming for a moment it is ha proxy, are there settings my admin needs to implement to avoid this scenario?


#2

Our ha proxy release has been updated. Via the manifest setting (and significantly increased)

properties:
  ha_proxy:
    client_timeout: 1800
    server_timeout: 1800

Initial observations are that this resolves the issue.