Configuring haproxy to load balance multiple engine.io servers

Haproxy is a Load Balancer to forward traffic to multiple servers. It works with HTTP and plays well with WebSockets.

Engine.io is the implementation of transport-based cross-browser/cross-device bi-directional communication layer for Socket.IO. It's WebSockets with fallbacks to make it work on the web.

Engine.io is much more low level then socket.io. It leaves things like reconnects and multiplexing up to us to implement. As such, it's similar to sock.js.

Because it starts with long-polling and does not try to do too many things, we can optimise real time connections to our needs, making it a better fit to run in production compared to socket.io.

Background

We have separated our app into an httpServer and socketServer (powered by engine.io) allowing us to spin up more instances of each depending on the load.

For this to work, we need two things:

  • a Load Balancer to forward all incoming traffic to httpServer instances or socketServer instances
  • a shared session store so each server instance can access the same session information

Happroxy does the loadbalancing. We use connect-redis and connect-sessions to share session information.

The problem

HttpServers and socketServers (with fallbacks) are two different beasts.

HttpServers can/should be stateless. This means that any request can be handled by any httpServer. It does not matter where the request lands.

The socketServers on the other hand requires all connections from one client to land on the same socketServer. This includes regular HTTP requests used to establish the initial connection and when using long-polling, FlashSocket and WebSocket connections.

If a WebSocket connection would downgrade to long-polling and HTTP requests would be routed to random servers, engine.io would just not work.

Haproxy

Our haproxy config needs to do two things:

  • find out which traffic (HTTP and sockets) needs to be routed to socketServers and which to the httpServers
  • ensure all engine.io related connections from one client always land on the same socketServer instance

Haproxy config file

Example haproxy config file:

defaults  
    mode    http
    retries 3
    # option redispatch
    timeout connect  5000
    timeout client  10000
    timeout server  10000


# Application can reached on port 3000
# Load balancer will split traffic to http
# and socket servers based on /engine.io prefix.
frontend all 127.0.0.1:3000  
    mode http
    timeout client 120s

    option forwardfor
    # Fake connection:close, required in this setup.
    option http-server-close
    option http-pretend-keepalive

    acl is_engineio path_beg /engine.io

    use_backend socket-servers if is_engineio
    default_backend http-servers


backend http-servers  
    balance roundrobin
    option httpclose
    option forwardfor

    # Roundrobin switching
    server node-1 127.0.0.1:4000 check
    server node-2 127.0.0.1:4001 check


backend socket-servers  
    timeout server  120s
    balance leastconn

    # based on cookie set in header
    # haproxy will add the cookies for us
    cookie SRVNAME insert
    server node-1 127.0.0.1:5000 cookie S1 check
    server node-2 127.0.0.1:5001 cookie S2 check

`

Notes

acl is_engineio path_beg /engine.io

use_backend socket-servers if is_engineio
default_backend http-servers

The above instructs haproxy to make a distinction for regular HTTP requests and socket related requests based on the /engine.io path of the request.

balance roundrobin

For the httpServers, we use a roundrobin strategy to distribute requests to different servers. Basically, we alternate the servers to send a requests to.

server node-1 127.0.0.1:4000 check
server node-2 127.0.0.1:4001 check

We use two httpServers, listening on ports 4000 and 4001.

balance leastconn

We use a least connection strategy for engine.io traffic. This is better for long lasting connections. Haproxy selects the server with the least amount of connections to route new connections to (if no earlier connection from that client was made).

cookie SRVNAME insert

This part is an important directive to make engine.io work. We use a technique called sticky sessions to route all request from the one client to the same server. It instructs haproxy to issue a cookie for each connection to differentiate servers. It will take this cookie into account to route incoming requests from a client that already made requests before to the same socketServer.

server node-1 127.0.0.1:5000 cookie S1 check
server node-2 127.0.0.1:5001 cookie S2 check

Our socketServers listen on port 5000 and 5001.

Resources