Haproxy is a Load Balancer to forward traffic to multiple servers. It works with HTTP and plays well with WebSockets.
Engine.io is the implementation of transport-based cross-browser/cross-device bi-directional communication layer for Socket.IO. It's WebSockets with fallbacks to make it work on the web.
Engine.io is much more low level then socket.io. It leaves things like reconnects and multiplexing up to us to implement. As such, it's similar to sock.js.
Because it starts with long-polling and does not try to do too many things, we can optimise real time connections to our needs, making it a better fit to run in production compared to socket.io.
We have separated our app into an httpServer and socketServer (powered by engine.io) allowing us to spin up more instances of each depending on the load.
For this to work, we need two things:
- a Load Balancer to forward all incoming traffic to httpServer instances or socketServer instances
- a shared session store so each server instance can access the same session information
Happroxy does the loadbalancing. We use connect-redis and connect-sessions to share session information.
HttpServers and socketServers (with fallbacks) are two different beasts.
HttpServers can/should be stateless. This means that any request can be handled by any httpServer. It does not matter where the request lands.
The socketServers on the other hand requires all connections from one client to land on the same socketServer. This includes regular HTTP requests used to establish the initial connection and when using long-polling, FlashSocket and WebSocket connections.
If a WebSocket connection would downgrade to long-polling and HTTP requests would be routed to random servers, engine.io would just not work.
Our haproxy config needs to do two things:
- find out which traffic (HTTP and sockets) needs to be routed to socketServers and which to the httpServers
- ensure all engine.io related connections from one client always land on the same socketServer instance
Haproxy config file
Example haproxy config file:
defaults mode http retries 3 # option redispatch timeout connect 5000 timeout client 10000 timeout server 10000 # Application can reached on port 3000 # Load balancer will split traffic to http # and socket servers based on /engine.io prefix. frontend all 127.0.0.1:3000 mode http timeout client 120s option forwardfor # Fake connection:close, required in this setup. option http-server-close option http-pretend-keepalive acl is_engineio path_beg /engine.io use_backend socket-servers if is_engineio default_backend http-servers backend http-servers balance roundrobin option httpclose option forwardfor # Roundrobin switching server node-1 127.0.0.1:4000 check server node-2 127.0.0.1:4001 check backend socket-servers timeout server 120s balance leastconn # based on cookie set in header # haproxy will add the cookies for us cookie SRVNAME insert server node-1 127.0.0.1:5000 cookie S1 check server node-2 127.0.0.1:5001 cookie S2 check
acl is_engineio path_beg /engine.io use_backend socket-servers if is_engineio default_backend http-servers
The above instructs haproxy to make a distinction for regular HTTP requests and socket related requests based on the
/engine.io path of the request.
For the httpServers, we use a roundrobin strategy to distribute requests to different servers. Basically, we alternate the servers to send a requests to.
server node-1 127.0.0.1:4000 check server node-2 127.0.0.1:4001 check
We use two httpServers, listening on ports 4000 and 4001.
We use a least connection strategy for engine.io traffic. This is better for long lasting connections. Haproxy selects the server with the least amount of connections to route new connections to (if no earlier connection from that client was made).
cookie SRVNAME insert
This part is an important directive to make engine.io work. We use a technique called sticky sessions to route all request from the one client to the same server. It instructs haproxy to issue a cookie for each connection to differentiate servers. It will take this cookie into account to route incoming requests from a client that already made requests before to the same socketServer.
server node-1 127.0.0.1:5000 cookie S1 check server node-2 127.0.0.1:5001 cookie S2 check
Our socketServers listen on port 5000 and 5001.