TCP has a theoretical limit of 65K connections per machine. I had always wondered how close we could approach that limit. A perfect use case is for ‘push notification’ type services which exist for all the major smartphone platforms (iPhone, Android, Windows Phone). Correction – thanks to Joe Speed of IBM the 65K limit is only on the client side (because the client assigns a unique ephemeral port number to each connection). The server is only limited by resources.
A push notification service is needed for ‘roaming’ clients; devices that cannot be addressed directly over the network. The main idea is to let the client open a long running connection to a service and essentially keep that connection open all the time. The server can use this channel to notify the client to do something; go fetch a message; call a service; etc.
The amount of client CPU or data exchange over this channel is minimal. Even server-side processing requirements may not be very high. However, if you want to support millions of clients then each server should support the maximum number client connections possible to make the whole system economical.
In this experiment, I created a push notification system using F# and WCF with Duplex NetTcpBinding. The following diagram explains the system.
The Push Notification Service exposes two service interfaces:
- IPushNotificationService – interface for clients to ‘Register’; and keep the connection alive via ‘Nops’ (if needed)
- IAlert – interface to make the service notify a client
The client implements the duplex callback interface IPushNotificationClient. The service invokes the ‘Tap’ operation to notify the client.
As the goal was to maximize the server side capability, all method signatures are asynchronous to leverage the asynchronous IO capability of the underlying TCP stack. This maps well to the F# async monad which was used to the hilt. In fact the core F# code for the service is only about 60 lines. Most of the other server side code (about 140 lines) is WCF related.
All code is compiled for 64 bit to maximize memory headroom. With 33,000 simultaneous clients connected and 10/alerts per second, the service memory consumption is only 630meg and the CPU usage is almost zero.
I used 3 machines to emulate the 33000 clients (15K, 16k and 2K per machine). I found that a typically machine can support less than 17K client (outbound) connections before it runs out of winsock resources. A process with 15K client connections uses between 2.5-3.0 gig of memory (as per the task manager).
I believe that the service should be easily able to handle 50K+ concurrent clients. I just did not have enough client machines on hand to finish the experiment.
You can access the code at this site. http://pushnotify.codeplex.com
If you want to conduct the experiment for yourself, use the app.config file of the TestClient to set:
- The service endpoint address
- The “StartID” for the test client process instance
- And the number of connections
The client will create the number of connections specified. The client IDs are generated by starting with the StartID and incrementing that for each new connection. For multiple client processes, set these values so that all client connections have distinct ids with no gaps.
Also configure the TestAlertClient to set the alert service endpoint and the range of client ids over which the alerts will be generated. The TestAlertClient will randomly alert clients in the specified range at the rate of about 10 alerts / second.