This post is about maximizing the number of concurrent connections to a single server using F# Async and Async IO. A perfect use case is for ‘push notification’ type services which exist for all the major smartphone platforms (iPhone, Android, Windows Phone).
Note: This post is about 33000 concurrent push notification clients. However in a separate test I achieved 5000 concurrent connections to a single server under load. Each client connection continually sent requests to the server with random data and no “think time” between subsequent requests. Here again I ran out of client resources to scale up. The server could handle more as it was mostly IO bound (AppFabric cache calls and SQL Server Async IO queries)
A push notification service is needed for ‘roaming’ clients; devices that cannot be addressed directly over the network. The main idea is to let the client open a long running connection to a service and essentially keep that connection open all the time. The server can use this channel to notify the client to do something; go fetch a message; call a service; etc.
The amount of client CPU or data exchange over this channel is minimal. Even server-side processing requirements may not be very high. However, if you want to support millions of clients then each server should support the maximum number client connections possible to make the whole system economical.
In this experiment, I created a push notification system using F# and WCF with Duplex NetTcpBinding. The following diagram explains the system.
The Push Notification Service exposes two service interfaces:
- IPushNotificationService – interface for clients to ‘Register’; and keep the connection alive via ‘Nops’ (if needed)
- IAlert – interface to make the service notify a client
The client implements the duplex callback interface IPushNotificationClient. The service invokes the ‘Tap’ operation to notify the client.
As the goal was to maximize the server side capability, all method signatures are asynchronous to leverage the asynchronous IO capability of the underlying TCP stack. This maps well to the F# async monad which was used to the hilt. In fact the core F# code for the service is only about 60 lines. Most of the other server side code (about 140 lines) is WCF related.
All code is compiled for 64 bit to maximize memory headroom. With 33,000 simultaneous clients connected and 10/alerts per second, the service memory consumption is only 630meg and the CPU usage is almost zero.
I used 3 machines to emulate the 33000 clients (15K, 16k and 2K per machine). I found that a typically machine can support less than 17K client (outbound) connections before it runs out of winsock resources. A process with 15K client connections uses between 2.5-3.0 gig of memory (as per the task manager).
I believe that the service should be easily able to handle 50K+ concurrent clients. I just did not have enough client machines on hand to finish the experiment.
You can access the code at this site. http://pushnotify.codeplex.com
If you want to conduct the experiment for yourself, use the app.config file of the TestClient to set:
- The service endpoint address
- The “StartID” for the test client process instance
- And the number of connections
The client will create the number of connections specified. The client IDs are generated by starting with the StartID and incrementing that for each new connection. For multiple client processes, set these values so that all client connections have distinct ids with no gaps.
Also configure the TestAlertClient to set the alert service endpoint and the range of client ids over which the alerts will be generated. The TestAlertClient will randomly alert clients in the specified range at the rate of about 10 alerts / second.