Stress testing TTalk

OLab v3 has proven to be remarkably resilient over the past decade and has tolerated quite heavy user loads. The efficient architecture and small data streams have helped OLab to support over 100 concurrent users on our demo server. Indeed, with some customization, one team in Karolinsks was able to actively support over 6000 users in a MOOC.

The more recent move from high performance rack servers to virtual servers on the UCIT server farm has given us a lot of flexibility in terms of disk space, but as the server farms have become more heavily used, we have noticed some loss of performance when the OLab3 servers are taxed with higher loads of concurrent users.

Turk Talk in particular is sensitive to this phenomenon. While the data streams are still quite small and tight, we are seeing round-trip delays caused by the multiple security layers. TTalk has a lot of round trip activity from client to server. This is even more marked in recent months as more users are off-campus.

On a recent large class example, with 200-250 students in the class, we saw some quite significant slow downs in performance, which both students and teachers commented upon. The delays were unacceptable and did lead to an unsatisfactory session.

As we move OLab4 to a more efficient framework, including SPA (single-page architecture), and offload some of the processing onto related microservices, we are exploring how different software frameworks can improve the scalability of OLab4.

In the past, the server side of OLab4 is dependent on PHP, which can be notoriously slow in such situations. We are now exploring more modern frameworks, such as .Net, which scale up much more efficiently. We can also place such micro-services into a cloud or cluster of servers, which makes things much, much more scalable.

To get a sense of how much this can be pushed, our Senior Tech Analyst for OHMES, set up the core messaging structures in a software stress-testing testbed. These tools allowed us to create hundreds of virtual concurrent users, which is a common practice when stress testing your servers and processes.

We were pleased to find that the new TTalk micro-service core was able to easily scale up to over 10,000 concurrent learners, communicating with 100 Turkers or facilitators, which is a much higher number than we are likely to need in the next few years. And because the services are cloud-based, should the need arise for much greater numbers of users, they can be scaled up accordingly.