Maintaining hyper-sonic releases at Dream11
- Published on
Cricket, as a sport, is worshipped in our country. 2020 was a trying time for everyone and sports had come to a halt for almost 4 to 5 months. The Indian Premier League (IPL) 2020 came as a much-needed welcome change for Indian sports fans. Fans celebrated the magnanimity of IPL and also increased their engagement with the matches by participating in Dream11 fantasy sports contests, which amplified the thrill and excitement of the tournament. As much fun as IPL is for the fans it is even more exhilarating for Dream11 to provide a great sports engagement platform for Indian fans.
However, this is no cakewalk. IPL is a challenging period for the Tech team at Dream11, where we roll out new and unique features every year. We try out different experiments to analyse the users’ adoption of those features, carve out new services based on the load and performances of the services we get on existing systems and derive the roadmap for upcoming big sporting events like IPL.
User satisfaction has always been our primary goal. We are extremely committed to providing varied user-centric experiences that would make our app stand out. Thus, it is of the utmost importance for us to undertake the mammoth task of rolling out multiple new features on a weekly basis during the IPL season. While it may sound like a simple task, in actuality this turns out to be one of the biggest challenges for the entire team as even a minor change without proper release planning would end up impacting the user experience.
That being said, this IPL 2020 season, our team rolled out weekly app releases on iOS and android platforms. At the same time, there were more than 40 backend deployments a week. As you can see, the final volume speaks for itself!
To scale with this particular challenge, there were a lot of processes and methodologies that were tried, tested and implemented. The first step to achieve this was to identify the problem areas.
The four roadblocks:
Delivering at such a fast pace and maintaining the quality and stability of the entire system is a worthy challenge for the SDET team. We succeeded in pointing out the major roadblock which prevented us from delivering a quality product at a faster pace and from scaling the automation.
The four roadblocks that we faced were:
- Test Data Setup: As there are multiple systems involved, test data setup became a very time-consuming and tedious process.
- Non-Uniform Automation: Multiple teams were trying to solve similar problems using different automation techniques that usually resulted in re-inventing the wheel.
- Regression Testing: A small change in any service or app took a lot of time and effort as it required testing everything on both the platforms, i.e., Android and iOS. This was resulting in longer regression cycles.
- Delayed Feedback Around Analytical Events: User-centric screen navigation data and feedback, captured through events, was not being checked for sanctity. Any problem in this data would result in incorrect business decisions.
What we did:
To overcome these four barriers in a remote working situation, the team had to reassess their entire strategy to understand how they were going to tackle these pain points. Despite the numerous setbacks, the team was determined to find out the solution and overcome these problems based on the trust and assurance on the expertise of the team members.
Scaffolding Code for Service Automation: The Dream11 ecosystem has more than 100+ services running in the background, and to top it all, the systems go through rapid changes as new features are being added continuously and being updated on the app. In order to provide a streamlined test automation coverage for all existing and newly-crafted services, the team created a boilerplate code for API automation. This framework had all the important components including Database integration, Sample API test cases, Test Runner, Jenkins Integration and Reporting.
These components made it easy for all the different teams involved in this process to fork out the boilerplate code, thus, providing uniformity across different service level automation tests. This resulted in different teams contributing to each other’s services directly without having the overhead of understanding the different frameworks. It also simplifies the whole procedure, which until earlier was time-consuming.
How It helped: Automation for services became a cake-walk as the automation framework is now readily available. Also, in a short span of time, more than 2000+ scenarios were automated, leading to a drastic improvement in the test coverage. This proved to be a strong basis of rolling out backend services at a quicker pace.
2) Test Data Generation Engine: Another hurdle for any team is to generate the test data, be it for manual testing or for automation. We have different services in our ecosystem for interaction to generate the test data and multiple databases where this information persists. Thus, finding and pinpointing data at the right time is a big challenge.
This was, undeniably, the most tedious and an error-prone part of manual and automation testing. More importantly, any error in the test data would automatically raise questions on the credibility and reliability of the test cases itself. One might argue that we keep pre-created data, but that approach means we need to be sure no one taints the data at any point in time. This, in a way, limits the scope of testing itself. Hence we needed to create data at runtime so that the power resides with the testers and they can decide what kind of data setup they need.
Our team came up with an idea of having a centralised data creation library, which is responsible for interacting with the entire ecosystem of services, databases, cache, cron, etc., for generating test data in less than a second (excluding the network latency).
How It helped: Generating the test data was a time-consuming activity, which took a lot of effort. But our meticulous system has made this process a single click simulation.
Also, the test data library has become a source of information and is plugged into different backend and mobile automation frameworks where the test data setup can be easily achieved. This in turn helped the teams to focus more on business logic rather than the test data creation. The dataset is also more reliable and flexible enough to cater to the needs of a tester.
The decoupled approach also helped save debugging and maintenance effort as a central fix would be required in case of changes in any of the base systems. For instance, a change in a login logic would only need a change in the engine and all the automation suits in the ecosystem would work as intended.
3) Mobile App Automation: With a user base of more than 100 million, the Dream11 mobile app opens different ways of supporting multiple devices and hardware configuration. A typical release regression used to take 5–6 days of effort. Thus, weekly releases became a challenging aspect, particularly during IPL 2020 when colossal traffic is generated. To handle the scenario and ensure stability is an important aspect.
By analytical events, the team figured out the most widely used platform and started working on automating critical user journeys. There are more than 85% critical user journeys which are automated and executed on a daily basis around release and feature builds.
How It helped: Running the automation tests on feature builds has helped us in getting faster feedback during the development phase itself. This has boosted our confidence in performing regression testing and eventually reducing the timelines. The overall effort of 5–6 man-days was reduced to 2 man-days. Apart from presenting quality work, we also improved upon the number of releases.
4) Analytics Event Automation: User Events are a crucial piece of information which helps the business to improve user experience. They are critical data points in decision making for the product and design teams. User events primarily help to keep a tab of user navigation across screens. It answers our queries as to when consumers join a screen and drop out or abruptly end their session.
With each new feature rolled out, there are new events that are introduced or new attributes added to existing events. We keep changing the events and experimenting and keeping track of every minor change can be a tedious task. We undertook this in order to consume the right user events, which eventually help in making appropriate decisions. The team developed an event automation service that can be plugged in the mobile automation framework.
How it helped: Whenever the existing events stop flowing across the user-journeys or the attributes of those events are missing, the event automation reports clearly indicate the mismatch and enable the team to relook and fix the missing events.
These were some of the key areas where, by providing automation coverage over a period of time, we were able to deliver at 3x speed with higher confidence and a more stable system. We were able to overcome the obstacles and meet our end goal by providing the users with a seamless experience.
Authored By: Dream11 SDET Team
- Authors
- Name
- Dream11
- @Dream11Engg