\u00a0<\/span><\/p>\nIn a way, downtime is good news for any e-commerce company. It indicates tremendous success in engaging the customers and other stakeholders including vendors, back end IT teams and marketing units. This means the business is heading in the right direction. Only the performance of the IT systems has to be constantly reinforced to match the \u2018difficult to predict\u2019 user activity.\u00a0<\/span>Fail<\/span>–<\/span>Safe Performance Testing has become a must for any business environment which thrives on heavy volume transactions<\/span>\u00a0<\/span>during peak seasons.<\/span>\u00a0<\/span><\/p>\nIt is evident that demand for highly scalable and dependable system is increasing exponentially for IT<\/span>–<\/span>driven verticals especially e-tail, e-Learning, healthcare etc. In addition, customers are getting less tolerant and excessively vocal on social networks by sharing poor buying experience with screenshots.<\/span>\u00a0<\/span><\/p>\nPerformance testing for e-commerce<\/span><\/b>\u00a0<\/span><\/p>\nWhen it comes to e-commerce, performance testing assumes multiple dimensions. Performance testing of such a complex system should be done in a layered approach that is both manageable and delivers comprehensive coverage. Big distributed systems\u00a0<\/span>ca<\/span>n no<\/span>t<\/span>\u00a0be fully tested\u00a0<\/span>in a<\/span>\u00a0UAT environment. There are several levels of testing\u00a0<\/span>that stretch<\/span>\u00a0over a range of speeds, resources, and fidelity to a production system.<\/span>\u00a0<\/span><\/p>\nFor example, a typical large system might consist of thousands of various servers, front-end Web applications, REST API servers, internal services, caching systems, and various databases. Such a system might process several terabytes of data every day and its storage is measured in petabytes.\u00a0<\/span>Countless clients and users constantly hit it<\/span>. It is difficult to replicate all this on a UAT environment.<\/span>\u00a0<\/span><\/p>\nTesting of large<\/span>–<\/span>scale distributed systems is\u00a0<\/span>complex<\/span>\u00a0and there is much to test beyond traditional testing methods. Performance testing, load testing, and error testing<\/span>\u00a0–\u00a0<\/span>all\u00a0<\/span>should\u00a0<\/span>be undertaken with realistic usage patterns and extreme loads.<\/span>\u00a0<\/span><\/p>\nTraditionally performance testing approach usually\u00a0<\/span>involves<\/span>\u00a0<\/span>i<\/span>dentification of\u00a0<\/span>key scenarios,\u00a0<\/span>s<\/span>etting up the load environment,\u00a0<\/span>d<\/span>esigning the scripts, generating load, monitoring<\/span>,<\/span>\u00a0and at last<\/span>,<\/span>\u00a0analysis and reporting<\/span>. It works for most of the system<\/span>,<\/span>\u00a0but it is completely a different ball game when conducting performance testing of\u00a0<\/span>a\u00a0<\/span>large<\/span>–<\/span>scale distributed system.<\/span>\u00a0<\/span><\/p>\nBusiness leaders and technology stakeholders need to look at performance from a fresh perspective.<\/span>\u00a0<\/span>The following table describes some of the characteristics of the common test scenarios associated with large scale distributed systems:<\/span>\u00a0<\/span><\/p>\n\n\n\nKey Characteristics<\/span><\/b>\u00a0<\/span><\/td>\n<\/tr>\n\nHigh Volume<\/span>\u00a0<\/span><\/td>\n | \u00bb Terabyte of transactional records in database<\/span> \n\u00bb Network throughput in gigabits per second<\/span>\u00a0<\/span><\/td>\n<\/tr>\n\nHigh Transactions<\/span>\u00a0<\/span><\/td>\n | \u00bb Millions of transactions per second from end users<\/span> \n\u00bb Millions of transactions in database due to few triggers (e.g. large report generation due to batch processing)<\/span>\u00a0<\/span><\/td>\n<\/tr>\n\nHigh Concurrency<\/span>\u00a0<\/span><\/td>\n | \u00bb Huge user base accessing simultaneously (e.g. Facebook)<\/span>\u00a0<\/span><\/td>\n<\/tr>\n\nGeographically Distributed<\/span>\u00a0<\/span><\/td>\n | \u00bb Traffic from all over the world<\/span>\u00a0<\/span><\/td>\n<\/tr>\n\nHigh availability<\/span>\u00a0<\/span><\/td>\n | \u00bb Huge revenue loss and complaining customers due to unavailability<\/span>\u00a0<\/span><\/td>\n<\/tr>\n\nHuge Data Analytics<\/span>\u00a0<\/span><\/td>\n | \u00bb Big data, data warehousing<\/span>\u00a0<\/span><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n \u00a0<\/span><\/p>\nThe following table describes performance testing solutions to address the business problems associated with a large<\/span>–<\/span>scale distributed system:<\/span>\u00a0<\/span><\/p>\n\n\n\nKey Challenges<\/span><\/b>\u00a0<\/span><\/td>\n | Proposed Solution<\/span><\/b>\u00a0<\/span><\/td>\n<\/tr>\n\nHigh cost for test environment setup<\/span>\u00a0<\/span><\/td>\n | \u00bb Production or staging environment<\/span> \n\u00bb Scaled<\/span>–<\/span>down environment<\/span>\u00a0<\/span><\/td>\n<\/tr>\n\nHigh cost for load generation environment<\/span>\u00a0<\/span><\/td>\n | \u00bb Cloud<\/span>–<\/span>based load generation tool<\/span>\u00a0<\/span><\/td>\n<\/tr>\n\nHigh license cost for tools and utilities<\/span>\u00a0<\/span><\/td>\n | \u00bb Open<\/span>–<\/span>source load generation\/monitoring tools<\/span> \n\u00bb Use pay<\/span>–<\/span>per<\/span>–<\/span>service if number of runs are less<\/span>\u00a0<\/span><\/td>\n<\/tr>\n\nProduction like environment configuration<\/span>\u00a0<\/span><\/td>\n | \u00bb Use CI tool like Jenkins for automatic build and deployment<\/span>\u00a0<\/span><\/td>\n<\/tr>\n\nConfiguration consistency for large number of nodes<\/span>\u00a0<\/span><\/td>\n | \u00bb Automatic validation of configurations before and after the execution<\/span> \n\u00bb Take restore point and roll it back after the performance run<\/span>\u00a0<\/span><\/td>\n<\/tr>\n\nPopulation of high<\/span>–<\/span>volume of test data<\/span>\u00a0<\/span><\/td>\n | \u00bb Copy production data and mask it<\/span> \n\u00bb Alteration of DB volumes<\/span> \n\u00bb Use tool like database generator,\u00a0<\/span>db<\/span>M<\/span>onster<\/span> \n\u00bb Use historic data<\/span>\u00a0<\/span><\/td>\n<\/tr>\n\nSimulation of realistic load<\/span>\u00a0<\/span><\/td>\n | \u00bb Identify key scenarios and usage patterns from log files, market research, BA etc.<\/span> \n\u00bb Generate load from different geographies<\/span> \n\u00bb Baseline response with CDN and without CDN<\/span>\u00a0<\/span><\/td>\n<\/tr>\n\nPenetrating the system complexity, touching all system nodes and database tables<\/span>\u00a0<\/span><\/td>\n | \u00bb Understand system architecture<\/span> \n\u00bb Manually walkthrough the scenarios and watch traffic on different nodes and database tables<\/span> \n\u00bb Detail analysis of application logs<\/span> \n\u00bb Understand load balancer strategy<\/span>\u00a0<\/span><\/td>\n<\/tr>\n\nIdentification and testing of failover scenario<\/span>\u00a0<\/span><\/td>\n | \u00bb Test the failover scenario separately during load condition<\/span>\u00a0<\/span><\/td>\n<\/tr>\n\nThird party interactions<\/span>\u00a0<\/span><\/td>\n | \u00bb Simulate using stubs<\/span>\u00a0<\/span><\/td>\n<\/tr>\n\nMonitoring of large number of disparate systems<\/span>\u00a0<\/span><\/td>\n | \u00bb Use diagnostic tools like\u00a0<\/span>AppDynamics<\/span>,\u00a0<\/span>Dynatrace<\/span>, HP Diagnostic,\u00a0<\/span>Glassbox<\/span>\u00a0<\/span><\/td>\n<\/tr>\n\nResult collation and analysis<\/span>\u00a0<\/span><\/td>\n | \u00bb Automatic result collection and collation<\/span> \n\u00bb Collection of built-in anti<\/span>–<\/span>patterns for quick analysis<\/span> \n\u00bb Knowledge base on historical failures or bottlenecks<\/span>\u00a0<\/span><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n\u00a0<\/span><\/p>\nConclusion<\/span><\/b>\u00a0<\/span><\/p>\nFor an e-commerce company, the user can be any computer literate individual with access to internet. This assumption makes it very difficult to predict user activity<\/span>. T<\/span>he scenarios that generate peak traffic are susceptible to changing combinations of the demand of a product, the pricing, the launch,\u00a0<\/span>availability,<\/span>\u00a0and UX.<\/span>\u00a0<\/span><\/p>\nTo stay resilient, a fail-safe performance testing consolidates the scenarios into predictable,\u00a0<\/span> | | | | | | | | | | | | | | | | | | | |