1. Introduction to Performance Testing


Today's fast-paced digital world demands performance testing in the IT industry. Every year, a number of new software products are being developed, creating a buzz in the industry, but not all of them are meeting user expectations. Competition is fierce and performance issues can damage a company's reputation, lead to customer dissatisfaction, and lead to revenue loss. A product's performance is surely a significant and critical aspect not to be ignored, and it is essential before marketing. Let's learn about performance testing and how it impacts software development.

1.1 What is performance testing?

Performance testing is a practice how a system performs in terms of stability, Speed, Scalability, and responsiveness of an application/product holds up under different anticipated workloads.

Speed: Determines whether the application responds quickly.

Scalability: Determines maximum load the AUT can handle.

Stability: Determines if the application is stable under varying loads.

Performance tests are typically executed to examine speed, robustness, reliability, and application size. The process incorporates “performance” indicators such as:

  • Browser, page, and network response times.
  • Server request processing times.
  • Acceptable concurrent user volumes
  • Processor memory consumption; number and type of errors that might be encountered with app

Performance testing gathers all the tests that verify an application’s speed, robustness, reliability, and correct sizing. It examines several indicators such as a browser, page and network response times, server query processing time, number of acceptable concurrent users architected, CPU memory consumption, and number/type of errors which may be encountered when using an application.

1.2  Why does Performance testing require for any application?

We should verify “System behaves and responds under various loads” and to avoid business impact.

Performance is an essential part of the user experience for the huge numbers of users who expect “good Performance” from their applications when using a variety of platforms. Performance testing plays a critical role in establishing acceptable quality levels for the end user and is often closely integrated with other disciplines such as usability engineering, performance engineering and DevOps.

Additionally, evaluation of functionality, usability and other quality characteristics under conditions of load, such as during a performance test, may reveal load-specific issues which impact those characteristics.

Product Quality Model as a non – functional quality characteristic with the three sub – characteristics and performance testing usually concentrates on one or more of these sub – characteristics.

Time Behavior: Generally, the evaluation of time behavior is the most common testing goal. This aspect of performance testing examines the ability of a component or system to respond to user or system inputs within a specified time and under specified conditions. Measurements of time behavior may vary from the “end-to-end” time taken by the system to responding to user input, to the number of CPU cycles required by a code module to execute a task.

Resource Utilization: If the availability of system resources is identified as a risk, the utilization of those resources (e.g., the allocation of limited RAM) may be investigated by conducting specific performance tests.

Capacity: If the capacity of the system (e.g., numbers of users or volumes of data) is identified as a risk, performance tests may be conducted to evaluate the suitability of the system architecture.

Performance testing is not limited to web – based domain where the end user is the focus. It is also relevant to different application domains with a variety of system architectures, such as classic client – server, distributed and embedded.

1.3 Types of Performance Testing.

Despite the fact that they all simulate traffic, they have different objectives and send varying levels and capacities of traffic. Tests might be designed to assess baseline performance, while others might be suitable for testing edge cases ad hoc.

There are five types of performance tests to assess applications:

1.3.1 Smoke/Sanity/Shakeout/Shake-down Testing:

Shakeout tests are typically performed after major system changes or deployments to guarantee that performance suites are re-run for every new update and this ensures that all test scripts or test scenarios work as intended.

The key considerations of sanity tests are:

  • How is the new version of the application performing when compared to previous ones?
  • Is any performance degradation observed in any area in the new version?
  • What should be the next area where developers should focus to address performance issues in the new version of application?

Example: To validate the scripts or correlation issues with your scenario, and if you can successfully complete a single user test, the scenario is sound. Before executing larger tests, it is an ideal practice to start one of these "verification" runs to check that the test is correct.

1.3.2 Load Testing:

Load testing evaluates the behavior and performance of a software application under normal and peak loads, as well as anticipated user loads. It ensures that an application can handle the expected user traffic and workload without compromising performance, responsiveness, or stability.

An application's load can vary depending on user traffic, real-world usage patterns, and performance metrics such as response time, throughput, resource utilization, and error rate. These metrics can be instrumental in identifying and addressing any performance bottlenecks.

A load test should be performed for two hours at peak load, primarily using the production load. This will help ensure that our application works flawlessly after its release into production, without any errors.

There are two types of Load Tests:

1.3.2.1 Peak Load Testing:

Peak load refers to the maximum number of concurrent users on a website within a certain time frame.

A retail website's peak load is most likely to occur during the weekends. Similarly, Thanksgiving and Christmas are the busiest times of the year for your business.

1.3.2.2 Average Load Testing:

The average load refers to the average number of concurrent users engaging in specific business activities on the website on a typical day. It's important to note that peak load is different from average load.

A good practice is to identify the peak load for your website during the performance planning phase and design your tests to simulate that load.

Example:

A flood of 10,000 users engaging in activities such as reading, sending, deleting, forwarding, and responding to emails can generate a significant number of transactions for the email functionality. In fact, the server treats each of these activities as a separate transaction. Therefore, in just one hour, a single user can generate 10,000 transactions. However, if we simulate 10 transactions per hour for each user, the server would need to process 100,000 transactions in an hour. Conducting this type of load test would allow us to monitor the email server's efficiency under stress.

1.3.2.3 The key considerations of Load testing are:

  • What is the maximum load that the application can withstand before it begins to behave abnormally?
  • How much data could the application's database handle before the system slows down or crashes
  • Are there any network-related issues that need to be addressed during the load test?

Key Aspects: Workload Simulation, Scalability Assessment, Identifying Performance Bottlenecks, Performance Optimization and Tuning.

Client – Side Metrics: Number of users, Throughput, Hits per second, Transaction response time, and error rate.

Server – Side Metrics: CPU, Memory, Disk and Network Utilizations.

1.3.3 Stress Testing:

Stress testing examine the ability of a system or component to handle peak loads that are beyond its expected or specified capacity, it also helps to examines how the system responds under intense loads and recovers when the load is reduced.

The stress test will not include wait times between transactions and iterations, allowing us to determine the real breakdown of users from the application's perspective.

We will use an incremental approach to add users to the application to ensure that the stress test stops once we have identified a problem with increased transaction response times of an application, and CPU utilization of the application has reached 80% of the total available CPU percentage. The optimal time for execution is until the system fails.

1.3.3.1 The key considerations of stress tests are:

  • What is the maximum load that a system can withstand before failing?
  • How does the system break down?
  • Is it feasible to get the system to recover once it has crashed?
  • How many ways can a system fail, and which components have vulnerabilities when handling with an unexpected load?

Key Aspects: Memory leaks, slowdowns, security issues, race circumstances, synchronization issues, and data corruption are all investigated during a stress test.

Client – Side Metrics: Throughput, Transaction response time.

Server – Side Metrics: CPU, Memory, Disk, IO, and Network Utilizations.

1.3.4 Endurance Testing:

Soak testing, also known as endurance testing or longevity testing, is a sort of performance testing that assesses an application's behavior and performance over a lengthy period of time when subjected to a steady workload. Soak testing's main objective is to identify any performance deterioration, memory leaks, or other issues that may arise after the application has been operating continuously for a lengthy period of time.

Ideal duration for soak testing is 8, 16, and 24 hours. There are two types of applications:

Service Provider Application (8 or 16 Hours): When a company utilizes an application internally to provide a specific type of service to their employees.

End user Application (24 hours): The costumers utilize Application directly like E-Commerce, Net Banking etc.

1.3.4.1 The key considerations of soak test are:

  • How the memory utilization is going in the application?
  • How the application is managing the constant load on the application for a long duration?

Key Aspects: System Stability, Memory Leak Detection, Resource Utilization Analysis, Performance Degradation detection, and Data Integrity.

1.3.5  Failover Testing:

The failover test assesses a system or application's ability to smoothly transition from a primary or active state to a secondary or standby state in the event of failure. The primary purpose of failover testing is to ensure that key services or applications continue to operate normally in the event of hardware failure, software difficulties, or other unexpected disruptions.

Failover is a critical component of fault-tolerant, high-availability systems and It is frequently used in environments where downtime is either unacceptable or have severe consequences, such as mission-critical applications, data centers, or cloud-based services.

The idle time for conducting the failover testing is 2 hours.

1.3.5.1 The key considerations of stress test are:

  • When one of the database servers fails due to a fault, how does the switching between database servers (clustering) occur?
  • Is the application capable of processing data with the available active servers even if the database server’s failover?

Key Aspects: Failure Scenario Replication, Automatic and Manual Failover Evaluation, Recovery Time Assessment, Data Integrity and consistency, Redundancy Validation, Load Balancing, and Traffic Management.

Client - Side Metrics: Number of users, Transaction response time, Hits per second.

Server - Side Metrics: CPU Utilization, Memory consumption of Database server.

1.3.6 Scalability Testing:

Scalability testing measures the software application's ability to manage the increasing workload over time. The fundamental goal of scalability testing is to verify whether the application can scale efficiently and effectively to support an increasing number of users, transactions, or data quantities without substantial performance deterioration.

Vertical Scalability: Vertical Scalability, also known as scaling up or scaling vertically, refers to the process of increasing the resources (such as CPU, memory, storage, or processing power) of a single server or hardware component to handle an increased workload or accommodate higher demand. Vertical scalability focuses on enhancing individual server capabilities to improve overall system performance and capacity.

Horizontal Scalability: Horizontal scalability, or scaling out, is the process of adding more servers, nodes, or components to a system to handle increased workload or demand. Horizontal scalability emphasizes the use of multiple instances or nodes to distribute the workload, which improves the overall capacity and performance of the system.

Observations: Load Handling Capacity, Resource Utilization, Vertical & Horizontal Scalability, Auto-scaling or Elasticity Validation in Cloud Environments, Load Balancing, and Capacity Planning.

1.3.7  Spike Testing (OPTIONAL):

Spike testing assesses a software application's ability to handle abrupt or significant surges in user load or traffic, referred to as spikes. The major goal is to see if the application can sustain performance, stability, and responsiveness under rapid fluctuations in demand, as well as how quickly it responds to them.

Key Aspects: Rapid Load Increase, Recovery Behavior, Performance Impact, Threshold Identification, and Resource Management.

Client - Side Metrics: Page Load & Render Time, Client-Side Errors, Network Latency, Network Bandwidth, Response Time, and Browser Memory Usage.

Server - Side Metrics: CPU Usage, Memory consumption, Database Metrics, Server Errors, Throughput, Concurrency, and Queue Length.

Example:

If suppose there is a Tatkal E-ticket Booking in IRCTC, it opens at 10 AM and load will be high for 10 to 15 minutes at that point of time and later it downs to normal load.

1.3.8 Volume Testing:

During a volume test, a software application is evaluated for its ability to handle large amounts of data. As data volume increases, performance, scalability, and resource utilization of the application are tested. By conducting this type of testing, it is possible to identify performance bottlenecks, resource limitations, or issues related to data management and storage.

The importance of volume testing is especially evident for applications containing substantial quantities of data, such as databases, data warehouses, content management systems, and file storage systems.

The most common recommendation of this test is tuning of DB queries which access the Database for data. In some cases, the response of DB queries is high for big database, so it needs to be rewritten in a different way or index, joints Etc. need to be included.

Ideal time duration of execution is 1 hour.

1.3.8.1 The key considerations of stress test are:

  • How the database server of the application is performing with increasing the load on the database server constantly?

Key Aspects: Resource Utilization, Data Volume Increase, Performance Metrics (response time, throughput, and database query times, under different data volume levels to assess how the application's performance scales), Data Integrity, and Database Performance. 

1.4  Baseline Testing:

Baseline testing, often known as benchmark testing, is a type of performance testing that creates a reference point or baseline for the performance of a software program under typical or expected conditions. The primary objective of baseline testing is to record the application's performance metrics in a controlled environment for future comparisons, optimizations, and reviews.

The first initial load test will begin once all issues discovered and addressed during baseline testing, which helps to reduce the majority of errors reported during the initial load test and most of the issues will have already been resolved during baseline tests.

Baseline testing should ideally be conducted with 1, 2, 5, 10, 20, and 50 users and this should be done independently for each script that will be part of the load test to immediately isolate any problems that may arise. To get useful averages, the baseline tests must be executed for 20 to 30 minutes, make it a priority and include it in your script and preparation for the load test.

No comments:

Post a Comment

Generate a dynamic Transaction Name that starts with the script name, transaction number, and transaction description

Problem Statement: Every start and end transaction in the script should have a name in the format "ScriptNo_AppName_ModuleName_Transact...