Wednesday, May 24, 2006
Hardware Sizing - Oracle Application Server
This document presents a set of guidelines for use when sizing the correct hardware for oracle application server.
The main steps toward sizing the right hardware are the following
1. Analyze the user load and requirement
2. Choose architecture
3. Choose database management system
4. Consider reliability
5. Consider scalability
6. Use Hardware sizing tool
Three types of information will be needed for sizing the hardware:
Processing
The first type of question has to do with processing (i.e., CPU) requirements. The business transaction volumes will be important for sizing CPU requirements. In some situations, this can be reduced to merely asking for “number of users.” In other cases, not all users put the same load on the server, so it may be necessary to ask for the “total number of business transactions processed.” Other questions include: How many orders are entered in a typical 8-hour day? What is the average number of items per order? How many loans are processed? How many hits-per-second are typical for a Web application?
Memory
The second type of question has to do with memory requirements. Often, applications are designed so that separate jobs and/or storage are used for each user. These questions will take the form of: How many order entry users are there? How many accounts payable users are there? How many concurrent users/transactions are there? ...etc.
Disk
The third type of question will have to do with DASD requirements. How many documents and records are there? What is their average size? How many SKUs are maintained? How many customer accounts? How long is historical data kept on-line?
Each of these three categories for information gathering may benefit from additional questions to help determine the complexity or “largeness” of that item. Questions such as
“Are users accessing the application through the Web or a native interface?” might affect the amount of CPU processing.
“What proportion of application users are doing Web browsing versus Web purchasing?” may affect disk activity for updates as compared to the total memory demand for caching frequently referenced Web pages.
Some care must be taken to organize the questions, phrasing them in a logical and comprehensible manner so the end user does not become overwhelmed. Also, keep in mind that code can be provided in the Sizing Guide Solution to offer some reasonable defaults in the event that not all of the information is readily known by the user. In some cases, the customer will only have “impressions” of how the server will be used. This will lead to using a large number of “fuzzy” terms to describe the work. Terms such as “light,” “heavy,” “moderate,” “casual,” and “complex” would be used extensively as answers to the sizing questions. This can make the effort of quantifying the default values more critical to achieving reasonable sizing results.
Sizing Parameters:
The sizing depends heavily on two parameters, 'Total Connected Users' and 'Business Trx/hr for all Users'. Also for light users, 'Business Trx/hr for all users' should be at least 1 to 4 times 'Total Connected Users'. For heavy users, 'Business Trx/hr for all Users' can be upto 12 times the 'Total Connected Users'.
Number of Active Concurrent Users is the number of users who are actively using the system. An active user requests a page, thinks for some time, then requests another page. The user is considered active in the time between the first and last page access, including the think times.
Page View Rate per User is the rate at which the average user generates a page request (i.e. "click") on the system. Isizer expects this number of page views to be expressed in per minute. You can enter a fraction if the rate you desire is less than one.
Number of Hits per Page View Each page view by a user will generate one or more hits on the OC4J server depending on the page and the application behind it due to images and other things in the page. More complex web pages tend to generate more hits and therefore are heavier for OC4J to process.
Workload Type Ultimately sizing OC4J is very dependent on the java application. A poorly written or very complex application will require more resources than a well-written, simple application. We have chosen the Pet Store application to run our benchmarks and generate our sizing metrics. You must estimate how much heavier or lighter your application is compared to Pet Store. A "Browsing" scenario is more appropriate to choose when your web site is mostly used for read only access; more data
A transaction is simply the HTTP hit that primarily constitutes a transaction. Throughput is a measure of the data transferred between users and the application server. Because the test requires all work to be done within an hour, throughput was not an explicit goal of the testing.
Latency is the time it takes one user to perform a transaction and receive results from Application. (Ttransaction latency does not include browser rendering, although it does reflect network load.) Lastly, the term named users represents the total community of users who have login-concurrent users are the authenticated and active subset of named users.
Transaction Concurrency
Transaction concurrency is a derivative of the testing scenario, which simulated users in a typical Application environment with a production database. The testing confirmed that transaction concurrency and latency are directly related. When a system gets busy, then more transactions end up executing at the same time and generally take longer to complete-the level of transaction concurrency increases when latency increases.
Scalability
Scalability refers to the performance characteristics an application as the workload increases, and an the application as it runs on larger and larger server configurations. For example, an application with a characteristic of linear scalability requires twice the server resources to perform twice the workload. Another vertical scalability characteristic involves whether an application can make full use of system resources (e.g., all processors as n-way increases).
The application itself should be “tuned” to make optimal use of CPU, memory, and disk resources that is, to make the application scale in a predictable and efficient manner. Examples of tuning techniques include: analyzing the application’s efficiency, adjusting the application design and implementation to perform as much parallel work in multiple threads (or jobs), and minimizing locking contention between application users (or between various parts of the application). Given enough disk and memory resources, and with enough workload, a highly scalable application can push the processor to 90% utilization or higher.
An application may have been designed for a certain kind of hardware configuration, which effectively means that buying larger, faster hardware may not yield the expected, linear improvement in throughput. For instance, if the application is written to take advantage of 16 threads, then running that application on a 32-way processor will not show throughput that is anywhere near twice as fast.
CPU Sizing
This section includes two formulas for CPU sizing - one for determining a configuration that will provide optimal response times, and one for producing a configuration with good (but less than optimal) response times. In choosing which formula to use, consider the transaction types and user characteristics of the target environment. Section 2, “Test Results” shows which transaction types exhibit better scaling in comparison to others, and this information can be used as a reference to tailor configurations for particular workloads.
The formulas presented below for CPU sizing are based on the number of transactions per CPU. This seems to be the best approach given trends in the test results. Deriving the number of transactions per
CPU requires:
The number of concurrent users, given a named user population of a certain size
The number of concurrent users that have transactions executing at any one point in time Of the total concurrent users, how many would have transactions that are simultaneously executing?
This value is directly related to the transaction latency measured in the tests. For optimal response times, this value is 3.5 percent, and for moderate response times, it is 4.5 percent.
Next, since we know the expected number of transactions in the system, how do these transactions map to CPUs? Again the testing provides some indication-approximately 2.5 transactions were executing on each CPU when response times were optimal, and about 4.5 transactions were executing on each CPU when response times were reasonable (but not optimal).
Total CPU MHz = # Hits/Sec * 4.65
Number of CPUs = Total CPU MHz / MHz per CPU
The main steps toward sizing the right hardware are the following
1. Analyze the user load and requirement
2. Choose architecture
3. Choose database management system
4. Consider reliability
5. Consider scalability
6. Use Hardware sizing tool
Three types of information will be needed for sizing the hardware:
Processing
The first type of question has to do with processing (i.e., CPU) requirements. The business transaction volumes will be important for sizing CPU requirements. In some situations, this can be reduced to merely asking for “number of users.” In other cases, not all users put the same load on the server, so it may be necessary to ask for the “total number of business transactions processed.” Other questions include: How many orders are entered in a typical 8-hour day? What is the average number of items per order? How many loans are processed? How many hits-per-second are typical for a Web application?
Memory
The second type of question has to do with memory requirements. Often, applications are designed so that separate jobs and/or storage are used for each user. These questions will take the form of: How many order entry users are there? How many accounts payable users are there? How many concurrent users/transactions are there? ...etc.
Disk
The third type of question will have to do with DASD requirements. How many documents and records are there? What is their average size? How many SKUs are maintained? How many customer accounts? How long is historical data kept on-line?
Each of these three categories for information gathering may benefit from additional questions to help determine the complexity or “largeness” of that item. Questions such as
“Are users accessing the application through the Web or a native interface?” might affect the amount of CPU processing.
“What proportion of application users are doing Web browsing versus Web purchasing?” may affect disk activity for updates as compared to the total memory demand for caching frequently referenced Web pages.
Some care must be taken to organize the questions, phrasing them in a logical and comprehensible manner so the end user does not become overwhelmed. Also, keep in mind that code can be provided in the Sizing Guide Solution to offer some reasonable defaults in the event that not all of the information is readily known by the user. In some cases, the customer will only have “impressions” of how the server will be used. This will lead to using a large number of “fuzzy” terms to describe the work. Terms such as “light,” “heavy,” “moderate,” “casual,” and “complex” would be used extensively as answers to the sizing questions. This can make the effort of quantifying the default values more critical to achieving reasonable sizing results.
Sizing Parameters:
The sizing depends heavily on two parameters, 'Total Connected Users' and 'Business Trx/hr for all Users'. Also for light users, 'Business Trx/hr for all users' should be at least 1 to 4 times 'Total Connected Users'. For heavy users, 'Business Trx/hr for all Users' can be upto 12 times the 'Total Connected Users'.
Number of Active Concurrent Users is the number of users who are actively using the system. An active user requests a page, thinks for some time, then requests another page. The user is considered active in the time between the first and last page access, including the think times.
Page View Rate per User is the rate at which the average user generates a page request (i.e. "click") on the system. Isizer expects this number of page views to be expressed in per minute. You can enter a fraction if the rate you desire is less than one.
Number of Hits per Page View Each page view by a user will generate one or more hits on the OC4J server depending on the page and the application behind it due to images and other things in the page. More complex web pages tend to generate more hits and therefore are heavier for OC4J to process.
Workload Type Ultimately sizing OC4J is very dependent on the java application. A poorly written or very complex application will require more resources than a well-written, simple application. We have chosen the Pet Store application to run our benchmarks and generate our sizing metrics. You must estimate how much heavier or lighter your application is compared to Pet Store. A "Browsing" scenario is more appropriate to choose when your web site is mostly used for read only access; more data
A transaction is simply the HTTP hit that primarily constitutes a transaction. Throughput is a measure of the data transferred between users and the application server. Because the test requires all work to be done within an hour, throughput was not an explicit goal of the testing.
Latency is the time it takes one user to perform a transaction and receive results from Application. (Ttransaction latency does not include browser rendering, although it does reflect network load.) Lastly, the term named users represents the total community of users who have login-concurrent users are the authenticated and active subset of named users.
Transaction Concurrency
Transaction concurrency is a derivative of the testing scenario, which simulated users in a typical Application environment with a production database. The testing confirmed that transaction concurrency and latency are directly related. When a system gets busy, then more transactions end up executing at the same time and generally take longer to complete-the level of transaction concurrency increases when latency increases.
Scalability
Scalability refers to the performance characteristics an application as the workload increases, and an the application as it runs on larger and larger server configurations. For example, an application with a characteristic of linear scalability requires twice the server resources to perform twice the workload. Another vertical scalability characteristic involves whether an application can make full use of system resources (e.g., all processors as n-way increases).
The application itself should be “tuned” to make optimal use of CPU, memory, and disk resources that is, to make the application scale in a predictable and efficient manner. Examples of tuning techniques include: analyzing the application’s efficiency, adjusting the application design and implementation to perform as much parallel work in multiple threads (or jobs), and minimizing locking contention between application users (or between various parts of the application). Given enough disk and memory resources, and with enough workload, a highly scalable application can push the processor to 90% utilization or higher.
An application may have been designed for a certain kind of hardware configuration, which effectively means that buying larger, faster hardware may not yield the expected, linear improvement in throughput. For instance, if the application is written to take advantage of 16 threads, then running that application on a 32-way processor will not show throughput that is anywhere near twice as fast.
CPU Sizing
This section includes two formulas for CPU sizing - one for determining a configuration that will provide optimal response times, and one for producing a configuration with good (but less than optimal) response times. In choosing which formula to use, consider the transaction types and user characteristics of the target environment. Section 2, “Test Results” shows which transaction types exhibit better scaling in comparison to others, and this information can be used as a reference to tailor configurations for particular workloads.
The formulas presented below for CPU sizing are based on the number of transactions per CPU. This seems to be the best approach given trends in the test results. Deriving the number of transactions per
CPU requires:
The number of concurrent users, given a named user population of a certain size
The number of concurrent users that have transactions executing at any one point in time Of the total concurrent users, how many would have transactions that are simultaneously executing?
This value is directly related to the transaction latency measured in the tests. For optimal response times, this value is 3.5 percent, and for moderate response times, it is 4.5 percent.
Next, since we know the expected number of transactions in the system, how do these transactions map to CPUs? Again the testing provides some indication-approximately 2.5 transactions were executing on each CPU when response times were optimal, and about 4.5 transactions were executing on each CPU when response times were reasonable (but not optimal).
Total CPU MHz = # Hits/Sec * 4.65
Number of CPUs = Total CPU MHz / MHz per CPU