Software Testing Social Network

Free Software Testing Tutorial and Quality Assurance Portal

Home Featured Articles Test Management Test Plan Template Test Plan for Web Security Testing

Test Plan for Web Security Testing

Complete Tutorial on Test Plan

For reference purposes, the sections that the IEEE 829–1998 standard recommends have been listed below:

Test Plan Identifier

Each test plan and, more importantly, each version of a test plan should be assigned an identifier that is unique within the organization. Assuming the organization already has a documentation configuration management process (manual or automated) in place, the method for determining the ID should already have been determined. If such a process has yet to be implemented, then it may pay to spend a little time trying to improve this situation before generating additional documentation

Introduction

Given that test-planning documentation is not normally considered exciting reading, this section may be the only part of the plan that many of the intended readers of the plan actually read. If this is likely to be the case, then this section may need to be written in an executive summary style, providing the casual reader with a clear and concise understanding of the exact goal of this project and how the testing team intends to meet that goal. Depending upon the anticipated audience, it may be necessary to explain basic concepts such as why security testing is needed or highlight significant items of information buried in later sections of the document, such as under whose authority this testing effort is being initiated.

Project Scope


Assuming that a high-level description of the project's testing objectives (or goals) was explicitly defined in the test plan's introduction, this section can be used to restate those objectives in much more detail. For example, the introduction may have stated that security testing will be performed on the wiley.com Web site, whereas in this section, the specific hardware and software items that make up the wiley.com Web site may be listed. For smaller Web sites, the difference may be trivial, but for larger sites that have been integrated into an organization's existing enterprise network or that share assets with other Web sites or organizations, the exact edge of the testing project's scope may not be obvious and should therefore be documented. Chapter 3 describes some of the techniques that can be used to build an inventory of the devices that need to be tested. These techniques can also precisely define the scope of the testing covered by this test plan.

It is often a good idea to list the items that will not be tested by the activities covered by this test plan. This could be because the items will be tested under the auspices of another test plan (either planned or previously executed), sufficient resources were unavailable to test every item, or other reasons. Whatever the rationale used to justify a particular item's exclusion from a test plan, the justification should be clearly documented as this section is likely to be heavily scrutinized in the event that a future security failure occurs with an item that was for some reason excluded from the testing effort.

Change Control Process


The scope of a testing effort is often defined very early in the testing project, often when comparatively little is known about the robustness and complexity of the system to be tested. Because changing the scope of a project often results in project delays and budget overruns, many teams attempt to freeze the scope of the project. However, if during the course of the testing effort, a situation arises that potentially warrants a change in the project's scope, then many organizations will decide whether or not to accommodate this change based on the recommendation of a change control board (CCB). For example, discovering halfway through the testing effort that a mirror Web site was planned to go into service next month (but had not yet been built) would raise the question "who is going to test the mirror site?" and consequently result in a change request being submitted to the CCB.

When applying a CCB-like process to changes in the scope of the security-testing effort in order to provide better project control, the members of a security-testing CCB should bear in mind that unlike the typical end user, an attacker is not bound by a project's scope or the decisions of a CCB. This requires them to perhaps be a little more flexible than they would normally be when faced with a nonsecurity orientation change request. After all, the testing project will most likely be considered a failure if an intruder is able to compromise a system using a route that had not been tested, just because it had been deemed to have been considered out of scope by the CCB.

A variation of the CCB change control process implementation is to break the projects up into small increments so that modifying the scope for the increment currently being tested becomes unnecessary because the change request can be included in the next scheduled increment. The role of the CCB is effectively performed by the group responsible for determining the content of future increments.

Features to Be Tested

A security-testing effort is directed to only test some and not all of the following features of a Web site:

  • Network security
  • System software security
  • Client-side application security
  • Client-side to server-side application communication security
  • Server-side application security
  • Social engineering
  • Dumpster diving
  • Inside accomplices
  • Physical security
  • Mother nature
  • Sabotage
  • Intruder confusion
  • Intrusion detection
  • Intrusion response

Features Not to Be Tested

If the testing effort is to be spread across multiple test plans, there is a significant risk that some tests may drop through the proverbial cracks in the floor, because the respective scopes of the test plans do not dovetail together perfectly. A potentially much more dangerous situation is the scenario of an entire feature of the system going completely untested because everyone in the organization thought someone else was responsible for testing this facet of the system.

Therefore, it is a good practice to not only document what items will be tested by a specific test plan, but also what features of these items will be tested and what features will fall outside the scope of this test plan, thereby making it explicitly clear what is and is not covered by the scope of an individual test plan.

Approach

This section of the test plan is normally used to describe the strategy that will be used by the testing team to meet the test objectives that have been previously defined. It's not necessary to get into the nitty-gritty of every test strategy decision, but the major decisions such as what levels of testing (described later in this section) will be performed and when (or how frequently) in the system's life cycle the testing will be performed should be determined.

Levels of Testing

Many security tests can be conducted without having to recreate an entire replica of the system under test. The consequence of this mutual dependency (or lack of) on other components being completed impacts when and how some tests can be run.

One strategy for grouping tests into multiple testing phases (or levels) is to divide up the tests based on how complete the system must be before the test can be run. Tests that can be executed on a single component of the system are typically referred to as unit- or module-level tests, tests that are designed to test the communication between two or more components of the system are often referred to as integration-, string- or link-level tests, and finally those that would benefit from being executed in a full replica of the system are often called system-level tests. For example, checking that a server has had the latest security patch applied to its operating system can be performed in isolation and can be considered a unit-level test. Testing for the potential existence of a buffer overflow occurring in any of the server-side components of a Web application (possibly as a result of a malicious user entering an abnormally large string via the Web application's front-end) would be considered an integration- or system-level test depending upon how much of the system needed to be in place for the test to be executed and for the testing team to have a high degree of confidence in the ensuing test results.

One of the advantages of unit-level testing is that it can be conducted much earlier in a system's development life cycle since the testing is not dependent upon the completion or installation of any other component. Because of the fact that the earlier that a defect is detected, the easier (and therefore more cheaply) it can be fixed, an obvious advantage exists to executing as many tests as possible at the unit level instead of postponing these tests until system-level testing is conducted, which because of its inherent dependencies typically must occur later in the development life cycle.

Unfortunately, many organizations do not conduct as many security tests at the unit level as they could. The reasons for this are many and vary from organization to organization. However, one recurring theme that is cited in nearly every organization where unit testing is underutilized is that the people who are best situated to conduct this level of testing are often unaware of what should be tested and how to best accomplish this task. Although the how is often resolved through education (instructor-led training, books, mentoring, and so on), the what can to a large part be addressed by documenting the security tests that need to be performed in a unit-level checklist or more formally in a unit-level test plan—a step that is particularly important if the people who will be conducting these unit-level tests are not members of the team responsible for identifying all of the security tests that need to be performed.

Dividing tests up into phases based upon component dependencies is just one way a testing team may strategize their testing effort. Alternative or complementary strategies include breaking the testing objectives up into small increments, basing the priority and type of tests in later increments on information gleaned from running earlier tests (an heuristic or exploratory approach), and grouping the tests based on who would actually do the testing, whether it be developers, outsourced testing firms, or end users. The large variety of possible testing strategies in part explains the proliferation of testing level names that are in practice today, such as unit, integration, build, alpha, beta, system, acceptance, staging, and post-implementation to name but a few. Black (2003), Craig et al. (2002), Kaner et al. (2001), Gerrard et al. (2002), and Perry (2000) provide additional information on the various alternate testing strategies that could be employed by a testing team.

For some projects, it may make more sense to combine two (or more) levels of testing into a single test plan. The situation that usually prompts this test plan cohabitation is when the testing levels have a great deal in common. For example, on one project, the set of unit-level tests might be grouped with the set of integration-level tests because the people who will be conducting the tests are the same, both sets of tests are scheduled to occur at approximately the same time, or the testing environments are almost identical.

Relying on only a single level of testing to capture all of a system's security defects is likely to be less efficient than segregating the tests into two (or more) levels; it may quite possibly increase the probability that security holes will be missed. This is one of the reasons why many organizations choose to utilize two or more levels of testing.


Pass/Fail Criteria
A standard testing practice is to document the expected or desired results of an individual test case prior to actually executing the test. As a result, a conscious (or subconscious) temptation to modify the pass criteria for a test based on its now known result is avoided.

Unfortunately, determining whether security is good enough is a very subjective measure—one that is best left to the project's sponsor (or the surrogate) rather than the testing team. Making a system more secure all too often means making the system perform more slowly, be less user-friendly, harder to maintain, or more costly to implement. Therefore, unlike traditional functional requirements, where the theoretical goal is absolute functional correctness, an organization may not want its system to be as secure as it could be because of the detrimental impact that such a secure implementation would have on another aspect of the system. For example, suppose a Web site requires perspective new clients to go through an elaborate client authentication process the first time they register with the Web site. (It might even involve mailing user IDs and first-time passwords separately through the postal service.) Such a requirement might reduce the number of fraudulent instances, but it also might have a far more drastic business impact on the number of new clients willing to go through this process, especially if a competitor Web site offers a far more user-friendly (but potentially less secure) process. The net result is that the right amount of security for each system is subjective and will vary from system to system and from organization to organization.

Instead of trying to make this subjective call, the testing team might be better advised to concentrate on how to present the findings of their testing effort to the individual(s) responsible for making this decision. For example, presenting the commissioner of a security assessment with the raw output of an automated security assessment tool that had performed several hundred checks and found a dozen irregularities is probably not as helpful as a handcrafted report that lists the security vulnerabilities detected (or suspected) and their potential consequences if the system goes into service (or remains in service) as is.

If an organization's testing methodology mandates that a pass/fail criteria be specified for a security-testing test effort, it may be more appropriate for the test plan to use a criteria such as the following: "The IS Director will retain the decision as to whether the total and/or criticality of any or all detected vulnerabilities warrant the rework and/or retesting of the Web site." This is more useful than using a dubious pass criteria such as the following: "95 percent of the test cases must pass before the system can be deemed to have passed testing."

Suspension Criteria and Resumption Requirements
This section of the test plan may be used to identify the circumstances under which it would be prudent to suspend the entire testing effort (or just portions of it) and what requirements must subsequently be met in order to reinitiate the suspended activities. For example, running a penetration test would not be advisable just before the operating systems on the majority of the Web site's servers are scheduled to be upgraded with the latest service pack. Instead, testing these items would be more effective if it was suspended until after the servers have been upgraded and reconfigured.

Test Deliverables
Each of the deliverables that the testing team generates as a result of the security-testing effort should be documented in the test plan. The variety and content of these deliverables will vary from project to project and to a large extent depend on whether the documents themselves are a by-product or an end product of the testing effort.

As part of its contractual obligations, a company specializing in security testing may need to provide a client with detailed accounts of all the penetration tests that were attempted (regardless of their success) against the client's Web site. For example, the specific layout of the test log may have been specified as part of the statement of work that the testing company proposed to the client while bidding for the job. In this case, the test log is an end product and will need to be diligently (and time-consumingly) populated by the penetration-testing team or they risk not being paid in full for their work.

In comparison, a team of in-house testers trying to find a vulnerability in a Web application's user login procedure may use a screen-capture utility to record their test execution. In the event that a suspected defect is found, the tool could be used to play back the sequence of events that led up to the point of failure, thereby assisting the tester with filling out an incident or defect report. Once the report has been completed, the test execution recording could be attached to the defect (providing further assistance to the employee assigned to fix this defect) or be simply discarded along with all the recordings of test executions that didn't find anything unusual. In this case, the test log was produced as a by-product of the testing effort and improved the project's productivity.

Before a testing team commits to producing any deliverable, it should consider which deliverables will assist them in managing and executing the testing effort and which ones are likely to increase their documentation burden. It's not unheard of for testing teams who need to comply with some contractual documentary obligation to write up test designs and creatively populate test logs well after test execution has been completed.

Test Log
The test log is intended to record the events that occur during test execution in a chronological order. The log can take the form of shorthand notes on the back of an envelope, a central database repository manually populated via a graphical user interface (GUI) front end, or a bitmap screen-capture utility unobtrusively running in the background taking screen snapshots every few seconds.

Environmental Needs
A test environment is a prerequisite if the security-testing team wants to be proactive and attempt to catch security defects before they are deployed in a production environment. In addition, tests can be devised and executed without worrying about whether or not executing the tests might inadvertently have an adverse effect on the system being tested, such as crashing a critical program. Indeed, some tests may be specifically designed to try and bring down the target system (a technique sometimes referred to as destructive testing). For example, a test that tried to emulate a denial-of-service (DoS) attack would be much safer to evaluate in a controlled test environment, than against a production system (even if in theory the production system had safeguards in place that should protect it against such an attack).

It would certainly be convenient if the testing team had a dedicated test lab that was an exact full-scale replica of the production environment, which they could use for testing. Unfortunately, usually as a result of budgetary constraints, the test environment is often not quite the same as the production environment it is meant to duplicate (in an extreme situation it could solely consist of an individual desktop PC). For example, instead of using four servers (as in the production environment) dedicated to running each of the following components—Web server, proxy server, application server, and database server—the test environment may consist of only one machine, which regrettably cannot be simultaneously configured four different ways. Even if a test environment can be created with an equivalent number of network devices, some of the devices used in the test lab may be cheap imitations of the products actually used in the production environment and therefore behave slightly differently. For example, a $100 firewall might be used for a test instead of the $50,000 one used in production.

If the test environment is not expected to be an exact replica of the production environment, consideration should be given to which tests will need to be rerun on the production system, as running them on the imperfect test environment without incident will not guarantee the same results for the production environment. A second consideration is that the test environment could be too perfect. For example, if the implementation process involves any steps that are prone to human error, then just because the proxy server in the test lab has been configured to implement every security policy correctly does not mean that the production version has also been implemented correctly.

In all probability, some critical site infrastructure security tests will need to be rerun on the production environment (such as checking the strength of system administrators' passwords, or the correct implementation of a set of firewall rules). If the Web site being tested is brand-new, this extra step should not pose a problem because these tests can be run on the production environment prior to the site going live. For a Web site that has already gone live, the security-testing team must develop some rules of engagement (terms of reference) that specify when and how the site may be prodded and probed, especially if the site was previously undertested or not tested at all. These rules serve as a means of eliminating false intruder alarms, avoiding accidental service outages during peak site usage, and inadvertently ignoring legitimate intruder alarms.

Responsibilities
Who will be responsible for making sure all the key testing activities take place on schedule? This list of activities may also include tasks that are not directly part of the testing effort, but that the testing team depends upon being completed in a timely manner. For instance, who is responsible for acquiring the office space that will be used to house the additional testers called for by the test plan? Or, if hardware procurement is handled centrally, who is responsible for purchasing and delivering the machines that will be needed to build the test lab?

Ideally, an escalation process should also be mapped out, so that in the event that someone doesn't fulfill their obligations to support the testing team for whatever reason, the situation gets escalated up the management chain until it is resolved.

Staffing and Training Needs
If outside experts are used to conduct penetration testing (covered in more detail in Chapter 9), is it cost effective for the internal staff to first conduct their own security tests? If the outside experts are a scarce commodity—and thus correspondingly expensive or hard to schedule—then it may make sense for the less experienced internal staff to first run the easy-to-execute security assessment tests; costly experts should only be brought in after all the obvious flaws have been fixed. In effect, the in-house staff would be used to run a set of comparatively cheap entry-criteria tests (also sometimes referred to as smoke tests) that must pass before more expensive, thorough testing is performed.

Project Closure
Although itmight be desirable from a security perspective to keep a security-testing project running indefinitely, financial reality may mean that such a project ultimately must be brought to closure (if only to be superseded by a replacement project).

When winding down a security testing project, great care must be exercised to ensure that confidential information (such as security assessment reports or a defect-tracking database that contains a list of all the defects that were not fixed because of monitory pressures) generated by the testing effort does not fall into the wrong hands. This is especially relevant if going forward nobody is going to be directly accountable for protecting this information, or if some of this information was generated by (or shared with) third parties.

The test plan should therefore outline how the project should be decommissioned, itemizing important tasks such as who will reset (or void) any user accounts that were set up specifically for the testing effort, making sure no assessment tools were left installed on a production machine, and that any paper deliverables are safely destroyed.

Planning Risks and Contingencies

PLANNING RISK

CONTINGENCY PLAN

Midway through the testing effort, Microsoft releases a new service pack for the operating system installed on a large number of the servers used by the Web site.

Don't install the service pack. (Keep the scope the same.)

Install the service pack and reexecute any of the test cases whose results have now been invalidated. (More time or resources are needed.)

Install the service pack, but don't change the test plan. (The quality of the testing is reduced.)

Redo some of the highly critical tests that have been invalidated and drop some of the lower, as-yet-unexecuted tests. (The quality of the testing is reduced.)

The production environment upgrades its firewall to a more expensive/higher-capacity version.

Do nothing, as the test environment becomes less like the production environment. (The quality of the testing is reduced.)

Buy a new firewall for the test environment. (Increase resources.)

Reduce firewall testing in the test environment and increase testing in the production environment. (Change the scope of the testing.)

The entire testing team wins the state lottery.

Make sure you are in the syndicate

Approvals

A test plan should identify two groups of approvers. The first group will be made up of those individuals who will decide whether or not the proposed test plan is acceptable and meets the security-testing needs of the organization, whereas the second group (which may be composed of the same individuals as the first group) will decide whether or not the deliverables specified in the test plan and subsequently produced and delivered by the testing team (for example, the test summary report) are acceptable.


Comments (0)Add Comment

Write comment
You must be logged in to post a comment. Please register if you do not have an account yet.

busy
  Attention! For US visitors deep discounted electronics products available! CLICK HERE to check it out.