Return to Training Resource Center

National Weather Service Training Center
Hydrometeorology & Management Division

Evaluation Methods

March 1997

1. Introduction

When someone says "training evaluation", the first thing you probably think of is the word "test". Tests use a quantitative measure to assess whether a trainee has learned the training material. Tests usually take a specific form, frequently a set of questions.

Evaluation, on the other hand, is much broader in scope than testing. It involves gathering of information in order to provide detailed feedback to the trainee on his/her progress. It includes both subjective (opinion) and objective (factual) aspects.

At a WFO/RFC the SOO/DOH you may need to "test" station personnel occasionally, particularly if certification is involved. But the SOO/DOH is primarily an evaluator. The purpose of this chapter is to explain the importance of training evaluation, to describe what should be evaluated, and to list a variety of evaluation methods that have application at a typical WFO/RFC. In the sense that evaluation applies to many aspects of office operations, some of the methods listed below have application beyond the training domain.

2. Why is training evaluation important?

Phillips (1983) states that training evaluation is undertaken for two primary purposes: (1) to improve the human resource development (HRD) process; and (2) to decide whether or not to continue it. He lists nine purposes and uses of evaluation.

1. "To determine whether a program is accomplishing its objectives.
2. "To identify the strengths and weaknesses in the HRD process.
3. "To determine the cost/benefit ratio of an HRD program.
4. "To decide who should participate in future programs.
5. "To identify which participants benefitted the most or the least from the program.
6. "To reinforce major points made to the participant.
7. "To gather data to assist in marketing future programs.
8. "To determine if the program was appropriate.
9. "To establish a data base which can assist management in making decisions."

Let's look at several of these points. Item (b), the strengths and weaknesses of the training, is the most common purpose for evaluation. Things like method of presentation, the learning environment, program content, training aids, facilities, schedules, and the instructor(s) are evaluated here. It is also important to know who benefitted the most or the least from a training program [Item (e)]. This factor will influence whether others should participate in this training session. Evaluation is often a good place to reinforce important points made during a training session [Item (f)]. It allows an instructor to highlight what parts of the lesson are significant compared to other parts.

Whatever your reason for evaluating a training program, evaluation is an important part of the overall training process and should not be ignored.

3. What should you evaluate?

The Basic Idea ... If you are going to evaluate training and its impact, you must ensure that you collect data that will tell you what you want to know about the training. The Kirkpatrick model discussed in the previous chapter may help you decide what kind of questions should be asked about the training.

An important point to remember is that once you have defined your training objectives, you have defined the results you expect from the training, and, essentially, identified what should be measured. The hardest part of this process is to decide what kind of data should be collected to achieve this measurement. This decision must be made prior to the training, because, in some cases, information needs to be collected during the training itself.

The example given by Mager (1984) shows how the evaluation should match the objectives. Assume you are attending a training session and your objective for the session is as follows:

Upon completion of this training, you will be able
to ride a unicycle one hundred yards along a level
paved street without falling off.

You faithfully attend the training sessions, you spend a lot of non-class time practicing on your unicycle, and you feel confident that you could ride that unicycle the length of two football fields. When test time comes around, you are given a sheet of paper and asked to answer the following questions:

"1. Define unicycle.
"2. Write a short essay on the history of the unicycle.
"3. Name at least six parts of the unicycle.
"4. Describe your methods of mounting a unicycle."

How would you react to this test? Were you tested on the objective of the training? What type of test should have been given? A good training program will ...

Evaluate its training objectives.


Design the evaluation scheme to check for accomplishment of the training objectives. The evaluation method should ask: Did the person attending your training learn what he or she needed to learn? In Mager's example the final examination should have been a performance test that required the trainee to ride the unicycle 100 yards.

Determining the intent of the objective ... In order to "evaluate your training objectives" you must determine the true intent of that objective. Mager (1984) calls this "decoding the objective". An objective may state its "main intent" or may just provide an "indication" of its main intent.

The main intent of a training objective shows the primary or principal purpose of the objective. Using a sample objective from Mager:

Be able to identify the verb in any sentence.

"Identifying the verb" is the main intent of this objective. When you complete a training session that has this objective, you should be able to examine a sentence and select the verb. Test for that intent.

Question: What is the main intent of this objective?

Be able to locate the axis of heaviest snowfall indicated by the track of the 850 mb low.

The main intent here is "to locate the axis of heaviest snowfall". This is fairly straightforward and fairly easy to identify.

An indicator, on the other hand, is "an activity through which the existence of the main intent will be inferred" (Mager, 1984). In other words, the main intent is not stated explicitly, but is hidden behind some process or activity. Using another of Mager's example:

Be able to circle a verb in any sentence.

The performance called for by the objective is "to circle a verb". However, is this really what the student needs to learn? Probably not. The intent is more likely the identification of the verb, as cited in the "main intent" example above. Let's examine this more meteorology-related example:

Be able to shade areas of high moisture content on upper air charts.

The process of "shading areas" is what the objective calls for. But is this its main intent? Not likely. Identification of areas of significant moisture is a more likely intent. Ensure that your objectives state explicitly what you want your trainee to learn. Then design your tests around these objectives.

4. Training Evaluation Options

The options listed below can be used to evaluate a WFO/RFC training program. The method or methods chosen as an evaluation mechanism for a particular program will depend upon the training program and its objectives. There is no "one size fits all" approach to training evaluation.

Testing ... Formal or informal testing has been a traditional method of evaluation. A test can contain a set of questions that the trainee is required to answer from memory, or it can be of a more practical nature, e.g., actual performance of a task or procedure. Questions may be written or oral, open-book or closed-book. You can find an abundance of books on how to properly design tests and test questions.

One way to use testing is the pre-test and post-test concept. If the training has been effective, post-test results should be show improvement over pre-test results. Many of the options described below are considered a type of test by some training experts. Specific tests and evaluation methods are defined in Section 5.

Socratic dialogue/questioning ... Asking questions to evaluate the depth of knowledge that an individual has acquired can be a very effective evaluation method. These questions are not asked in the sense of a formal, oral test, but become part of a more casual discussion of the training. When properly done, you can get a fairly good feeling for what a person has learned. The key here is developing a questioning style that extracts the necessary information without making the trainee uneasy.

Use open-ended questions with this method. Open-ended questions are asked questions that cannot be answered with a "yes" or a "no". Also, when this method is employed, it is good practice to record your observations for future reference.

Problem-solving projects ... An excellent way to discover what a trainee has learned, or if a trainee can apply what he/she has learned, or whether he/she may need additional training, is to assign the trainee a problem to solve that focuses on the material covered in a training session. In meteorology this often involves applying a concept to a specific weather situation. For example, in the Winter Storm lesson at the NWSTC students are given a set of weather charts and asked to forecast where the heaviest snow is expected to occur during a 24 hour period. The methods needed to make this forecast are addressed in the lesson just prior to the exercise.

When you critique these problem-solving projects, you need to prepare a list of what the trainee should find during his/her analysis of the problem. This means that you must thoroughly examine the material yourself and note the important points. Please realize, however, that in many cases, there may be more than one solution to a problem. The trainee may have a solution you did not think of.

Case histories ... Case histories are just problem-solving projects on a larger scale. They challenge the trainee to use what they have learned to analyze and evaluate a particular situation. The conclusions drawn by the trainee reveal to what depth the trainee has learned the material covered by the training.

Practice sessions ... The manner in which a trainee works through a simulation or hands-on practice sessions can tell an evaluator how well a trainee knows how to do a task. The evaluator must know ahead of time what key points to look for. If a person is doing a task which involves a series of steps, each step must be evaluated separately.

In these situations, any corrective action should be approached in a positive, reinforcing manner. Never play "I gotcha". Try to get the trainee to recognize what was wrong without feeling guilty. Providing this positive reinforcement takes practice. For example, if an wrong answer is given to a question, instead of saying "you're wrong", ask a series of questions that lead the trainee through the steps necessary to arrive at the correct answer.

Observation/eye-contact ... Casual observation of a trainee can sometimes tell an evaluator how things are going. This can involve observation during a practice session, a test, or any other activity normally associated with a training session. In addition, casual conversation during non-training time may reveal useful information about the effectiveness of training process.

Eye-contact between the trainer and trainee is an excellent way to communicate information as is kinesics or body language. During training sessions ensure that you are in a position to look at the trainee. The following eye contact interpretations might be useful:

glare or stare -
frown -
glassy or blank -
shining eyes -
droopy or sleepy -
blinking or wandering -

challenge or disagreement
doubt or deep thought
had enough
challenged and interested
tuned out or bored
nervousness or hiding something

If you observe these indications, it tells you that a change of pace, change of subject, or a few questions may be needed to keep the trainee challenged or interested.

Assessment sessions ... A general assessment session of a training program or lesson usually takes place at the end of the training program or lesson. This may involve an end-of-course critique form, a questionnaire or survey, or an interview with the trainees. Follow-up assessment may involve statistics derived from performance records. Many of the items listed above can be used in these assessment sessions.

A good training evaluation involves considerable work on the part of the evaluator. In order to extract the proper information for the evaluation, the evaluator must know what should be evaluated, and then design the evaluation to obtain the proper information.

5. Training Evaluation Terms

There are a variety of terms that are used in association with testing and evaluation. Some of these are listed below. Definitions were extracted from Nilson (1989).

validity: "testing that fairly and accurately represents the content (skills and knowledge) covered by training"

reliability: "testing that is repeatable over time with similar types of trainees; testing designed to measure the same thing with different groups of trainees"

competency: "qualities of a person that make him or her fit for a job; competency can be acquired through talent, experience, or training"

achievement: "a measurement of what a person knows or can do after training"

norm-referenced test: "a test of an employee's rank in reference to a selected group of people, often expressed as a percentile"

criterion-referenced test: "a test of an employee's accomplishment in relation to a standard; often expressed as 'performing according to standard' or 'not performing according to standard', that is, in yes or no terms"

pre-test: "examination of knowledge or skills a trainee already has in the content area of the training course/program"

post-test: "examination of knowledge and skills a trainee can demonstrate directly after a training course/program; often a summative test, administered at the end of the course"

certification: "guarantee of competency in a specific job because entry criteria or continuation criteria have been met; assured through testing, often controlled by a professional association ... or legal body ... ; often sought by trainees at the end of a training program"


References

Mager, Robert F., 1984: Measuring Instructional Results. 2nd edition. Lake Publishing, Belmont, CA, 166 pp.

Nilson, Carolyn, 1989: Training Program Workbook and Kit. Prentice Hall, Englewood Cliffs, NJ, 430 pp.

Phillips, Jack J., 1991: Handbook of Training Evaluation and Measurement Methods, 2nd Edition. Gulf Publishing, Houston, 415 pp.


Review Questions and Exercises


(1) True or False

Training evaluation has a much broader scope or meaning than testing.

(2) The most common purpose for training evaluation is:

A. To reinforce major points within a lesson
B. To determine if a training program achieved it objectives
C. To identify the strengths and weaknesses of a training program
D. To gather data to market future programs

(3) When deciding what to measure as part of your evaluation, where is the best place to look?

A. The lesson's introduction
B. The lesson's objectives
C. The lesson's exercises
D. The lesson's conclusion

(4) Match the following evaluation methods with its description:
A. Testing
B. Socratic Dialogue
C. Problem-Solving
D. Practice Sessions
E. Observation
_____ A project that focuses on lesson material
_____ A casual evaluation techniques
_____ An objective evaluation method, usually a set of questions
_____ Asking questions to evaluate to depth of knowledge
_____ Hands-on practice

Complete the Following Exercises

For the Mesoscale Convective System (MCS) lesson in Chapter 2, develop an appropriate evaluation method.


Appendix A

Answers to the Review Questions


(1) True

(2) C

(3) B

(4) C - A project that focuses on lesson material
  E - A casual evaluation techniques
 A - An objective evaluation method, usually a set of questions
 B - Asking questions to evaluate to depth of knowledge
 D - Hands-on practice


Return to Training Resource Center