Assessment of the contributions of the laboratory to science instruction has received little attention in the literature of science education. Despite considerable emphasis on laboratory investigations in the curriculum projects supported by the NSF, assessment of curriculum effectiveness has remained at the paper and pencil level. Multiple-choice tests such as those developed by the BSCS have included accounts of laboratory experiments, but no use of actual experimental material has been suggested. The teachers participating in a BSCS Blue Version Test Center (1) indicated in their discussions that some direct assessment of laboratory learning should be included in student evaluation since so much class time was devoted to this activity. The usual laboratory “practical” examination of naming the parts of dissected animals, etc., certainly did not fit the kind of laboratory exercises that the students had performed. A review of the laboratory work performed by students in this test center during the first semester provided a basis for developing a framework for construction of a practical. Since all students had not performed exactly the same set of experiments, the basic examination was developed as a “core” examination based upon the work performed by all students. Each instructor supplemented the “core” questions with the additional problems he thought necessary to assess the semester’s work in his particular classes. Four kinds of activities were found to be common to student laboratory work: performing various kinds of measurements; naming or categorizing organisms, models or apparatus; interpreting experiments; and seeing the appropriate interrelationships of phenomena and ideas. These four areas formed the framework for constructing a twenty item laboratory practical examination. Past experience of the teachers with such examinations led to a selection of a timed movement of students from one station to another with 90 seconds allowed at each station. Protocols for setting up each station were prepared so that all students were confronted with the same phenomena and responded to identical questions. (2) The test was administered in a two-hour examination period during the last week of the first semester of the year course. Post hoc consideration of categorizing the performance required for each test item has led the writer to replace the initial design of four performance areas to a behaviorally formulated set of categories into which the items may be grouped for analysis. The revised categories are: measuring, identifying, selecting, and computing. These categories denote performance operations that the student must successfully execute in order to succeed on a particular item. Measurement items are divided into two subgrgups, those in which only reading and recording were required, and those to which other operations are added to reading and recording a measurement. Identification items are those requiring recognizing and naming-establishing identity. One group of items required the respondents to name an object, process, or use of an object; the second required the placement of an object within a designated group. Selection items required the respondents to choose a designated object from a group of two or more objects.