Subsections
8.2 Empirical Evaluations of the CodeBroker System
To understand the effectiveness of the CodeBroker system in supporting
reuse-within-development, formal evaluation experiments have been
conducted. The structure of the experiments is described in this
section, and findings and conclusions are presented in the next four sections.
Subjects were recruited from undergraduate and graduate students from
the Computer Science Department. As mentioned in Chapter 2,
programming involves a wide range of knowledge. Because the design goal of
CodeBroker is to
provide knowledge about reusable components to programmers,
to minimize other factors that contribute to the difficulty of
programming in general, only students who already had extensive
programming knowledge
and experience were recruited as subjects. Because CodeBroker is developed as
an add-on to the existing programming environment, Emacs in Unix,
a basic working knowledge of Emacs and Unix was also required so that
subjects could easily learn the operations of the system and experiments could
be focused on the support provided by the system.
Five subjects voluntarily participated in the evaluation
experiments. All but one programmer had extensive knowledge in other
programming languages, such as C and C++. Two had worked as
professional programmers. Three
were regular contributors to several Open Source
projects. Their expertise in Java programming varied, ranging from
medium to expert level. All of them knew the syntax of
Java very well; the difference of their expertise came from the range
of reusable components (classes and methods in API libraries) they
knew.
Table 8.2 summarizes their background knowledge about
programming in general and Java in particular. In that table, small
(abbreviated as S)
projects refer to projects similar to semester projects,
requiring 1 or 2 man-months; medium projects (abbreviated as M) refer to projects requiring
3 to 5 man-months; large projects (abbreviated as L) refer to projects requiring more than
6 man-months.
Table 8.2:
Programming knowledge and expertise of subjects
=
|
Subject |
S1 |
S2 |
S3 |
S4 |
S5 |
|
Years of general programming |
3 or 4 |
5 or 6 |
8 |
10+ |
10+ |
|
Programming experience in general (measured in number of projects) |
3S, 1L |
10S |
7M, 1L |
10+SM, 2L |
10+L |
|
Current major programming language |
C++ |
Java |
Java |
Java |
Java
|
|
Years of Java programming |
10 months |
4 |
4 |
7 |
5 |
|
Self-evaluation of Java expertise (1: Beginner - 10: Expert) |
4 |
7 |
7 or 8 |
10 |
7 |
|
Recent frequency of programming in Java |
Not active for 3 months. |
Not active for 3 months. |
Every week |
Every day |
Not active for months |
|
Subjects were asked to implement two or three programming tasks with
the CodeBroker system. Days before the experiments, CodeBroker created an initial user model
(see Section 7.4.3 for the method, and Figure 7.6 for an exmaple) for each subject by
scanning programs the subject had
written recently. Because many of the programs the subjects had written
were for companies and thus were not available,
no user models were complete. Nonetheless, the number and range of components
included in the user models were consistent with the subjects' self-evaluations
of Java expertise.
After analyzing their user models, the subjects were assigned tasks whose implementation involved
components they probably had not known well enough.
In the beginning of the experiments, the main functionality
of the CodeBroker system was briefly introduced with a running
example after the subjects had signed the Informed Consent Form for
participating in the experiments. This took about 5 minutes.
Previous to the implementation of each task, programmers were asked to
describe briefly how they would implement the task, and after each
task had been finished, simple questions such as ``Did you know this component
before?'' and ``Why did you choose this component?'' were asked regarding
their programming activities. At the end of the experiments, a post-experiment
interview8.3 was conducted to
capture the subjects' background
knowledge of programming and their subjective evaluation of the CodeBroker system based on their use.
Programmers were told to do programming in their normal
way but to take advantage of the support provided by CodeBroker. They could use
books, the Java API Documentation Browser, and all other support as they
usually did. Two subjects actually brought and consulted their favorite ``Java in
a Nutshell'' [Flanagan, 1997]. As an observer, I occasionally
answered their questions about the operation of the CodeBroker system.
The CodeBroker system used the following default settings in the experiments:
- It adopted the Okapi retrieval mechanism and the
signature-matching mechanism, with each assigned the weight of
0.5.
- A component was decided to be known to the programmer if the
user model indicated the programmer had used it three times.
- In the first four experiments with the first two subjects, the system
delivered 14 components in the RCI-display because the experiments
were conducted on a laptop with a small monitor. In all other
experiments that were conducted on a desktop with a large monitor, the
system delivered 20 components.
- The component
repository contained 673 classes and 7,338 methods from both the Core
API library of Java 1.1.8 and JGL 3.0.
8.2.3 Programming Tasks
Because subjects were volunteers, large and time-consuming tasks were not
very suitable. The experiments used programming tasks similar to the typical
assignments of a programming language course, which could be
implemented with several methods in about 20 to 60 minutes. The following tasks were used in the experiments.
- Task 1
-
You are asked to implement a program that selectively
backs up files based on a list that holds all files
needed to be backed up. The list looks like:
/usr/java/private/important/letter1
/usr/joe/project/backup/getAllFiles.java
and the file name is passed as a parameter in the command line.
It requires that the back-up program retain the same hierarchical
structure when the files are backed up in another directory
$BACKUPDIR, which is passed as the second parameter in the
command line; for example, getALLFiles.java should also be
found under the directory $BACKUPDIR/usr/joe/project/backup/.
- Task 2
-
You are asked to write a program to simulate the process of
card dealing. Each card is represented by a number from 0 to
51. The program should produce a list of 52 cards, as
it results from a human card dealer. Let us assume that if
a person cuts a deck of cards and shuffles it 7 times, the
result is satisfactory.
- Task 3
-
Traditionally, Chinese write numbers with a comma inserted at
each fourth number from the right. For example, 1,000,000 is
written as 100,0000. Please implement a program that transforms
the Chinese writing format (100,0000) to the western format (1,000,000).
To simplify the programming task, you don't need to read the
input from the keyboard. You can assume you can get the input
anywhere you like, such as a static class variable, a parameter
of a method, or input from the command line.
- Task 4
-
Jack has a long list of MP3 songs he has compiled. However, many of
the songs are repeated in the list. He wants to create a new list
in which each song appears only once. Assume each list has the following
format TITLEa, TITLEb, TITLEc, ...
where TITLEi is a string including letters only. Implement
a method to create a new list with no repetitions.
Assume the list is stored somewhere; for example, you can put it
into a class variable.
- Task 5
-
Please write a program that can calculate the day of the week. We know
that today is Jan. 19, 2001, Friday. Your program should be able to
compute the day of the week M years from today, or N months from today.
Both M and N could be negative, which means M years or N months before.
Assume the convention to pass the data to your program is:
Y 10 means 10 years from today, and
M -5 means 5 months ago.
- Task 6
-
A processor needs to respond to a series of events. Each event is assigned
a distinct number. When the processor is busy, newly arrived events will be
put into a waiting list. When the processor finishes processing the
previous event, it picks an event in the waiting list. However, it picks
the event with the largest number in the waiting list. You are asked
to implement a pair of operations: one to put a new event into the
waiting list, and the other to help the processor pick up the next event
to be processed. (You don't need to be concerned with concurrency.)
All tasks could
be implemented with different combinations of different reusable
components from the repository. If the subjects know or find the right
components, the implementation would be fairly easy; if they do not,
they would have to use components of lower levels or even basic
statements. Therefore, those
tasks can allow us to observe how the delivery of the system changes
the programming process of subjects.
The CodeBroker system has an automatic log mechanism that logs the reuse queries
extracted by Listener, the components retrieved by Fetcher, the
components removed by Presenter, and both system-initiated and
user-initiated changes to discourse models and user models.
All experiments, including interviews, were
videotaped. Subjects were asked to think aloud during the
experiments. However, because thinking aloud may interfere with normal
programming practice, this was not stressed.
Analysis of the system was based on the log data, the video tapes, and
transcribed interviews.
The
purpose of the analysis was not about the quality and productivity of
programming; instead, it was about how CodeBroker affects the process of
programming by encouraging programmers to reuse. Quantitative assessment was based
on log data, and qualitative evaluation was based on interviews and
think-aloud protocols.
Ph.D. Dissertation by Yunwen Ye, April 20, 2001, Department of Computer Science, University of Colorado