1.1 Motivation

A wide gap exists between the constantly increasing demands for complex software systems and the capability of the software industry to deliver quality software systems in a timely and cost-effective manner. Software reuse, a development method of using existing reusable software components to create new programs, has been shown through empirical studies to improve both the quality and productivity of software development [Basili et al., 1996,Boehm, 1999]. Software reuse also increases the evolvability of software systems because complex systems evolve faster when they are built from stable subsystems [Simon, 1996].

Programmers are knowledge workers, and programming is a process of progressive crystallization of their knowledge into a program. Knowledge needed during programming comes either from the programmer's head or from such external sources as books, manuals, peer workers, and computerized information systems [Norman, 1993]. A lack of needed knowledge is one of the major reasons for poor quality and productivity of programming. With the advent of objected-oriented technology, reusable software components now comprise the bulk of programming knowledge. Easy access to needed external information, in particular, reusable software components, to complement the insufficient knowledge of programmers is thus critical to the improvement of programming quality and productivity.

If programmers know a reusable software component well enough, they may integrate it into their programs whenever it is applicable without even realizing they are reusing because such reusable components become ``ready-to-hand'' to programmers [Winograd and Flores, 1986]. However, repositories of reusable software components are often so large that programmers cannot learn about all of the components before they start programming. Software component repositories are not static; they are constantly evolving with new components added and old components updated. As an example, Table 1.1 shows the rapid growth of the Java Core API (Application Programmer Interface) library--a repository of reusable components of classes and methods. Few Java programmers, if any, can claim that they know all the components in this library.


Table 1.1: The rapid growth of the Java Core API library
Version No. of Packages No. of Classes Year of Release
Java 1.0 8 211 1996
Java 1.1 23 503 1997
Java 1.2 59 1525 1998
Java 2 70+ 2100+ 1999


Programmers who have not learned the software component have to go through the reuse process if they want to reuse or use it in their programming. The reuse process consists of three steps: location, comprehension, and modification (Figure 1.1). Programmers have to locate those components that are potentially reusable in the current programming task from the component repository, comprehend their functionality and usage, and make necessary modifications if the components do not completely fit their needs [Fischer et al., 1991].

Figure 1.1: The location-comprehension-modification process of reusing components

\includegraphics[width=0.8\linewidth]{figs/lcm-cycle.eps} $\textstyle \parbox{.8\linewidth}{\small{Successful reuse requires programmers be able to locate,
comprehend, and modify needed reusable components.}}$

The foremost obstacle to the success of component reuse is that programmers cannot locate needed software components quickly and easily. Locating reusable software components is often supported by component repository systems or reuse repository systems. Like many other information repository systems, browsing- and querying-oriented schemes have long served as the principal techniques for programmers to locate reusable software components. More innovative schemes, such as query by reformulation [Williams et al., 1982,Fischer and Nieper-Lemke, 1989,Henninger, 1993], information filtering [Belkin and Croft, 1992], and Latent Semantic Analysis [Landauer and Dumais, 1997], have introduced new possibilities. Unfortunately, the problem remains that programmers simply do not actively search for components and make no attempt to reuse. According to a study by Frakes and Fox [Frakes and Fox, 1995], no attempt to reuse is the leading failure mode of software reuse (Figure 1.2). This inhibiting factor to the wide success of reuse has been reported again and again by software companies that have tried to introduce reuse into their organizations [Devanbu et al., 1991,Rosenbaum and DuCastel, 1995,Fichman and Kemerer, 1997].

Figure 1.2: Software reuse failure modes

\includegraphics[height=3.5in]{figs/failureModes.eps} $\textstyle \parbox{.8\linewidth}{\small{In the Frakes and Fox (1995) paper,
sev...
...is shows the percentage each condition plays in
causing the failure of reuse.}}$


Ph.D. Dissertation by Yunwen Ye, April 20, 2001, Department of Computer Science, University of Colorado