Subsections

4.1 No Attempt to Reuse

4.1.1 Three Reuse Modes

As a part of the knowledge-intensive programming process, reuse is a process of applying the knowledge of reusable components into programs. Because few programmers know all about reusable components, component repository systems are introduced to facilitate the easy application of reusable components during programming. Based on the source of the knowledge of reusable components, three modes of reuse exist: reuse-by-memory, reuse-by-recall, and reuse-by-anticipation.

Programmers have little resistance to the first two modes of reuse. As has been reported by Isoda, programmers reuse those components repeatedly once they have learned them [Isoda, 1995]. Lange and Moher, in their empirical study on programming and reuse strategies, have found that programmers search extensively for the components they know exist even if they may not be able to name them a priori [Lange and Moher, 1989]. This explains why individual ad hoc reuse has been taking place while organization-wide systematic reuse has not received the same success: programmers have individual reuse repositories in their memories so they can reuse by memory or reuse by recall [Mili et al., 1995]. For those components that have not yet been internalized into their memories, programmers have to resort to the mode of reuse-by-anticipation. The activation of the reuse-by-anticipation mode relies on two enabling factors:


4.1.2 Information Islands in Component Repositories

Unfortunately, programmers' anticipation of available reusable components does not always match real repository systems. Empirical studies on the use of high-functionality computer systems (component repository systems being typical examples of them) have found there are four levels of users' knowledge about a computing system (Figure 4.1) [Fischer, 2001].

Figure 4.1: Different levels of programmers' knowledge about a component repository

\includegraphics[width=4.250in, height=1.694in]{figs/FourLevels.eps}

In Figure 4.1, ovals represent the collection of components that are in a particular knowledge level of programmers, and the rectangle represents the actual information space (namely, the whole collection of items in an information system), labeled L4. L1 includes those reusable components that are well known, easily employed, and regularly reused by a programmer. L1 corresponds to the reuse-by-memory mode. L2 contains components known vaguely and reused only occasionally by a programmer; they often require further confirmation before being reused. L2 corresponds to the reuse-by-recall mode. L3 represents what programmers believe about the repository. L3 corresponds to the reuse-by-anticipation mode.

Many components exist in the area of (L4 - L3), and their existence is not known to programmers. Consequently, there is no possibility for programmers to reuse them simply because people do not ask for what they do not know [Fischer and Reeves, 1995]. Components in (L4 - L3) thus become information islands [Engelbart, 1990,Ye and Fischer, 2000], inaccessible to programmers without appropriate tools. Repositories are not static--it is expected that they will evolve over time, and this will increase the size of (L4 - L3).

Many reports about reuse experiences of industrial software companies illustrate this inhibiting factor of reuse. Devanbu et al. have reported that because developers are unaware of reusable components, they repeatedly re-implement the same function--in one case, this occurred ten times [Devanbu et al., 1991]. This kind of behavior is also observed as typical among the four companies investigated by Fichman and Kemerer [Fichman and Kemerer, 1997]. From the experience of promoting reuse, Rosenbaum and DuCastel have concluded that making components known to developers is a key factor for successful reuse [Rosenbaum and DuCastel, 1995].

4.1.3 Low Reuse Utility

Human beings often try to be utility-maximizers in the decision-making process [Reisberg, 1997], and programmers are no exception. When programmers perceive that reuse utility, which is the ratio of reuse value to reuse cost, is too low, they do not make an attempt to reuse [Sen, 1997]. Because there is no easy way for programmers to estimate reuse value and reuse cost objectively, the estimation made by programmers during programming is quite subjective and suffers from cognitive biases against reuse; they tend to underestimate reuse value and overestimate reuse cost.

4.1.3.1 Underestimated Reuse Value

The value of reuse is multifold. As stated in Section 2.4, reuse value includes:
(1)
reduced development time
(2)
improved quality
(3)
easy maintenance
(4)
improved evolvability
(5)
increased problem-framing ability
However, not all programmers recognize reuse value when they are under a tight schedule to finish their current program. Most of the reuse value is long-term and shows its benefit only after the program has been developed; for programmers, what interests them most are the short-term benefits. In his investigation on reuse in NTT (Nippon Telegraph and Telephone Corporation), Isoda concludes that unless programmers find the immediate benefits of applying reusable components, they will not, of their own free will, perform reuse [Isoda, 1995]. It is human nature to pay attention to the immediate benefits only and ignore long-term benefits [Grudin, 1994] because human beings are unable to think coherently about the remote future and particularly about the distant consequences of their actions [Simon, 1996]. To encourage programmers to recognize the full benefits of reuse, many researchers have called for reuse education. Despite its importance, reuse education alone has not brought reuse to fruition [Joos, 1994] because being told that ``it is for your own good'' seldom provides adequate motivation for programmers to change their behavior [Simon, 1996]. Some organizations have also tried to provide monetary rewards to programmers who reuse, which has not been successful either [Frakes and Fox, 1995].

4.1.3.2 Overestimated Reuse Cost

As analyzed in Section 3.4.3, the cost of reuse caused at reuse time includes:
(1)
the cost of forming reuse intentions
(2)
the cost of formulating reuse queries
(3)
the cost of operating the repository system to retrieve components
(4)
the cost of choosing components
(5)
the cost of understanding and modifying components
(6)
the cost of integrating components
In addition, when reuse repository systems are separated from current programming environment, reuse cost includes the cost associated with switching back and forth between the programming environment and the reuse repository system, which causes the loss of working memory and the disruption of workflow.

Depending on the reuse mode, only some of these costs may be involved. In the reuse-by-memory mode, the cost of reuse is reduced to the cost of (6) only. In the reuse-by-recall mode, the costs of (1), (2), and (4) are quite small because programmers know what to look for and where to find the components. In the reuse-by-anticipation mode, all of these costs are involved, and due to the following two cognitive biases--Einstellung and loss aversion--against reuse, those costs are often overestimated.


Ph.D. Dissertation by Yunwen Ye, April 20, 2001, Department of Computer Science, University of Colorado