The CodeBroker system presents information on components to programmers in three layers. The first layer is the RCI-display in which each component is accompanied with its rank of similarity, its similarity value, its name, and a short description (Figures 7.2 and 7.3). The presentation of the second layer of information is triggered by the mouse movements of programmers. Component names and short descriptions in the RCI-display are mouse-sensitive. When the mouse cursor is moved over the component name, the signature of the component is shown in the mini-buffer (see the last lines in Figures 7.2 and 7.3); and when the mouse cursor is over the short description, terms contributing to the concept similarity between the component and the concept query are shown in the mini-buffer (Figure 7.4) to reveal why this component is retrieved and to help programmers refine their queries if necessary. The third layer of information, the most complete description of a component, is shown in an external HTML browser, such as Netscape Navigator. When the programmer left-clicks on the component, the full Javadoc documentation for the component is displayed in the browser. The HTML tag extracted at the time of indexing is used so that the browser can display the exact place of the component description.
CodeBroker captures this larger context in a discourse model (Section 5.2.3) that represents the previous interactions between the programmer and the system in one development session. The discourse model is used by Presenter as a filter to remove components in which the programmer is not interested in the current development session, although they are retrieved by Fetcher based on incomplete task models.
Java component repositories are organized hierarchically according to packages and classes, and packages and classes are often designed for particular application domains. For most programming tasks, only a part of the repository is involved. CodeBroker uses negative discourse models to capture what part of the repository is not of interest to programmers because discourse models are incrementally evolved by programmers during their interactions with the CodeBroker system, and in many cases it takes less effort for programmers to identify apparent irrelevant components. Section 7.5 explains in detail how the discourse model is incrementally augmented by programmers.
A discourse model in CodeBroker is in the format of a Lisp association list (Figure 7.5). It specifies packages or classes in which the programmer has no interest for the current development session. Before components retrieved by Fetcher are delivered to programmers, Presenter compares each component against the discourse model, and if the component belongs to a class or a package in the discourse model, it is removed.
Discourse models also reduce the delivery of irrelevant components caused by polysemy--a difficult problem for any information retrieval systems--by limiting searching domains because polysemous words often have different meanings in totally different domains. For example, if the programming task is to shuffle a deck of cards, the programmer may use the word ``card'' in doc comments. That would make the system deliver components from the class java.awt.CardLayout, a GUI (Graphic User Interface) class in which ``card'' means a graphical element. If the current development project does not involve interface building, this whole class is irrelevant. The programmer can add the class (java.awt.CardLayout) or even the whole package (java.awt) to the discourse model to prevent components belonging to it from being delivered in this development session.
User models in CodeBroker contain a list of components known to the programmer, namely, those components from L1 and L2. An example user model is shown in Figure 7.6. Each item in the list is a package, a class, or a method. Each component retrieved from the component repository is looked up in the user model before it is delivered. If a method component matches a method in the user model, and the user model indicates the programmer has used it more than three times (this number is adjustable by the programmer), the system assumes the programmer knows it already and removes it from the delivery. If the method has no use time, it means the method was added by the programmer, who had claimed he or she had known it very well and did not want it delivered. If the class of the method (which has no method list in the user model), or the package of the method (which has no class list) is included in the user model, the method is removed as well.
VariableDeclarationStatement := LocalVariableDeclaration;
LocalVariableDeclaration := final_opt Type VariableDeclarators
VariableDeclarators := VariableDeclarator |
VariableDeclarators, VariableDeclarator
VariableDeclarator := VariableDeclaratorId |
VariableDeclaratorId = VariableInitializer
VariableDeclaratorId := Identifier | VariableDeclaratorId [ ]
VariableInitializer := Expression | ArrayInitializer
Type := PrimitiveType | ReferenceType
ReferenceType := TypeName | Type []
TypeName := Identifier | PackageOrTypeName.Identifier
PackageOrTypeName := Identifier | PackageOrTypeName.Identifier
Each time a new variable is declared by a programmer, the Listener stores the variable name and its class name in an association list.
When an object method name and its variable name are extracted,
Listener looks up the variable-class association list to find to which
class the method belongs. For example, in Figure 7.7,
the variable name for the method addElement (line 9) is tmpVec,
which is declared as a Vector (line 6).
ImportDeclaration := import TypeName; |
import PackageOrTypeName.*;
and creates a list of imported packages and classes.
If the package of a class is
unique in the imported package list, then the imported package becomes the
package of the class; otherwise, the programmer has probably made a
mistake,7.2and the extracted method is ignored.
To make it easier to understand here, three steps were described in the reverse order of their execution. Listener creates the list of imported packages first, followed by creating the variable-class list, followed by extracting method names. When Listener successfully extracts the method name and determines its class name and package name, it adds the component, including its class and package, with the current time as use time, to the user model. Listener adds only methods to the user model; it does not add a class or a package because the use of a class or a package does not mean that the developer knows the whole class or package.