Subsections
7.5 The Retrieval-by-Reformulation Mechanism
To complement the incompleteness of reuse queries,
CodeBroker supports two forms of retrieval-by-reformulation (Section 3.4.3):
direct manipulation and query
refinement. After examining the components initially
delivered by CodeBroker, programmers can either
refine the query to improve its completeness and
preciseness or directly manipulate the delivered components by
removing apparently irrelevant ones.
7.5.1 Direct Manipulation
Direct manipulation of the delivered components serves two
purpose: to facilitate the easy choice of components and to
augment the discourse model or the user model. Each component
in the RCI-display is associated with a float menu, the
Skip Components Menu (Figure 7.8),
which pops up as the component name is right-clicked.
Figure 7.8:
The Skip Components Menu
|
|
The Skip Components Menu allows programmers to remove those components
that are apparently not related to their current development
task so that they can find needed information easier. The
first item of the menu is the method component itself;
the second, its class; and the third, its package.
If programmers want to remove
the method or all of the components in the class or the package
from the RCI-display, they can choose the appropriate item. Each
item has three choices: This Buffer Only,
This Session Only, and All Sessions.
When the command This Buffer Only is chosen, the corresponding
components are removed from the RCI-display. When the command
This Session Only is chosen, the components are
not only removed from the RCI-display, they are also added to the
discourse model and will not be delivered
later in this development session. The discourse model is
empty when a development session starts, and it gets incrementally
increased by programmers as they interact with the system.
When the command All Sessions is chosen, the components
are removed from
the current RCI-display and are added to the user model. Components added to
user models through the Skip Components Menu do not have the use time field (see the last line in Figure 7.6).
With this design,
the system can obtain information to evolve discourse models
and user models without adding too much extra work for programmers,
who also gain the immediate benefit because the choice of
needed components becomes easier by removing those apparently
irrelevant components.
For example, in Figure 7.9(a), in response to the doc comment,
CodeBroker delivers some components (No. 1 through
No. 4) belonging to the class java.awt.Cardlayout (a GUI class)
due to the term ``card.'' However, the
current task is not related to the
class java.awt.Cardlayout, so the programmer can remove it through the
direct manipulation interface. This manipulation brings
the needed component randomShuffle, obscured
previously, to the salient fourth place (Figure 7.9(b)).
The fact that the programmer is not interested in the class
java.awt.Cardlayout can be added to the discourse model
at the same time if the programmer chooses the This Session Only
command, and then no components from the class
java.awt.Cardlayout will be delivered later in this
development session, even if the programmer uses the word ``card''
in doc comments again, which is quite possible because the programmer
is developing programs about card shuffling.
Figure 7.9:
The Direct Manipulation interface
|
|
7.5.2 Query Refinement
Query refinement is invoked by choosing the Query
Refinement command in the same pop-up menu, or
directly typing it in as an Emacs command. A buffer
(Figure 7.10) will
appear for programmers to start another round of
component locating after having refined the automatically
extracted reuse queries.
Figure 7.10:
The Query Refinement interface
|
|
Programmers can refine the
concept query by choosing more appropriate terms, or they can
modify the constraint query to make it less restrictive or
more restrictive depending on the situation. To narrow
the searching range of relevant components, the query
refinement interface also provides two additional fields:
- Filtered Components:
- for specifying classes or packages that
are not of interest, and
- Interested Components:
- for instructing the system to return
components from the specified classes or packages only.
Component repository systems could provide a mechanism to let
programmers specify either of these fields previous to the initial
use of systems. However, programmers who do not know
the structure of the repository well enough may not be able specify
these two fields. Even a system-guided dialog mechanism to solicit user
specifications as explored in the KID system [Nakakoji, 1993],
is not suitable for repository systems because component repositories
are often very large and it will take a long time to get a meaningful
specification. The CodeBroker system does not assume that programmers
know the repository structure well enough, and it solicits user
input only after its delivered components have acquainted programmers with
the structure of the component repository, especially the structure
of the part of the repository that might be relevant to the task
at hand.
The retrieval-by-reformulation mechanism in CodeBroker is a more
comprehensive approach to improving the retrieval performance
than the relevance feedback mechanism used in many
information retrieval systems [Buckley et al., 1994].
Through the adjustment of terms used in a query by query expansion or other
techniques, relevance feedback of information retrieval
systems focuses mainly on the improvement of the
retrieval process itself. Instead, the focus
of retrieval-by-reformulation is to improve
the relevance of information to the working
context of programmers, not to the query per se.
The direct manipulation tries to establish a shared
understanding of the context between the component repository system
and the programmer. It uses programmers'
previous interactions with the system as filters for later
deliveries. Although it does not affect what the Fetcher agent
returns, it does modify what gets shown. The system also
takes advantage of the fact that software components are
organized into a hierarchy (packages, classes, and
methods) according to their application domains to let
programmers limit the retrieval range to their interests.
Ph.D. Dissertation by Yunwen Ye, April 20, 2001, Department of Computer Science, University of Colorado