BU CAS CS 680 and CS 591: Readings in Computer Graphics

Class commentary on articles: Virtual Reality II



Jeremy Biddle


The ALIVE System: Full-body Interaction with Autonomous Agents

This paper is a description of a computer vision implementation of a
wireless interactive system.  One of the motivations behind this approach
is because of the restrictive constraints imposed by wearing VR gear.

The world used in the system contains both inanimate objects as well as
agents.  Agents are autonomous entities that act on their own in response
to their own sensory perception of the system.  Interaction with the system
(and the agents) is done indirectly, through the use of gestures and
signals.

Agents appear semi-intelligent because of a set of internal needs and
motivations that they have, which are balanced in real-time so as to
provide feedback about their behavior.  The feedback comes as both physical
movement in the system and auditory response.  The specifications of an
agent includes:  sensors to judge the positions of objects;  motivations
which indicate an agent's desires/needs;  activities of the agent to
actually tell the agent what to do under different circumstances.  The
activities are perhaps the most interesting aspect of the agent, as they
start out very high level (identifying actions such as playing, feeding,
etc.).  These high level activities are broken into lower level children
activites (for feeding, these would be chewing, looking for food, etc.).
These in turn, are broken down to lower level functions until the lowest
level functions are reached, which specify specific motor system
activities.

These different activities compete for control over the agent, so that only
one activity will be pursued at a time.  As different parameters are
modified due to different conditions, the most important activity to the
agent will gain control and have the agent do something.

In order for the system to pick out the salient details of the scene, there
are several visual routines used.  The first, visual ground processing,
isolates the user of the system.  This is achieved with color and motion
cues to determine the differences in the scene.  After the user is
identified, a bounding box is created around the user.  Next, the user's Z
dimension is determined by casting a ray from the camera to the user's
feet.  Since it is assumed that the bottom of the user is on the ground,
and the ground is fixed and precomputed, the Z distance is computed and the
user becomes a 2D plane in the 3D enviroment.  After the body is located,
the hands and feet are found using domain constraints.  Finally, in order
for the system to register an indirect command given by the user, it must
recognize gesture.  Some of the gestures recognized are poiting, waving,
and kicking.

In ALIVE I, the first incarnation of the ALIVE system, there was a Puppet
agent and a Hamster agent.  The Puppet simply tried to follow the user in
3D and hold the user's hand.  The Hamster would avoid predators, eat food,
and beg the user for food if there was none around.  The Hamster was also a
social little fellow, and enjoyed having its stomach scratched.  The user
could let out the predator who would try to catch the Hamster, but who was
afraid of the user.

Evaluating the results revealed that the system is very user friendly, and
also that users are more likely to be patient with agents than with
inanimate objects.

Roberto Downs


The ALIVE System: Full-body Interaction with Autonomous Agents
by Maes, Trevor, Blumberg, and Pentland at MIT Media Laboratory

This paper discusses the design and implementation of a system called
the Artificial Life Interactive Video Environment (ALIVE),
which allows wireless full-body interaction between a human
participant and a rich graphical world inhabited by autonomous
agents. The authors argue that such a system offers more complex and
very different experiences than traditional virtual reality
systems. Examples of the uses of this extended model are in the areas
of training and teaching, entertainment, and digital assistants or
interface agents. Traditional VR interfaces require cumbersome
equipment and a limited interaction which has restricted the range of
applications which could use this type of technology. In the model
proposed in this paper, full- body interaction between a human
participant and a richer graphical world inhabited by autonomous
agents is accomplished through the use of a single video camera to
obtain a color image of a person which is then composited into a 3D
graphical world. The resulting image is then projected onto a large
screen facing the user (referred to as a magic mirror) which
shows the user imbedded within the 3D world.  Computer vision
techniqes are used to extract information about the human participant,
such as 3D location, the position of various body parts, and
gestures. The 3D environment contains inanimate objects as well as
agents. These agents are modeled as autonomous behaving entities which
have their own sensors and goals and which can interpret the actions
of the human participant and react to them in
interactive-time. Such a system offers a more powerful,
indirect style of interaction in which gestures can have more complex
meanings, which may vary according to the situations presented.  This
system resides at the MIT Media Laboratory, where it has undergone
extensive testing by real users. The results show that (1) the
magic mirror approach has several advantages over
head-mounted display- based virtual reality systems, and (2) virtual
worlds including autonomous agents can provide more complex and very
different experiences than traditional virtual reality systems. The
authors conclude that the ALIVE system significantly broadens the
range of potential applications of VR systems; in particular,
applications in the areas of training and teaching, entertainment,
telecommunication, and interface agents. The modelling of autonomous
agents is broken up into three components: (1) the sensors of the
agent, (2) the motivations or internal needs of the agent, and (3) the
activities and actions of the agent. Given this information, the
system automatically infers which activities are most relevant to the
agent at any particular moment in time given the state of the agent,
the situation it finds itself in and its recent behavior
history. Through using a vision-based interface, the authors hoped to
create an interface to a computer graphics world which is as
non-intrusive as possible, while allowing a rich and intuitive set of
gestures to be used in controlling and navigating the
world. Ultimately, users who have used this system have found that
they concentrate more on the environment itself, rather than on the
complex and unfamiliar equipment being used to interact with that
environment. Visual search tasks, also referred to as active vision,
are solved using the paradigm of goal-directed computation. Specific
search tasks are carried out depending on the state of the agents in
the world as opposed to computations performed uniformly across the
image at each time step. Four basic types of visual routines have been
developed for interacting with a VR world using the magic-mirror
paradigm: (1) figure-ground processing, (2) body localization, (3)
hand and feet localization, and (4) gesture spotting. Implementation
of the four routines includes conventional image processing and
computer vision techniques. By combining the behavior modeling and the
vision techniques described in this paper, the authors have
constructed a system for video-based interaction with artificial
agents.  Using the two, ALIVE allowed the user to interact with both
agents and inanimate objects. The ALIVE-II project expanded upon this
interaction by creating a more sophisticated repetoire of behaviors
than the previous agents as well as simple auditory output consisting
of prerecorded samples. This system seems to be well constructed but
it seems awkward to constantly refer to a huge screen in the area. In
order to equal other interactive methods for VR, this system will have
to somehow be cut down to a smaller size which does not compromise its
power. Issues dealt with by real users included the inability of the
system to recognize hand gestures in front of the body (due to the
silhouette imaging). A more dynamic system would perhaps move around
as to get a better sense of the subject or better yet offer multiple
references to the subject (multiple cameras). Both of these would be
more of an engineering issue which utilized the imaging technology.

---
=93Rich Interaction in the Digital Library=94 by Rao, Pedersen, Hearst,=20
Mackinlay, Card, Masinter, Halvorsen, and Robertson
---------------------------------------------------------------------------=
---
The authors discuss the development of techniques which support=20
various aspects of the process of user/information interaction in order=20
to increase the =93bandwidth and quality=94 of the interactions between=20
users and information in an information workspace (an environment=20
designed to support information work). Current access tools and=20
applications could be considered information workspaces, but they=20
limit the effectiveness of information access and the larger process of=20
information work. Conventional retrieval interfaces are based on the=20
view of information retrieval as an isolated task in which the user=20
formulates a query against a homogenous collection to obtain=20
matching documents. The authors argue that this view =93misses the=20
reality of users doing real work=94. A strong system should then offer=20
iterative query refinement; source heterogeneity; parallel, interleaved=20
access, and a larger work process. The authors address each of these=20
four through presentation of examples from their own work that lead=20
towards information workspaces supporting rich interaction. =20
Exploration of these examples leads to some categories of meta-
information which should be addressed: (1) content, (2) provenance,=20
(3) form, (4) functionality, and (5) usage statistics. Emphasis is placed=
=20
on visualizations which map sources into spatial and graphical=20
elements based on meta-information, allowing interactions that allow=20
users to select sources as well as build a spatial memory of sources.=20
The rendering of retrieved data can be explored much in the same=20
way. Access management of the data extends into analysis of time and=20
cost, asynchronous performance, and status feedback to the client. The=20
later increases the user=92s ability to formulate and execute multiple=20
operations in parallel to manage form search strategies more=20
effectively (refer to the GAIA protocol). While this paper covers=20
existing applications which contain commendable features, the authors=20
might have considered the design of a system which incorporated all=20
of these features. While remaining in the realm of theory, this=20
proposed system (any system at all) would still add more content to=20
this paper than a synopsis of existing systems and their limitations.

Bob Gaimari


"The ALIVE System: Full-body Interaction with Autonomous Agents"

This paper describes a virtual environment system which is implemented in a
unique way: instead of messing around with goggles, gloves, helmets, etc.,
the user sees him/herself in a mirror.  The user is shown in the virtual
world, interacting with the objects and agents there as if look into what
they call the "magic mirror".  A large screen in front of the user displays
the image, and the users position and actions are recorded using a video
camera mounted above the screen.


Section 2 discusses how agents are modeled in this environment.  These
agents are semi-intelligent, and have the following things.  

a set of goals, motivations, and needs: The agents will want certain things,
such as food or attention.

a set of activities which they can perform: They have a set of activities
which will allow them to meet these needs, such as walking around searching
for food, or hopping up and down for attention.  These activities are
hierarchically designed, with a high-level activity made up of several
low-level activities.

a set of visual sensors: These are used to watch the user for gestures and
queues, or to look around in the environment.  Rays are shot out of the
sensors, and the first thing they hit is recorded as seen.

a behavior system: This, given the above sets, will determine what the
agent will do in the next time step.  There is a certain persistence, so
that the agent won't flip/flop between actions, but it also won't keep
doing them forever.


Section 3 discusses the interface between the user and the environment.  It
is based upon vision, and the user can perform simple gestures to make
things happen in the virtual world.  The user can point at things, wave,
pick up or manipulate inanimate objects, touch agents, etc.  The camera
which is pointed at the user records his/her actions, and the system
interprets them, by 1) isolating the user from the background; 2)
localizing the position of the body; 3) localizing the positions of the
hands and feet; and 4) spotting gestures by the positions and orientations
of the hands and feet.  Gestures are temporal as well as positional;
waving, for example, is a gesture which can only be discerned as a
side-to-side motion over time.  Also, the user may not want something
interpreted as a gesture if he/she was simply moving his/her hand from one
place to another.


Section 4 discusses some environments they have used with the system, such
as a puppet world, a hamster world, and a virtual dog.  Each of them has
specific gestures and interpretations of these gestures by the agents.  The
agents also have different motivations and abilities.


Section 5 compares the system with other systems, and section 6 evaluates
the results.  I think that the strongest success is the naturalness of the
"magic mirror" interface.  This is something people are already familiar
with, and so will have very little trouble getting used to.  Users don't
have to wear special goggles and gloves, and avoid the possibility of
disorientation and tripping.  Also, simple, everyday gestures can be used.
Finally, section 7 discusses possible applications, in such areas as
entertainment, training, and interfacing to digital assistants.


"Rich Interaction in the Digital Library"

This paper discusses the need for better interaction between users
searching for information and on-line information sources, such as
databases, file servers, and digital libraries.  Using current information
retrieval tools, users have to overcome a number of barriers to accessing
information, such as unfamiliarity with different interfaces and
functionalities of sources, lack of ability to interleave operations, or
lack of smooth integration with the overall work process.  It then goes on
to discuss various methods of overcoming these barriers using better
interface tools.

The first tool discussed is the "Scatter/Gather" paradigm of browsing.
Here, a user will select a topic area to search from a list.  The possible
documents in this topic area are all "scattered", forming a small number of
clusters of similar document areas, each with a list of key words, and a
sample document title.  The user can select one or more of these clusters,
and they will be "gathered" together to form a subcollection of the larger
group, with its own list of topic areas, and the process can continue until
the user finds a set of documents to directly access.  Another tool is a
"Snippet Search" which, given a keyword, will return "snippets" of context
showing where these keywords occur.  This can give the user a better idea
of what other words may be useful to search for.  Combining this with
"TileBars", the user can see how often these key phrases are used in a
document, whether they occur together often, etc.  These show graphically
where the words occur by filling in black spaces in a white rectangle,
according to frequency of use.

Next, they discuss how visualization tends to be a strong aid in gathering
and sorting through information.  They show what I think is the best method
of visualizing data I have seen yet, the Butterfly, which combines search
and browsing capabilities.  This is used to search for articles through
bibliographic sources.  The display has the currently active article in the
center (the head).  The left wing contains a list of the articles which the
paper references, and the right wing contains a list of articles which
reference the current one.  So the user can browse through a space of
interrelated articles, and select particular ones for saving.

References to the 3-D display systems from the previous paper are also
mentioned toward the end, as examples of other methods of visulaizing and
browsing data.

Daniel Gentle


John Isidoro


The paper we haad to read, "The ALIVE System", was very interesting. It described a system where a user, using a "magic mirror" interface, could interact with virtual agents as if they were actually touching these agents. The thing that struck me the most about this paper is how well the magic mirror system worked.. I think if I were to use the system, I think I'd keep looking away from the dog's image in the magic mirror (supposing I was playing with the virtual dog) and expect to find the dog in the real world.. I guess it just takes a little getting used to. :^) As for real world applications, one thing I though of is using this system as an interactive assistant to animators doing rotoscoping of 3d or cartoon models. A the model or cartoon could be made to mimic the real life person, and the user could get feedback on how his movements look when an animated actor is attempting to do them. The user could then adjust his movements according to how he wants the actor to act.. I wonder if they did something like this for "Toy Story" or some other computer rendered movies??

Dave Martin

      The ALIVE System: Full-body Interaction with Autonomous Agents

	 Pattie Maes, Trevor Darrell, Bruce Blumberg, Alex Pentland


The idea of ALIVE is so simple it is truly inspired: instead of strapping
on expensive and entangling sensing gear onto a VR world participant, use a
single well-placed camera and image understanding techniques in order to
locate the participant in 3-space; and instead of depositing a bulky
display device on the participant's head, just put a really big screen on
the wall.  This puts almost all of the system implementation in software,
giving the developers enormous flexibility to change their system without
retooling.  The participants have complete freedom of motion, and since
their natural view isn't obstructed, any lack of clarity and realism in the
system will more likely annoy than nauseate them.  This system would excel
in VR applications where only rough user controls are required.  For
instance, one could imagine a workstation with a camera pointed at the
user's face for imaging, and another camera having a profile view of the
user for gesture interpretation.  The workstation screen would depict the
user in an information space; just by rolling the hands, pointing,
grabbing, and so on, the user would be able to very quickly change
viewpoints and activate encountered objects.

The authors describe a system that works with a single camera, sensing the
user and rendering the synthesized agents at about 10 frames per second.
This fine performance is due in part to what they cite as "active vision",
wherein certain analyses are performed only under appropriate circumstances.
For instance, it is not important to determine the precise location of a hand
if it is not close to anything in the VR world.

Their system features several critters with primitive but cute behavior.
Much of the paper describes the implementation of these agents' behaviors.
While this helps the reader envision the running system, I did not find
that part of the implementation particularly compelling.  Nor do I see
any reason to deify the agents by capitalizing them ("Puppet", "Hamster",
"Dog"), but I guess that's just my pet peeve.

The ALIVE system internally reduces the input image to silhouettes in
order to extract gestures, so it loses much potentially valuable
information.  As mentioned in the paper, hands cannot be located if
they are held close to the body.  The authors suggest looking for
particular flesh colors or using a second camera to resolve this
problem.  A second camera would also make it easier in principle for
the system to track more than one participant (no extra equipment
required for more participants!), but I'm sure I don't fully appreciate
the complexity of the multi-camera multi-participant problem.

Most people point their heads pretty much directly at what they want to
see, so I wonder if users of the ALIVE system suffer from some kind of
stiff-neck-rigid-torso fatigue.  But even this sounds preferable to having
to wear sensing gear.  All told, the system sounds very cool, and I want
one.  

		  Rich Interaction in the Digital Library

  Rao, Pedersen, Hearst, Mackinlay, Card, Masinter, Halvorsen, Robertson



This article describes the authors' approaches towards improving
interaction with large on-line databases.  They list four basic
requirements and describe techniques for addressing them.  Many of these
techniques have been implemented by the authors.  The requirements are (1)
iterative query refinement, (2) source heterogenity, (3) interleaved
access, and (4) larger work process.

The idea behind iterative query refinement is that most users of large
databases do not know precisely what they seek in terms of the database
content and organization.  One of the main roles of a reference librarian
is to create the dialog that leads to query refinement, but in a completely
on-line system, there is typically no such presence.  Therefore, it is
important that the system make it very easy for the user to ask broad
questions and subsequently refine the questions based on results.  The
authors describe a technique called scatter/gather to facilitate this need.
In this technique, the system dynamically creates sort of a hierarchical
view of the data.  By navigating this view, the user implicitly refines the
query.  Beyond that, I did not really understand how the technique works:
exactly what is scattered?  The authors identification of scattering with
clustering is confusing.  How much up-front effort is required to adapt a
database for this system?  It is hard to imagine that the system would
automatically know to put "text print format menu page word font image mac
size" in one catagory and "agency office government department contract
center" in another.  Finally, how quickly does the system run?  Are the
views built in advance, or are they really computed at the time of use?
Are they computed at the client or server end?

TileBar lets the user specify how often certain terms must appear in a
document for the search to find it.  In the result view, a little graphic
shows the distribution of the terms in the result documents.  These strike
me as novel and useful techniques for whittling down the query space.

The authors note that professional database searchers spend much of their
time sharing information about sources, while the average user does not
have this kind of knowledge.  The GAIA protocol was built to provide an
interface for uniform access to a wide variety of heterogeneous sources.
GAIA also sends time estimates and status messages to the client while
searches are in progress, which is an important means of enabling
interleaved access.  However, the authors do not describe GAIA in any
detail; instead, they concentrate on a GAIA client--- the butterfly viewer.

The butterfly viewer displays matching articles on the left and articles
that cite the matched articles on the right.  It can also show 3D plot of
the citing-cited relation between articles over time, making it very easy
to identify fringe versus "hot spot" papers.  The viewer allows the user to
create "piles" of references at will and automatically generates references
to related works in an unspecified manner.  It seems like a very helpful
searching tool. 

The authors also describe a tool called Protofoil for navigating image and
OCR versions of scanned documents.  Protofoil supports multiple search
methods and result visualizations, including a TileBar display.  It is
particularly successful in its use of thumbnail images, which greatly speed
searching of familiar documents.  I have also found that I remember much
more than text when reading: the page layout, paragraph shapes, shape of
displayed equations, etc. are very important in helping me locate a passage
in a technical text.

Finally, the authors refer to a variety of data visualization techniques,
many of which we have seen: the perspective wall, cone tree, and others.
This is a fine article.  It describes both abstract goals and the
state-of-the-art with references, so that an interested reader can follow
up if desired.



John Petry

"THE ALIVE SYSTEM: FULL-BODY INTERACTION WITH AUTONOMOUS AGENTS,"
by Maes, Darrell, Blumberg and Pentland.

The ALIVE System is a combination of a full-body user interface and a
variety of autonomous agents which interact with an image of the user.
I'd like to consider three aspects of this paper independently:
	the autonomous agents;
	the full-body interface;
	the interaction of the agents with the interface.

1) Autonomous agents

These are moderately complex, but in themselves are not particularly
original.  The most involved one seems to be a virtual dog which has
an action-generating algorithm which consists of a tree several layers
deep, with the upper levels denoting high-level actions (e.g., eat, move)
and the leaves specific sub-tasks (open mouth, chew, swallow).  This
is fairly tangential to the core of the paper, which is the interface.

As such, the agents simply give some purpose to the interface.  They could
just as easily be replaced with file system icons, video game controls,
a virtual jukebox (pick your music interactively): the possibilities
are wide-ranging.

2) The full-body interface

This is the most superficially appealing aspect of the work, but to be 
honest, I found it to be just that:  superficial.

The vision portion is difficult, but the results don't justify the effort.
A vision system that could read American Sign Language would be excellent,
if incredibly hard; a vision system that could interpret facial motions of
quadriplegics would be very useful; this is much cruder than either, and
of much less use, since anyone who can make these types of gestures can
also control a mouse or joystick, or type.

I'm reminded that a person moving in front of a giant screen while moving 
her arms to interact with a program is much like the same person sitting 
in front of a monitor moving a mouse, except that the former uses 16' x 16'
of floorspace, a wall-sized monitor, an incredibly imprecise mechanism
to specify actions (compared to the fine control presented by a mouse), 
and needs a person to be standing the whole time if she wants access to the
full range of controls.  Oh, and it is much more expensive.

While gloves and headsets have significant limitations, I think this system
is at least as limited in its own way.  I strongly suspect a combination
of the two is the way to proceed.  Perhaps wearing gloves or headwear that
was visually distinctive without requiring wiring or an internal power source
might work.  For instance, a pair of IR-sensitive gloves and headband could
vastly simplify the vision task without causing any appreciable difficulties
for the user, as well as being quite inexpensive.

3) The interaction between the autonomous agents and the full-body interface

This is somewhat interesting, but to some extent I'm not sure if it is meant
to be the focus of the paper, or if the interface is the key part.  To the
extent that it is the main topic, I don't really understand what it is meant
to show.

Overall, the application strikes me as very cute.  It's quite easy for
inexperienced users to learn how to operate, which certainly deserves
credit, and it is a novel implementation that will no doubt generate
considerable ideas as a result.  But overall, I don't think there's a lot 
of depth to this paper.

Robert Pitts


The ALIVE System: Full-body Interaction with Autonomous Agents
by Maes et al.
==============

This paper describes a virtual reality system that uses a different
interface than traditional systems and that has a richer virtual
environment.

The authors begin by contrasting their system to the traditional
virtual reality approach.  Their system differs in that:

o There are no devices attached to the user.  The system uses computer
  vision technique to determine position and orientation of a user.

o The virtual environment contains autonomous agents that can interact
  with the user in addition to static objects.

The model used for the autonomous agents in the virtual world has
several realistic properties that help to produce believability;
they are:

o Sensors for detecting aspects of the environment.  The basic method
  uses a "what is in my line of sight" approach; however, the agents
  are privy to more than just visual properties of the objects in their
  sight (since all objects are modeled in the virtual system, including
  the user).

o Motivations and goals.  These give the agents a reason to do
  something other than just sitting there.

o Activities/actions that the agent can perform.  This will be a pool
  of behaviors that an agent knows how to perform.  They are organized
  in a hierarchical manner, with more specific behaviors at the
  leaves.

o A motor system that is "told to do something" based on what the
  current behavior is. The model of the motor system uses both physical
  and kinematic modeling.

o A control system that, in real-time, takes sensor information,
  determines what the current goal should be and activates activities
  to satisfy that goal.

I believe this model is a good tradeoff between the believability and
complexity of the agents.  Humans often attribute deeper cognition to
behavior that can be generated by simple motivations interacting with
an environment.  Thus, agents behave as expected, but with less
computation.

In the third section, the authors describe the human interface to the
virtual world.  Advantages of their computer vision approach are that
users are not tethered to unnatural devices.  Because the system models
users in the virtual system, they describe how using computer vision
techniques, the system can efficiently produce an internal model of the
user.  Their efficient techniques make assumptions about the positions
of certain body parts, etc. based on other information.  It would have
been nice for them to mention limitations to this technique in this
section.  For example, do certain natural movements present
difficulties to the system?  Are there limitations to the the speed of
users' movements for reliable tracking?

Lastly, they describe "gestures" that are supported by this system that
are used to control the behavior of agents.  It is not clear in this
section if users experience any of the usual problems faces in a
"mirror" interface.  For example, humans have trouble coordinating
certain task when faced with a mirror image of themselves.  This issue
is somewhat covered later.

Next, a description of some of the virtual environments and the agents
found in these environments is described and the ALIVE system is
contrasted with previous work.  The primary improvements are: the use
of vision techniques, using 3-D user positions, using more complicated
gestures and more sophisticated agents.

There are certain limitations to the ALIVE system.  Nonetheless,
interesting observation is that users are more tolerant of mistakes by
the agents than by inanimate objects.  For example, an agent may have
missed a gesture.  This is a clever way to use human expectations to
ignore technological limitations.

The authors do a decent job in addressing some of the limitations of
the "mirror" metaphor, thought it does not adequately address the
question of whether users have trouble coordinating because of the
mirrored image of themselves.

Finally, their discussion gives a good description of possible uses of
the magic mirror technology.  The authors concentrated on future
applications.  Although they briefly describe possible improvements to
the underlying implementation (e.g., improved hand detection), I would
have liked to hear more about these issues.

Rich Interaction in the Digital Library
by Rao et al.
=============

This article addresses the issue of constructing "information
workspaces," by allowing users to perform queries on heterogeneous sets
of information and form a unified or multi-aspect view of that
information.

There are a few typical characteristics of queries performed by humans
(when not limited by computer interface design or resources).  Humans
typically collect information from "multiple sources" and perform
"parallel/interleaved retrieval" and "non-sequential processing."

The authors present some schemes for improved querying of homogeneous
sources.  Techniques include iterative refinements of searches and
geometric representations of document contents (both based on keywords).

The preceding examples describes techniques for improving query of
homogeneous sources.  To handle heterogeneous sources, a set of meta-
information, information about the sources themselves, has to be
constructed.  The important pieces of meta-information identified are:
information about the source, content, structure and how to access a
service as well as who is using an information service.

The GAIA protocol, a protocol for querying multiple sources, is
presented.  It supports access to the pieces of meta-information that
were identified as important as well as characteristics of human query
sessions mentioned above.  It is not evident when meta-information must
be programmed into GAIA or when it can obtain this information
automatically.  However, the authors do mention that the system
supports a certain set of sources, suggesting that it has been
programmed to do so.

The importance of real-time constraints in an parallel, asynchronous
query system is mentioned.  A user (the system) needs an estimate on
how long a query will take before deciding to initiate that query.  It
is mentioned that these times can be estimated by GAIA, but it is not
mentioned how this is done.  Furthermore, the authors later state that
"Query results typically have an unpredictable size and require an
unpredictable amount of time to enumerate."

Finally, the authors describe how they organized multiple views for a
particular search operation.  Even though they identify categories of
different views, I did not find their example to be general enough.
There description is only of a single problem and doesn't really
contain any theory about the task in general.

In summary, this article provides a couple good strategies for keyword-
based search and does a minimal job of identifying a framework for
dealing with multiple sources.  It lacks convincing examples of how
these principal can be applied.  Last, the authors do not speculate as
to what path this work should take in the future.


Stan Sclaroff
Created: Mar 13, 1996
Last Modified: Apr 3, 1996