Jeremy Biddle
Merging Virtual Objects with the Real World: Seeing Ultrasound Imagery within the Patient. Bajura, Fuchs, Ohbuchi In this paper, the authors describe their work in creating a 3D ultrasound system used for visualizing ultrasound images of objects in combination with real video of the object being viewed. Primarily, this system is being developed to assist in medical situations, where it is necessary to "see" inside a patient, or for other reasons. The authors do not limit the use of the system to simply facilitate medical examination, and discuss several other possible uses at the end of the paper, including fire fighting, architecture, and service information. There has been previous work in the field in two different sub-fields, ultrasound aquisition and display. In aquisition, there has been work to develop a real time 3D data scanner, although such a beast has not [as of 1992 at least] been developed. A closer to real- time 3D ultrasound scanner is underdevelopment [finished now?] with manual guidance and mechanical tracking that can make a fair number of 2D slices of an object in just a few seconds. In display, the paper talks of presenting 3D image data in either visual form or caluculated values. The visual form can be either geometrically or image based. An extension to the display is a system with location and orientation tracking in order to overlay the 2D image with other 2D images in order to convey their intersection with 3D space. Another study used 3D gray In order to visualize the data, their experiments led them to use a visualize the 3D volume through a sequence of 2D ultrasound images. The final images are rendered with an image-order ray casting algorithm. In their "virtual environment", a sophisticated array of hardware devices are used to visualize the ultrasound images. Two Polhemuses track the motion of the ultrasound scanhead as well as the VPL EyePhone (used for viewing real images with ultrasound imagery overlays), and dedicated processors are devoted to image generation. From the first experiments with a real human, the researchers found a number of problems and inadequacies with the system. Most notable were that the ultrasound images did not realistically appear to be within the patient, and the system was still too slow and the images were not updated quickly enough to be convincing. Also, the image tracking was not robust enough for their needs, the head-mounted display was inadequate, and the display engines were not as powerful as they needed to be. Information Management Using Virtual Reality-Based Visualizations FairChild This paper discusses problems with current systems of information management, and how visualizing large amounts of information can reduce the inherent complexity of comprehending such large information systems. Fairchild starts off by addressing several methods to deal with large amounts of information. Visualizing a single complex object is the task of displaying the object so that position, shape, and color represent different attributes of an object. Because of the possibility that a particular object may have too much information to be expressed at once, only a subset of the information may be chosen. The Automatic Icon model can be used to specify what kinds of visualizations can be used in multimedia representations of data. These visualizations consist of four sub-parts: semantics, normalization, graphic vector, and graphic object. Visualizing a large collection of complex objects can be solved by developing degrees of interest or fish-eye views. If the collections of objects are put into a 3D visualization, instead of 2D, the apparent complexitiy of the information is brought to a more managable level. Presumably it would give the semblance of there being more room for which to shuffle around information, even though the screen resolution would be the same. The ability of the user to freely navagate through such a defined information space is also critical, as without proper locomotion, the data is not easily attained. Several methods of navagation techniques mentioned are: relative, absolute, teleportation, hyperspace, and transformation. The first three are straightforward, hyperspace is analogous to the world wide web, and transformation is potentially powerful, but has not been explored as much as the others. Since the human body is so adapted to gestures, a Gesture Sequence Navigation technique is proposed for simple, natural navigation through such a system. Three important aspects of such a system are: response, where the system reacts to a gesture of the user; semantic paths, creating new pathways based on gestures; stimulus, the ability for the system to be configured and to modify itself according to usage. Next, several different management systems are discussed, including MCC's SemNet, Xerox PARC's perspective wall and cone tree, Silicon Graphic's 3D Landscape, and the Institute of Science's Sphere based visualizations.Roberto Downs
---------------------------------------------------------- "Merging Virtual Objects with the Real World" Bajura, Fuchs, and Ohbuchi ---------------------------------------------------------- This paper proposes a solution to the problem of "live" ultrasound echography data visualization using an 'ultimate' 3D system which acquires and displays 3D volume data in real time. The general idea behind the data acquisition proposed in this paper is the use of 2D imaging primitives to reconstruct 3D data. The ideal would be to have a 3D scanner which would use a 2D phased array transducer to sweep out an imaging volume; however, such a real-time 3D scanning system is not yet available due to required advances in both 3D volume data acquisition and 3D volume data display. The procedure for reconstructing 3D images from images of lesser dimensions requires that the location and orientation of the imaging primitives be known. Coordinate values may be tracked acoustically, mechanically, or optically. Other systems allow for a human or a machine to scan at predetermined locations and/or orientations. The authors review an incremental, interactive, 3D ultrasound visualization technique which visualizes a 3D volume as it is incrementally updated by a sequence of registered 2D ultrasound images. An image-order, ray-casting algorithm renders the final images incrementally, using a hierarchical "ray cache" for a quick composition and fast rendering of polygons properly composited with volume data. The authors move on to virtual environment ultrasound imaging in which each echography image is acquired by an ultrasound scanner. The position and orientation in 3D world space and of a head-mounted display are tracked with 6 degrees of freedom. The resultant images are video mixed with real-world images from a miniature camera mounted on the head-mounted display, yielding the 2D ultrasound data registered in its true 3D location. Technical problems addressed by the authors include conflicting visual cues, system lag, tracking system range and stability, head-mounted display system resolution, and the current power of display engines. This paper illustrates an accurate assessment of the current limitations of hardware and the resultant visualizations, proposing a model which appears to solve the initial problem posed. In agreement with one point the authors raised, the method behind the integration of scan and video still seem to raise questions as to the synchronization of the two to produce the apparent 3D visualization. Calibration only handles the initial image and subsequent images may be off due to human error. A challenging task would be to create an algorithmic solution to compensate for/minimize human error, then integrate that into this model. Otherwise, the general model seems to be well formed. ------------------------------------------------------------------------------- "Information Management Using Virtual Reality Based Visualizations" Fairchild ------------------------------------------------------------------------------- This paper begins by pointing out that the growing power of computers has made representing over-complex amounts of information a problem. One solution would be the encoding of subsets of the information using multimedia techniques and placing the resulting visualizations into a perceptual three-dimensional space in order to increase the amount of information which a human can meaningfully manage. Extending upon this concept, an appropriate approach to this problem would be to create a flexible system which would be able to represent just about any collection of abstract information in a similar VR visualization. User comfort and controls should be stressed in this model. The success of this approach depends on three things: (1) the visualizations of the individual pieces of information, (2) the visualization of the larger collection of these individual pieces, and (3) the amounts of these larger collections of information which should be visible at any given time (allowing user control of subsets of the entire amount of information presented). In regards to the management of the large amounts of complex information, there are three subproblems which need to be addressed: (1) how to make meaningful visualizations of single objects, (2) how to make meaningful visualizations of collections of objects, and (3) how to allow the users to control the selection of the visualizations efficiently. Visualization of a single complex object must take into account the limitations of human comprehension when determining the semantic information of objects. In essence, humans are ill-suited for readily understanding complex encodings which incorporate complicated schemes which extend beyond the theoretical limitations addressed in Miller's "The Magical Number 7 Plus Or Minus 2...". The model should be able to dynamically change its visualization to aid the user's understanding of the semantics of the visualized data. To solve the problem of visualization of a large collection of complex objects, we introduce the notion of degrees of interest (DOI) or fish-eye-views (FEV). The DOI model associates two values with each object, semantic distance and priori importance (API). Semantic distance is a measure of how far the viewpoint is away from the object. The API is a measure of the object's importance to the user. FEV's contain a mixture of objects with high and low levels of detail. This is equivalent to perspective in our real world. Its apparent limitation is that all objects have the same DOI value. As the Euclidean distance to objects increases, their apparent size decreases. Imposing DOI values upon these objects, similarly sized objects which happened to be of the same physical size yet had different DOI values would reflect this in their apparent sizes. The goal is to create a model which allows for the automatic assignment of semantic distance and API to all objects in a visual scene. The resulting view would illustrate the concept of information fidelity; some objects being show with higher information content than other objects. The user definition of these visualizations presents an interface problem since current VR hardware has not advanced to an acceptable level for "natural interaction." In addition to the interaction limitation, the development of VR space for information management requires two additional interaction methods: users must be able to both efficiently (1) define the encodings of subsets of individual object semantics to visualizations, and (2) define the subsets of objects that should be shown in higher information fidelity. Using these methods, visualizations of a single complex object now refer to a user-defined multimedia encoding which reflects some combination of the semantic values of the original object. The visualization of large collections of complex objects begins with the reduction of apparent complexity of the information by placing the objects within a 3-dimensional display as opposed to a 2-dimensional display. Further reduction of the complexity involves allowing a limited view of the information, in this case the subset which the user would like to examine. Therefore, perspective is introduced, with objects nearer the viewpoint appearing larger and thus helping the user to examine local neighborhoods more effectively. The DOI function provides for the distortion of the original space to reflect design task requirements of the users and provides a FEV; instead of only growing larger as an object gets closer to the user, the object increases in information fidelity as well. AutoIcons referencing the DOI values determine the amount of semantic information the user will see of the objects in the view at any given time. User definition of visualizations needs to offer some manner of navigation of the space and allow end-users to develop efficient task-specific visualizations. Several navigation techniques have been discussed in previous work, but new ones must be developed. To this end, the authors introduce Gesture Sequence Navigation, which based upon sequences of a small set of gestures takes advantage of the human ability to respond rapidly to recognized stimuli. The final issue is an information visualization architecture which would allow for the dynamic specification as to how semantic information should be represented. Surrogate object classes are introduced to handle this problem, implementing an access protocol for different information types. The authors concluded by reviewing prototype visualization systems and, using the requirements which they established, proceeded to evaluate these systems. Overall, this paper was well written and had a clear focus as to what should be dealt with in theory as well as an accurate appraisal of the merits of existing systems. ------------------------------------------------------------------------------- -------------------------------------------------------------------------------Bob Gaimari
"Merging Virtual Objects with the Real World" This paper discusses the goal of being able to project a computer-generated image onto a real-life scene in real time. The example used is a woman getting an ultrasound, and having the image of the fetus projected correctly onto her body. An obstetrician would be able to wear a special device which is a combination viewing screen and camera, and he/she could walk around the patient, viewing her from various angles, and the images on the viewing screen would change with the orientation; the information about the orientation would be gathered from the head-mounted camera. A true 3-D ultrasound scanner is not available (at the time of the paper), and so the 3-D image had to be constructed from a series of 2-D slices, using volume rendering. In section 3, they discuss how this was done, adapting previous techniques to this dynamic system. The imaging was first done in 2-D on a regular screen for testing, and was moved to the desired 3-D environment (discussed in section 4). This experiment was only the first steps in this application and showed some of the problem areas which must be overcome: image appears to lay on top of the subject instead of inside; lag in image generation; problems with tracking and display resolution; and hardware not yet sophisticated enough to display in real time. This paper discusses a very important application of computer graphics; however, I don't feel that it was written very well. I felt that they over-used acronyms to such an extent that I had to stop several times to look back to what they were refering to. Also, I felt that the discussion of the specific hardware devices used (in section 4), while important, was not very interesting. "Information Management Using Virtual Reality-Based Visualizations" According to this paper, people have a very difficult time comprehending the huge amounts of data which are available. For a person to be able to wade through it all, there has to be a way of simplifying the search space. This paper discusses techniques to make this space more manageable, by representing different aspects of pieces of data, such as importance, type, creator, date of creation, etc., using multimedia. Also, in addition to individual pieces of data, collections of data need to be represented, as well. The set of objects is quite large, and so must be grouped according to degrees of interest. The degree of interest of an object can be shown visually in many ways: putting significant objects in line of sight; putting more important objects closer to the user than less important ones; making the level of detail greater in more important objects; color and shape cues; and so on. Also, the user needs to be able to navigate in this space. The paper suggests "gesture sequence navigation", where the user makes small movements (with hands, for example) to tell the system what objects to move toward in the space. This paper was quite interesting. I had a difficult time picturing what they were talking about some of the time when they described the visual spaces, but this is probably because of the nature of what they were trying to convey.Daniel Gentle
John Isidoro
The two papers we read this week have to do with representing information, or augmenting provided information by using 3D graphics. The first paper "merging virtual objects with the real world" was very interesting, and definitely seems to be pushing hardware limitaions. The first limitation is the ultrasound technique itself. They mentioned how ultrasound suffers from noise and refraction degradation.. Noise isn't too bad because it can be eliminated by doing several passes on the same dataset, but refraction can severly deform a 3d object especially the non-linear refraction of organic tissue. Its probably as difficult as doing motion tracking with a camera using a liquid bubble lens. :^) (I don't think anybody would want to do this anyways..) This accuracy problem seems to be a big probem if using these images as overlays to perform surgery. If the overlay is incorrect, the doctor will be impeded instead of aidied by this technology.. The second limitaion of this technology is the rendering speed of the ultra- sound/3d system.. The problem is, as the person moves, the data becomes more, and more inaccurate and the inaccuracy problem stated above happens.. A real -time ultrasound/3d system would be nice, bu the authors seemed to think that is wasnt really feasable. also, the ultrasound devices might get in the way of the doctor also, if 6 different devices are used for each axis (negative and positive). Overall the paper describes a novel system, and some of the other applications of this device talked about in the paper seem to suit this system a little better.. The second paper also uses 3d to add additional information to a dataset, but seems to suit a different purpose.. How to represent files (or other structured data sets) in a 3d environment.. Overall the paper brought up some good points, but he seemed to neglect one minor point. This point is, while navigating in a 3d world, there is the inherent problem of objects being behind the camera in the 3d space.. Maybe these objects are meant to be non-visible by the user.. Depends on the application. Anyways I suppose all that would be needed to solve this problem is a rear view "window" :^) :^)...Dave Martin
Merging Virtual Objects in with the Real World: Seeing Ultrasound Imagery within the Patient Bajura, Fuchs, and Ohbuchi This paper describes the authors' progress towards the following ideal: a system that combines 3D imaging data with natural images for an observer in real-time, thereby using the 3D data to supply an additional level of detail in an otherwise natural 3D VR world. The authors give an impressive list of applications in their conclusion, including firefighting in reduced visibility, on-site architectural visualization, and part identification in complicated servicing processes. Another application would be in the targeting computers used to identify the tiny air ducts ("no bigger than a womprat") that turned out to be the Death Star's weakness; the X-wing fighters apparently did not have a state-of-the-art display system! The particular system developed by the authors uses ultrasound imaging to collect the 3D data. By wearing a HMD, an observer can see the ultrasound images combined with the real-world images. As the observer moves, the ultrasound image moves accordingly in the observer's HMD. This seems like a tremendous cognitive advantage for the observer when compared to a traditional ultrasound monitor. There are apparently not many good choices for nonintrusive medical imaging. Having chosen ultrasound as their probing technique, the authors investigated 3D extensions to ultrasound (which is primarily still a 2D process). Section 3 describes an system they developed that incrementally incorporates newly-obtained 2D images into a 3D model. Although this system appears to work well, it is too slow for their VR application: it required 15-20 seconds to render the adjusted 3D model after each new 2D slice arrived. In their VR application, the authors chose to obtain simple 2D image data and render it in its real-world 3D position. Even this turned out to be a subtler problem than they imagined; their initial attempts showed the data in correct position, but it appeared to be more "pasted on" the patient than within the patient. Adding depth cues to the imaging data helped alleviate this problem, but this introduced unnatural occlusion of visible objects. This is a very good paper. Having described their dream system, they described the state of the art (as they know it) on both the imaging and the rendering ends. The system is complex, and they are not able to give precise details in the space provided--- just listing all of the equipment involved takes half of a page. Instead, they concentrated on the difficulties they encountered. The "Remaining Technical Problems" section is extensive and uninhibited. It would have been very easy for them to omit or reduce this section; if they had done so, the pictures would have suggested that the system was much better than it currently is. Information Management Using Virtual Reality-Based Visualizations Kim Michael Fairchild This chapter of "Virtual Reality: Applications and Explorations" addresses the question of how VR and multimedia can be used to increase the amount of information that users can manage. The author treats this question in three parts: how to visualize an object, how to visualize a collection of objects, and how to allow the user to specify subsets of objects of interest. There is obviously no fixed and generally useful way to visualize arbitrary objects, but the AutoIcon model offers flexibility by providing a framework in which visualizations can be easily constructed. This framework permits the association of various kinds of semantic information with corresponding graphic requests; these associations are called the AutoIcons. Objects can have multiple AutoIcons for visualization in different contexts. The AutoIcons are an important part of the author's proposed solution to the second question, i.e., how to visualize a collection of objects. The essence of this solution is that each object should have a DOI (degree of interest) value that is computed from two subvalues: semantic distance, measuring the "real" distance between the viewpoint and the object (such as the number of intervening directories between two files in a file system), and a priori importance, measuring the object's intrinsic importance to the user. The object is then rendered in a 3D space at a position computed from the object's DOI. Furthermore, the DOI is used to choose an appropriate AutoIcon for that object. For instance, a graphic object with a large DOI might display as a thumbnail image, but the same object with a small DOI might not be visually distinguished from a text object. In order to define subsets of objects of interest, the user navigates through the virtual space and selects objects. The author proposes "Gesture Sequence Navigation" as a general navigation technique. In this technique, the user knows a limited set of gestures that can be combined into longer meaningful sequences. For instance, typing "cd" is a well-known gesture to return to the home directory. The innovation is that the gesture input devices are imagined to be appropriate for the virtual world, such as an eyeball tracker, and that pausing during a gesture sequence results in explicit visual cues ("stimuli") regarding further options. The author goes on to claim that this design allows users to create new meaningful gesture sequences and to create new visualizations for these stimuli but provides no details. In the final "proposed system" section of this chapter, the author describes an extremely high-level architecture for implementing the system. The rest of the paper describes four existing visualization systems. These descriptions are useful, but the clear emphasis in this chapter is on the author's dream system: each existing system is evaluated according to the standards suggested by the author's own unimplemented design. It strikes me as rather irresponsible to use so much of this chapter to describe the design of a nonexistent system. If its purpose is to teach the reader about the state of the art in VR information spaces, then the author should have concentrated on existing systems. Otherwise, I would have liked to read about how the designed system actually worked, not how it is imagined to work.John Petry
INFORMATION MANAGEMENT USING VIRTUAL REALITY-BASED VISUALIZATIONS," by Kim Michael Fairchild This was a relatively interesting paper. The key points of interest to me were the various types of data visualization suggested, and some subsidiary issues such as the need to provide tools that automatically generate visual representations of data for unsophisticated users. The paper was designed to cover information management within a Virtual Reality (VR) environment. To me, the two issues -- data visualization, and navigation using VR methods -- are fairly independent. I think the most interesting parts of the paper are where the author describes visualization approaches or related issues such as the icon editor. VR tools are currently quite modest. No one is going to use VR to look at a representation of his Unix file system. It doesn't add much to a joystick, for instance. VR adds an extra degree or two of freedom and permits better data representation, but I think this is a difference of degree and not of kind. Some of the better points in the data visualization sections were: The importance of usings skewed representations such as fish-eye views to increase ease of information presentation. (Preferably automatic) methods for assigning relationships are vital. The cone and wall models from Xerox PARC were very powerful. The cone file system representation was very clear, for instance, and I found it much more powerful than the typical flat Mac/Windows file icon approach, where usually just one level of a directory tree is shown. Some good VR points were made as well, especially The need to allow users to define shortcuts, i.e. to "chunk" command sequences. The need/ability to simplify an interface so that naive users can take advantage of it without significant training.Robert Pitts
MERGING VISUAL OBJECTS WITH THE REAL WORLD: Seeing Ultrasound Imagery within the Patient by Bajua, Fuchs and Ohbuchi This article describes a working system for real time viewing of 3-D ultrasound information superimposed over the observed object via a head-mounted display. The authors begin by characterizing the limitations of ultrasound data acquisition. They continue by describing previous working in 3-D ultrasound visualization. Two aspects of 3-D ultrasound are mentioned: acquisition and display. For acquisition of 3-D ultrasound data, tradition techniques acquire 3-D information from sets of 2-D information. A particular problem that has been addressed in display is interpolating between 2-D slices to form a 3-D volume. The authors discuss their technique for rendering 3-D information using an incremental approach that limits the amount of information needed for rendering. The technique sounds interesting, but I would have liked a concrete example to understand it better. They discuss constraints that must be satisfied for their technique to work in real time. A hierarchical caching method used in rendering the image is described. I was not sure what type of hierarchy this was. Does it contained layers of low to high detail so that lower detail images could be presented if higher detail rendering was not possible because of time constraints? The imaging system is described that renders ultrasound images over live images viewed in a head-mounted display. The system includes: an image acquisition device, a head-mounted display, a tracking device (for the image acquisition device and head-mounted display), an image generation system, a camera for the real images, a mixer to mix the real and rendered images. Descriptions of each part of the system are given. Of interest is the technique for combining the real world and rendered ultrasound image (based on the luminance of pixels in the rendered image). In addition, the calibration of the system to "match" real world points with the tracking and rendering systems seems an area where automated techniques are needed. Finally, a test case using a pregnant subject is presented with compelling images of what the user would see--ultrasound images displayed over the patient's abdomen. The superimposition of ultrasound images is somewhat flat because of the lack of visual cues suggesting that the image resides inside the patient. They improve this by adding depth cues. Limitations of the system are discussed. They include system lag, tracking accuracy, display resolution and rendering power. I feel that the problems caused by immature technology will hamper development of a practical system until the technology improves. They conclude by suggesting other uses of their techniques in which rendered information is overlaid on real world images. I thought this work was promising because they have an actual working system. In large part, making the system practical requires better hardware devices. Nonetheless, I think there will be opportunities to explore rendering techniques to improve visualization and deal with complexity. INFORMATION MANAGEMENT USING VIRTUAL REALITY-BASED VISUALIZATION by Fairchild This article presents the problem of visualizing large sets of information using virtual reality techniques, outlines the requirements of solving that problem, and presents the author's solution. The author points out three requirements for solving the visualization problem: 1) How to visualize single objects; 2) How to visualize collections of objects; and 3) How users can efficiently control the visualization. He explains how one can take advantage of perceptual tasks that humans perform well to highlight important information and solve (1) and (2) above (this idea is nothing new). His approach uses perspective, depth and detail. An interesting idea is the use of hierarchical levels of detail to model a user's "degree of interest" in a particular feature of an object. This approach has the added benefit of reserving more rendering power for objects of interest. The author describes how the user may control a visualization through gestures (from various VR input devices). Although it is easy to see how gestures can move through the 3-D data space, it is not clear that gestures to control other parameters of visualization, such as the features of the information the user is currently interested in, work well. In addition, his description of how "views" of information are created sound more like a task for a programmer than something a user can perform with gestures. The author gives an example illustrating how his approach works. The example, however, is weak because his figure does not convince me that the interface is intuitive. An illustration of his paradigm on a more common visualization problem would have been better. The author finishes by describing previous work in visualization. Nonetheless, he does a poor job of relating previous work in visualization to his own work. He does make a passing remark that some of the previous models can implemented using his model, but does not describe how to do so. While he imparts criticism on the prior works, he does not discuss how his paradigm fixes these problems. It was also odd for him to discuss previous work at the end of the article. A better approach would have been to present previous work earlier. Since these earlier models were simpler, describing them first would have given a few simple examples to make the idea of visualization concrete in the readers mind. Instead, the author first makes us wade through the lengthy description of his ideas.
Stan Sclaroff
Created: Mar 13, 1996
Last Modified: Apr 3, 1996