BU GRS CS 680: Graduate Introduction to Computer Graphics --- Class commentary on articles

BU GRS CS 680
Graduate Introduction to Computer Graphics

Readings for April 16, 1997

P. Debevec, C. Taylor, and J. Malik. Modeling and rendering architecture from photographs: a hybrid geometry- and image-based approach. In Computer Graphics Proceedings, ACM SIGGRAPH, pages 11--20, 1996.
S. E. Chen. QuickTime VR -- an image-based approach to virtual environment navigation. In Computer Graphics Proceedings, ACM SIGGRAPH, pages 29--38, 1995.

Commentary

"Modeling and Rendering Architecture from Photographs: A hybrid geometry-and image-based approach" by Paul E. Debevec Camillo J. Taylor and Jitendra Malik

This technique combines geometry and image based components to make a realistic rendering of a architecture. The photogrammetric modeling captures the basic geometry and the model based stereo algorithm captures the difference between the scene and the model.

The goal of this paper was to create a process of modeling architectural scenes in a more convenient, more accurate and more photorealistic way then current techniques allow. There are three new modeling and rendering techniques that are introduced. They are the photogrammetric modeling, view dependent texture mapping and model based stereo.

The Photogrammetric modeling uses the photographs to recover the basic geometry. In the paper this was implemented using a application called Facade. It then stored the geometry primitives that were created in Facade as polyhedral blocks. The reason for this primitive representation was to reduce the number of parameters that the reconstruction algorithm needs to recover the image. This reconstruction algorithm works by minimizing the difference between the edges of the model and the edges of the image that were marked in facade.

The view dependent texture mapping projects the photographs onto the approximated model. This technique was found to work well when the model conforms closely to the structure of the scene and when the photographs use from the projection where taken under similar lighting conditions.

Model based stereopsis differs from the traditional stereo in that it measures how the actual scene deviates form the approximated model rather than trying to measure the structure of the scene without any prior information. By making use of the warped offset image of the scene and computing the stereo based on the key image and the warped offset images the technique provides an additional level of detail.

"QuickTime VR - An Image-Based Approach to Virtual Environment Navigation by Shenchang Eric Chen"

This technique s objective is to create an available, extensible and high quality system for creating and navigating through virtual environments. By using a image based approach solves the problems that come from the real-time constraint that are associated with other Virtual Reality systems.

This simplification leads to better image quality because the images can be rendered before hand and stored on disk. When the images are called for they don't have to be calculated on the fly they can just be played back with the viewer. The QuickTime VR system allows for the simulation of all of a camera's six degrees of freedom. These are broken up into Camera rotation and Object rotation.

The system uses environment maps to accomplish the camera rotation. It allows for users to move through a interactive environment by stitching pieces of images together and including special objects of interest called hot spots. There are tow types of moves panoramic and object. The creation process of the panoramic move consists of five steps. These are selecting nodes to link together, stitching together images to make panorama, marking of any hot spots, manually registering the viewing directions and compressing all components together to make a movie. The object movie making is created with a special device called an object maker. That allows for the object to be filmed from all views.

Scott Harrison

Leslie Kuczynski

"Modeling and Rendering Architecture from Photographs: A hybrid geometry- and image-based approach", by Paul E. Debevec, Camillo J. Taylor, Jitendra Malik

In this paper, the authors present a method for modeling and rendering architectural scenes from a "sparse" set of photographs. Their modeling approach uses a combination of geometry-based and image-based techniques and consists of two components, (1) Facade - an interactive modeling program that allows a user to construct a geometric model of a scene from existing photographs and (2) a model-based stereo algorithm which differs from traditional stereo in that it measures how the model deviates from the actual scene. The built model is used to give context to the images used in reconstruction so that correspondence between the images becomes an easier problem to solve. The authors render the final scene using a method termed "view-dependent texture mapping". This method maps multiple views of a scene onto the final scene producing more realistic results.

Some problems discussed and directions for future work were (1) surfaces of revolution (e.g., domes) are not recovered in Facade. Facade uses "blocks" to facilitate building of a scene, (2) material properties of the scene are not recognized, thus rendering the scene under different lighting conditions (i.e., times of day) is not possible without redoing the entire process on a different set of initial images, and (3) the selection of images for rendering (view-dependent texture mapping) is not well defined and results in seams in the final scene.

I found that the ideas presented and the paper itself were well structured. I believe that this is due in part to the fact that the problem was constrained to handle architectural scenes. This decision helped limit the scope of the problem.

Geoffry Meek
Modeling and Rendering Architecture from Photographs: A hybrid geometry and image-based approach

Debevec, Taylor, Malik

The author's system for constructing 3D architecture from photograps has three main steps:

photogrammetric modeling

In this step the points of a photograph are hand-mapped to a collection of user-specified 3D primitives. This is a human step, so, fortunately, the author's paid a lot of attention to User Interface and created a point-and-click GUI program they call facade to help this step.

view-dependent texture mapping

In this step a virtual view is constructed using weighted averages from two views (photos) of the same pixels in order to accurately map textures.

model-based stereo

In the author's model-based stereo method, two views are not tested directly for disparity, rather the key image is left alone and the offset image is warped into the same perspective as the key image. The resulting stero disparity information is very good.

In general, it seems that the author's took the best pieces of all different methods and pieced them together into an accurate architectural rendering system. It seems that it is still labor intensive and would require a lot of training to use this system. I guess I question the utility of rendering the architecture in the first place. There is a lot of information that is "created" in this method, which always seems like a bad idea to me, but I guess it works in some cases.

Romer Rosales

QuickTime VR. An Image-Based Approach to Virtual Environment Navigation

Shenchang Eric Chen
(Article Review)

This is an alternative approach to 3D environment navigation which uses 360-degree cylindrical panoramic images to compose a virtual environment.

It doesn't requires laborious modeling and special purpose hardware rendering 3D geometrical entities, which requires a limit in scene complexity in order to guarantee a minimum level of quality.

Some other approaches use branching movies, which have a limited navigability and interaction and requires a lot of storage space. They do not require 3D modeling and rendering

In this approach, in order to simulate camera zooming or panning, the 360-degree image is warped at run-time. Different viewing positions and orientation are required, so it is necessary to have 6 degrees of freedom.

Panoramics can be created with specialized panoramic cameras or by composing overlapping photographs.

This is an image based approach, which uses real-time image processing to generate 3D perspective viewing effects. According to them it is similar to the movie-based approach but the movies are replaced with orientation independent images, called panoramas, they are rotation independent because they contain all the information needed to look around 360 degrees, because of this, the movie player has to be a real-time image processor. When connected, these images form a walkthrough sequence.

They described how camera rotation, movement and zooming and object rotation can be computed.

In camera rotation pitch and yaw can be obtained by the reprojection of an environment map (a projection of a scene onto a simple shape). For rotating the object it s necessary to have all allowable orientations of the object and use interpolation if we want the rotation angle to be arbitrary. For camera movement, changes in view direction can be obtained using an environment map, the viewpoint change is discussed and various solutions are presented, the view interpolation method generates new views from a coarse grid of environment maps. The nearby environment maps at each points are interpolated to generate a smoother path. Changes in field of view is like zooming, no more detail is obtained and aliasing problems are generated. A multiple resolution image can be one solution, although I think it requires a lot of storage.

This work has a lot of useful and interesting applications, mainly related to exploration of scenes. It still requires a good set of input data, specially if more quality is required. Benefits include an increase in speed without requiring special purpose hardware, the fact that all possible view orientations can be obtained. It also provides the appropriate level of detail of the objects in the scene. The display speed is independent from image complexity. Some limitations exist, the problem of looking straight up or down and the scene must be static, but in general I think that the idea of using cylindrical panoramic images is a good and 'efficient' solution (for now) to this problem. The rotation independent images allow for a greater degree of freedom in navigation

The problem in general with image-based approaches is that we may need a 3D description of the environment we have, images does not provide that directly (at least without special processing).

The system also includes viewing of object from different directions , hit-testing through orientation-independent spots.

Modeling and Rendering Architecture from Photographs: A hybrid geometry and image-based approach.

Debevec, Taylor, Malik
(Article Review)

This paper presents an approach for modeling and rendering architectural scenes from a set of images (still photographs). It combines geometry and image-based techniques..

They use a method to recover the basic geometry of the scene, this approach uses the constraints that are common in architectural scenes. Also a model-based stereo algorithm is used to recover how the real scene deviates from the simplified model. According to them, this technique recovers the depth using widely-spaced image pairs.

For the rendering process they use view-dependent texture mapping, which is a technique for compositing multiple views of a scene that better simulates geometric detail on basic models.

Techniques to model the appearance of the real world are divided into geometry-based and image-based methods. The first requires a lot of work, it is not realistic (looking)and cannot be verified easily. The second produce photorealistic images relying on the technique of computational stereopsis to automatically determine the structure of the scene from the photographs available. But there are problems related with the necessity of having very similar pictures, needing user input to pair features in images or needing the use of a lot of images.

Their technique uses both approaches, it requires a sparse set of photographs and can produce realistic rendering of arbitrary points. A basic geometric model is recovered interactively (with a photogrametric modeling system) , new views are created using view dependent texture mapping, additional geometry detail can be recovered automatically using stereo correspondence.

These are 3 new modeling and rendering techniques. Basically they interact in the following way, the photogramteric modeling algorithm creates a basic volumetric model of the scene, which is used to constraint stereo matching. It composites information from multiple images with view-depending texture mapping, which is like projecting the original photographs to the model

This approach can be very useful, it doesn't requires modeling architecture, or other surveying equipment, just photographs even though it cannot be automatic, it requires user input (it seems like it may be a lot). Other advantage is that they can model the scene with fewer photographs than current image based approaches. The approach seems to produce good photorealistic scenes. They mentioned some improvements that could be made.

Lavanya Viswanathan

1) P. Debevec, C. Taylor, and J. Malik. Modeling and rendering architecture from photographs: a hybrid geometry- and image-based approach. In Computer Graphics Proceedings, ACM SIGGRAPH, pages 11--20, 1996.

This paper discusses an approach for modeling and rendering existing architectural scenes from a sparse set of still photographs. This is done in two steps: the first one, called photogrammetric modeling involves the recovery of the basic geometry of the scene (an obvious issue to consider here is how to convert a 2-D photograph into 3-D positions in the world, which is a one-to-many mapping: more specifically, given a point in 2D space where an object is known to be present, the location in 3-D space where it actually is could be any point on a ray connecting the camera to that object); and the second one, called model-based stereo algorithm, recovers how the real scene differs from the basic model. Some problems with the method proposed here are that it does not recover components of architecture that are surfaces of revolution. This comprises a large class, such as domes, columns and minarets.

2) S. E. Chen. QuickTime VR -- an image-based approach to virtual environment navigation. In Computer Graphics Proceedings, ACM SIGGRAPH, pages 29--38, 1995.

This paper contains a very nice introduction to some of the methods used in creating walk-through virtual reality environments, such as branching movies. Some key aspects like camera rotation, object rotation, camera movement and camera zooming are discussed. This paper also introduces a new approach that composes a virtual environment from 360 degree cylindrical panoramic images. These images allow the user to look at any spot in the scene in a 360-degree horizontal sweep and a 180-degree vertical sweep. One interesting technique discussed in this paper is "stitching" in which overlapping shots taken by a camera in a 360-degree horizontal sweep are combined to form one continuous image of the entire sweep. However, this technique assumes that the image is stationary and would produce weird artifacts if there were a moving train or car, say, in the foreground. It would be interesting to see if an extension to this technique could be found that would also be able to take motion into account.



Stan Sclaroff

Created:  Jan 21, 1997

Last Modified: Jan 30, 1997