BU CAS CS 585: Image and Video Computing --- Class commentary on articles

BU CAS CS 585: Image and Video Computing

Shape from Shading
October 31, 1996

Readings:

A. Pentland and M. Bichsel, Extracting shape-from-shading, in Handbook of Pattern Recognition and Image Processing: Computer Vision, T. Young, ed., Academic Press, New York, 1994, 161-183.

Kriss Bryan
Bin Chen
Jeffrey Considine
Cameron Fordyce
Timothy Frangioso
Jason Golubock
Jeremy Green
Daniel Gutchess
John Isidoro
Tong Jin
Leslie Kuczynski
Hyun Young Lee
Ilya Levin
Yong Liu
Nagendra Mishr
Romer Rosales
Natasha Tatarchuk
Leonid Taycher
Alex Vlachos

Kriss Bryan

This article "Extracting Shape from Shading" introduces the idea of extracting shape from regions around shaded areas. Two such methods were determined for this. One is called Local-shape-from-shading and the other called Global-shape-from-shading. The global method uses the combinations of the local method and boundary conditions to solve the problem. The article says that the local algorithm provides robust estimates of surface shape, while the global algorithm produces accurate estimates of the 3D surface shape.

This article was quite informative, however, it was like a double edged sword. In one aspect the authors gave all the math involved in each process, this is helpful but can get slightly confusing. This occurs especially if you start to ignore the equations and they are then referred to 5 pages later. Despite the over all nature of the article, the introduction was not as informative as it should have been, in fact if I had not read the title before reading the article I might not have had a clue as to what was going on. As the reading continued though the ideas became clearer.

It is interesting how the Gaussian seems to be appearing every where. I would never have assumed it to be related to shading. The article did not explain what a Lambertian reflectance was. It was fortunate for me that my professor showed the class a Lambertian image prior to my reading the article. It is also interesting that for the local algorithm the concepts of the early human visual system are employed. One would assume the model to be more complex. What is an isointensity line?

Bin Chen

Many objects have relatively smooth, homogeneous surfaces, and shading is important in perspecting opaque and solid bodies. So we can model images as being composed of homogeneous patches. Existing vision techniques are doing well at edges but not in the interior of a homogeneous region. Some other techniques are needed to recover 3D structure from a image.

This paper studied several previous works done by other researchers, such as Horn's thesis and Oliensis and Dupuis's. Horn's idea is to fill in the surface between edges which were detected by usual vision technique. This approach has too general to most real situations. Oliensis and Dupuis's work was feasible by using a conservative minimization process to successively approximate the solution surface and then leads to a stable solution. Authors combine the main idea of Oliensis and Dupuis's method and the simple characteristic-strips method originally used by Horn to produce a stable, accurate, and efficient solution.

There are two classes of algorithm. One is local algorithm and the other is global algorithm. Local algorithm is similar to human visual processing, it attempt to estimate shape from local variations in image intensity but it can not recover metrically accurate estimates of surface shape. Global algorithms attempt to propagate imformation across a shaded surface starting from points with know surface orientation.

In the paper, authors use the reflectance map to present the reflectance properties of a surface. In a small region, the reflectance map can be approximateed by linear function of the partial derivatives. Filter is used to remove noise and nonlinear components of the image to improve the recovery process. This is a local algorithm similar to the human visual system, that is, recover surface shape from filter set. To get a global solution, authors bring up a solution by linking adjoining patches together, using reflectance map, is to integrate the information along the direction of steepest descent on the reflectance map. That is the method of characteristic strips. A conservative "minimum downhill" rule is used to deal with singular points ambiguity. Both general and discrete implementation are given and the Lambertian reflectance law is applied to test cases.

In the experiments, different noise and erroneous light-source tilt is added to see the sensitivity of the algorithm. However, the result images are hard to perspect by using human vision system.

Jeffrey Considine

In this paper, Pentland and Bischel discuss the problem of extracting shape from shading given homogeneous, uniformly lit regions and present two solutions, one for local estimations and one for combining local estimations into global estimations. While these assumptions are usually not accurate for real world images, this is an important first step in extracting data 3D data from an image as most images can be segmented into regions for which these assumptions are reasonably accurate.

The first method presented, the local one, is initially based on the planar approximation of the reflectance map. It is then transformed into the Fourier domain and eventually an equation for the Fourier transformation of the z (depth) data is obtained. This equation is then improved with some more assumptions they mention. While the math seems complex, it is apparently more efficient than the planar form. It also seems that this could be grossly expanded over a much larger homogenous region though the increase in calculation would probably not be worth it.

The second method presented is for the integration of patches obtained from local methods. They build on the "method of characteristic strips" for the extrapolation of the surfaces, but introduce the "mimimum downhill" rule to add stability by only allowing changes to propagate in one direction while also trying to stay in regions of mimimum change.

These methods seem to work well and a lot of information towards implementation is given. At some points though, they seem to skip steps in the analysis, especially in the math, but enough clues are given so that someone familiar with the literature and the necessary mathematics could probably reproduce it.

Cameron Fordyce

Timothy Frangioso

The basic notion that the author starts with is the fact that images are composed of homogeneous patches. One can use various techniques of edge finding to get the out line of the object or shape being looked for and then use shading to recover it. This is done by two different algorithms in the paper. The first is a local algorithm that works within a particular area of one or a few of the homogeneous patches that make up the image. The second is a generalization of this first problem that attempts to look at all of the variations in the homogenous patterns across that image and relate them.

"The problem is defined as follows: Solve the brightness equation R(n(x,y)) = E(x,y). Assuming that brightness can be represented by a function R(n) = (n1,n2,n3)." In other words to find the correct normal one has to correlate neighboring pixels and find which normal will apply to the brightness over this particular surface. The key to the working of the first algorithm is the idea that in particular relevant areas of the reflectance map the iosintensity lines are almost parallel. "So they provide a good way to approximate the reflectance map by a linear function of the partial derivatives (p,q)." The second algorithm is not as simple. It tries to find the correct way to correlating the various patterns by using a method of characteristic strips. The key to making this method work and avoiding the ambiguities that can arises is by following the "downhill rule". This rule allows one to avoid ambiguities by guaranteeing that there will be a smooth continuous surface.

Both of these algorithms are good approximations of the shape of Lambertian surfaces. The strength of both of them is there easy of use and general simplicity. The code that is given in appendix B is somewhat clear after working through the point notation. The major question that I have is why the experiments weigh so heavily upon the Lambertian reflectance law. Granted the assumption of ambient light is a good one but, it seems very restrictive to only stress this surface. The one example given with a shiny surface seemed to work relatively well and more experiments like this would of helped to more fully demonstrate the robustness of this technique.

Jason Golubock

The goal of shape-from-shading is to extract information about the shape and 3-d orientation of homogenous surfaces in images. Because the surfaces are smooth, the only visual cues as to the shape of the surface in the image are given by the intensity of the reflection of light over the surface. This article presents techniques for finding the shape of a surface based on the variations in intensity within the surface in a given image. Two general approaches for this are described... Local algorithms extract imformation abotu the surface shape based on variations in intensity within a small neighborhood. Global algorithms, using a starting point with a known orientation, sweep out the shape of the surface by finding the orientation of any one area based on the orientation of the area next to it. It is noted that the local method is though to be the best model of the shape-from-shading procedure that takes place in the human visual system.

The article describes both a local and a global procedure for finding shape from shading. The mathematics of each approach are described in detail. Both are shown to have performed well in the experiments that were done.

Jeremy Green

The method discussed in the article seems like a reasonably good way recover three dimensional information from a 2D image through looking at the shading. Although the algorithm seems like it could be useful, it is also very limited in the sorts of images that it can interpret.

The method assumes that the object is a uniform color and does not produce any specular highlights. It assumes that the object is completely matte. This would be a problem if you were using it to reconstruct real world objects. In a controlled environment where you are trying to produce a 3D description of a clay model this algorithm should work well. If it were trying to create 3D information about something like a face or, even worse, a car it would fail.

The algorithm definately has it's uses but it is by no means a universal way of creating 3D information from an image. I don't think the authors claim that it is.

Daniel Gutchess

Extracting the shape of a surface from an intensity image is a very important, yet difficult problem in machine vision. The authors Alex Pentland and Martin Bichsel distinguish two general approaches which have been used to solve the problem: local algorithms and global algorithms. Two specific techniques are outlined, the first taking the local approach, and another using global methods.
1. Small regions of the reflectance map (gradient space) are considered, so the isointensity lines can be approximated as linear. Multiplying the fourier spectrum of this approximated region, and the inverse of the transfer function H gives the approximate shape of the surface. Some improvements are made, one of which is to use Wiener filtering for noise removal. Several examples show that the techniqe performs well for a nice lambertian surface, but is less accurate when surfaces are shiny, or specular.
2. The second technique is a modification to Horn's method of characteristic strips. Two rules are added to improve performance. For each point, the line with the steepest slope passing through is selected as the path for integration. The downhill rule says that direction of this line should be chosen as moving away from the light (decreasing intensity) to integrate along. The minimum distance rule says to pick the height with minimal distance to the light source for all angles. Doing so will ensure the algorithm converges. Examples are shown showing that the algorithm is fairly robust to noise, and it's results are accurate.
The experimental results were quite impressive. I liked the fact that they provided code in the appendix, and the discussion on implementation for the global algorithm. The comparison of local methods with the human visual system was interesting.

John Isidoro

I have to say I really liked the Shape from Shading paper, it all seemed farily intuitive except for the part about using the Fourier transform to find direction of lighting. This paper reminded me of bump-mapping, but in reverse, where you try to find the surface heights from the final image, rather than, lighting a height field. Also, while bumpmapping renders a height field by using the gradients of the height field, extracting shape from shading uses an integration technique to keep a "running sum" of the height while scanning the image.

However this technique is very limited as it can only be applied to solid color lambertian surfaces. I suppose if texture information and the phong illumination model were provided for a particular surface, this technique could be applied to textured, shiny surfaces perhaps.

Its very nice how they provide code in their paper, it shows that they have nothing to hide, and that their paper isn't purely theoretical, with specially tweaked examples to make their paper seem like it works.

Tong Jin

Leslie Kuczynski

Authors A. P. Pentland and M. Bichsel discuss two general approaches to recovering the shape of an object in an image from variations in image intensity. The two classes of algorithms presented are local algorithms and global algorithms.

The local algorithms concentrate on 'small' patches of an image. If the patch is sufficiently small, computations become linear and thereby relatively simple to compute. However, the authors note that although local algorithms produce good qualitative estimates of shape, they do not recover metrically accurate estimates of surface shape. As an example, consider the recovered images of the nickles shown in the paper. We definiately see success in the fact that if we did not have access to the original image, we could still identify the recovered image, but notice the distortions in the recovered image shape. If we had never seen a nickle before, we would not know that it's shape should be a circle, we would assume that an oval was correct. Therefore, it appears that local algorithms require fundamental knowlege of the shape attempting to be recoved.

Global shape-from-shading is accomplished quite differently. Instead of concentrating on small patches of image intensity variation, the focus is on 'solving' the global intensity variation equation. The algorithm is rather clever in that it begins at one point (area) in the image and 'grows' outward from there. The concept is accomplished by choosing a step size and increasing this step size on each interation through the algorithm. The recovered image is quite accurate in it's representation of the actual shape of a 3D image. The Elvis example in the paper demonstrates this. However, actual surface properties are not as successfully recovered as they were using local algorithms.

I will conjecture that global algorithms would be useful in an applications where you were searching for something in an image while local algorithms would be more useful if you were actually interested in the details of what you were searching for. It appears that the two algorithms are quite complementary, that is, use a global algorithm to locate something and then use a local algorithm to identify its properties.

Hyun Young Lee

In modeling an image as a mosaic of homogeneous patches, recovering 3D structure from the image, based only on most-familiar vision techniques such as stereo and motion analysis, is difficult. Thus, by employing the shading information from varying intensity of a single homogeneous patch, not only the depth information at the edges, but also 3D structure from the image can be extracted.

Two classes of such algorithm are local one and global one: local algorithm estimates shape from local variations in image intensity by use of linear filters and point nonlinearities and so provides robust estimates. Global algorithm propagates information across a shaded surface starting from points with known surface orientation, which uses boundary conditions and local shading information together, and thus produces more accurate estimates.
Obtaining local solution requires first approximating the reflectance map by a linear function of the partial derivatives in the surface orientation space. And then the surface shape can be estimated in closed form with the inverse transfer function. Wiener filtering can be used to improve the recovery process. To obtain a global solution, method of characteristic strips technique can be used, together with regularization methods which help producing a stable solution. And in such integration step, a minimum downhill rule should be applied to avoid the problem of singular point ambiguity so as to be proven to converge to a unique, correct surface.

It is interesting that this method is based on the intensity information being caused by illuminants in an infinitesimal area of an image. Besides the results shown in the book, some curious thought comes up about the conditions on the lighting circumstances such that, to get an optimal 3D structure of an image, how bright the illuminant should be, from which direction the light should come, and so on. There may be no universal solution for these questions; or else exhaustive simulation could be performed.

Ilya Levin

This paper, by Alex P. Pentland and Martin Bichsel, describes the algorithms that can extract shape from the variations in intensity that exist within single homogeneous patch. The explanation of mathematical foundation for the algorithm was very brief and lucked the necessary details to understand it at all. All the examples/images, provided by the author, did not help me to understand the methods used to obtain them. Basically I found this paper very confusing and very difficult to read.

Yong Liu

Images are composed of homogenous patches. Depth information on stereo, motion or focus is provided at the patch edges. Therefore, extract shape from shading is a very important machine vision problem. Alex P. Pentland and Martin Bichsel discussed their alorgorithm for extracting shape from the variations in intensity that exist within a single homogenous patch in chapter six of 'Handbook of Pattern Recognition and Image Processing: Computer Vision' (Academic Press, 1994 pp161-183).

Two solutions were proposed in this chapter. The first solution uses linear filters and point non-lineariities to obtain local estimate of shape. The second solution links boundary conditions and local shading information together to obtain a global solution.

In their discussion of the first solution, they link it to biological mechanism . They specifically pointed out that early stages of the human visual system can be regarded as composed of filters tuned to orientation, spatial frequency and phase. They illustrated this linkage with three examples.

Nagendra Mishr

The Article by Pentland, and Bichsel describes two methods for retreiving the shape of an image from a gray scale image. The methods can be thought of as a bottom up and a top down aproach. The authors call them Local and Global respectively. The article contends that both methods are feasable and yield similar results except for one ambiguity which I did not understand. They state that the local algorithm generates the shape and is robust while the global algorithm generates a 3D shape acurately. I did not understand why they didn't use the same terms for the comparison.

The local algorithm takes its inspiration from biological models which seperate the incomming image into its component frequencies. These data are independant of each other and as such are calculated in parallel. The results are then normalized, and phase shifted such that all neighbors talk the same "language" and local shapes are estimated from this process. The biological model then shifts attention to defferent areas of the image to obtain a full 3d understanding.

The global algorithm is more synthetic and has stability problems which must be addressed by sohpisticated abmiguity resolution logic. The authors give basic code which contains the logic and also point out that one of the earlier aproaches to the problem used this technique. The basic idea is that the shape of an image can be estimated if we know the orientation of a particular patch in the shape. Given the orientation of the starting point, the images' illumination angle is used to plot a gradient line. The direction of the line is determined by using a "minimum downhill" hill climbing algorithm. The authors go on to describe what happens in the computer world where you have a descrete world, but you still need to look at the neighbors.

The algorithms are defecient in dealing with multipe light sources and image surfaces which have mixed reflectivity properties. The images are also assumed to be the same color all-around. Multi-colored images would throw off the global logic. The local logic might have hope in dealing with multi-colored images.

Romer Rosales

Although many computer vision techniques are oriented to analyze images as a serie of homogeneous texture patches, the tri-dimensionality of the real world make this approach work weakly in the interior of these homogeneous regions. This is why extraction of shape by considering image shading (or intensity) is likely to be a more useful model for recognition. Our perception make us think about the real world as a mosaic of relatively smooth and homogenous surfaces.

This work is based in the extraction of shape from variations in intensity that can exist within a single homogeneous patch.

Many assumptions have been made to simplify this very complicated problem, for example: the depth and orientation of some points have to be know (the brightest or the edge points), non consideration to smooth variations in surface color, nearby illuminants, reflection from other surfaces, etc...

The main idea: this work defines the brightness E(x,y) at the image plane as only dependent on the orientation of the surface, due to this, it specifically assumes that surface patches are homogeneous an uniformly lit by distant light sources. The brightness is then represented as a function R(n)=R(n1,n2,n3).

In this way, the shape form shaded problem is considering as the solution for:

R(n1,n2,n3 ) or R(n(x,y)) = E(x,y).

According to this work, there is a problem: an infinite number of normals (n1,n2,n3) satisfy this equation. Another assumption is made, we cannot observe brightness directly, so a measurement of image-plane intensity is used considered a good approximation.

For sufficiently small image patches, the isointensity line of reflectance could be considered parallel lines. This means that an approximation of the reflectance is possible by using linear functions (of the partial derivatives).

This is E(x,y)=k1+h2p+k3q. (Refer to the article)

An approximation specific for the Lambertian reflectance function, which kis are known, is then specified.

This approach was tested with an image with a very linear reflectance function, so as we can expect the resulting image can give an accurate impression of the real surface shape. A second image, from a more shiny metal surface was used (same object, different reflectance) and the recovery was somewhat less accurate, however the differences were very small. A third higher-detail image was used and the recovered surface was generally correct.

All this experiments were made considering a sufficiently small area or the image. To obtain a global solution, to the brightness equation adjoining patches are linked together, i.e. linking boundary conditions and local shading information to obtain a global solution.

For a global recovery a technique that grows the solution by integrating the information along the direction of steepest descent on the reflectance map was used, the normal of the surface patch. The method of characteristic strips.

This work tries to solve a very complicated problem, characterized for the necessity of controlling more variables and environment distortions that we can actually represent in a model. Due to this, many assumptions are made to build it. The result a model that maybe very helpful under controlled situations, but weak in the free real world environment.

We can expect that when these assumptions are implicit in the image, it is going to proportionate a solution that can be considered as a good recovery of the actual image. This is what the experiments showed. In the case of the global algorithm, a good estimate of the 3D surface shape is proportionated, although in some cases, like in the Elvis bust, some deviations from the true surface (around the edge of the nose) are shown, according to this work due to self-illumination of the surface.

Natasha Tatarchuk

Leonid Taycher

In this paper, Pentland and Bichsel discuss different methods of recovering shape information (mainly surface normal information) from shading (the variations of image intensity). The overall depth of the image can be determined by integrating over normals given depth of a few points. The two methods discussed are local (due to Pentland), which is considered to be similar to numan vision, and is good at qualitative estimates, but produce metrically unaccurate estimates and global methods which originally (Horn) requared a strong continuity assumption, but later it was proven that it was unnesessary. Both algorithms operate on the assumption that the light rays what hit the surface are parallel and surface has Lambertian reflectance function. The local method seems to be a quick way to get local orientations of the surface using linear filters, but the global method derived from it does not seem to give good results when the conditions are changed (specular highlights are introduced). And I don't think that it would perform well when use in the case of multiple light sources. The global method which tries to sequentially adapt the surface (starting with the given points) to the lighting data seems to give better quantitative results.

Alex Vlachos



Stan Sclaroff

Created:  Sep 26, 1996

Last Modified: Nov 1, 1996

BU CAS CS 585: Image and Video Computing

Shape from Shading October 31, 1996

Shape from Shading
October 31, 1996