BU CAS CS 585: Image and Video Computing --- Class commentary on articles

BU CAS CS 585: Image and Video Computing

Applications of Pyramid Image Representations
September 26, 1996

Readings:

P. Burt and E. Adelson, A multiresolution spline with application to image mosaics, ACM Trans. Graphics 2, 1983, 217-236.
P. Burt and E. Adelson, The Laplacian pyramid as a compact image code, IEEE Trans. Comm. 31, 1983, 532-540.

Participants:

Judd Bourgeois
Kriss Bryan
Bin Chen
Jeffrey Considine
Cameron Fordyce
Timothy Frangioso
Jason Golubock
Jeremy Green
Daniel Gutchess
John Isidoro
Tong Jin
Leslie Kuczynski
Hyun Young Lee
Ilya Levin
Yong Liu
Nagendra Mishr
Romer Rosales
Natasha Tatarchuk
Leonid Taycher
Alex Vlachos

The reading "A Multiresolution Spline with Application to Image Mosaics" deals with the techniques of using Splining to create better images. These images vary from an actual image to a computer-generated image of an actual image.

This technique of Splining is quite interesting to me because it seems to have many practical uses. These uses can stem from special effects in films to Video Games / Virtual Reality simulators. This is illustrated in figure 8, with the eye in the hand also in figure 1 with the merging of the two halves of the planet surface. This method for films could reduce the cost of film producing, since it is technically using some of the methods of film production. For instance figure 8 again with the eye in the hand is done in the same way that it is done in a film. The process occurs by filming parts you want, then creating areas to store the picture and then finally just merging the two with a few adjustments.

There was one problem that arose for me: What would happen if two images were not of the same size and they were merged? This was somewhat solved with the splining nonoverlapped images. Even with this technique, to escape from creating double exposure an area has to be designated in advance for an image to create a smoother merging, for example again figure 8 the eye in the hand.

The Laplacian Pyramid which is very useful in Splining, reminds me of a tree structure or rather a linked list with the root node being the original image and each pointer is a filter. You may only traverse the list from the right to the left or left to the right. Traversing through the tree or the images gives the different pictures at each level. Like a list you have to go from one to get the other.

Bin Chen

In image encoding, we need a technique, which is both easy to implement and requires a relatively simple computations, to provide greater data compression. Paper[1] proposed a compressed image representation called Laplacian-pyramid.

Laplacian-pyramid combined the advantages of predictive and transform methods. It is implemented by first do a low-pass filter (blur) on the original image to get a reduced version, and repeat this to get a series of further reduced images, which together is called Gaussian pyramid. The Laplacian pyramid is a sequence of error images which are the difference between two levels of the Gaussian pyramid. The weight w and the equivalent weighting function play important role in pyramid gneration. By choosing different values of a, we will obtain different effects to meet the actual situations. The Laplacian pyramid is something like a complement of the Gaussian pyramid, it can be decoded and recovered to the original image by expanding, then summing all the levels of the pyramid. Quantization can be used to reduce the entropy, in the meantime, the proper choice of the number and distribution of quantization levels should be wisely chosen to made the degradation imperceptible to human. So, knowlege of human vision is needed in this case.

Laplacian-pyarmid is a versatile data structure with many attactive features for image processing. It has many applications such as progressive image transmission. A more interesting applications of Laplacian-pyamid is discussed in Paper[2] - Multiresolution spline with application to image mosaics.

In many situations, we may want to combine two or more images into a larger image mosaic. A common technical problem in image mosaics is joining two images so that the edge between them is invisible. Pyramid structure is found ideally suited for performing the splining steps. The multiresolution spline algorithm is defined in terms of the basic pyramid operations and can be generalized for constructing mosaics on overlapped and nonoverlapped square images as well as arbitrary shape images. The images to be splined are first decomposed into a set of band-pass filtered images, then those images in the same spatial frequency band are assembled into a band-pass mosaic, finally those band-pass mosaic images are summed to abtain the desired image mosaic. Thus, the multiresolution spline can eliminate visible seams between component images.

Jeffrey Considine

"The Laplacian pyramid as a compact image code"

Burt and Adelson present the Laplacian pyramid used in the previous article. The idea behind it is that the value of a pixel is related to that of those around it. In constructing it, the image is iteratively Gaussian blurred and decreased in resolution. The Laplacian pyramid is the series of differences between these images starting from the smallest image with the least detail. Without compression of any sort, the Laplacian pyramid would be about double the size of the original image (sum of powers of 2) but since each level has so little information (only the difference with the previous) it can be greatly compressed.

Use of the Laplacian pyramid takes a simple representation of an image and shows its usefulness. Besides image compression, this technique could aid in speeding progressive transmission of images using the Laplacian pyramid.

"A multiresolution spline with application to image mosaics"

Burt and Adelson present a technique for merging images that avoids the problem of obvious edges. They start presenting the use of weighted averages along the edge and show how it has disadvantages with pictures having details over a wide variety of frequencies. As an alternative, they suggest breaking the images up by bands of frequencies and splining each subimage separately before splining and recombining them. This avoids problems from blending the images at too high or low a frequency. To me this sounded like blending the components of each images Fourier Transform separately but they used a Laplacian pyramid to subdivide the images which seems much more computationally efficient. They also discuss using this method for irregularly shaped edges (not straight) but they seem vague about the use of a Gaussian weighting when apparently the method does not need to change.

Except for the section on irregular shapes, this paper is easy to follow. The math is simplified but not overly and appears correct for the most part I believe the sumation on p.222 should be from -2 to 2 but the point is clear and this is a minor point.

Overall, this article is clear and to the point and presents a clean and efficient algorithm.

Cameron Fordyce

The authors present an efficient method for the compression of images that minimizes the loss of image information. An algorithm is also presented that appears to be relatively fast and not very computationally expensive. The method is compared to other methods of compression both causal and noncausal methods. The operations of the Laplacian Pyramid have the advantage of being local and applicable iteratively over the image and over the successive images produced at each stage of the process. The authors also propose other uses for their method such as progressive image transmission.

It is difficult to judge the contributions of this paper given my lack of knowledge in this field and the age of the paper. However, I will try. The method in and of itself appeals to me because of its simplicity and repeatability. The convolution and subtraction cycle occurs at each level of compression and the reverse of these steps is used in the expansion of the image. Also, in each intermediate image some information that is recognizeable is kept so that the use proposed by the authors, progressive image transmission, becomes possible. As Prof. Sclaroff noted in class, there might be other uses for intermediate images such as segmentation of an image based on an image at the top of the 'pyramid' into useful and non useful subimages. Processing could then continue only on the subimage judged to be important.

I wonder if how much any method that relies on the prediction of the value of a pixel given its surrounding pixels will perform badly in images that have areas of great differences in intensity in adjacent areas such as in binary images.

Unfortunately, the quality of the copying impeded evaluation of the examples provided by the authors.

Timothy Frangioso

A Multiresolution Spine With Application to Image Mosaics.

This paper describes a process of merging or combining two images together using a process called multiresolution splining. It uses a filtering scheme to separate the image into a series of low-pass images. This process is carried out by producing a Gausssian Pyramid images. This process is carried out by producing a Gausssian Pyramid After the creation of this pyramid these low-pass images are converted to band-pass images by a process of constructing a Laplacian Pyramid. Now that the various levels of the image have been separated out into there elementary parts they can be splined together on this lower level. At this point the splining algorithm is used on each on these levels to create a smooth transition between the combined images.

This process has multiple benefits. Intuitively, one is breaking up the image into all of its component parts. Only after each of these levels have been connected together are they integrated to get the resulting image. So, this process of multiresolution splining has the effect of making a image without any distortion. That is it will solve the problems of boundary lines being created in the resulting image and the double exposure effect. It also makes calculation of the image efficient because, all of the commutation is being done on the lower level images.

The Laplacian Pyramid as a Compact Image Code.

This paper discuses a technique for using the Laplacian Pyramid as a data structure to encode information about a image. This idea of encoding is based on the assumption that pixels are highly correlated to neighboring pixels and that a average or a summation of areas of pixels would be a more efficient way of storing images. First a Gaussiam Pyramid is created by convolving the weighting function or averaging function with the image. The prediction error is then computed by subtracting the low-pass imaged by the original image. The result of this process is a low-pass image and a prediction error that can be used to store the information about the correlated pixels and differences between the levels of the Gaussian Pyramid. These differences between the levels of the Gaussian Pyramid are in fact the various levels of the Laplacian Pyramid. By repeating this process you get various different levels of the image and an effective way to store and compress an image.

Jason Golubock

A Multiresolution Spline With Application to Image Mosaics

This article tells about a new technique that has been developed for joining two or more images into one image, or splining. The main problem with splining is that if two images are simply glued together over an edge, there is a visible seam, due to differences in shading and texture along the edges. The techinique described in the article involves smoothing the edge over between the two images by taking a weighted average of the two images along the edge which smooths out the seam. The width of the strip over which the images are smoothed is variable, of course, and has a big effect on the outcome of the process. Instead of just splining the whole image at once, the images are first decomposed into a series of band-pass filtered component images, which are splined individually using the weighted average. The width of the smoothing zone varies depending on which wave lengths are represented in the band, which tends to give optimal results. The individual splined images are then added together to get the final result. The author seems rather pleased with this procedure, although he mentions that no really good splining technique has yet been found. Obviously I'm not exactly qualified to critique this technique or comment on its usefulness...

The Laplacian Pyramid as a Compact Image Code

This article describes a technique for compact storage of images which uses a Laplacian Pyramid... It is based on the fact that neighboring pixels in any image are usually highly correlated, and that storing the value of each pixel separately is rather redundant. The procedure described here uses a Gaussian weighting function, and convolves this weighting function with the image to obtain a low-pass filtered image, which is then subtracted from the original. So rather than storing the original image, the band-pass filtered image is stored, along with the difference image between that and the original. This results in a net reduction in necessary storage space. To make storage even more efficient, this process is repeated on the low-pass filtered image which was obtained from the original, and then repeated again, and so on. The result is the Laplacian Pyramid structure. The article also describes the process of decoding the image based on the pyramid to obtain the original image. The procedure is described in detail, but I was unable to follow it well enough to discuss it here. Again, the author seems to believe that this technique is a good one, since the storage space required is significantly reduced, and the computation required for this technique is relatively simple. One nice side effect of this encoding procedure is that one automatically has access to the already-computed band-pass filtered images from the original.

Jeremy Green

(The Laplacian Pyramid as a Compact Image Code by Peter J Burt and Edward H Adelson)

The paper discussed using laplacian pyramids as a method of compactly storing images and of progressive storing or transmitting images.

The method discussed in the paper seems like a very useful way to store an image that is compact and easily retrievable. The decoding of the image is exactly the reverse of the encoding of the image and both of them are very trivial. The laplacian pyramid algorithm would also be easily scalable to multiple processors to run in parallel.

The idea of using it for prograssive transmission of an image is a very useful one. but sending each level of the pyramid seperately the image would start out blurry (with no high frequencies) and then get sharper as more of the image is received. All the receiving computer would have to do is add each layer of the pyramid to the previous layer. This make the user much less likely to become bored waiting for the image to coalesce. He would be able to see something at once and have it get more detailed as time went on.

(A Multiresolution Spline with Application to Image Mosaics by Peter J. Burt and Edward H. Adelson)

The paper discussed using splines and pyramids to combine two similar or disimilar images into one. The method discussed in the paper was a very interesting and seemingly useful way to disguise the border between the two (or more) images. The idea of splitting the image into many images of decreasing size and with decreasing frequencies seems like a very intuitive and useful way to merge the different frequencies of the image seperately. By splitting the image into the different frequency bands it is much easier to find an appropriate spline for the different frequencies instead of finding one spline that works with all of the frequencies.

Daniel Gutchess

Title: The Laplacian Pyramid as a Compact Image Code
Authors: Peter J. Burt, Edward H. Adelson

In this paper, the authors introduce a method of image compression using a structure called a "Laplacian pyramid". To begin, they provide a brief, easy to understand summary of causal and noncausal image encoding methods, listing the pros and cons of each. They claim that their Laplacian-pyramid code provides higher compression than causal methods, yet is more computationally efficient than noncausal methods which use image transforms.

The steps of the coding process are then described in detail. The first step, Gaussian pyramid generation, is accomplished by low-pass filtering the image over and over, forming a pyramid of reduced images. The Laplacian pyramid is then constructed from the Gaussian. Each level of the pyramid is equal to its corresponding Gaussian level minus the expanded Gaussian level above. Since the entropy of an image is the lower-bound of the number of bits needed to represent each pixel, it is desirable to decrease entropy for greater compression. A quantization technique is described which will slightly degrade the image, but hopefully not enough to be visible to the human eye. At the end of the paper, it is mentioned that Laplacian pyramid image coding facilitates progressive transmission. In progressive transmission, top levels of the pyramid are sent first (ie over a computer network) and are expanded immediately to give the viewer a rough image of what is to come.

I made the mistake of reading Burt and Adelson's paper on multiresolution splines before reading this one. I was very confused about Laplacian pyramids before reading this paper. I felt this paper did an excellent job in describing how Gaussian pyramids were used to generate the Laplacian pyramids, and showing why this is a compact code. The discussion of optimal values for the parameter a in the generating kernel was informative, as was the section on progressive transmission- I always wondered how web browsers did that.

Title: A Multiresolution Spline With Application to Image Mosaics
Authors: Peter J. Burt, Edward H. Adelson

The introduction talks about some applications which deal with image mosaics, by which it shows the motivation for producing a good splining method. The goal is to make the seams (line segments where the smaller images meet) invisible, but not to lose much valuable data from the original pictures. A weighted average splining method is shown, where pixels in the overlapped area are simply multiplied by a weight, which is just some decreasing function as you move through the area. However, it is shown through a (nifty) example that finding an ideal weighting function and transition width is not always possible. They conclude that images that have a spatial frequency band over one octave cannot be nicely splined this way.

After a description of Laplacian pyramids, they state the main idea behind multiresoltion splining. That is, Laplacian pyramids should be made of both images, and new pyramid should be made, consisting of all the nodes on the left half of one pyramid, and all the nodes on the right half of the other. The recovered image of this new pyramid will be your splined image. Some nice example pictures are shown, including an apple and an orange spline, which looks quite good (despite the poor quality of the photocopy). They then extend this basic procedure to solve two other problems: splining irregular regions, and splining images which don't overlap.

Figure 3 is a very good advertisement for this technique. The other methods of splining pale in comparison to multiresolution splining. I wondered, however, when I read their comment that "no fully satisfactory splining technique has yet been found", how many other techniques existed, aside from those mentioned. (Maybe none back in 1983?) As I mentioned in #1, I was left confounded by the explanation of Laplacian pyramids, it just wasn't clear enough for me. So I am thankful for having the first reading. Otherwise, I felt that this paper was very readable and interesting. In particular, the authors made good use of pictures to show how well their technique actually works.

John Isidoro

The Laplacian Pyramid is quite a clever concept, its sort of a cheezy Fourier Transform in the same way environment mapping can be used to make cheezy phong shading. Its a really computationally inexpensive way to do spectral decomposition. It can be used for such things as spectral decomposition, progressive transmission, and edge finding. I don't see why they use a 5x5 kernel instead of a 2x2 evenly weighted kernel, and use the output of the 2x2 for the gaussian pyramid. Oh well.. Maybe I'll try it and see what happens

The Multiresolution Spline paper really cleared up some questions I had about seemless image blending, before I read this paper my approach would have been to just overlap the two images, and then just fade one into the other using a smooth alpha gradient.. They explained in the paper why this doesn't always produce good results, and how sometimes it can produce a "double exposure" effect. I can visualize how the spline works now, and how it will only smooth about one wavelength of data, in each frequency band.. This one should be fun to implement for project 2?? 3??

Tong Jin

Leslie Kuczynski

Summary/Critique for "A Multiresolution Spline With Application to Image Mosaics"

Summary

Authors P.J. Burt and E.H. Adelson define a multiresolution spline tec hnique (digital technique for distorting two surfaces so that they can be joined together with a smooth seam), based on a weighted average approach, for combining two or more images into a larger image mosaic. They solve the 'visible boundary' problem while still maintaing an acceptable quality-of-image to the human eye by first decomposing the images into a set of band-pass component images. This is achieved by constructing a Gaussian pyramid for each image, resulting in a sequence of low-pass filtered images, from which a Laplacian pyramid can be constructed. The Laplacian pyramid is the sequence of band-pass component images. This means that each original image is now represented by a sequence of component images where each component image (in a sequence) is representative of a relatively narrow spatial frequency band of the original image. Corresponding band-pass component images from each sequence can then be splined together producing a sequence of band-pass mosaics. Expanding and summing the band-pass mosaics results in the boundary free, desired mosaic.

Critique

A multiresolution spline algorithm was presented in this paper along with variations for splining overlapped and nonoverlapped square images and for splining images of arbitrary shape. The algorithm seems straight forward for merging overlapped square images (to produce and mosaic image consisting of half of each original image) with relatively similar characteristics at the boundary line (i.e. close gray levels). They simply took the average of the nodes (from the Laplacian pyramid) along the center line. However, when combining two quite different images, they made the transition more gradual by averaging nodes on either side of the center nodes. This sounds easy enough but I suspect that the ratio of weights is not as simple as guessing and getting it right, although no algorithm was given to compute these ratios. Splining images of arbitrary shape follows roughly the same procedure as above except in this case, a mask is created to indicate the location of an image (i.e. which nodes should be taken from each image and which nodes should be averaged). In this case, weighting is applied to the edge of the mask and to a distance of two sample positions on either side of the mask. Again, how the weighting is decided is not specified. Splining non-overlapped images, again follows the same basic procedure except that image edges (at boundaries) are extrapolated to produce an overlapping region. I am curious to see how well the technique works for images that do not appear to belong together, unlike the example where the resulting mosaic was the result of splining an image composed of mosaics.

Summary/Critique for "The Laplacian Pyramid as a Compact Image Code"

Summary

Authors P.J. Burt and E.H. Adelson describe a technique for image encoding which differs from previous techniques in the respect that the code elements are localized in both spatial frequency and space. They accomplish this by first removing image correlation utilizing a hybrid of predictive and transform methods. Essentially, this means that the image is represented by its Laplacian pyrmiad code (sequence of band-pass images). Each image in the sequence comprising the Laplacian pyrmiad is then quantized. Reconstruction is accomplished by expanding and summing the quantized images.

Critique

To obtain the Laplacian pyramid representation, an appropriate low-pass filter must first be applied to the image. Applying successive low-pass filter to the image a sequence of low-pass filtered images is obtained (Gaussian pyramid). It is the difference (error) between each successive low-pass filtered image that makes up the sequence of images representing the Laplacian pyramid. The authors introduce two fast algorithms for obtaining the Gaussian and Laplacian pyramids. They discuss something they term as the generating kernel. The generating kernel is a pattern of weights that is convolved with the image, producing a low-pass filtered image. The same pattern is then applied to the low-pass filtered image to generate another low-pass filtered image. These are the sequence of images that make up the Gaussian pyramid. They state that the size of the weighting function is not critical and that the selected 5-by-5 pattern used in their examples was chosen because it provided adequate filtering at low computational cost. I do not find this reason to be satisfactory and would prefer a more intuitive reason. Would a larger pattern result in a too dramaticly ramped pyramid? Would interpolation upon expanding degrade image detail too much? The choice of a generating kernel is subject to four constraints which I will not reinterate here. However, I could not justify all of them, why should it be seperable and why is the one-dimensional length normalized? The algorithm for computing the Laplacian pyramid from the Gaussian pyramid was straightforward as was the idea of quantizing the image to reduce entropy. However, not much was said about how to choose the proper number and distribution of quantization levels so that degradation (due to quantization errors), will be imperceptible to the human eye.

Hyun Young Lee

The Laplacian Pyramid as a Compact Image Code

To encode an image by removing pixel-to-pixel correlations, Gaussian pyramid and Laplacian pyramid images are generated and quantized to yield their compressed code. This method is combining the causal predictive and noncausal transform techniques, so as to achieve a noncausal, possibly parallelizable, fast scheme, yet with relatively simple and localized computations. The pyramid data structure is produced by encoding the prediction error for each bit, n times such that L0, L1, ..., Ln each of which has lower density than its predecessor by a factor of 1/2.

Laplacian pyramid is well suited for efficient compression method and its use for progressive transmission. Variable-length code words are used for each node image in the pyramid to take advantage of the nonuniform distribution. If variable-size bin technique has been adopted and applied to encode one image node of skewed frequency, then it would be more efficient encoding to compress the image.

A Multiresolution Spline With Application to Image Mosaics

In combining two or more images into a larger image, a technical problem of visible edge can be resolved with multiresolution spline technique. The noticeable difference between two surfaces of two images is adjusted with image splining. The parameter T for the weighted average splining technique, which is the width of the transition zone, has a crucial role in avoidance of double exposure effect as well as the degree of smoothness of resulting image.

To accomplish image filtering, Gaussian pyramid and Laplacian pyramid operations are performed. To spline images for which Laplacian pyramids have been constructed respectively, another Laplacian pyramid is constructed from those Laplacian pyramids and its levels are expanded and summed to obtain the splined image. So the splining algorithm is applied not only in overlapping two images as half-and-half but also for regions of arbitrary shape, and nonoverlapped images with image extrapolation.

It is doubted that this method can work in combining more than two images where an image for a certain region is made to border more than one different images, which implies that the region has to employ two or more T values and Laplacian pyramids for the center image

Ilya Levin

Yong Liu

What do image splining and image encoding have in common? They both require high degree of data integraty after the operation. The requirements come under different contexts. But Peter J. Burt and Edward H. Adelson found a unified solution for maintaining data integraty at low computation cost. The unified solution, known as Laplacian Pyramid Alogorithm, were discussed in their articles 'A Multiresolution Spline with Application to Image Mosaics' ( ACT Transactions on Graphics, Vol.2 No.4, October 1983, Pages 217-236) and ' The Laplacian Pyramid as a Compact Image Code'.

The context for image splining comes when tradiational image merging methods failed to meet the requirements of smooth transition at the boundary and maintaining data fidelity at each side. Application of tradiational boundary condition alogorithm either produces overexposure for pixels in the overlapping zone or creates sharp edge on the boundary.

The Laplacian Pyramid alogorithm proposed by the authors solved this problem by decomposing the images to be splined into a set of low pass filtered images. Then, the decomposed images formed a corresponding low-pass mosaic through interpolation. Next, these low-pass mosaic images were summed together to form the splined image. This method has been found widely applicable in various images splining situations. Images do not have to overlap. They do not have be same kind. The authors demonstrated that they can spline right half of an orange and left half of apple together without creating a sharp edges in between.

The context for image encoding in a more directly way because images take huge amount of space. The pixels in a image are highly correlated. But most of the encoded information are redundant. Compression is possible in image encoding.

The authors applied same alogorithm they used for image splining. This time, they created a series of low pass images from the original image. Instead of encoding the original pixel, they encode the differences between each level of low pass images and low pass images themselves (called Laplacian pyramid images). The information for original image were obtained by summing up encoded messages in the low-pass images and the differences information. Effectively, the authors adopted a compression strategy for images encoding. A common problem with image compression is how much information is lost between compression and decompression. The authors use severeal example to repeatedly demonstrate that information is intact for maintaining good visual effects.

An interesting example was used by the authors where they demonstrated that their alogorithm not only saves computer storage but also saves data communication capacity. In data communication, large capacity is needed to transmit images. However, not every details are necessary to get the information transmitted. The authors demonstrated that the low pass images and the difference information on low-pass images can be first transimitted because considerable transmission capacity can be saved. The receiving end can decide weather a even more detailed information is necessary. If it is, the sum-and-expand message can be delivered just as the image was gradually decompressed.

Nagendra Mishr

Multi-res splines...

The article by BURT and ADELSON describes a method for creating photomosaiacs without placing a restriction on the input images. The ydescribe a techinque which they have developed called "Multiresolution Splines."

Consider any two images which you would want place two images next to each other and you want the transition from one image to the next to appear unnoticable. You can aproach this problem by just placing the images next to each other. This aproach only works if the two images are very close in texture, color, shape and granularity of detail. What if the images differe in color? For example, you want a read circle to turn into a blue circle. The natural result image should be a circle which starts out as red and ends as blue. What if the shape of an object changes? Well, since objects are a higher order moment, and a graphical sense, its really just the colors situatiion still. The idea is that any two images can slowly be turned into the other by altering colors. The notion of texture is stil lunclear and probably will not work whith the technique described. Another use for mosiacs would be to take irregularly formed shages and merging them into the other.

The spline technique involves performing a Laplacian operation on the images. These Laplacian numbers can then be added and subtracted to each other which in theory yields images which are intermediate to each other. A second technique required to perform the merge operation takes the laplacian numbers and generates image pyramids from them. These pyramids in turn can regenerate the original image. Eahc level in the pyramid contains information extracted from the lower layers and thus information in multiple levels of the images can be merged. So information which transcends the simple color component of the image in the lower layers is also considered in the merge operation.

The Laplacian Pyramid

The laplacian pyramid replies on the notion of spacial contunity to compress data. This means that we pick a point in the image and we imagine that all points around it have an intensity specified by the laplacian function. Using this math constructed image, we look to find the actual differences from the original. Since the original is "spacially continous" we find that the differences are minute. We then store the differences and the center points of the laplacians. We take this method and create a pryamid out of the image to gain further compression.

Much of this will break if the original image does not folow the spacially continous image. For example, image random noise which is very dense as in a TV screen. In other words, the density of the pyramid (it's entrophy) will be greater. A dense pyramid means that it is not very efficient at compacting the data. The entrophy can be reduced by quantizing the image firsthand i.e. removing some data from the picture will reduce the density. In the example of the TV, the snow will be converted into a smeared picture.

A second benifit to performing laplacian pyramids is that it breaks the image into successively more details. Tthe highest level of the pyramid contains the least detailed view of the image but using the level and the incomming next layer, we can generate another picture which contains more details of the picture. This is an excellent way to focus in on images. i.e. we may first transmit the highest layer of the picture. The image when rendered will be the entire picture but with little detail. As successive layers are transmitted, they can be interpolated to display a more detailed view of the picture. Sort of like the way interlaced gifs are view from a web browser.

Constructing the laplacian pyramid not only compresses the data, but also band-passes the data much more efficiently then FFT's.

Romer Rosales

The Laplacian Pyramid as a Compact Image Code

The article describes a way of encoding intensity images by applying consecutive low-pass filters on a serie of generated images. The algorithm starts processing the original image and produces a second image that is successively filtered by using the same algorithm or function.

Normally, image neighboring pixels are highly correlated, so the encoded information can be redundant. This technique can decorrelate the image pixels by combining features of predictive and transform methods.

Let L0 be defined:

L0(i,j)=g0(i,j)-g1(i,j) (Ec 1.)

,g1 is the resulting image after applying a low-filter to g0 (original image). L0 represents the prediction error. The generated image g1 is a form of g0 that has a reduced resolution and sample density.

Data compression is achieved encoding L0 (which is decorrelated)and g1 (which can be encoded at a reduced sample rate) instead of g0.

The images g2...gn are obtained in the same way as g1, that is by using a weighting function reduce described in the article. After having gi, L1..Ln are obtained by means of Ec 1. The Laplacian Pyramid Li, i=(0..n) represents the sequence of error images obtained between two levels of the Gaussian pyramid (g0,g1...gn).

To decode the image we use an expand function to get gi from gi+1. This function reverts the process of expand by interpolating new node values between the given values. The values of Li are obtained by using the equation

Li = gi - Expand(gi+1)

A efficient way to do this is by expanding LN once and add this to LN-1, expand this image and get LN-2 and so on until reaching level 0 and go is recovered. Note: gN=LN

The Laplacian Pyramid method is an effective technique that can be useful in many instances: reducing image storage needs, transmitting images in a additive way, pattern recognition and image contrasting. The speed of the algorithm needed to implement it is, according with this work, acceptable due to the use of non-complicated computations and the use of the same routine during almost all the processing. The point against this method is the degradation that occurs on the image after decomposing and recomposing it. Although this lost of accuracy can be reduced by some methods to a point that can be imperceptible to the human eye, identical retrieval may be needed in special occasions.

Natasha Tatarchuk

Considering the background that I have in the image processing area (or the lack of) this paper was very helpful in a variery of topics for me. It gave me a much clearer understanding of the area of digital image encoding. It was useful from the authors side to introduce some simpler encoding techniques in the beginning such as predictive coding and and transform technique, both of which were briefly, but to my view, clearly explained the few openinng paragraphs of the paper.

I don't think that I would be able to point out any specific instances in the paper where I would be able to critique the authors approach or the techniques due to vague knowledge of the area on my part, but this paper did a great job of introducing and explaning both Gaussian and Laplacian pyramids. I found that after reading this paper, I finally understand what a Gaussian pyramid much better than just that it's a technique for image encoding. I found the section on the generating kernel also very useful to me, as well the section on the equivalent weighting function. Reading this paper after reading the 'Multiresolution Spline' finally helped me to understand many of the things that were just slightly touched in that paper, and yet very necessary to complete understanding.

The sections on Quantization and Entropy were a bit less clear to me, possibly because it seemed that the authors assumed the reader's prior knowledge in the area that they were dicussing.

Multiresolution Spline

This was a very interesting paper. At the first run at it, some parts of the paper weren't clear, but if you'd read it after reading the Laplacian Pyramid paper beforehand, than many of the areas that were unclear in this one would be understood.

I liked how the authors gave a good introduction of the subject for this paper in terms of actual examples arising from life applications. That helped me to associate the topic with the actuality, and thus helped to understand much better. Actually, after reading this paper, I realized why some of the techniques for combining the images in somewhat smooth manner (such as Photoshop's layering and alpha channels+gradient combinining, and so on) wouldn't work perfectly for many images. The explanation of the weighted average splining technique in this paper, and the drawbacks was useful for precisely that reason. The example with the stars (although the choice of images was not very good for reproduction of the paper), especially the explanation of the double exposed look helped to understand the need for the multiresolution spline.

One of the things that I wasn't sure why the authors did it, was that they made an assumpltion of working with overlapping images only in the beginning of the paper, and only in the end did they finally explained how the technique could be used for other types of images.

Leonid Taycher

In the papers "A multiresolution spline with application to image mosaics" and "The laplacian Pyramid as compact image code" Burt and Adelson discuss different applications of Laplacian Pyramids (for image splining and coding respectively). Laplacian Pyramid is an image pyramid where each higher level is created by lowpass filtering (usually gaussian) of the previous level, substructing lowpass version from that level, and subsampling the result. Each level is such pyramid is analogous to band pass filtered original image, where the passed band's width depends on the filter.

The Laplacian Pyramids in image splining seem useful because of the need to observe two (almost always opposite) requirements : that the width of the averaging area between images be comparable to both the lowest highest frequencies in the image. This is obviously unattainable with normal images, but easy with same levels in Laplacian Pyramids of those images, since each level contains only limited band of freqiencies. So the averaging area for lower frequencies is greater then for higher, because a transition zone is the same for all the levels, but since the lower frequency images are smaller in size. The only problem I had with this paper is that there seems to be a mistake in the expansion formula, where they attempt to interpolate the points of expanded version of a level by averaging between pixels around it, but instead of dividing by 4, they multiply by 4.

In the second paper Burt and Adelson discuss using Laplacian Pyramids for image coding, using the fact that high frequency levels can be efficiently coded (with the big quantization bins) without having much visible effect on the human perception of the image, since humans are more sensitive to lowfrequency noise than to high frequency. The Laplacian Pyramid coding is also very useful in progressive image transmission, since the lowfrequency levels carry with them most of the information contained in the image and can be efficiently sent over the network (since their size is a fraction of the original image) first to give viewer the main idea about the image.

Alex Vlachos

Reading #1
The Laplacian Pyramid

The reading gives an explanation to the encoding and decoding of an image using a pyramid-style representation. The end result of encoding an image using this style is basically a collection of images that progressively reduce in size (by one half) and reduce in resolution.

The images are computed by applying a band-pass filter and storing the result. An error image can be produced by subtracting the filtered image from the original image. Then, the filtered image and the error image are stored instead of the original image resulting in a smaller file size.

The decoding of the image uses the error and filtered images by basically adding them together. The process is done in reverse by decoding the smaller images first and then the larger ones.

The article starts by explaining a Gaussian Pyramid, which uses a low-pass filter, and then explains the Laplacian Pyramid. The article also mentions that the Laplacian Pyramid representation works well for progressive image transmission. Although the images aren't very clear, the decompression of the image slightly resembles the way progressive JPEG stores images.

Reading #2
A Multiresolution Spling with
Application to Image Mosaics

This article looks into the problem of image mosaics, the combining of images. The main problem with mosaics is getting a smooth edge between the combined images.

The first possible solution discussed is the weighting function. This is where the boundary is smoothed by combining both edges (of a defined width) by using a fade-in / fade-out method. As one image begins to faded out, the other image starts to fade in.

The next method discussed uses Laplacian Pyramids. Three pyramids are constructed: one for each of the original images, and one which is constructed by combining parts of the other two Laplacian Pyramids to form the final Laplacian Pyramid. This third image is the smoothed mosaic image.

The final method discussed can be applied for images containing different shapes. It uses both Laplacian Pyramids and Gaussian Pyramids. As before, two Laplacian pyramids are formed, one for each of the original images. Then a Gaussian Pyramid is formed to be used as the weighting factor for combining the two Laplacian Pyramids. The resulting Laplacian Pyramid is the smoothed mosaic.



Stan Sclaroff

Created:  Sep 26, 1996

Last Modified: Sep 30, 1996