BU CAS CS 585: Image and Video Computing --- Class commentary on articles

BU CAS CS 585: Image and Video Computing

Curve matching and approximation
October 10, 1996

Readings:

M. Kass, A. Witkin, and D. Terzopoulos, Snakes: Active contour models, Int. J. Computer Vision 1, 1988, 321-331.
A. Witkin, Scale-Space Filtering, Proc. of 8th Int. Joint Conf. on Artificial Intelligence, 1983, 1019-1022.

Judd Bourgeois
Kriss Bryan
Bin Chen
Jeffrey Considine
Cameron Fordyce
Timothy Frangioso
Jason Golubock
Jeremy Green
Daniel Gutchess
John Isidoro
Tong Jin
Leslie Kuczynski
Hyun Young Lee
Ilya Levin
Yong Liu
Nagendra Mishr
Romer Rosales
Natasha Tatarchuk
Leonid Taycher
Alex Vlachos

The article "Snakes: Active Contour Models" is a good article. It explains the snake which is a spline that uses various forces to somewhat distinguish features in an image. These forces include external constarint forces and image forces. The features inlude such things as lines, edges and terminations.

This article at first appeared to not really have any direction but as I read, the concepts became more obvious to me. It is interesting how the snake uses the forces to almost wrap itself around the image. The snake is proposed as a natural reactor because of its continous reactions to the actions or forces of the image. Once in motion it becomes its own entity.

There are a few reasons why Sankes interest me:

With a peripheral that checks eye movement, one can look at a point of a picture and start the snake process after that the snake will do the rest. Then the user can enlarge this section to see what is there. This can be used in instances where the law has been broken and it is caught on film.

In the future, for robots or androids to recognize people and objects, it could use this snaking process of determining an object then try to match it with a model that was stored in memory if the object is not already there then it creates a new model of that object. This could be helpful for robots recognizing all types of objects in the world including themselves. They can then associate characteristics about these models which in turn would create something close to human memory.

This method could probably be used in Virtual Reality Games to allow users to focus in on an object or space and then scale it. Because the snake is pretty much dynamic, images can move around and be snaked in a VR world.

There were a few things that I didn't quite understand. For instance what was the zero-crossing concept? Why did the Snake have a sharp corner, because I thought that the snake was a spline or a connected curve which did not have any sharp edges? In the line functional, which sign of W will make the snake attract to light lines, a negative or a positive sign? Does it vary depending on the implementation? Why did giving a snake mass allow it to predict its next position?

It is interesting, one would think that adjusting monocular edge-finding based on binocular matches would be stated the other way around in that you would determine a binocular image from a monocular one. I assume however that the monocular edge finding is determined by the averages of the binocular matches.

The article as a whole was imformative, however some parts were vague and unclear.

Bin Chen

Scale-Space Filtering

This paper introduces a new method to perform useful general-purpose qualitative description for signals, it's called "scale-space". This paper appeared under the situation that in many sophisticated signal understanding task, the problem of scale is always a foundamental source of difficulty. Using the raw numerical signal values directly, like using parameter of scale will have problem, because for many tasks, no one scale of description is categorically correct. Descriptions at different scales be related to each other in an organized, natural and compact way are needed to solve this problem.

A "Scale-Space filting then was brought up. It describes signals qualitatively, in terms of extrema in the signal or its derivatives. This paper deals with one-dimensional signal, it first expands the one-dimensional signal into two-dimensional signal scale-space image, by Gaussian convolution over a continuum of sizes. Then using connectivity of extreal points tracked through scale-space (called coarse-to fine tracking , and the singular points at which new extrema appear. The image is then collapsed into the interval tree which can describe the signal simultaneously at all scales. It gives a complete qualitative description covering all scales of observation. The tree is further refined using maximum-stability criterion to identify events that persist of large changes in scale.

It is a useful tool in computational vision research. One of its application is in the next paper "Active Contour Models". The paper doesn't say much about the two-dimensional images, and further more, three dimension images, how could this method apply to them, and what are the results after the Gaussian convolutions?

Snakes: Active Contour Models

This paper discusses an active contour model named snake. This idea is brought up since in low level computational vision tasks, such as edge detection, stereo matching and motion tracking, previous approach could not perform high-level processes. We need sets of new organizations among which high-level processes may choose.

Energy minimization is the main concept in this paper, it is tread as a framework and how its function is designed plays an important role. The authors design this function whose local minima comprise the set of solutions available to higher-level reasoning. An interactive approach is used by adding suitable energy terms to the minimization. It is always minimizing its energy functional and therefore exhibits dynamic behavior. This is called snake. Snake's energy function is given mathematically in the paper using three parameters, i.e. internal energy, image forces and external constraint forces. The Scale Space technique was applied to the image force energy functional to attact the snake to zero crossings but still remain its smoothness.

The paper gives examples of the snake's two applications, stereo matching and motion tracking. Although the paper gives energy functional. better ones can be found to make the snake more effective and be a unified solution for all levels of visual processing.

Jeffrey Considine

Scale-Space Filtering

In this paper, Witkin presents space-scale filtering, a method of tracking zero crossings through changes in resolution. Witkin uses it to produce a general purpose qualitative description that is not tied to any particular scale. It seems to me that this algorithm could be very useful as an edge detector if extended into 2D.

While Witkin explains his reasoning fairly clearly, he is annoyingly vague with the details. The only formulas are simple and basic and could probably have been left out. He apparently covers all the concepts but is vague about the analysis of the final data and its refinement, leaving me wonder how many "magic numbers" would be needed for its implementation.

Snakes: Active Contour Models

Kass, Witkin and Terzopoulos present snakes as a general method for edge detection and motion tracking. They simplify the problem by reducing everything to energy minimization. Factors like the the Laplacian and other ways of classifying local areas with respect to edges and the like are converted into energy levels. Smoothness constraints are also converted into energy levels. The advantage of this approach is that it is very simple to make use of more information, possibly determined after later processing, by simply converting it into energy levels, adding them to the existing ones, and minimizing the energy levels again. This allows arbitrary data, like attraction or repulsion to a point clicked on by the user, to be added.

This seems like a very useful technique especially for processing video. The ability to trivially add data like our own identification of an object that would take huge amounts of time to identify if it was possible is enormous. The ability to keep this information and adapt it towards later frames seems very powerful. One comment is that it seems to be doing a lot of computation for each segment of the snake, but the benefits of letting the user help identify object is immense.

Cameron Fordyce

Scale-Space Filtering by A. Witkin

The author describes a method for extracting qualitative information from an image. The method described is relatively simple to implement and is useful in extracting extrema and derivative information from the signal while allowing a choice of multiple scales with which to examine this information. In addition, the use of multiple scales allows the examination of information that persists from one scale to another providing further information about the image.

The point is made that the primary problem had been to choose the appropriate scale with which to view separate events in an image. One scale does not work for all events. However, the author's method, 'scale-space filtering' appears to avoid another problem relating to data explosion. One could simply scale the image and extract the desired information but one would be left with multiple images and an over abundance of data. Further, this method allows a closer relationship between the different scales.

To be honest, I am not sure that I understand this technique fully enough to make any particular comments on it. The most troubling aspect is the description of the 'tree' and how this relates to organizing the continuous information such as zero crossings and derivatives extracted from the image. I believe that the author could have given more examples of the possible applications of this method. Of particular interest would have been the pattern recognition applications and perceptual phenomena descriptions that this method purports to provide

Timothy Frangioso

Scale-Space Filtering

This paper talks about a technique that solves the problems of scale in describing a signal in qualitative terms. That is in term of its extreme. The paper points out that one of the major problems with describing a signal in these terms is scale. The process of scale-space filtering is as follows a intuitive idea of scaling the description as you scale the image. That is as the image is scaled so is the process of tracking. When you change the scale of the image the extrema are tracked at the new scale. This information is stored into a tree which then provides a description of the signal's extrema over varied scales.

Snakes: Active Contour Models

This paper describes a contour models for finding and approximating edges. Basically the snakes work by finding the local minimum of an area. The paper defines these constraints as image forces that pull the snake towards the particular parts of the image that are of interest. This energy function is expressed as the integral of E(v(s)) from 1 to 0. This energy constraint gets broken up into two various types of constraints the Internal and the External.

To attract the snakes to particular areas or objects within the image one uses a energy functional. The paper described three different functional for finding lines, edges and termination's. By modifying or changing these functionals one can make different snakes that will find various objects. These changes and modifications are not just limited to still image tracking but, can also be used in motion tracking. This is accomplished by giving a local minimum for the snake to keep track of.

Jason Golubock

Review: Scale-Space Filtering

This article presents a new method for finding a good qualitative description of a signal. A qualitative description is said to be one that characterizes the signal by its extrema and those of its first few derivatives. The main problem here is that of scale... qualitatively describing a signal at different scales often results if very different outcomes. There seems to be no good general method for finding the appropriate scale for any situation, if there even is one. The solution presented here is to vary the scaling parameter and find the signal description for each scale. These signal descritions at different scales are laid side to side so that the make a surface called the scale-space image. This makes it possible to note changes and appearances of new extrema over different scales.

The next step is to take the information about connectivity of extremal points over the scale space, and the points at which new extrema appear, and "collapse" this information into what they call an interval tree. Although it is not decsribed in detail how the interval tree is derived, it is generally described in detail, with all of its important features pointed out. The interval tree is is a complete qualitative description of the signal over all the different scales used. The tree description is improved upon by imposing a stability criterion which favors extrema that are present over wide ranges of scale changes. The stabilization procedure is described briefly and in general, and not in any detail.

The final result of the article deems to do what it sets out to do fairly well... It is rather short and does not talk about the robustness of their method over a wide range of applications. It does not seem that the article attempts to prove the goodness of their method rather than just to describe the new idea.

Snakes: Active Contour Models

This article presents a type of energy-minimizing spline which the authors call a snake. One of the most important features of the snake seemed to me to be that it has the ability to take input from the outside which can be used to change its behavior. By allowing the snake to be used in different ways, the problem is solved of finding complete and accurate edge and contour recognition for any given image; the snake can be tuned for any particular application.

The snake uses a technique called energy minimization which causes it to change its shape and wrap itself around the edge or contour which is nearest to it from its starting point. The article describes in some detail the mathematics behind their energy-minimizing system. It assmues a prerequisite level of understanding of splines and energy minimization which I do not have; I did not understand the body of the article well enough to comment on it.

Intuitively, the idea of a stretchable, changeable, shape- fitting egde and contour detector seems to be a very good one. The snake described in the article seems to do a good job of implementing this idea. Unfortunately, I have to take their word for it.

Jeremy Green

Daniel Gutchess

Scale-Space Filtering

In this paper, Andrew Witkin describes a compact representation of a signal which is not dependant on scale. The motivation behind doing this is to classify signals qualitatively, without the need for a parameter of scale, which introduces abiguity, or for multi-scale descriptions, which increase storage requirements.

The first step in scale-space filtering is to draw a scale-space image, which is done by adding a new dimension to the signal, representing the scale parameter. We now have a surface on which we can find local extrema for any derivative. Mapping out the slope extremas (zero-crossings of the second derivative) gives the contours of a scale-space image, which then may be used to generate the so-called interval tree. This tree is a simplified representation of the contour sketch. By inspection, you could make a good sketch of the original signal, but you lose quite a bit of detail. The author notes that two-dimensional signals can also be represented by a 3-D scale space.

This paper is short and to the point, in that it clearly describes a method of representing the prominent features of a signal. It basically stops here, which was a disappointment. Yes, figure 5 shows that the method does what it is supposed to, but it might have been a good idea to extend this in two areas. First, a discussion on how to implement this for recognition purposes, and effectiveness in that area. Second, a brief summary of some other applications this would be useful for.

Snakes: Active Contour Models

The authors, Kass, Witkin, and Terzopoulos describe a technique of locating (and tracking in motion video) edges, lines, and subjective contours. The main idea is to place a representation of a mechanical system onto an image, and just as it happens in the real world, minimize the energy in the system. This mechanical model consists of "snakes" and springs, where the snake is somewhere between a metal rod and and a bendable wire. Using one of three energy functionals as a guide, and the mathematical formulas in the appendix, a snake (or group of snakes) will actively move itself toward the nearest edge or contour.

This paper did a good job by explaining snakes intuitively as well as providing mathematical detail. The wealth of examples helped my understanding, and showed how effective this technique can be at finding edges. The role of snakes in the grand scheme of vision systems was well described as one reliant upon higher level mechanisms, which possess some knowledge about the particular application.

John Isidoro

"A review of Scale space Filtering" and "Snakes active contour models"

John Isidoro

The Scale Space filtering paper is perhaps one of the most unclear white papers I have ever read. First of all, I am uncertain as to the exact problem the author is trying to solve? Is he trying to find a way to provide a high level interpretation of a signal? And if so, what does this high-level interpretation provide you with? Also his criteria for success seemed a little ambiguous.. What exactly does a complete qualitative representation mean, isnt noisyness a quality of the signal?? Maybe I'm missing something?

In contrast to the Scale Space paper, the Snake paper was wonderful! Snakes are a really popular machine vision technique because they can be used for so many things, such as segmentation, edge finding, feature tracking, and many others. When I first encountered the concept last year, I found that visualizing how snakes was very easy. Imagine the image as a mountain range, where the valleys correspond to what you want to track: (Light regions, Dark regions, contours, etc.) The snake is an extra heavy thin and strechable gummiworm that rolls down these mountains into the valleys to form a shape. It's an almost anthropomorphic approach to finding details in images. I think that implementing this one will be the true test of this paper though!

Tong Jin

Leslie Kuczynski

Scale-Space Filtering

Author A.P.Witkin proposes a technique for describing a signal in terms of it's 'general-purpose' qualities. This method termed 'scale-space filtering' describes a signal qualitatively, in terms of it's local extrema , it's first few derivatives and the intervals bounded by the extrema. The benefits we derive from such a representation, is that now, we have a set of primitives associated with a signal. That is, we have a qualitative description of the signal in a understandable and familiar form (i.e. Witkin draws the comparison between the techniques used to characterize functions in calculus). However, the real issue lies, not in the form of the representation, but in addressing the issue of scale. For example, the task of computing extrema, derivatives, etc. is not difficult if we are concerned with a particular segment of a signal whose relative size is known to us, but what if we are concerned with the whole signal and all 'events' within that signal, large and small. Witkin takes a typical engineering approach whereby the goal is reduced to managing and reducing rather than overcoming and eliminating the ambiguities associated with the issues of scale.

The approach used by Witkin to obtain the signal representation starts by convolving the signal with a guassian convolution kernel to obtain a smoothed version of the signal. Extrema and derivatives are then computed on the smoothed signal. The amount of smoothing increases as the standard deviation of the gaussian increases. Thus, by varying the standard deviation (in a continuous manner) a qualitative description of the signal over all scales of observation can be obtained. When the standard deviation is large, a coarse scale of the signal is obtained. It is in these coarse representations that extrema can be identified. Localizing occurs at a finer scale (obtained by decreasing the standard deviation). The idea is comparable to using the Gaussian pyramid in image recognition whereby the top levels are used to locate regions of interest while the lower levels are used in the recognition process.

I was unable to fully understand the concept introduced in regards to the interval tree. Precisely, how small does the gaussian standard deviation get (i.e. when does the process terminate)? I am also left with questions regarding the structure of the interval tree, specifically its ternary nature. I am unable to evaluate issues of correctness and robustness in regards to Witkin's methods without first obtaining satisfactory answers to the above questions.

Snakes: Active Contour Models

Authors M.Kass, A.Witkin, and D.Terzopoulos investigate the use of energy minimization as a framework within which the task of finding recognizable image contours (i.e. edges, lines) can be realized. Their somewhat unorthodox method addresses the problem from a top-down perspective rather than the traditional bottom-up approach. Attacking the problem top-down requires knowledge of the object in question. The authors accommodate this necessity by providing a user interface, thereby allowing a user to inject forces to guide the procedure. For example, a user places a contour (snake) in an image somewhere near (how near?) a contour in the image. Depending on the image energy (a weighted combination of the image intensity, the gradient magnitude and the partial derivative of the gradient angle with respect to the partial derivative of the unit vector perpendicular to the gradient direction), the internal snake energy and external constraint forces, the snake will deform itself into conformity with the nearest recognizable contour. A particularly interesting use of snakes, presented by the authors, was their use in motion tracking. A snake can automatically track the local minimum of a 'slowly' (how slowly?) moving feature if it has an initial lock on that feature.

I found the concept of breaking the problem into two separate problems (high-level and low-level) and concentrating on the low-level portion rather smart. However, perhaps due to severe ignorance on my part, or due to insufficient explanation by the authors, I was unable to fully comprehend the basic snake behavior. For example, I had no idea what the author's meant by the snake acting like a membrane or a thin-plate in their discussion of the snake's internal energy. Nor did I fully understand how or why the snakes were attracted to features. My experience reading this paper is somewhat akin to being given a test and realizing that I had never been to class.

Hyun Young Lee

Snakes: Active Contour Models

To find image contours, not only the image itself, but also external constraints are considered so that by minimizing the energy functional locally , an active and dynamic contour model can be obtained. The snake model is based on scale-space continuation and controlled continuity spline with the influence of three factors: internal spline forces, image forces, and external constraint forces.

Since a snake tries to adapt itself to the optimal contour dynamically, it can also be used for motion tracking effectively. In such local energy- minimizing systems, higher level processing can be more flexible and conformable, comparing to previous rigid methods.

SCALE-SPACE FILTERING

To describe signals qualitatively, a manageable way to describe the scale of the signals is obtained by scale-space filtering. The method begins by continuously varying the scale parameter and sweeps out the scale-space image which is then reduced to a tree which provides a concise but complete qualitative description of the signal.

In choosing the scales, a coarse scale is used to identify extrema and a fine scale, to localize them. With the representation from such various scales and its simplified tree, this method serves a powerful description of signals.

Ilya Levin

SCALE-SPACE FILTERING

The paper, written by Andrew P. Witkin, describes the method which can relate the descriptions of the signal at different scales to each other. This method, which is called Scale-space filtering, consist of several stages. At the beginning the scale parameter is varied continuously in order to sweep out the surface which author calls the scale-space image. This approach allows to track down the extreme and to identify the singular points at which new extreme appear. Author suggests to use the Gaussian convolution to compute the description of the signal that depends on scale. One of the reasons for that is the Gaussian convolution satisfies the "well behavedness" criteria which is a very useful property. The coarse-to-fine tracking method is used to localize large-scale events. The interval tree method is used to describe signal simultaneously at all scales and to generate a family of single-scale descriptions .

The paper is very well written and gives a good explanation of the methods the author is using and math seems to be correct but difficult to follow. The author gives a few examples which show how his method works but does not give any examples where this methods can be used. If the author showed some applications of his method, it would be easier to understand the paper.

SNAKES: ACTIVE CONTOUR MODELS

This paper investigates the use of energy minimization as a framework to design energy functions whose local minima comprise the set of alternative solutions available to higher level processes. A snake is an energy-minimizing spine guided by external constraints forces and influenced by image forces that pull it toward features such as lines and edges. Snakes have proven to be useful for interactive specification of image contours. The snake model provides a unified treatment to a collection of visual problems. In order to make snakes useful for early vision, we need energy functional that attract them to silent features in images. The authors have derived energy functionals which can be applied for such tasks as finding edges in an image or finding terminations of line segments and corners. Snakes can also be applied to the problem of stereo matching.

The techniques implemented in the paper are very useful for dealing with a number of visual problems. The author explains his methods in a way it is easy to follow. This paper introduced me to a variety of methods to deal with images which I find very useful.

Yong Liu

In image processing and other signal processing processes, extrema and its first few derivatives are used as qualitative description of signals. For example, they define the edges in the image. The neighborhood within which the derivetives are calculated affects the result of the calculation. Therefore, a principle is needed to select the basis for neighborhood determination. Andrew P Witkin's article, 'Scale-space Filtering'(Proceedings of the 8th International Joint Conference on Artificial Intelligence. August 1983. Pages 1019-1022, Volume 2.), proposed a scale-space filtering approach to solve this problem.

Under this approach, the representation of scale is done by a gaussian convolution to form a scale space image.The reason why a gaussian convolution is needed is that the standard deviation of the Gaussian distribution can be served as a gauge for the scales. In the scale-space, the problem of localizing large- scale events is solved by a coarse-to-fine tracking, which is done by adjusting the standard deviation along the zero second derivative contour. The problem of multi-scale integration is solved by a interval tree method. This method is to reduce the scale-space image into a tree structure.

The topic and the methods proposed by the author is technically important. However, I personally find the author's presentation hard to understand. The scale-space itself is a abstract concept. The author aggrevated the level of difficulty by presenting interval tree with even less illustration and almost no mathematical representation.

Application of the scale-space concept under the context of image processing was discussed by Michael Kass, Andrew Witkin, and Demitri Terzopoulos in their article, 'Snakes: Active Contour Models' (International Journal of Computer Vision, 321-331 Kluwer Academic Publishers, Boston 1987) . Their approach is a variational method to find and determine the contours and edges.

Basically, their method is to use special knowledge of the objects in the image as control elements for variational calculation. Edge finding and contour determination are accomplished through splining based on minimum energy principle, which is relatively well established in mathematics.

The authors adopted the scale-space concept proposed by Andrew Witkin. Their variational method was proved to be more powerful with property of scale-space continuation. Unlike the previous article, the authors emphasize more on tangible accomplishments than on introducing an alogrithmic concept. The contour determination example in the orange and pear case offered a good illustration.

Nagendra Mishr

SCALE-SPACE FIILTERING

The Scale space article written by A. Witkin is a very intresting approach to feature detection which he claims describes the signal qualitatively without introducing artificial parameters. I don't know what that means, but I do think that if it works it is very powerful.

The basic idea is that we humans process sound at multiple levels and that we can pretty clearly sift through garbeled signals. The perfect example is that in a dinner party we can detect individual conversations without much effort. A. Witkin describes a method by which we can detect these features using a bottom up approach.

The notion is that when you analyze a signal, we first pick the scale of the fatures you wish to detect. In the Scale-Space Filtering method, you apply an algorithm which ends up using all scales which would be precptable in the signal. By convolving the signal with a gausean you simplify it. Each successive signal is further convolved until it is a simple signal. This technique is pretty similar to that used for image parymids.

The key to detecting the features in the signal involves finding all the "information" segments. In information theory, a predictable signal contains little information therefore signals which contain lots of information will contain many zero-crossings across feequency boundries. Since each convolution in essence is like taking a derivitave of the signal, you find all the zero crossings in each layer.

Witkins takes that various zero crossings and generates an interval tree from it and from that he can reconstruct all the details at any particular layer in the signal. But what is important is that he claims that we detect those features which transcend convolution bopudries. i.e. signals which span many frequency octaves. So if a zero crossing transcends the different layers, then it is used to divide the neighboring signals into different features. Those features which transcend the largest range of frequencies are more noticable.

The article is not very good about giving examples and more examples would elucidate the subject and convince the reader. The idea is very original and useful if this is true. The points which I question are: how is the interval tree constructed, how is the convolution performed, what proof is there that humans dectect signals which transcend frequency octave boundries.

Snakes: Active Contour Models

The snake article by KASS, Witkin and TERZOPOULOS is useful. It describes a formal model for specifying a templace which a computer can use to identify objects given the correct parameters.

A snake is a mathematical model which follows the laws of physics to help quantify physical objects in a descrete world. All images are descrite and as such contain quantification errors when objects are rendered in different real-world examples. The objects themselves may be such that they do not reproduct identically. The snake helps to identify these objects by allowing a fuzzy fit.

Snakes contain some properties which allow them to interact with the values of an image at any particular point. These properties can make the snake get attracted to some points and to be repelled from others. An example which the paper gives is that in a motion sequence a snake can be placed around some lips. In each subsequent frame, the snake will trace the changed shape of the lips because it is following the contours in the image. In another example with stero vision, two snakes which identify two objects in two pictures can identify that the same object appears in both images.

Snakes can be made to contain predefined shapes and will subsequently get attracted and match only those objects which match the predefined shape of the snake without being constrained by the size of the object.

Snakes know that they have recognized objects when their energy values stabalize at some value. Since the a value of a snake can be specified by any mathematical property, they can be made to match cornors, to repel from certain areas, and to be attracted to certain features.

Romer Rosales

Scale-Space Filtering

(Article Review)

Filtering a signal by using its first derivatives are normally useful for obtaining a relatively accurate qualitative description of it. But discrete filters have to be of a fixed size, in other words, the extension of the neighborhood is a value that we should know in order to use these techniques. The problem is that the exact size of the filter cannot be established without having some extra information like size and location of objets in the image.

Meaningful objects or events in the signal are frequently located by obtaining local extrema and derivatives. Edges and images are in this way located. But the problem of scale make unrestricted processing very difficult, the meaningful objects may be different if the image is presented at a different extent and size.

A parameter with a scale value would be a fast solution, we could create a mask with a size that is determined by this parameter. In this case unclear descriptions would appear. Every different scale value would generate a different description. No one scale value can be considered precise.

Other methods have been used to solve this problem: pyramids and zero-crossings. According to this work, no clear criteria for obtaining edge pyramids or a test of zero-crossing techniques have been reached.

This work relates descriptions of signals at different scales by continuously changing the scale parameter sweeping out a surface called scale-space image The singular points at which new extrema appear are found. Then it is reduced into a tree structure. This would mean a qualitative description of the signal at any scale.

The Gaussian convolution is used to perform an expansion of the signal, then it is collapsed and inflection points are found.

These inflection points (Fxx=0) will mark the appearance of contours. Two assumptions are made: the inflection points found at different scales are related with the same event and that the location of this event is the contour (horizontal) location as gamma tends to infinity.

To get the interval tree for the signal, the scale-space image is reduced describing the qualitative structure of the signal completely over all scales of observation. When an opposite sign appears while varying gamma the interval divides into sub-intervals.

The nodes of the tree are generated in a way that the parent node represent the large interval form which the internal of the node emerges, and its offspring are the smaller interval in which it subdivides. This interval tree represents the image in all defined scales.

This technique is based on the localization of large scale events. I think that is a very clever way to deal with the ambiguity of the scale problem in a natural way i.e. avoiding the introduction of arbitrary parameters.

Snakes: Active Contour Models

Michael Kass, Andrew Witkin and Demetri Terzopoulos.

(Article Review)

Low level tasks in machine vision have been oriented to an inflexible sequential approach, they utilize the information that is available in the image itself. So low-level mechanisms are the basis of image analysis and no alternative organizations are provided. The approach of this article is related to provide alternative organizations that are useful for higher level processes, and it uses energy minimization concepts to achieve this.

Energy minimization functions provide local minima the can comprise the set of alternative solutions available to higher level processes. This technique uses high-level knowledge for guiding it.

This work is aimed to find edges, lines and subjective contours and follow the contours in order to match them. They are based in the idea that an interpretation on an object is difficult to justify and it depends on the kind of knowledge we have about it. Then because a signal exact interpretation is almost impossible to define, it suggests to find local minima instead of global minima. This is considered an active mechanism because it is always minimizing its energy functional.

Snakes depend on other mechanisms in order for them to be placed near the contour. So it can be used as a semi-automatic image representation. From any starting point the snake changes its form in order to match the nearest contour (lines, edges and subjective contour). The snake is pushed by the image energy toward contours.

After a particular (perhaps user-defined) starting point (although it can be defined by different techniques), the energy minimization will pull the snake to the edge. This feature provides a basic tool for image interpretation.

This work defines three energy functional that attract the snake to the edge. The total energy is expressed as a weighted combination of these energy functional. They are:

E-line=I(x,y); which attract the snake depending on the intensity of the nearby contour.

E-edge=-|VI(x,Y)|^2; will attract the snake to contours with large image gradients. (V=gradient)

E-term will find terminations of line segments and corners. (It uses the curvature of level lines in a slightly smoothed image)

The paper discusses the way in which snakes can be applied to the problem of stereo matching and motion. Motion is tracked by snakes due to its characteristic of locking on the salient visual feature. When there is a slow movement, the snake track the same local minimum. Fast movement may generate a change in the local minimum that the snake is tracking.

This approach is based on interactive computations for finding image contours and energy minimization at each step. Besides all the described above, it can also be useful for matching 3D models to images. The effectiveness of this method can be improved by finding new useful energy functional or by combining them. For example by combining E-edge and E-term, it was possible to create a snake that is attracted to edges and terminations. It implements a similar way to find different points of interest in the image (edges, contours, subjective contours). This work introduces a new propose: to influence the lowest-level visual interpretations by all levels of visual processing and perform a dynamic aproximation to points of interest.

Natasha Tatarchuk

Snakes: Active Contour Models

The technique of implementing snakes is presented in this paper. A snake is an energy-minimizing spline that is guided by external constraint forces and influenced by image forces that pull it toward image features such as edges and lines. Snakes are very useful creatures: there are many applications in computer vision for this technique, such as the problem of detecting the edges, lines, and subjective contours. The authors also give a good account of how snakes can be useful in stereo matching and motion tracking. Also, there are other areas for which snakes can prove to be very useful. One of those would be recognition, I imagine. Given a contour of an object that we'd be looking for, one could use snakes to find an object in the image, lock the snake onto it, and then match the found contour with the given one. Also, once that's done, one could do tracking of the particular objects, and studying the behavior of them using snakes. Government is probably already working on implementing the snakes for tracking down all the foreign spies, in the least.

This paper was helpful in understanding of the nature of snakes and the behavior of the snakes, although it wasn't clear for me how to implement a snake given just this paper. The authors do a great job of explaing how a snake would behave in different case, such as in detecting lines and edges, as well as in motion tracking and stereo matching. What I found most useful is the explanation of some of the formulas for snake behavior, such as the energy formulas. I found this paper very interesting and neat.

Scale-Space Filtering

Scale-space filtering technique is a way for describing the signals quantitatively simultaneously at all scales, in terms of the extrema in the signal or its derivatives. I find very interesting this approach of the authors to escape ambiguity of dealing with the problem of choosing some specific scale for general processing of the signal. The interval tree is a simple way to represent all scale levels of the signal by generating a family of single-scale descriptions, and later combining then in a tree of nodes. My only comment on this technique would be in the effiency of this representation's storage. The authors don't address the issue of storing the actual tree, which I imagine, can 'grow' quite large for complicated signals.

I greatly appreciate the authors' effort to present their idea in a very consistent, coherent manner, which made explanation of the material much better readable. This paper was quite interesting, though I am not entirely sure where this technique is going to be used.

Leonid Taycher

Alex Vlachos

Reading #4
Scale-Space Filtering

Well, not much to say about this one...except that I was thoroughly confused by the end of the second page...so, consequently, I stopped reading, and went on to the next paper.

Reading #5
Snakes: Active Contour Models

The most interesting aspect of snakes is that they provide a way to detect edges and follow them through motion sequences. The article gives an example of a speaker's lips. Through 8 images, the snake followed the contour of the lips throughout each frame. The initial snake was placed in the general vacinity of the speakers lips by someone. Then, the snake found the edges of the lips and followed them through the sequence of images.

This seems like a very useful tool. A possible implementation of this technology would be robots. Robots could effectively follow someone (like someone's dog...or pet rabbit) by using a camera, conveniently mounted on the robot's head, and process images of the person being followed. The result of this would be a very annoying pet dog...or rabbit...that would constantly follow you around your house.

On the technical side of this paper, I followed a good part of the equations, but some of the issues being discussed obviously depended on the reader (being me) haveing some other knowledge of the material (which I didn't have) to follow those parts of the article. But overall, it was a very interesting paper.



Stan Sclaroff

Created:  Sep 26, 1996

Last Modified: Sep 30, 1996