Readings:
The article "Snakes: Active Contour Models" is a good article. It explains the snake which is a spline that uses various forces to somewhat distinguish features in an image. These forces include external constarint forces and image forces. The features inlude such things as lines, edges and terminations.
This article at first appeared to not really have any direction but as I read, the concepts became more obvious to me. It is interesting how the snake uses the forces to almost wrap itself around the image. The snake is proposed as a natural reactor because of its continous reactions to the actions or forces of the image. Once in motion it becomes its own entity.
There are a few reasons why Sankes interest me:
There were a few things that I didn't quite understand. For instance
what
was the zero-crossing concept? Why did the Snake have a sharp
corner,
because I thought that the snake was a spline or a connected curve which
did
not have any sharp edges? In the line functional, which sign of W
will
make the snake attract to light lines, a negative or a positive sign?
Does it
vary depending on the implementation? Why did giving a snake mass allow
it to
predict its next position?
It is interesting, one would think that adjusting monocular
edge-finding
based on binocular matches would be stated the other way around in that
you
would determine a binocular image from a monocular one. I assume however
that
the monocular edge finding is determined by the averages of the
binocular
matches.
The article as a whole was imformative, however some parts were vague
and
unclear.
This paper introduces a new method to perform useful general-purpose
qualitative description for signals, it's called "scale-space". This paper
appeared under the situation that in many sophisticated signal
understanding task, the problem of scale is always a foundamental source of
difficulty. Using the raw numerical signal values directly, like using
parameter of scale will have problem, because for many tasks, no one scale
of description is categorically correct. Descriptions at different scales
be related to each other in an organized, natural and compact way are
needed to solve this problem.
A "Scale-Space filting then was brought up. It describes signals
qualitatively, in terms of extrema in the signal or its derivatives. This
paper deals with one-dimensional signal, it first expands the
one-dimensional signal into two-dimensional signal scale-space image, by
Gaussian convolution over a continuum of sizes. Then using connectivity of
extreal points tracked through scale-space (called coarse-to fine tracking
, and the singular points at which new extrema appear. The image is then
collapsed into the interval tree which can describe the signal
simultaneously at all scales. It gives a complete qualitative description
covering all scales of observation. The tree is further refined using
maximum-stability criterion to identify events that persist of large
changes in scale.
It is a useful tool in computational vision research. One of its
application is in the next paper "Active Contour Models". The paper doesn't
say much about the two-dimensional images, and further more, three
dimension images, how could this method apply to them, and what are the
results after the Gaussian convolutions?
This paper discusses an active contour model named snake. This idea is
brought up since in low level computational vision tasks, such as edge
detection, stereo matching and motion tracking, previous approach could not
perform high-level processes. We need sets of new organizations among which
high-level processes may choose.
Energy minimization is the main concept in this paper, it is
tread as a framework and how its function is designed plays an important
role. The authors design this function whose local minima comprise the
set of solutions available to higher-level reasoning. An interactive
approach is used by adding suitable energy terms to the minimization. It
is always minimizing its energy functional and therefore exhibits dynamic
behavior. This is called snake. Snake's energy function is given
mathematically in the paper using three parameters, i.e. internal energy,
image forces and external constraint forces. The Scale Space technique was
applied to the image force energy functional to attact the snake to zero
crossings but still remain its smoothness.
The paper gives examples of the snake's two applications, stereo matching
and motion tracking. Although the paper gives energy functional.
better ones can be found to make the snake more effective and be
a unified solution for all levels of visual processing.
In this paper, Witkin presents space-scale filtering, a method of
tracking zero crossings through changes in resolution. Witkin uses it to
produce a general purpose qualitative description that is not tied to any
particular scale. It seems to me that this algorithm could be very useful as
an edge detector if extended into 2D. While Witkin explains his reasoning fairly clearly, he is annoyingly
vague with the details. The only formulas are simple and basic and could
probably have been left out. He apparently covers all the concepts but is
vague about the analysis of the final data and its refinement, leaving me
wonder how many "magic numbers" would be needed for its implementation. Kass, Witkin and Terzopoulos present snakes as a general method for edge
detection and motion tracking. They simplify the problem by reducing
everything to energy minimization. Factors like the the Laplacian and other
ways of classifying local areas with respect to edges and the like are
converted into energy levels. Smoothness constraints are also converted into
energy levels. The advantage of this approach is that it is very simple to
make use of more information, possibly determined after later processing, by
simply converting it into energy levels, adding them to the existing ones,
and minimizing the energy levels again. This allows arbitrary data, like
attraction or repulsion to a point clicked on by the user, to be added. This seems like a very useful technique especially for processing video.
The ability to trivially add data like our own identification of an object
that would take huge amounts of time to identify if it was possible is
enormous. The ability to keep this information and adapt it towards later
frames seems very powerful. One comment is that it seems to be doing a lot
of computation for each segment of the snake, but the benefits of letting
the user help identify object is immense.
The author describes a method for extracting qualitative information from
an image. The method described is relatively simple to implement and is
useful in extracting extrema and derivative information from the
signal while allowing a choice of multiple scales with which to
examine this information. In addition, the use of multiple scales
allows the examination of information that persists from one scale to
another providing further information about the image.
The point is made that the primary problem had been to choose the
appropriate scale with which to view separate events in an image. One
scale does not work for all events. However, the author's method,
'scale-space filtering' appears to avoid another problem relating to
data explosion. One could simply scale the image and extract the
desired information but one would be left with multiple images and an
over abundance of data. Further, this method allows a closer
relationship between the different scales.
To be honest, I am not sure that I understand this technique fully
enough to make any particular comments on it. The most troubling
aspect is the description of the 'tree' and how this relates to
organizing the continuous information such as zero crossings and
derivatives extracted from the image. I believe that the author could
have given more examples of the possible applications of this method.
Of particular interest would have been the pattern recognition
applications and perceptual phenomena descriptions that this method
purports to provide
This paper talks about a technique that solves the problems of scale
in describing a signal in qualitative terms. That is in term of its
extreme. The paper points out that one of the major problems with
describing a signal in these terms is scale. The process of
scale-space filtering is as follows a intuitive idea of scaling the
description as you scale the image. That is as the image is scaled so
is the process of tracking. When you change the scale of the image the
extrema are tracked at the new scale. This information is stored into a
tree which then provides a description of the signal's extrema over
varied scales.
This paper describes a contour models for finding and approximating
edges. Basically the snakes work by finding the local minimum of an
area. The paper defines these constraints as image forces that pull the
snake towards the particular parts of the image that are of interest.
This energy function is expressed as the integral of E(v(s)) from 1 to
0. This energy constraint gets broken up into two various types of
constraints the Internal and the External.
To attract the snakes to particular areas or objects within the image
one uses a energy functional. The paper described three different
functional for finding lines, edges and termination's. By modifying or
changing these functionals one can make different snakes that will find
various objects. These changes and modifications are not just limited
to still image tracking but, can also be used in motion tracking. This
is accomplished by giving a local minimum for the snake to keep track
of.
This article presents a new method for finding a good
qualitative description of a signal. A qualitative
description is said to be one that characterizes the
signal by its extrema and those of its first few derivatives.
The main problem here is that of scale... qualitatively describing
a signal at different scales often results if very different
outcomes. There seems to be no good general method for finding
the appropriate scale for any situation, if there even is one.
The solution presented here is to vary the scaling parameter
and find the signal description for each scale. These signal
descritions at different scales are laid side to side so that
the make a surface called the scale-space image. This makes it
possible to note changes and appearances of new extrema over
different scales.
The next step is to take the information about connectivity
of extremal points over the scale space, and the points at
which new extrema appear, and "collapse" this information into
what they call an interval tree. Although it is not decsribed
in detail how the interval tree is derived, it is generally
described in detail, with all of its important features pointed
out. The interval tree is is a complete qualitative description
of the signal over all the different scales used. The tree
description is improved upon by imposing a stability criterion
which favors extrema that are present over wide ranges of scale
changes. The stabilization procedure is described briefly and
in general, and not in any detail.
The final result of the article deems to do what it sets out
to do fairly well... It is rather short and does not talk
about the robustness of their method over a wide range of
applications. It does not seem that the article attempts
to prove the goodness of their method rather than just to
describe the new idea.
This article presents a type of energy-minimizing spline
which the authors call a snake. One of the most important
features of the snake seemed to me to be that it has the
ability to take input from the outside which can be used
to change its behavior. By allowing the snake to be used
in different ways, the problem is solved of finding
complete and accurate edge and contour recognition for
any given image; the snake can be tuned for any particular
application.
The snake uses a technique called energy minimization which
causes it to change its shape and wrap itself around the
edge or contour which is nearest to it from its starting
point. The article describes in some detail the mathematics
behind their energy-minimizing system. It assmues a prerequisite
level of understanding of splines and energy minimization
which I do not have; I did not understand the body of the
article well enough to comment on it.
Intuitively, the idea of a stretchable, changeable, shape-
fitting egde and contour detector seems to be a very good one.
The snake described in the article seems to do a good job
of implementing this idea. Unfortunately, I have to take
their word for it.
In this paper, Andrew Witkin describes a compact representation
of a signal which is not dependant on scale. The motivation behind
doing this is to classify signals qualitatively, without the need
for a parameter of scale, which introduces abiguity, or for multi-scale
descriptions, which increase storage requirements.
The first step
in scale-space filtering is to draw a scale-space image, which is
done by adding a new dimension to the signal, representing the
scale parameter. We now have a surface on which we can find
local extrema for any derivative. Mapping out the slope extremas
(zero-crossings of the second derivative) gives the contours of
a scale-space image, which then may be used to generate the so-called
interval tree. This tree is a simplified representation of the
contour sketch. By inspection, you could make a good sketch of the
original signal, but you lose quite a bit of detail. The author notes that
two-dimensional signals can also be represented by a 3-D scale space.
This paper is short and to the point, in that it clearly describes
a method of representing the prominent features of a signal.
It basically stops here, which was a disappointment. Yes, figure
5 shows that the method does what it is supposed to, but it might
have been a good idea to extend this in two areas. First, a
discussion on how to implement this for recognition purposes, and effectiveness
in that area. Second, a brief summary of some other
applications this would be useful for.
The authors, Kass, Witkin, and Terzopoulos describe a technique
of locating (and tracking in motion video) edges, lines, and
subjective contours. The main idea is to place a representation
of a mechanical system onto an image, and just as it happens in
the real world, minimize the energy in the system. This mechanical
model consists of "snakes" and springs, where the snake is somewhere
between a metal rod and and a bendable wire. Using one of three
energy functionals as a guide, and the mathematical formulas in the
appendix, a snake (or group of snakes) will
actively move itself toward the nearest edge or contour.
This paper did a good job by explaining snakes intuitively
as well as providing mathematical detail. The wealth of examples
helped my understanding, and showed how effective this technique
can be at finding edges. The role of snakes in the grand scheme
of vision systems was well described as one reliant upon
higher level mechanisms, which possess some knowledge about
the particular application.
John Isidoro
The Scale Space filtering paper is perhaps one of the most unclear
white papers I have ever read. First of all, I am uncertain as to the
exact problem the author is trying to solve? Is he trying to find a way to
provide a high level interpretation of a signal? And if so, what does this
high-level interpretation provide you with? Also his criteria for success
seemed a little ambiguous.. What exactly does a complete qualitative
representation mean, isnt noisyness a quality of the signal?? Maybe I'm
missing something?
In contrast to the Scale Space paper, the Snake paper was wonderful!
Snakes are a really popular machine vision technique because they can be used
for so many things, such as segmentation, edge finding, feature tracking,
and many others. When I first encountered the concept last year, I
found that visualizing how snakes was very easy. Imagine the image as a
mountain range, where the valleys correspond to what you want to track:
(Light regions, Dark regions, contours, etc.)
The snake is an extra heavy thin and strechable gummiworm that rolls down
these mountains into the valleys to form a shape. It's an almost
anthropomorphic approach to finding details in images. I think that
implementing this one will be the true test of this paper though!
Author A.P.Witkin proposes a technique for describing a signal in terms
of it's 'general-purpose' qualities. This method termed 'scale-space
filtering' describes a signal qualitatively, in terms of it's local extrema
, it's first few derivatives and the intervals bounded by the extrema. The
benefits we derive from such a representation, is that now, we have a set of
primitives associated with a signal. That is, we have a qualitative
description of the signal in a understandable and familiar form (i.e. Witkin
draws the comparison between the techniques used to characterize functions
in calculus). However, the real issue lies, not in the form of the
representation, but in addressing the issue of scale. For example, the task
of computing extrema, derivatives, etc. is not difficult if we are concerned
with a particular segment of a signal whose relative size is known to us,
but what if we are concerned with the whole signal and all 'events' within
that signal, large and small. Witkin takes a typical engineering approach
whereby the goal is reduced to managing and reducing rather than overcoming
and eliminating the ambiguities associated with the issues of scale. The approach used by Witkin to obtain the signal representation starts by
convolving the signal with a guassian convolution kernel to obtain a
smoothed version of the signal. Extrema and derivatives are then computed
on the smoothed signal. The amount of smoothing increases as the standard
deviation of the gaussian increases. Thus, by varying the standard
deviation (in a continuous manner) a qualitative description of the signal
over all scales of observation can be obtained. When the standard deviation
is large, a coarse scale of the signal is obtained. It is in these coarse
representations that extrema can be identified. Localizing occurs at a
finer scale (obtained by decreasing the standard deviation). The idea is
comparable to using the Gaussian pyramid in image recognition whereby the
top levels are used to locate regions of interest while the lower levels are
used in the recognition process. I was unable to fully understand the concept introduced in regards to the
interval tree. Precisely, how small does the gaussian standard deviation
get (i.e. when does the process terminate)? I am also left with questions
regarding the structure of the interval tree, specifically its ternary
nature. I am unable to evaluate issues of correctness and robustness in
regards to Witkin's methods without first obtaining satisfactory answers to
the above questions. Authors M.Kass, A.Witkin, and D.Terzopoulos investigate the use of
energy minimization as a framework within which the task of finding
recognizable image contours (i.e. edges, lines) can be realized. Their
somewhat unorthodox method addresses the problem from a top-down perspective
rather than the traditional bottom-up approach. Attacking the problem
top-down requires knowledge of the object in question. The authors
accommodate this necessity by providing a user interface, thereby allowing a
user to inject forces to guide the procedure. For example, a user places a
contour (snake) in an image somewhere near (how near?) a contour in the
image. Depending on the image energy (a weighted combination of the image
intensity, the gradient magnitude and the partial derivative of the
gradient angle with respect to the partial derivative of the unit vector
perpendicular to the gradient direction), the internal snake energy and
external constraint forces, the snake will deform itself into conformity
with the nearest recognizable contour. A particularly interesting use of
snakes, presented by the authors, was their use in motion tracking. A snake
can automatically track the local minimum of a 'slowly' (how slowly?) moving
feature if it has an initial lock on that feature. I found the concept of breaking the problem into two separate problems
(high-level and low-level) and concentrating on the low-level portion rather
smart. However, perhaps due to severe ignorance on my part, or due to
insufficient explanation by the authors, I was unable to fully comprehend
the basic snake behavior. For example, I had no idea what the author's
meant by the snake acting like a membrane or a thin-plate in their
discussion of the snake's internal energy. Nor did I fully understand how
or why the snakes were attracted to features. My experience reading this
paper is somewhat akin to being given a test and realizing that I had never
been to class.
To find image contours, not only the image itself, but also external
constraints are considered so that by minimizing the energy functional locally
, an active and dynamic contour model can be obtained.
The snake model is based on scale-space continuation and controlled continuity spline
with the influence of three factors: internal spline forces,
image forces, and external constraint forces.
Since a snake tries to adapt itself to the optimal contour dynamically,
it can also be used for motion tracking effectively. In such local energy-
minimizing systems, higher level processing can be more flexible and
conformable, comparing to previous rigid methods.
To describe signals qualitatively, a manageable way to describe the scale
of the signals is obtained by scale-space filtering. The method begins
by continuously varying the scale parameter and sweeps out
the scale-space image which is then reduced to a tree which provides
a concise but complete qualitative description of the signal.
In choosing the scales, a coarse scale is used to identify extrema and
a fine scale, to localize them. With the representation from such
various scales and its simplified tree, this method serves a powerful
description of signals.
The paper, written by Andrew P. Witkin, describes the method which can relate
the descriptions of the signal at different scales to each other. This method,
which is called Scale-space filtering, consist of several stages. At the
beginning the scale parameter is varied continuously in order to sweep out the
surface which author calls the scale-space image. This approach allows to
track down the extreme and to identify the singular points at which new
extreme appear. Author suggests to use the Gaussian convolution to compute the
description of the signal that depends on scale. One of the reasons for that
is the Gaussian convolution satisfies the "well behavedness" criteria which is
a very useful property. The coarse-to-fine tracking method is used to
localize large-scale events. The interval tree method is used to describe
signal simultaneously at all scales and to generate a family of single-scale
descriptions .
The paper is very well written and gives a good explanation of the methods the
author is using and math seems to be correct but difficult to follow. The
author gives a few examples which show how his method works but does not give
any examples where this methods can be used. If the author showed some
applications of his method, it would be easier to understand the paper.
This paper investigates the use of energy minimization as a framework to
design energy functions whose local minima comprise the set of alternative
solutions available to higher level processes. A snake is an energy-minimizing
spine guided by external constraints forces and influenced by image forces
that pull it toward features such as lines and edges. Snakes have proven to
be useful for interactive specification of image contours. The snake model
provides a unified treatment to a collection of visual problems. In order to
make snakes useful for early vision, we need energy functional that attract
them to silent features in images. The authors have derived energy functionals
which can be applied for such tasks as finding edges in an image or finding
terminations of line segments and corners. Snakes can also be applied to the
problem of stereo matching.
The techniques implemented in the paper are very useful for dealing with a
number of visual problems. The author explains his methods in a way it is easy
to follow. This paper introduced me to a variety of methods to deal with
images which I find very useful.
In image processing and other signal processing processes, extrema and
its first few derivatives are used as qualitative description of
signals. For example, they define the edges in the image. The neighborhood
within which the derivetives are calculated affects the result of the
calculation. Therefore, a principle is needed to select the basis for
neighborhood determination. Andrew P Witkin's article, 'Scale-space
Filtering'(Proceedings of the 8th International Joint Conference on
Artificial Intelligence. August 1983. Pages 1019-1022, Volume 2.),
proposed a scale-space filtering approach to solve this problem. Under this approach, the representation of scale is done by a gaussian
convolution to form a scale space image.The reason why a gaussian
convolution is needed is that the standard deviation of the Gaussian
distribution can be served as a gauge for the scales. In the scale-space,
the problem of localizing large- scale events is solved by a coarse-to-fine
tracking, which is done by adjusting the standard deviation along the zero
second derivative contour. The problem of multi-scale integration is solved
by a interval tree method. This method is to reduce the scale-space image
into a tree structure. The topic and the methods proposed by the author is technically
important. However, I personally find the author's presentation hard to
understand. The scale-space itself is a abstract concept. The author
aggrevated the level of difficulty by presenting interval tree with even
less illustration and almost no mathematical representation. Application of the scale-space concept under the context of image
processing was discussed by Michael Kass, Andrew Witkin, and Demitri
Terzopoulos in their article, 'Snakes: Active Contour Models'
(International Journal of Computer Vision, 321-331 Kluwer Academic
Publishers, Boston 1987) . Their approach is a variational method to
find and determine the contours and edges. Basically, their method is to use special knowledge of the objects in
the image as control elements for variational calculation. Edge finding and
contour determination are accomplished through splining based on minimum
energy principle, which is relatively well established in mathematics.
The authors adopted the scale-space concept proposed by Andrew
Witkin. Their variational method was proved to be more powerful with
property of scale-space continuation. Unlike the previous article, the
authors emphasize more on tangible accomplishments than on introducing an
alogrithmic concept. The contour determination example in the orange and
pear case offered a good illustration.
The Scale space article written by A. Witkin is a very intresting
approach to feature detection which he claims describes the signal
qualitatively without introducing artificial parameters. I don't
know what that means, but I do think that if it works it is very
powerful.
The basic idea is that we humans process sound at multiple
levels and that we can pretty clearly sift through garbeled signals.
The perfect example is that in a dinner party we can detect
individual conversations without much effort. A. Witkin describes a
method by which we can detect these features using a bottom up
approach.
The notion is that when you analyze a signal, we first pick
the scale of the fatures you wish to detect. In the Scale-Space
Filtering method, you apply an algorithm which ends up using all
scales which would be precptable in the signal. By convolving the
signal with a gausean you simplify it. Each successive signal is
further convolved until it is a simple signal. This technique is
pretty similar to that used for image parymids.
The key to detecting the features in the signal involves
finding all the "information" segments. In information theory, a
predictable signal contains little information therefore signals
which contain lots of information will contain many zero-crossings
across feequency boundries. Since each convolution in essence is
like taking a derivitave of the signal, you find all the zero
crossings in each layer.
Witkins takes that various zero crossings and generates an
interval tree from it and from that he can reconstruct all the
details at any particular layer in the signal. But what is
important is that he claims that we detect those features which
transcend convolution bopudries. i.e. signals which span many
frequency octaves. So if a zero crossing transcends the different
layers, then it is used to divide the neighboring signals into
different features. Those features which transcend the largest range
of frequencies are more noticable.
The article is not very good about giving examples and more
examples would elucidate the subject and convince the reader. The
idea is very original and useful if this is true. The points which
I question are: how is the interval tree constructed, how is the
convolution performed, what proof is there that humans dectect
signals which transcend frequency octave boundries.
The snake article by KASS, Witkin and TERZOPOULOS is useful. It
describes a formal model for specifying a templace which a computer
can use to identify objects given the correct parameters.
A snake is a mathematical model which follows the laws of physics
to help quantify physical objects in a descrete world. All images
are descrite and as such contain quantification errors when objects
are rendered in different real-world examples. The objects
themselves may be such that they do not reproduct identically. The
snake helps to identify these objects by allowing a fuzzy fit.
Snakes contain some properties which allow them to interact with
the values of an image at any particular point. These properties
can make the snake get attracted to some points and to be repelled
from others. An example which the paper gives is that in a motion
sequence a snake can be placed around some lips. In each
subsequent frame, the snake will trace the changed shape of the lips
because it is following the contours in the image. In another
example with stero vision, two snakes which identify two objects in
two pictures can identify that the same object appears in both
images.
Snakes can be made to contain predefined shapes and will
subsequently get attracted and match only those objects which match
the predefined shape of the snake without being constrained by the
size of the object.
Snakes know that they have recognized objects when their energy
values stabalize at some value. Since the a value of a snake can be
specified by any mathematical property, they can be made to match
cornors, to repel from certain areas, and to be attracted to certain
features.
(Article Review)
Filtering a signal by using its first derivatives are normally useful for
obtaining a relatively accurate qualitative description of it. But
discrete filters have to be of a fixed size, in other words, the
extension of the neighborhood is a value that we should know in order to
use these techniques. The problem is that the exact size of the filter
cannot be established without having some extra information like size and
location of objets in the image.
Meaningful objects or events in the signal are frequently located by
obtaining local extrema and derivatives. Edges and images are in this way
located. But the problem of scale make unrestricted processing very
difficult, the meaningful objects may be different if the image is
presented at a different extent and size.
A parameter with a scale value would be a fast solution, we could create
a mask with a size that is determined by this parameter. In this case
unclear descriptions would appear. Every different scale value would
generate a different description. No one scale value can be considered
precise.
Other methods have been used to solve this problem: pyramids and
zero-crossings. According to this work, no clear criteria for obtaining
edge pyramids or a test of zero-crossing techniques have been reached.
This work relates descriptions of signals at different scales by
continuously changing the scale parameter sweeping out a surface called
scale-space image The singular points at which new extrema appear are
found. Then it is reduced into a tree structure. This would mean a
qualitative description of the signal at any scale.
The Gaussian convolution is used to perform an expansion of the signal,
then it is collapsed and inflection points are found.
These inflection points (Fxx=0) will mark the appearance of contours. Two
assumptions are made: the inflection points found at different scales are
related with the same event and that the location of this event is the
contour (horizontal) location as gamma tends to infinity.
To get the interval tree for the signal, the scale-space image is reduced
describing the qualitative structure of the signal completely over all
scales of observation. When an opposite sign appears while varying gamma
the interval divides into sub-intervals.
The nodes of the tree are generated in a way that the parent node
represent the large interval form which the internal of the node emerges,
and its offspring are the smaller interval in which it subdivides. This
interval tree represents the image in all defined scales.
This technique is based on the localization of large scale events. I
think that is a very clever way to deal with the ambiguity of the scale
problem in a natural way i.e. avoiding the introduction of arbitrary
parameters.
(Article Review)
Low level tasks in machine vision have been oriented to an inflexible
sequential approach, they utilize the information that is available in
the image itself. So low-level mechanisms are the basis of image analysis
and no alternative organizations are provided. The approach of this
article is related to provide alternative organizations that are useful
for higher level processes, and it uses energy minimization concepts to
achieve this.
Energy minimization functions provide local minima the can comprise the
set of alternative solutions available to higher level processes. This
technique uses high-level knowledge for guiding it.
This work is aimed to find edges, lines and subjective contours and
follow the contours in order to match them. They are based in the idea
that an interpretation on an object is difficult to justify and it
depends on the kind of knowledge we have about it. Then because a signal
exact interpretation is almost impossible to define, it suggests to find
local minima instead of global minima. This is considered an active
mechanism because it is always minimizing its energy functional.
Snakes depend on other mechanisms in order for them to be placed near the
contour. So it can be used as a semi-automatic image representation. From
any starting point the snake changes its form in order to match the
nearest contour (lines, edges and subjective contour). The snake is
pushed by the image energy toward contours.
After a particular (perhaps user-defined) starting point (although it can
be defined by different techniques), the energy minimization will pull
the snake to the edge. This feature provides a basic tool for image
interpretation.
This work defines three energy functional that attract the snake to the
edge. The total energy is expressed as a weighted combination of these
energy functional. They are:
E-line=I(x,y); which attract the snake depending on the intensity of the
nearby contour.
E-edge=-|VI(x,Y)|^2; will attract the snake to contours with large image
gradients. (V=gradient)
E-term will find terminations of line segments and corners. (It uses the
curvature of level lines in a slightly smoothed image)
The paper discusses the way in which snakes can be applied to the problem
of stereo matching and motion. Motion is tracked by snakes due to its
characteristic of locking on the salient visual feature. When there is a
slow movement, the snake track the same local minimum. Fast movement may
generate a change in the local minimum that the snake is tracking.
This approach is based on interactive computations for finding image
contours and energy minimization at each step. Besides all the described
above, it can also be useful for matching 3D models to images. The
effectiveness of this method can be improved by finding new useful energy
functional or by combining them. For example by combining E-edge and
E-term, it was possible to create a snake that is attracted to edges and
terminations. It implements a similar way to find different points of
interest in the image (edges, contours, subjective contours). This work
introduces a new propose: to influence the lowest-level visual
interpretations by all levels of visual processing and perform a dynamic
aproximation to points of interest.
The technique of implementing snakes is presented in this paper. A
snake is an energy-minimizing spline that is guided by external constraint
forces and influenced by image forces that pull it toward image features
such as edges and lines. Snakes are very useful creatures: there are many
applications in computer vision for this technique, such as the problem of
detecting the edges, lines, and subjective contours. The authors also give
a good account of how snakes can be useful in stereo matching and motion
tracking. Also, there are other areas for which snakes can prove to be very
useful. One of those would be recognition, I imagine. Given a contour of an
object that we'd be looking for, one could use snakes to find an object in
the image, lock the snake onto it, and then match the found contour with
the given one. Also, once that's done, one could do tracking of the
particular objects, and studying the behavior of them using
snakes. Government is probably already working on implementing the snakes
for tracking down all the foreign spies, in the least.
This paper was helpful in understanding of the nature of snakes and the
behavior of the snakes, although it wasn't clear for me how to implement
a
snake given just this paper. The authors do a great job of explaing
how a snake would behave in different case, such as in detecting lines and
edges, as well as in motion tracking and stereo matching. What I found most
useful is the explanation of some of the formulas for snake behavior, such
as the energy formulas. I found this paper very interesting and neat.
Scale-space filtering technique is a way for describing the signals
quantitatively simultaneously at all scales, in terms of the extrema in the
signal or its derivatives. I find very interesting this approach of the
authors to escape ambiguity of dealing with the problem of choosing some
specific scale for general processing of the signal. The interval tree
is
a simple way to represent all scale levels of the signal by generating a
family of single-scale descriptions, and later combining then in a tree
of
nodes. My only comment on this technique would be in the effiency of this
representation's storage. The authors don't address the issue of storing
the actual tree, which I imagine, can 'grow' quite large for complicated
signals.
I greatly appreciate the authors' effort to present their idea in a very
consistent, coherent manner, which made explanation of the material much
better readable. This paper was quite interesting, though I am not entirely
sure where this technique is going to be used.
Well, not much to say about this one...except that I was
thoroughly confused by the end of the second page...so,
consequently, I stopped reading, and went on to the next
paper.
The most interesting aspect of snakes is that they provide
a way to detect edges and follow them through motion sequences.
The article gives an example of a speaker's lips. Through
8 images, the snake followed the contour of the lips throughout
each frame. The initial snake was placed in the general vacinity
of the speakers lips by someone. Then, the snake found the edges
of the lips and followed them through the sequence of images.
This seems like a very useful tool. A possible implementation
of this technology would be robots. Robots could effectively
follow someone (like someone's dog...or pet rabbit) by using
a camera, conveniently mounted on the robot's head, and process images
of the person being followed. The result of this would be
a very annoying pet dog...or rabbit...that would constantly
follow you around your house.
On the technical side of this paper, I followed a good part of
the equations, but some of the issues being discussed obviously
depended on the reader (being me) haveing some other knowledge
of the material (which I didn't have) to follow those parts
of the article. But overall, it was a very interesting paper.
Bin Chen
Scale-Space Filtering
Snakes: Active Contour Models
Jeffrey Considine
Scale-Space Filtering
Snakes: Active Contour Models
Cameron Fordyce
Scale-Space Filtering by A. Witkin
Timothy Frangioso
Scale-Space Filtering
Jason Golubock
Review: Scale-Space Filtering
Snakes: Active Contour Models
Jeremy Green
Daniel Gutchess
Scale-Space Filtering
Snakes: Active Contour Models
John Isidoro
"A review of Scale space Filtering" and "Snakes active contour models"
Tong Jin
Leslie Kuczynski
Scale-Space Filtering
Snakes: Active Contour Models
Hyun Young Lee
Snakes: Active Contour Models
SCALE-SPACE FILTERING
Ilya Levin
SCALE-SPACE FILTERING
SNAKES: ACTIVE CONTOUR MODELS
Yong Liu
Nagendra Mishr
SCALE-SPACE FIILTERING
Snakes: Active Contour Models
Romer Rosales
Scale-Space Filtering
Michael Kass, Andrew Witkin and Demetri Terzopoulos.
Natasha Tatarchuk
Snakes: Active Contour Models
Scale-Space Filtering
Leonid Taycher
Alex Vlachos
Reading #4
Scale-Space FilteringReading #5
Snakes: Active Contour Models
Stan Sclaroff
Created: Sep 26, 1996
Last Modified: Sep 30, 1996