BU CLA CS 835: Seminar in Image and Video Computing --- Class commentary on articles

BU CLA CS 835: Seminar on Image and Video Computing

Class commentary on articles: Texture



	
BU CLA CS 835: Seminar in Image and Video Computing --- Class commentary on articles
		


 Paul Dell 


H. Tamura, S. Mori, and T. Yamawaki. "Textural features corresponding to  
visual perception." IEEE Transactions on Systems, Man, and Cybernetics,  
SMC-8(6):460--472, 1978. 


The Tamura article examines the correlation between human vision studies and  
computational models of six textural features.  The features studied are  
coarseness, contrast, directionality, line-likeness, regularity, and  
roughness.  The study included 48 subjects and 16 patterns.  


It is impressive that the study looked at a large variety of textural  
features instead of limiting the study to 1 or 2 features.  It was found that  
the model and human studies corresponded well for coarseness, contrast, and  
directionality.  Overall similarity measurements for simple combinations of  
the computational measurements with the human subject results did not compare  
well.  Authors suggested more studies using more sophisticated combinations  
of the features.   The authors did not believe that the individual feature  
measurements were insufficient, just the modeling of the combining the  
various cues. 


The article was published in 1978 which indicates that this may be one of the  
first articles looking at the various textural features.  So it is not fair  
to compare a 1978 to a 1995 article in the level of model complexity and  
accuracy of the results.  But one shortcoming of the Tamura study was the use  
of only 16 patterns.  I believe that there should have been 2-4x as many  
patterns and that at least half of the patters should have come from actual  
physical objects. 



R. Picard and M. Gorkani. "Finding perceptually dominant orientations in  
natural textures." Spatial Vision, 8(2):221--253, 1994. 


Picard and Gorkani describe a study comparing a computational method of  
determining dominant orientations and a human study.  The human study results  
are used to refine the computational method to make it more closely model the  
human responses.

The image analysis system utilizes a steerable pyramid of steerable filters.   
The pyramid utilized consisted of four levels and a histogram was recorded  
for each level which was later combined to compare with the human studies.   
Histogram smoothing and contrast compensation was used to aid in finding  
prominent peaks.

The human studies consisted of forty MIT students with normal or corrected to  
normal vision.  Two interesting notes on the results of the human studies  
were that all subject used the full range of strengths with approximately  
uniform distribution and that only 4 subject had variances greater than 10%.   
This surprised the reader in that the reader expected a greater variation of  
responses.

Because of the uniformity of the human studies it seems that it is a  
reasonable target to model.  The computational system consisted of 7  
variables that could be modified for optimization to the target human system.   
After some optimization the correlations achieved were:
-dominant orientations of 68/111 were matched w/o contrast normalization 

-dominant orientations of 74 /111 were matched w/ contrast normalization
-finding 1 dominant orientation brings the success rate up to 95/111

The large number of images and the fact that the images are from natural  
textures adds much to the applicability of this study.  Overall the article  
is readable and gives sufficient background information for the reader.

 John Petry 

TEXTURAL FEATURES CORRESPONDING TO VISUAL PERCEPTION, Tamura, Mori and Yamawaki
____________________________________________________

The authors tried to compute six different texture measures and compared
them to human perceptions of texture in an effort to determine which
computationally tractable approaches replicated human perceptions.  They
performed their experiments on what they believe was a representative
subset of 16 of the 112 images in the Brodatz texture album.

Of the six measures, coarseness, directionality and contrast performed
well, with some limitations.  For instance, they measured directionality by
histogramming Sobel images by angle (after passing a certain magnitude
threshold), then peak-finding.  This may work well for some fine features,
eg. a checkerboard, which would return a set of 0/180 and 90/270 degree
edges, but not for a set of checkers which were arrayed in a similar
pattern, since there would be an even distribution of angles.  In other
words, there was no work at multiple resolutions to account for scale.
Similarly, their contrast measurement is weak in that it is directly linked
to the number of edges in a scene, which can also be a function of
resolution.

Their other three measures did not perform well.  The line-likeness test
was apparently very parameter-dependent, which doesn't give one much
confidence given the tiny size of their sample set (16 images).  The other
two measures, regularity and roughness, were not direct measurments, but
combinations of the results of the four basic measurements described.
Neither of these corresponded well with human judgements of the same
attributes.  Regularity was apparently very dependent on scale choices.
Roughness didn't match human choices well, perhaps in part because they
believe their choice of terminology may have biased the human classifiers.

Some of these measures seem to have worked well, or at least promised that
possibility if problem areas were taken into account.  One of the largest
such problems seems to have been scale, which affected almost all of their
tests.  In addition, their reliance on a sample set which could only
include a few samples of each case of interest leaves us unable to
extrapolate too much from their results.


FINDING PERCEPTUALLY DOMINANT ORIENTATIONS IN NATURAL TEXTURES, 
______________________________________________________________
Picard and Gorkani

Picard and Gorkani took one of the measures from the first paper, finding
dominant orientations in natural texture, and investigated it in much more
detail.  They did not use the identical algorithm.  Instead, they divided
the measures into four separate tests at different resolutions and used
steerable filters to measure the dominant angle(s) and peak strengths.
[Unfortunately, two pages describing the details of their algorithm were
missing from my copy].  They also performed Gaussian smoothing to reduce
noise, and contrast normalization to minimize the effects of shadowing and
uneven lighting.

Their basic approach was an interative one, modifying parameters for
strenth thresholds, angle quantization size, and the design of their
Gaussian filter in an attempt to maximize the performace over their
database.  Since they ran their tests over the entire Brodatz database, we
can have more confidence in their results.

For most classes of images, good results were obtained when comparing the
computer measurements of dominant orientation(s) with those of human
judges.  Overall, they did a better job of determining human judgements
than Tamura et al, who merely ranked images according to each scale being
tested.  Picard and Gorkani had some problems with the human test; most
notably to me was that the humans viewed circular windows of the original
Brodatz samples, while the computer operated on rectangular windows.  While
the authors only determined a major problem in one case, it would seem best
to perform both tests on circular images -- it didn't seem that difficult
to adjust the algorithm to cope with it [again, as far as I could tell
given the two missing pages].

Overall, the authors had good results.  Given that the results were tuned
to a specific database, I'd be very curious to see how well their
parameterization generalized to other sets of images.  One positive note is
that man-made scenes are often dominated by strong orthogonally-oriented
features, and so might work even better than natural scenes for this type
of approach.


THE DESIGN AND USE OF STEERABLE FILTERS, by Freeman and Adelson
_______________________________________

The authors address the problem that while it is often desirable to apply
filters at particular angular orientations, it is computationally very
expensive to run such filters over an image at every conceivable desired
angle.  Instead, they show an approach which relies on using basis filters
over the image, then interpolating their results to determine the results
for a particular angle.

The number of basis filters needed to steer a function can be determined
from its Fourier representation.  Representing its terms as a polynomial
expression in Cartesian coordinates is often easier, and will yield a
sufficient number of basis filters for the task, but it is not guarranteed
to produce the *minimum* number of basis filters needed; eg. 2N+1 filters
are sufficient to steer an Nth order polynomial, but fewer may be acceptable.

The authors point out that steerable filters work with discretely sampled
functions as well as continuous ones, so they are applicable to machine
vision.  They also give several sample applications, such as determining
dominant local orientations (note Picard and Gorkani, above); filtering
noise on angularly aligned features; differentiating line from edge data in
a way that Sobel or Canny edge filters, or conversely, line filters,
cannot; and finally, shape-from-shading computations.  Finally, the authors
demonstrate the extendability of their approach to three-dimensional
computations.

 Shren Daftary

Synopsis for Finding Perceptually Dominant Orientations in Natural
Textures by Picard and Gorkani

This paper presents methods to determine texture orientation, since
orientation is an important visual cue. Some of the algorithms that are
mentioned for orientation information include Gabor filters, wavelets, and
a steerable pyramid. The steerable pyramid determines information for all
orientations at a small computational cost. This may not be necessary in
many cases though, since as the authors state the majority of human visual
experiences correspond to either horizontal or vertical patterns.  The
algorithm for orientation detection using the pyramid is given, whose
orientations can be used by simpler algorithms. Once the orientation data
is gathered the authors suggest that smoothing should be performed to
reduce noise, and find prominent peaks of the resultant histogram. 

The determination of dominant peaks for orientation identification
involves either curve fitting (of the histogram) to a Gaussian, or taking
derivatives. The importance of the peak is given a certain weight, and in
the initial stages this does not necessarily correspond to that of human
perception, although eventually it seems the formula may be tweaked to
correspond to human perception. Some problems that are presented include
the inability of the standard algorithm to compensate for contrast
differences between images. This may not be a significant problem, because
humans may view orientations more dramatically in images with higher
contrasts. 

The next part of the paper discusses the testing of humans on texture
orientation. The differences between computer identification of
orientation and human identification may be used to strengthen the
computer algorithm. For most of the images humans and the algorithm
agreed. Those with disagreement showed the ability of computer analysis to
pick up hidden patterns, and avoid confusion between symmetry and
orientation. 

Overall this paper seemed to bring out the important issues with
orientation identification, but did not explain or prove the necessity for
using a steering technique in orientation determination, since the
majority of the images did not present any non horizontal, diagonal, or
vertical orientation, which can be handled by other less intensive
techniques. 

Synopsis for Textural Features Corresponding to Visual Perception by Tamura
Mori, and Yamawaki

This classical paper also presents techniques for computers to deal with 
textual cues. It presents methods to approximate coarseness, contrast, 
directionality, line-likeness, regularity, and roughness. Coarseness 
refers to the scale of objects and the degree of repetition. Contrast is 
concerned with the part of the gray scale that is used by an image. 
Directionality relates to the orientation. Line-likeness is concerned 
with whether there are lines or blobs in an image. Regularity relates to 
the variation of structure. Roughness corresponds to the perception that 
our sense of touch will feel with an object.

Psychological experiments were performed to rank a set of 16 images. 
These experiments demonstrated the unclear nature of some of the 
above definitions. An algorithm was presented to calculate the 
"coarseness" of an object. Next a method to describe contrast is 
presented, directionality, line-likeness, regularity, and roughness. With 
these algorithms the differences between human perception and computer 
analysis corresponded well for contrast, and coarseness. For 
directionality, line-likeness, and regularity the correspondence was 
reasonable, while for roughness the algorithm was too crude to provide an 
accurate estimate. 

This paper presented some interesting concepts, but as can be expected 
for a classical paper - current research has expanded on the concepts and 
improved the techniques, while probably relying on the premises of this 
paper for ideas on texture determination.


Lars Liden
 
"Textural Features Corresponding to Visual Perception" 
Tamura, More & Yamawaki

The authors of this paper attempted to create computational measures
which mimic human perception for six basic textural features,
coarseness, contrast, directionality, line-likeness, regularity and
roughness.  They began by asking subjects to make pair comparisons
between a set of images for a given textural property and then ordered
the images based on subject's perceptions.  Finally they examined the
correlation between the properties.  Next, the authors used
computational heuristics for each psychological category and similarly
ordered the images based on the textural properties.  The authors
found that the heuristics for coarseness, contrast and directionality
ordered the images in a similar way to the human subjects.  On the
other hand, the heuristics for line-likeness, regularity and roughness
were unsuccessful.  The utility of creating computational measures of
this type is not clear.  Psychometric measurement of the type
performed with subjects does not necessarily correspond with the
processing methods used by subjects for image segmentation and
identification.  Many studies have shown clear discrepancies between
human introspection of decision making/perception and the actual
processing used by the subjects.  Additionally, I find it unlikely
that finding heuristics capable of mimicking human perceptual
judgements will yield useful image processing techniques.  As the
authors point out, humans are capable of applying an enormous library
of somatosensory information to label images with features such as
roughness.  Given the amount of the prior knowledge available to human
subjects and sheer amount of higher level processing which must occur
for human to make a verbal response, it is unlikely that this method
will be directly applicable to image processing.


"The design and Use of Steerable Filters" 
Freeman and Adelson

The paper examines the ability of what the author's call a "steerable
filter" to perform tasks such as edge detection.  A steerable filter
is one in which a filter of arbitrary orientation can be formed as a
linear combination of a set of "basis filters".  It is desirable to
find the minimum number of basis functions which can then form all
other filters by interpolating between the basis functions.   The
authors show that all functions that can be expressed as a Fourier
series in angle or in a polynomial expansion in x and y times a
radially symmetric window function can be represented by a steerable
filter. The authors examine the use of steerable filters for analyzing
local orientation, the detection of contours and shape-from-shading. I
found this paper to be far superior to the previous one, as it
directly addressed issues relevant for image processing and
demonstrated the utility of a particular method.  Additionally, the
type of processing discussed in this article is similar to the type of
low-level processing that has been suggested to occur in the human
brain. Although higher level processing in human is unlikely to have
any relevance, lower level systems may have relevance.
Stan Sclaroff Created: Sept 25, 1995 Last Modified: