BU CLA CS 835: Seminar on Image and Video Computing

Class commentary on articles: Texture



	
BU CLA CS 835: Seminar in Image and Video Computing --- Class commentary on articles
		


Paul Dell

H. Tamura, S. Mori, and T. Yamawaki. "Textural features corresponding to visual perception." IEEE Transactions on Systems, Man, and Cybernetics, SMC-8(6):460--472, 1978. The Tamura article examines the correlation between human vision studies and computational models of six textural features. The features studied are coarseness, contrast, directionality, line-likeness, regularity, and roughness. The study included 48 subjects and 16 patterns. It is impressive that the study looked at a large variety of textural features instead of limiting the study to 1 or 2 features. It was found that the model and human studies corresponded well for coarseness, contrast, and directionality. Overall similarity measurements for simple combinations of the computational measurements with the human subject results did not compare well. Authors suggested more studies using more sophisticated combinations of the features. The authors did not believe that the individual feature measurements were insufficient, just the modeling of the combining the various cues. The article was published in 1978 which indicates that this may be one of the first articles looking at the various textural features. So it is not fair to compare a 1978 to a 1995 article in the level of model complexity and accuracy of the results. But one shortcoming of the Tamura study was the use of only 16 patterns. I believe that there should have been 2-4x as many patterns and that at least half of the patters should have come from actual physical objects. R. Picard and M. Gorkani. "Finding perceptually dominant orientations in natural textures." Spatial Vision, 8(2):221--253, 1994. Picard and Gorkani describe a study comparing a computational method of determining dominant orientations and a human study. The human study results are used to refine the computational method to make it more closely model the human responses. The image analysis system utilizes a steerable pyramid of steerable filters. The pyramid utilized consisted of four levels and a histogram was recorded for each level which was later combined to compare with the human studies. Histogram smoothing and contrast compensation was used to aid in finding prominent peaks. The human studies consisted of forty MIT students with normal or corrected to normal vision. Two interesting notes on the results of the human studies were that all subject used the full range of strengths with approximately uniform distribution and that only 4 subject had variances greater than 10%. This surprised the reader in that the reader expected a greater variation of responses. Because of the uniformity of the human studies it seems that it is a reasonable target to model. The computational system consisted of 7 variables that could be modified for optimization to the target human system. After some optimization the correlations achieved were: -dominant orientations of 68/111 were matched w/o contrast normalization -dominant orientations of 74 /111 were matched w/ contrast normalization -finding 1 dominant orientation brings the success rate up to 95/111 The large number of images and the fact that the images are from natural textures adds much to the applicability of this study. Overall the article is readable and gives sufficient background information for the reader.

John Petry

TEXTURAL FEATURES CORRESPONDING TO VISUAL PERCEPTION, Tamura, Mori and Yamawaki ____________________________________________________ The authors tried to compute six different texture measures and compared them to human perceptions of texture in an effort to determine which computationally tractable approaches replicated human perceptions. They performed their experiments on what they believe was a representative subset of 16 of the 112 images in the Brodatz texture album. Of the six measures, coarseness, directionality and contrast performed well, with some limitations. For instance, they measured directionality by histogramming Sobel images by angle (after passing a certain magnitude threshold), then peak-finding. This may work well for some fine features, eg. a checkerboard, which would return a set of 0/180 and 90/270 degree edges, but not for a set of checkers which were arrayed in a similar pattern, since there would be an even distribution of angles. In other words, there was no work at multiple resolutions to account for scale. Similarly, their contrast measurement is weak in that it is directly linked to the number of edges in a scene, which can also be a function of resolution. Their other three measures did not perform well. The line-likeness test was apparently very parameter-dependent, which doesn't give one much confidence given the tiny size of their sample set (16 images). The other two measures, regularity and roughness, were not direct measurments, but combinations of the results of the four basic measurements described. Neither of these corresponded well with human judgements of the same attributes. Regularity was apparently very dependent on scale choices. Roughness didn't match human choices well, perhaps in part because they believe their choice of terminology may have biased the human classifiers. Some of these measures seem to have worked well, or at least promised that possibility if problem areas were taken into account. One of the largest such problems seems to have been scale, which affected almost all of their tests. In addition, their reliance on a sample set which could only include a few samples of each case of interest leaves us unable to extrapolate too much from their results. FINDING PERCEPTUALLY DOMINANT ORIENTATIONS IN NATURAL TEXTURES, ______________________________________________________________ Picard and Gorkani Picard and Gorkani took one of the measures from the first paper, finding dominant orientations in natural texture, and investigated it in much more detail. They did not use the identical algorithm. Instead, they divided the measures into four separate tests at different resolutions and used steerable filters to measure the dominant angle(s) and peak strengths. [Unfortunately, two pages describing the details of their algorithm were missing from my copy]. They also performed Gaussian smoothing to reduce noise, and contrast normalization to minimize the effects of shadowing and uneven lighting. Their basic approach was an interative one, modifying parameters for strenth thresholds, angle quantization size, and the design of their Gaussian filter in an attempt to maximize the performace over their database. Since they ran their tests over the entire Brodatz database, we can have more confidence in their results. For most classes of images, good results were obtained when comparing the computer measurements of dominant orientation(s) with those of human judges. Overall, they did a better job of determining human judgements than Tamura et al, who merely ranked images according to each scale being tested. Picard and Gorkani had some problems with the human test; most notably to me was that the humans viewed circular windows of the original Brodatz samples, while the computer operated on rectangular windows. While the authors only determined a major problem in one case, it would seem best to perform both tests on circular images -- it didn't seem that difficult to adjust the algorithm to cope with it [again, as far as I could tell given the two missing pages]. Overall, the authors had good results. Given that the results were tuned to a specific database, I'd be very curious to see how well their parameterization generalized to other sets of images. One positive note is that man-made scenes are often dominated by strong orthogonally-oriented features, and so might work even better than natural scenes for this type of approach. THE DESIGN AND USE OF STEERABLE FILTERS, by Freeman and Adelson _______________________________________ The authors address the problem that while it is often desirable to apply filters at particular angular orientations, it is computationally very expensive to run such filters over an image at every conceivable desired angle. Instead, they show an approach which relies on using basis filters over the image, then interpolating their results to determine the results for a particular angle. The number of basis filters needed to steer a function can be determined from its Fourier representation. Representing its terms as a polynomial expression in Cartesian coordinates is often easier, and will yield a sufficient number of basis filters for the task, but it is not guarranteed to produce the *minimum* number of basis filters needed; eg. 2N+1 filters are sufficient to steer an Nth order polynomial, but fewer may be acceptable. The authors point out that steerable filters work with discretely sampled functions as well as continuous ones, so they are applicable to machine vision. They also give several sample applications, such as determining dominant local orientations (note Picard and Gorkani, above); filtering noise on angularly aligned features; differentiating line from edge data in a way that Sobel or Canny edge filters, or conversely, line filters, cannot; and finally, shape-from-shading computations. Finally, the authors demonstrate the extendability of their approach to three-dimensional computations.

Shren Daftary

Synopsis for Finding Perceptually Dominant Orientations in Natural Textures by Picard and Gorkani This paper presents methods to determine texture orientation, since orientation is an important visual cue. Some of the algorithms that are mentioned for orientation information include Gabor filters, wavelets, and a steerable pyramid. The steerable pyramid determines information for all orientations at a small computational cost. This may not be necessary in many cases though, since as the authors state the majority of human visual experiences correspond to either horizontal or vertical patterns. The algorithm for orientation detection using the pyramid is given, whose orientations can be used by simpler algorithms. Once the orientation data is gathered the authors suggest that smoothing should be performed to reduce noise, and find prominent peaks of the resultant histogram. The determination of dominant peaks for orientation identification involves either curve fitting (of the histogram) to a Gaussian, or taking derivatives. The importance of the peak is given a certain weight, and in the initial stages this does not necessarily correspond to that of human perception, although eventually it seems the formula may be tweaked to correspond to human perception. Some problems that are presented include the inability of the standard algorithm to compensate for contrast differences between images. This may not be a significant problem, because humans may view orientations more dramatically in images with higher contrasts. The next part of the paper discusses the testing of humans on texture orientation. The differences between computer identification of orientation and human identification may be used to strengthen the computer algorithm. For most of the images humans and the algorithm agreed. Those with disagreement showed the ability of computer analysis to pick up hidden patterns, and avoid confusion between symmetry and orientation. Overall this paper seemed to bring out the important issues with orientation identification, but did not explain or prove the necessity for using a steering technique in orientation determination, since the majority of the images did not present any non horizontal, diagonal, or vertical orientation, which can be handled by other less intensive techniques. Synopsis for Textural Features Corresponding to Visual Perception by Tamura Mori, and Yamawaki This classical paper also presents techniques for computers to deal with textual cues. It presents methods to approximate coarseness, contrast, directionality, line-likeness, regularity, and roughness. Coarseness refers to the scale of objects and the degree of repetition. Contrast is concerned with the part of the gray scale that is used by an image. Directionality relates to the orientation. Line-likeness is concerned with whether there are lines or blobs in an image. Regularity relates to the variation of structure. Roughness corresponds to the perception that our sense of touch will feel with an object. Psychological experiments were performed to rank a set of 16 images. These experiments demonstrated the unclear nature of some of the above definitions. An algorithm was presented to calculate the "coarseness" of an object. Next a method to describe contrast is presented, directionality, line-likeness, regularity, and roughness. With these algorithms the differences between human perception and computer analysis corresponded well for contrast, and coarseness. For directionality, line-likeness, and regularity the correspondence was reasonable, while for roughness the algorithm was too crude to provide an accurate estimate. This paper presented some interesting concepts, but as can be expected for a classical paper - current research has expanded on the concepts and improved the techniques, while probably relying on the premises of this paper for ideas on texture determination.

Lars Liden

"Textural Features Corresponding to Visual Perception" Tamura, More & Yamawaki The authors of this paper attempted to create computational measures which mimic human perception for six basic textural features, coarseness, contrast, directionality, line-likeness, regularity and roughness. They began by asking subjects to make pair comparisons between a set of images for a given textural property and then ordered the images based on subject's perceptions. Finally they examined the correlation between the properties. Next, the authors used computational heuristics for each psychological category and similarly ordered the images based on the textural properties. The authors found that the heuristics for coarseness, contrast and directionality ordered the images in a similar way to the human subjects. On the other hand, the heuristics for line-likeness, regularity and roughness were unsuccessful. The utility of creating computational measures of this type is not clear. Psychometric measurement of the type performed with subjects does not necessarily correspond with the processing methods used by subjects for image segmentation and identification. Many studies have shown clear discrepancies between human introspection of decision making/perception and the actual processing used by the subjects. Additionally, I find it unlikely that finding heuristics capable of mimicking human perceptual judgements will yield useful image processing techniques. As the authors point out, humans are capable of applying an enormous library of somatosensory information to label images with features such as roughness. Given the amount of the prior knowledge available to human subjects and sheer amount of higher level processing which must occur for human to make a verbal response, it is unlikely that this method will be directly applicable to image processing. "The design and Use of Steerable Filters" Freeman and Adelson The paper examines the ability of what the author's call a "steerable filter" to perform tasks such as edge detection. A steerable filter is one in which a filter of arbitrary orientation can be formed as a linear combination of a set of "basis filters". It is desirable to find the minimum number of basis functions which can then form all other filters by interpolating between the basis functions. The authors show that all functions that can be expressed as a Fourier series in angle or in a polynomial expansion in x and y times a radially symmetric window function can be represented by a steerable filter. The authors examine the use of steerable filters for analyzing local orientation, the detection of contours and shape-from-shading. I found this paper to be far superior to the previous one, as it directly addressed issues relevant for image processing and demonstrated the utility of a particular method. Additionally, the type of processing discussed in this article is similar to the type of low-level processing that has been suggested to occur in the human brain. Although higher level processing in human is unlikely to have any relevance, lower level systems may have relevance.


Stan Sclaroff
Created: Sept 25, 1995
Last Modified: