My project is to create a computer environment in which virtual creatures can evolve, in a similar way as described in [11]. A difference is that my creatures will need to learn how to move, to perform specific simple acts, such as "move in a straight path", "turn left", "detect food", etc. These simple tasks can be combined to build higher-level actions, such as "search for food". Eventually, these creatures will be able to interact in the environment, working and competing to survive. The environment will be a 2-D plane. Scattered randomly about the plane will be obstacles, food particles, and creatures. These will be distinguishable by color and shape. Obstacles will simply be immobile polygons which the creatures will have to move around. Food will consist of green spots scattered around the plane. As time passes, some food may disappear and some may grow in other areas, to simulate plants growing and dying. If a creature runs over a food particle, it is eaten, and will disappear. The creatures will be made up of 2-D rectangles, connected together by joints. Each joint will have a maximum motion of 90 degrees in either direction. Effectors will control how much a particular joint will bend; this will determine not only the new angle, but how quickly it gets to the new angle. This provides information for direction and force. Each side of a rectangle can have a maximum of two limbs branching off. Each side of a rectangle can have one or zero visual sensors. A sensor will be able to detect objects within a certain distance away in a visual arc, and will record the color of the object. The sensors will act as input to a neural network, along with sensors describing the current degree a limb is bent to; outputs to the network will be connected to the effectors. The neural network will be a combination of methods, described in [2], [7], [8], [9], and [10], the exact details of which I am still working out. As discussed in [7] and [8], I will use directed graphs and hierarchical expressions to represent a creature's genotype. Each genotype will contain information about the physical structure of a creature, the types of joints and muscles it contains, the types and locations of sensors, the connections and weights of neurons for control, etc. The specific setup of their bodies, the way their muscles work, the locations of sensors, their reactions to stimuli, etc. would all be evolved using genetic algorithms. The genetic algorithm will be implemented as discussed in [3], [4], [7], and [8]. The evolution will take place in several steps. First, the creature must learn how to move and how to interpret its visual signals. Each creature will be tested on several fitness functions, according to how well it moves forward, turns left, and turns right given certain types of visual stimulations. Once we have a population of creatures which can move about, we can put them into the simulated environment. In the environment, they will have to eat food to survive. Moving around will deplete a supply of energy and eating will add to it. If there is a time, I would also like to add the ability to have creatures find each other and mate in the environment, using genetic algorithms. By April 3, I plan to be able to design a creature by hand. I will be able to create a genotype, representing the design of a creature, and the system will be able to generate a creature based upon this design. The creature will move according to its instructions. I will also have a good part of the user interface ready, showing the environment the creatures will be living in. By May 1, I will have added the genetic algorithm portion of the program, which will allow the creatures to evolve, rather than being hand-generated. Using this, I will have the system evolve groups of creatures which are successful at their specific tasks. References [1] Braitenberg, Valentino, _Vehicles_, MIT Press 1987 [2] de Garis, Hugo, "Genetic Programming: Building Artificial Nervous Systems Using Genetically Programmed Neural Network Modules," Proceedings of the 7th International Conference on Machine Learning, 1990 [3] Koza, John, _Genetic Programming_, MIT Press 1992 [4] Michalewicz, Zbigniew, _Genetic Algorithms + Data Structures = Evolution Programs_, Springer-Verlag 1992 [5] Reynolds, Craig, "An Evolved, Vision-Based Model of Obstacle Avoidance Behavior," Artificial Life III Proceedings, June 1992 [6] Reynolds, Craig, "Competition, Coevolution and the Game of Tag", Artificial Life IV Proceedings, July 1994 [7] Sims, Karl, "Evolving 3D Morphology and Behavior by Competition," Artificial Live IV Proceedings, July 1994 [8] Sims, Karl, "Evolving Virtual Creatures," SIGGRAPH 94 Computer Graphics Proceedings, July 1994 [9] Travers, Michael, "Animal Construction Kits," Artificial Life Proceedings, September 1987 [10] van de Panne, Michiel, and Eugene Fiume, "Sensor-Actuator Networks," SIGGRAPH 93 Computer Graphics Proceedings, August 1993 [11] Yaeger, Larry, "Computational Genetics, Physiology, Metabolism, Neural Systems, Learning, Vision and Behavior or Polyworld: Life in a New Context," Artificial Life III Proceedings, June 1992
Problem Definition/Motivation/Goals: The problem posed is the design of efficient navigation controls which demonstrates user control and a sense of location as well as content while traveling through representations of a standard directory structure tree-based information space. The motivation behind this revolves around the desire to create a "user-friendly" representation= and navigational control over a 3D space. The goal of this project is to create an interface and navigational technique which is intuitive and comprehensively easy on the user while maintaining a data representation (of the returned space from searches) which maximizes the assimilation of the information without increasing the complexity of the data presented to the user. Overview of Theoretical Issues/Previous Work: In regards to information space visualization, Fairchild [0] describes a system which structures around three main components in the creation of visualizations of an information space: (1) the representation of a single object, (2) the representation of a set of objects, and (3) user control over the representation (encodings) used to view sets of the information. Visualizing the entire contents of an information space in one viewport presents a complexity which reduces the benefit of using a 3D model (refer to [1]). An emphasis is placed on representations which the user finds most agreeable to his/her comprehensive needs. Subsets are used to give a focus to the portion of data which the user is most interested in. In [2], a model is developed which emphasizes a "3D interface environment in which the workspace gives the user a high-level view of the information, a sense of place within the information, and a sense of movement as the user examines and navigates the structure." A graphically recursive representation of the information structures was chosen since the application domains themselves were recursive in this model, allowing the user to interact with and manipulate data within the 3D recursive workspace, which itself was represented using a simple encoding of spatial arrangement/same-level definition, temporal arrangement, nesting, containment, and level of depth within a 3D coordinate system. In [3], the authors express their belief that virtual environments should convey the widest possible range of perceptual and symbolic information. Accordingly, their interest is in testing the informational limits of immersive environments. The topics of orientation and navigation are explored in an attempt to find more efficient ways of experiencing and conveying information. Echoing the message of [1], [4] stresses "representing information in a form that matches our perceptual capabilities (mainly visual) and a problem=92s particular needs [in order to make] the process of getting the information and digesting it easier and more effective." Visualization is directly linked to perceptualization in [5], where the workspace is just as important as the visualization. In other words, user interaction with the data should be the prime concern upon developing a visualization. The authors introduce the concept of sense-making tools which help users understand information by associating and combining it, re- representing retrieved information to make patterns visible. Simplification of the physical view and the optimization of comprehension is covered in [6], where three major node and link display problems are discussed: (1) display clutter, (2) node positioning, and (3) perceptual tension. In regards to navigation, as seen in [7], there are roughly five techniques which can be used: (1) relative which uses a sequence of small steps and is considered the worst technique, (2) absolute which consists of pointing on a map to indicate where you would like to go; yet, lacks accuracy, (3) teleportation which can be used once a position can be named and is very fast, but requires that the user have been there before in one way or another, (4) hyperspace which follows the links between objects and is useful when relationships between objects are important, and (5) transformation which instead of moving the viewpoint, moves the objects desired to the viewpoint and is potentially very powerful, especially in querying by reformation. In other words, a form of rapid transit would be preferable in the information space, but the structure of such navigation must be thought out. The main concerns when considering navigation should be the sense of location within the information, a sense of the content of the information, as well as the perceptual friendliness of the actual traveling. Work in [8] simplifies the interaction with and comprehension of an information space into the following cycle: (1) browsing and searching, (2) selecting data, and (3) digesting and creating new information. The message of this paper is that "improv[ing] the quality of [the information space] requires making the processes of interacting with the information easier and more efficient." Finally, [9] deals with then influence of a walking technique in an information space, centering around human subject response within an immersive 3D environment. Navigational preference leaned towards a simple pointing technique which reduced the amount of energy expended by the user while sense of presence was enforced by simulated walking through the environment. In order to create a more powerful as well as friendly interface requires a sense of user preference as well as a clear representation of the current status at all times. Problem Constraints, System I/O, Evaluation Criteria: A constraint placed upon this project will be the data to be represented. We shall only consider a minimal directory structure with directory entries including symbolic pointers, data files, executable files, and one further directory. Input/output will be through the keyboard and mouse, with a query box for searches. Possible Approaches, Structure of Research/Coding: The implementation would use the standard directory hierarchy in representing the space, the one subdirectory leading to a recursive representation. The current location would be denoted by an overview mapping of the current location to give a sense of where the user was in the information space. The current level space itself would be represented in a minimum symbolic manner since navigation is the main concern: data files by blocks, executables by pyramids, symbolic pointers by arrows, and directories by raised spheres in the distance. Upon entering a level of the search space, the user will only be able to clearly see the first eight elements of that space (four elements per row) as well as the fainter subdirectory spheres. The reasoning behind this design choice stems from the work done in [1]. In order to see the other elements of that space, the user will use the mouse to select a button which would advance the viewport within the space using a deformation navigational technique as touched upon in [7]. The user can at any time specify either a local (performed on the elements of the current directory) or global (performed on all directories within the entire information space) search to bring to the forefront the files related to the search query. A local search would bring the files with the higher search related values to the forefront. Global searches would do the same but with the larger set. "Browsing" of the returned data would be done in a similar fashion as the initial deformation, allowing the user to move "into" the data, but by placing the previous first row at the last row of the space. Research will involve the study of various implementations of the technique described to transform the information space. Coding of this project will begin with representation of the information space, then the search-resultant representation of the information space and then progress to the navigation ("browsing") of that space. This project will be coded using Open Inventor(TM), part of the OpenGL Technical Library, to specify views after a search values have been obtained. Basic pre-defined 3D objects will be used in the representation; Gouraud shading used on those; the camera viewpoint remains the same since the information is "brought" to the scene rather than the camera venturing into the scene; and the user interface will be created using Open Inventor=92s interaction resources. Specific Objectives/Deliverables: Objectives for this project include creating a clear representation of the information space and providing a clear, yet powerful, tool to search/browse through the space. Deliverables for this project include a graphical tool which implements a well constructed interface for a deformation navigation which maintains the user=92s comfort and user comprehension of the information. Timetable: By April 3: A visualization of the initial information space, consisting= of one camera viewpoint and geometric forms with Gouraud shading. By May 1: The entire tool: a navigation tool for the represented information space. Additional: Cosmetic refinements to the visualization or additional navigational methods revolving around the same premise. References: [0] Fairchild, K.M., "Information Management Using Virtual Reality Based Visualizations," Virtual Reality: Applications and Explorations, Academic Press [1] Miller, G., "The Magical Number 7 Plus or Minus 2: Some Limits in our Capacity for Processing Information," Psychological Review, 81-97, 1956. [2] Freeman, E., and S. Hupfer , "A Model for 3D Interaction with Hierarchical Information Spaces," http://www.cs.yale.edu/HTML/YALE/CS/Linda/CHI95/CHI.html [3] Bolter, J., F. Hodges, T. Meyer, and A. Nichols, "Integrating Perceptual and Symbolic Information in VR," IEEE Computer Graphics and Applications, July 1995 [4] Gershon, N., and J. Brown, "The Role of Computer Graphics and Visualization in the GII," IEEE Computer Graphics and Applications, March 1996 [5] Card, S., "Visualizing Retrieved Information: A Survey," IEEE Computer Graphics and Applications, March 1996 [6] Eick, S.G., "Aspects of Network Visualization," IEEE Computer Graphics and Applications, March 1996 [7] Fairchild, K. M., S. E. Poltrock, and G.W. Furnas, "SemNet: Three-dimensional Graphic Representations of Large Knowledge Bases," in Cognitive Science and Its Applications for Human- Computer Interaction, Fairlawn, NJ: Lawrence Erlbaum Associates, 1988. [8] Gershon, N., "Moving Happily through the World Wide Web," IEEE Computer Graphics and Applications, March 1996 [9] Slater, M., M. Usoh, and A. Steed, "Taking Steps: The Influence of a Walking Technique on Presence in Virtual Reality," ACM Transactions on Computer-Human Interaction, Vol. 2, No. 3, September 1995
Problem Definition, Motivation and Goals ---------------------------------------- Nowadays, people have easy access to tens of thousands of digital images on the Internet. Also, the number of people who will have access to the Internet and the number of online image databases are exponentially increasing. Consequently, searching these databases becomes more and more important and manual searching through these image databases becomes much harder. This explains the current(and future) presence of search engines on the World Wide Web to allow users to easily query image databases. An important part of the search engine, which affects its usefulness, is the user interface used. Current interfaces of image database search engines use image keywords, sample images, image content- such as color, texture, shape- as the basis for their queries. My project is to build a user interface to these kinds of search engines where the user can draw or paint his "example" image which the search engine uses to find the closest image. But why did I choose this kind of query interface, called Query By Example? the answer to that question summarizes the goals for my project. First of all, in order the user interface be successful, it is necessary to communicate image data to and from the image database in a user-friendly manner. For instance, a user provides the content of the image to be searched by drawing/painting a rough sketch of it. This will give him/her better easiness and convenience than giving color, texture or shape as visual attributes of the required image. Furthermore, the result of the query will be more accurate since the user specifies exactly and explicitly what he/she wants. On the whole, this will give the user more satisfaction. Theoretical Issues, Related Works ---------------------------------- My project emphasizes on searching directly from an "example" image, without any further specifications from the user-either about the database images or about the particulars of the search itself. In the following, I consider the interfaces of other image database query systems. An important system for querying databases by image content, which was done at IBM, is called QBIC [4] (Query By Image Content). In QBIC, you can perform Text-only search (using image keywords), graphical only search, or text and graphical search. In graphical search, the user may specify a particular color composition (up to 5 colors); i.e. x%of color1, y% of color 2, etc of the target image. He/She may also might specify the color layout of the target image by filling the rectangles of the drawing area provided (9x12 rectangles) with the appropriate colors. The result of the query is 50 images displayed in a user-selected format- up to 4 rows by 8 columns of images. Also, the user can select one of the image displayed to be the basis of the next search or randomize his/her search. The Virage on-line image database query[5] has a simpler interface. First, 8(can be made 4,12 or 16) random images, out of 10,000, are displayed to you. You can either begin search directly or invoke another set of random images. Then, pressing any of the displayed images starts the search and the image which are "visually similar" are returned. Each of these images has a number below it which reflects the distance between it and the target image. So, the lower the number, the closer the image is to the target image. The term "visually similar" is based on four visual attributes: color, composition, texture and structure. You can vary the importance of each attribute by giving each attribute an appropriate coefficient(closer to 0.0 means less important, closer to 1.00 more important). As compared to the interface of my project , this one has the following drawback: Assume you are looking for a particular kind of picture that can easily be painted. While through my interface it can be done quickly, you might spend more than 10 minutes trying to look at the random images displayed to select the query image close to your request. Another query by image content whose interface is similar to mine is the work done by Charles Jacobs et. al [2]. The query image can be hand-drawn sketch or a scan of the image to be retrieved. Their program is written in C++ using OpenGL and Motif and runs on SGI workstations. Their interface can be summarized as follows: In a big rectangle, the user paints his image and the 20 highest-ranked targets appear in the same window at the right of the rectangle. Also, the work of Hirata and Kato [1], and Kato and Kurita[3] have similar style of user interaction than mine. In their system, called Query By visual example, they described a sketch retrieval method for full color image database. The user has only to draw a rough sketch to retrieve the original image and the system evaluates the similarity between it, i.e. visual example, and each part of the image data in the database automatically. They do that by performing edge extraction on user "example" queries and match these edges against those of the databases. References ---------- [1] Hirata, K. and T. Kato, "Query by visual example-content based image retrieval, In A. Pirotte, C. Delobel, and G. Gottlob, editors, Advances in Database Technology(EDBT'92), pp. 56-71, Austria, 1992. [2] Jacobs, C., et al., "Fast Multiresolution Image Querying", ACM Computer Graphics Proceedings, Los Angeles, CA, August 1995. [3] Kato, T. and T. Kurita, "A sketch retrieval method for full color image database-query by visual example", In Proceedings of the 11th IAPR International Conference on Pattern Recognition, pp. 530-533, Los Alamitos, CA, USA, 1992. [4] Niblack, W. et al., The QBIC project: Querying Images By Content Using Color, Texture, and Shape, IBM Research Division, February 1993. http://wwwqbic.almaden.ibm.com/~qbic/qbic.html [5] The Virage On-line Image Database Search Demonstration http://www.virage.com/online [6] Tutorial for Java Programming Language http://java.sun.com/tutorial/index.html Overview and System Input/Output -------------------------------- The program consists of a sketch tool that allows the user to manipulate images. Using this tool, the user builds his /her example image which constitutes the input to the main program. The latter processes this image and outputs a GIF-format image, which is the input to the search program. The results of this program are then displayed as "thumbnail" images. (See User Interface) Problem Constraints and Functional components --------------------------------------------- The program will be implemented using Java Applets. The reason is to allow the program to be run on-line using a Java-compatible browser, like Netscape 2.0 or higher. In addition, I will constraint the problem in the sense only simple drawn/painted 2D images can be considered. That is, my sketch tool is not suitable for complicated and detailed images like the picture of a person wearing a blue jacket, for example. It is really up to the user how patient and skillful he or she is to draw or paint an complex image. The program has two main modules: one to handle the sketch tool and one to process the image once the user submitted it. The sketch tool includes the main characteristics of a drawing and painting package. Mainly, you can draw a point, a line, a rectangle, a circle or a polygon. There is a color pallete (10 colors:light and dark) from where you could choose your color. Also, you can draw filled rectangles, circles and polygones with the color already selected. For painting functions, paint brush with six levels of thickness and a spray can are supported. Evaluating Criteria ------------------- The program will be evaluated by using miscellaneous examples and tested by running the image-database-search program written by Leonid Taylor (http://www.cs.bu.edu/students/grads/leo-lion/academic/WebWonder/jew.html) with input the output of my program. First, I will try with simple examples like painting a flower, or a simple image having two layers of colors. Then, I will try to try more complicated examples/tests. User Interface -------------- The user interface consists of a big window divided into two parts. The right part is dedicated to the output of the images after submitting the query. In fact, 21 "thumbnail" images corresponding to the highest-ranked targets, will be displayed in the order of their closiness the "example" image from left to right in row-major ordering. That is, the upper-left image is the one that matches the query most closely. Below the displayed images, there is a press button "view more" allows you to view more pictures, if any, that matches the query. The left part of the big window is the sketch tool. It consists of a canvas where the user draws/paints his/her image. At the right of it, there is the color palette, consisting of 10 colors, each of which can be light or dark. Below the canvas appears the tool bar which includes some drawing tools like a point, a line, a rectangle, a circle, and a polygone. Also, a filled rectangle, a filled circle and a filled polygon are available. The tool bar also include some painting tools like a paint brush and a spray can. There are six icons representing six levels of thickness to be used by the paint brush. Selecting a color or a tool is done by clicking on the appropriate icon using the left button of the mouse. Drawing and painting are done using the right button of the mouse. Below the tool bar, there are two push buttons : one for clearing the canvas, one for submitting the query after finishing drawing/painting. Possible Approach ----------------- Since I didn't program in Java before, first I would like to spend some time to be familiar with its style. With the help of on-line tutorials, I will start by having a simple program running, like draw a line, or a rectangle. Then I will add to that little by little (See Objectives) until I include all the drawing tools and then the painting tools. This will finish up the sketch tool. After that, I will change the format of the query image to fit into the image query program written by Leonid Taylor (See "http://www.cs.bu.edu/students/grads/leo-lion/academic/WebWonder/jew.html"). Having the integration of two programs done, I will try to redirect the output of Leonid's program into the interface of my program. Objectives ---------- I break down the project into 3 phases: Phase 1 to be completed by April 3, Phase 2 to be completed by May 1, and the third phase is left for my Master's Project. The first phase includes the support of drawing tools, mainly those that allow the user to draw a point, a line, a rectangle, a circle, a polygone, a filled rectangle, a filled circle and a filled polygone. Also, I should have the color palette done and the clear push button working. The second phase adds the painting tools, mainly the paint brush with six levels of thickness and the spray can. Also, an "eraser" tool should be added and the general shape of the user interface (including the part corresponding to the output of the image query program- Leonid's program) will be done. Furthermore, the program should be integrated with Leonid's one and we should have the total image query system working. The last phase will put some enhancements on the sketch tool and the general shape of the user interface. Such enhancements are like having a Text tool with a few fonts, supporting an Undo operation, a Select, Cut and Paste Operation so that the user can select a part of his sketch to be the query image or combine two parts to form the query image. These enhancements will also include any refinement found necessary after testing the program in phase 2.
Problem definition, motivation, goals ===================================== When an artificial neural network is working as expected, an attempt to intimately understand what units in the network are encoding is (unfortunately) often foregone. Nonetheless, when the network is producing other than the desired behavior, it is necessarily to determine what these units are doing, and is where visualization techniques are useful. There are many software packages that simulate neural networks and that incorporate approaches to visualization that have evolved over time. However, these packages often require the user to design networks in their own formalism and either have few mechanisms to visualize what a network is doing or include the "kitchen sink" and thus have steep leaving curves. Many times, a modeler writes his own neural network simulations using a procedural or object-oriented language. This reduces the learning curve and gives the modeler much flexibility. Nonetheless, writing visualization tools for a particular model can be a daunting task. My goal is to create a generic toolkit of Java classes that encapsulate visualization of neural network states and development. I chose Java because of its platform independence and object-oriented design. Theoretical issues, previous work ================================= Visualization techniques, especially of neural networks, have often been addressed as something less than an science. Moreover, since the state of a neural network is encoded in its weights, any visualized information of the network will display information about these weights--either directly or indirectly. It is typical for modelers to display information such as responses of cells in the network; however, the decision as to what to display and how to display it can be critical for giving intuition about what a network is doing. For certain problems, such as networks with a single layer or where it is clear what the weights of the network should be doing, determining what information to display can be straightforward. For example, in a simple associative memory [1], where the task of the network is to learn an association, displaying the output of the network can give a lot of intuitive about the success of the network. Similarly, with a self-organizing feature map [2] being trained to produce a topographical map of a 2-dimensional input space, plotting the weights in two dimensions can show how the network unfolds from a random shape into an orderly map of the input space. These types of networks, however, tend to be an exception in that there is some natural or geometric interpretation of what they are doing. It is not at all clear, for example, what the internal, "hidden" units of a backpropagation network should be doing. Some techniques used in packages are to plot error rates of the network and display the magnitude of particular weights or cell responses using squares of various sizes [3]. There is even one method that uses this squares technique and tries to represent each cell's weighted connection with all other cells. While these techniques can give relevant global and local information, it can be difficult to reconstruct "the big picture" from these pieces. There are other visualization techniques that have been motivated by other disciplines. For example, in neural networks used to model biological visual processing, modelers often produce "tuning curves" of unit responses, similar to the tuning curves obtained from single cell recordings in the brain [4]. Another method is to use statistical techniques to cluster inputs based on the response of a particular cell [5]. The shortcomings of these techniques are that they are applied after the fact, i.e., after training is complete. One of the more natural visualization techniques is to use color to show the evolution of a network [6]. Colors have an intuitive relation with hot (red) and cold (blue) that extends nicely to the concepts of "excitation" and "inhibition" is neural networks. Problem constraints, system input/output, evaluation criteria ============================================================= I will limit testing of my visualization techniques to small or medium sized backpropagation networks. Such networks are an appropriate testing model because their learning algorithm is well-understood, but the consequent behavior and organization of hidden units as a result of the learning algorithm is a difficult problem. I will evaluate my system on a simple learning problem, such as the XOR problem, in which it can be determined by other means, what hidden units are encoding. Since evaluating how good a visualization is can be very subjective, I will justify my approach by explaining what the hidden units are encoding and then describe how the visualized information displays that same information. Possible approach(es) you will take, and how you will structure the research/coding task ======================================= The primary approach I would like to pursue is the use of color coding to display the evolution of weight changes and unit responses as a network trains and as testing patterns are presented. Nonetheless, if color coding does not meet the evaluation criteria, then other techniques that lead to better intuition will be explored. I will write the graphical user interface in Java's windowing library There will be some initial coding time spent on creating the infrastructure for the user interface (e.g., designing main window layout, pull-down menus, etc.). In parallel, the basic design of the Java class hierarchy for visualization will be performed. The rest of the cycle will involve taking a particular learning problem, running it with the visualization, evaluating what works/doesn't work, and making appropriate changes. Amongst the issue to resolve in this testing stage are how effectively can updates be made to the graphical display while a network is training. Specific objectives/deliverables: code, demo, algorithm ======================================================= One objective will be to determine a Java class design for visualization that acts as a layer above and separate from classes that implement the neural network functionality, i.e., a tradition separation of "user interface" from "functionality". In addition, a strategy for assigning colors to weights and connections must be described. The final report will describe the above design issues, implementation issues (e.g., any physical limitations) and evaluation of my visualization approach on one backpropagation learning problem other than the problem used during the development cycle. What will be done by April 3? What will be done by May 1? ========================================================== By April 3, I will have the: o The initial Java windowing code; o A basic design of the Java classes for visualization; and o The ability to run a simple backpropagation learning problem with visualization. By May 1, everything will be done. --------- References: [1] Kosko, B. (1987) Constructing an associative memory. Byte: The Small Systems Journal, 12(10), 137-144. [2] Kohonen, T. (1982) Self-organized formation of topologically correct feature maps. Biological Cybernetics, 43, 59-69. [3] Miyata, Y. (1991) A User's Guide to PlaNet Version 5.6: A Tool for Constructing, Running, and Looking into PDP Networks. [4] Graziano, M. S. A., Andersen, R. A., and Snowden, R. J. (1994) Tuning of MST Neurons to Spiral Motion. Journal of Neurosci. 14(1), 54-67. - This article contains an example of the tuning curves plotted by physiologists. [5] Gorman, R. and Sejnowski, T. (1988) Analysis of Hidden Units in a Layered Network Trained to Classify Sonar Targets. Neural Networks, 1, 75-89. - This article uses a clustering technique to evaluate how a particular hidden unit responds to sonar patterns. [6] 23rd meeting Society for Neuroscience. (1993) Neural Simulation Software Demonstration. - This is a web page that reprints descriptions of the software demonstrated at this meeting. The software package NEMOSYS is one such that used color-coding. (I can dig up the URL if you ask me.)
I. GOAL _______ Use energy minimization functions to control motion of multiple dependent graphical objects. The use of energy constraints to animate a few objects was demonstrated by Witkin, Fleischer and Barr. But in their presentation, a few significant limimtations appeared which they do not specifically address. The two I wish to handle in this project are: automatic local minima avoidance: e.g., parts A and B wish to connect, but are separated by C, which is not moving. Or C separates A and B, while B separates C and D, so both B and C are gridlocked. An automatic approach is desired to avoid move out of gridlock without requiring user intervention. Such a system might involve random movement for a prescribed distance once gridlock is detected, or prioritizing objects such that C must make way for B, etc., if they become gridlocked. A particular approach must be determined and implemented. handling complex objects: e.g., parts A and B move into alignment as part of each's constraints. How can a merged object AB be formed, which will handle all open contraints of A and B? This may be necessary for two reasons: 1) Simplification. It is desirable to form hierarchical objects whose constraints can be simplified once their interrelationship constraints are satisfied. 2) Avoiding dithering or blocked motion. Consider objects A, B, C and D which are to merge to form ABCD. Suppose A and B merge, and C and D merge. If objects are moved sequentially, A cannot move toward CD since B blocks it. B will not move toward CD since that would mean detaching itself from A, or it may bounce between A and C. Similarly, C and D would not move. However, if A and B form a new object, AB, which inherits their unresolved constraints (namely, that B must join C), the new object will then be able to move toward C (or CD). time-dependent constraints (optional): Create a simple mechanism for the user to specify dynamic constraints rather than just constant, a priori constraints. II. BACKGROUND ______________ The approach of Witkin et al is very elegant, and permits an operator to specify complex interactions with just a few mouseclicks, once basic energy functions have been written and correctly parameterized. An advantage is that once the functions have been created, they can be easibly re-used, or copied for use by multiple objects simultaneously. As shown in their paper, there are some serious limitations to overcome for a more powerful system consisting of multiple, self-animating objects. These limitations are those mentioned above in the Goals section above. This work has relevance both to low-level motion, and to slightly higher-level animation such as that done for autonomous actors in virtual life programs. In particular, the idea of using energy minimization functions to handle gridlock is really a higher level of abstraction. It is also possible to see how such an approach could be used both by high-level intention functions (which is essentially what is involved in the decision to "get out of gridlock") as well as low-level motion functions (e.g., "move closer to object A"). Some relevant papers: "Energy Constraints on Parameterized Models," by Witkin, Fleischer and Barr, in Computer Graphics, Vol. 21, Number 4, July 1987. "Through-the-Lens Camera Control," by Gleicher and Witkin, in Computer Graphics, Vol. 26, Number 2, July 1992."Dynamic Constraints," by Barzel and Barr. "Using Dynamic Analysis to Animate Articulated Bodies Such As Humans and Robots," by Wilhelms and Barsky, Graph;ics Interface 1985. III. CONSTRAINTS ________________ Since the purpose of this project is to concentrate on self-animation, the objects involved will be graphically simple, but will embody all of the key areas identified in the Goals section. Toward that end, the objects will be pieces of a 2-dimensional puzzle which will occupy less than 50% of a display window. The pieces will be randomly distributed about the space when the program begins, and the object of the program will be to "solve" the puzzle purely through self-animation of the pieces. The pieces will each have at least three types of energy functions which they must satisfy: 1) adjacency: each must connect to its neighbors in the proper pattern. 2) boundary: pieces cannot leave the "table" on which they are initially placed. 3) intersection: pieces must remain in the plane; they should not ovelap each other at any time. In addition, a fourth type of constraint may be added, to cause the final puzzle to be placed in the precise center of the screen when complete. This last constraint could be implemented by forcing any single piece to end in its correct absolute position. All other pieces would (should) then end in their correct destinations solely based on their relationships to the anchoring piece. IV. APPROACH ____________ Implement using polygon structures filled with portions of an image (the picture on the puzzle). In the initial phase, only a few, rectangular pieces will be allowed, and they will only be translated, not rotated. Pieces will move sequentially, a few pixels each step (step size to be determined). All motion will be determined solely on the basis of energy minimization functions (including overlap and border detection cases). The emphasis in Phase 1 will be on getting the basic energy functions and motion to work. In Phase 2, the local minima problem will be addressed, along with formation of complex objects and the inheritance of sub-object constraints. In Phase 3 (time permitting), the complexity of the puzzle and the graphics involved will be expanded. In order of preference, possibilities include rotating objects; increasing the number of objects; cutting the puzzle into non-rectangular objects; handling 3-D objects. V. OBJECTIVES _____________ The basic objectives are to complete phases 1 and 2, which will meet the overall goals. In addition to showing the pieces of the puzzle and the table boundaries, user-selectable connection points will be displayed. The puzzle pieces will then be pseudo- randomly scattered about the table, and will automatically move themselves back into the original puzzle image. Some mechanism will be provided to avoid gridlock. This mechanism will preferably use the energy minization functions, though it may not in the first iteration of the program. VI. SCHEDULE ____________ Phase 1 to be completed by April 3, Phase 2 by May 1, write-up by last cs680 meeting or instructor deadline. Phase 3 to be done only if time permits.
Modelling transparent and semitransparent solids in computer graphics can be very computationally expensive due to the raytracing required to simulate the behavior of light when passing through these mediums. In my project I propose to use environment mapping and transparency in order to approximate refraction and reflection of light waves. From this project I hope to gain a better understanding of some of the physics of light behind refraction, and reflection, at both the microscopic and macroscopic levels and the math to model these properties. I also want to learn about the wavelength dependence of the index of refraction, and how this can be used to model various types of materials. Refraction Basics, the physics: Refraction is the macroscopic phenomina where light rays, when passing from one refractively transparent medium to the next, change their trajectory. According to Foley and vanDam, the concept behind refraction is based upon the index of refraction. The index of refraction of a medium is the speed of light in a vacuum divided by the speed of light in that medium, according to F&vD. This is not entirely true, because the speed of light is always a constant. The reason the speed of light may seem like its slower in a refractive material is because at the atomic level the light is going through a series of reflections off of molecules within the material thus giving the illusion of a slower speed. Also, because the reflection of light at the atomic level is highly dependent on wavelength (and phase for that matter), the index of refraction is wavelength dependant. These properties can make modelling of light very difficult at microscopic detail levels Modelling Refraction: However, to model refraction at the macroscopic level, one way to do it is by using Snell's law. Snell's law states that the sin (t1)/n1 = sin (t2)/n2 , t1 and t2 are the complement of the angles of incidence to the plane where the two refractive materials intersect (the surface of the object usually), and n1 and n2 are the indices of refraction for each of the refractive materials.. The most common way to apply Snell's law to 3d modelling is to cast a ray for every pixel that needs to be diplayed and use Snell's law to trace the path of light to back to each light source. My Approach to Modelling Refraction: My approach will be to only cast rays for each vertex of the refractive material's object model and use the results of the raycasting to find the direction vectors of the rays exiting the object model. These directional vectors can be transformed into environment map texture co-ordinates. Now that there are sets of texture coordinates for each vertex of each polygon that make up the object, the object can just be drawn using standard texture mapping techniques.. For chromatic dispursion I will use the same approach, but use multiple passes with 6-12 different frequency ranges of light to encompass the entire visible light spectrum. To generate the model, the polygons will be drawn using 6-12 layers of transparency, each layer being a filtered color map of the environment map representing each color frequency range. Biblography so far: Graphics Gems II: Section V.8 "A Body Color Model: Absorption of Light Through Translucent Media" by Lee and Uselton Computer Graphics, Principles and Practice: Chapter 16 "Illumination and Shading" by Foley, vanDam, Feiner, and Hughes Siggraph 94: "Wavelength Dependant Reflectance Functions" by Gondek, Mayer, and Newman The Visual Computer, 2(1) January 1986, 3-8 "Dispersive Refraction in Ray Tracing" by Thomas, S.W. Procedings of Graphics Interface '89 , "Prisms and Rainbows: A Dispersion Model for Computer Graphics" London, Ontario, June 19-23, 1989, 227-234
Motivation, Project Goals I would like to address the problem of data structure visualization. Instructors spend an enormous amount of time drawing data structures on the blackboard. Much of this information is not really absorbed by the audience; for instance, it is very difficult to copy an undirected graph from the blackboard and be able to pay attention to the discussion as well. When it comes time to do homework problems, students may find it hard to recall the procedures clearly or to envision what kind of solution is possible. One way to alleviate this is to put interactive versions of the algorithms and data structures on-line so that students can tweak them at their own pace and get a feeling for how they work. The goal of this project is to maximize the potential audience for such a system while minimizing the effort it takes for an instructor to prepare an interactive example. To maximize the audience, I intend to implement the project as Java applets, so that students do not have to stand in line waiting for licensed software on a special-purpose computer just in order to see how, say, heapify() works. To minimize the instructor's implementation effort, I plan to locate most of the interaction (display, picking, modification) code as methods in the data structures themselves, so that the algorithms remain as pure as possible. Theoretical Issues, Previous Work There are not very many hard-core graphics issues in this project; it mostly amounts to a user interface problem, where the two kinds of users are students and instructors. In maximizing the audience, I mean to support as many display platforms as possible. This means I cannot make many assumptions about screen real estate or font availability. One of the earliest disseminated examples of Java code was the visual sorting applet written by James Gosling. Following is the entire bubblesort algorithm: class BubbleSortAlgorithm extends SortAlgorithm { void sort(int a[]) throws Exception { for (int i = a.length; --i>=0; ) for (int j = 0; j a[j+1]) { int T = a[j]; a[j] = a[j+1]; a[j+1] = T; } pause(i,j); } } } Most of the above code is pure bubblesort; only the lines containing stopRequested and pause() have to do with the interactive animation. All of the rest of the visualization code is hidden in superclasses. This is the sort of low-impact adaptation of standard algorithms that I'm looking for. Jeffrey McWhirter at WPI has done extensive work on visual programming and has built ``Escalante'', a large C++ framework for easily building visual systems. See his PhD thesis at http://cs.wpi.edu/~jeffm/research/papers /thesis.ps.Z. One application of his Escalante system is Faret: An Interactive Visual Learning Environment for Finite Automata and Regular Languages, a brief description of which is available at http://cs.wpi.edu/ ~jeffm/research/papers/Faret95.ps.Z. Although I have not yet investigated this material very deeply, I do expect that it will offer insight and hopefully point out some pitfalls. The BU Computer Science Department has used the Turing's World software, still available in the department Macintosh lab, in its Automata classes. Its main disadvantage is that it is licensed software available only for the Macintosh, and therefore not extensible for other applications. A very good description (with an icky background color) is available at http://csli-www.stanford.edu/hp/Turing1.html. Sklenar, Kudelka, and Snasel (diacriticals callously dropped) at the Palacky University of Olomouc describe a tool for algorithm visualization and provide an package of examples. Their UI may be useful, though I could not find much English description. Another potentially important (but still unexplored) resource is the journal ``The Computing Teacher''. Problem Constraints, I/O, Evaluating Criteria At first I will concentrate my attention to developing displayable and mutable (by mouse and keyboard) undirected graph and array objects. This should be enough to get a rudimentary DFA acceptor algorithm running. I have chosen this as the initial goal because neither the algorithm nor the data structures are very complex, and the finished program could play a useful role in the department's curriculum. It will be a straightforward modification to the basic deterministic accepting algorithm to illustrate different acceptance semantics such as existential and universal nondeterminism. This kind of modification will take place at the source code level but will hopefully be simple enough that anyone mildly familiar with Java will be able to tackle it without having to touch the UI code. Other data structures of interest include trees, heaps, etc. Having not yet programmed in Java, it remains unclear to me how much background UI code needs to be developed. Certainly some good high-level support is provided. For instance, see http://www.javasoft.com/ applets/applets/GraphLayout/example4.html for a very reasonably-sized applet that uses energy minimization to realize geometry for a graph. On the other hand, I haven't seen scroll bars or menus in a Java applet yet, and it's not clear if I will have to implement them myself. Structure of the Project Initially I will more carefully examine existing work to sort out design issues that I'd rather not surprise me at the wrong time. For instance, I would like this system to be able to illustrate the subset construction for removing nondeterminism from an NFA. This means that the underlying algorithm has an input graph and an output graph, and so the graphs must negotiate appropriate screen positions. This is rather more complicated than just highlighting individual nodes to indicate a DFA computation. I would eventually like the system to be able to illustrate the heapify() function for maintaining a heap. This means, in addition to drawing a heap structure, the system has to be able to represent recursive calls and multiple scopes. On the other hand, I am not intending to write a full-fledged debugger, so these issues must be carefully balanced. Ultimately I imagine implementing highly virtualized class definitions. For instance, an array object should know when its 5th element is being referenced so that it can highlight that box on the display if appropriate; this requires intercepting almost all ordinary access. A standard objection is that this slows things down too much, but in a visual, step-by-step environment, this does not seem like a serious concern. Deliverables Source code; a demonstration; a report describing design and implementation tradeoffs and instructions for writing algorithms using the supported data structures. Milestones Designing class structure for supported data types Displaying / editing an array: setting bounds, writing elements Displaying / editing a graph: adding, deleting nodes, setting edge labels, specifying layout, etc. (Above hopefully completed by April 3) Multiple objects negotiating display real estate with each other, framed and scrollboxed if necessary Running the visual nfa computation algorithm Running the visual nfa subset construction algorithm Writing the final paper (Above to be completed by May 1)
For the final project, I plan to model the growth of plants under a variety of conditions. The plants will begin as seedlings, merely a simple stalk and will grow into various forms based on gravity, stress, sunlight, robustness (thickness and bushiness). Plants will be relatively simple in design, consisting of only branches (the stem will be considered a branch) and leaves.
The plants' growth patterns and development will be based on L-systems for their shape, due to the natural plant-like self-similarity that arises. Additionally, constraints due to gravity and plant strength, the plant's desire for sunlight, and a factor of randomness will contribute to the growth of different types of plants. In previous work, Prusinkiewicz et al. developed techniques for modeling herbacious plants using L-system representations and axial structures. Hearn and Baker discuss fractal methods and self similarity in relation to modeling of trees and branches. I will be using stochastic, context-sensitive L-Systems to model the branches in their growth through several generations of growth. This will provide the simulation with random "motion" in the growth, similar to the randomness found in the chaos of nature. The context-sensitive nature of the L-system will provide different growth patterns based on the information of the parent. I will be representing the L-Systems not as a string which will be parsed and re-written, but rather as a tree, with various nodes that can modify their structure, and their surrounding structures in order to advance the growth. So the nodes will only affect their local area, but the tree will globally arise in parallel.
The problem will be constrained in several ways. First an upper bound on the size of the plant will be imposed (but as yet undecided upon). This is necessary because theoretically a plant could keep growing until its computational complexity becomes too much for the simulator to bear. In the simulation, outside elements such as weather, predators, nutrients, etc. will not be included, due to the obvious complexity they would add to the system. The plant itself will be composed simply of branches and leaves. Flowers, buds, and other attributes of a plant will not be modeled. The input to the system will consist of user interface widgets (sliders, etc.) in order to adjust the peculiarities of the specific system. It is still undecided as to whether dynamic manipulation of the system will be allowed during the growth process. The output of the system will be in the form of an animation on the screen as the plant is being rendered. I am also debating whether or not to include output in the form of a text file which could be read by a popular ray tracing package (such as POV) for detailed ray traced images.
The tree will be represented in 3D, with a heirarchical structure of branches. The branches will hold information about themselves (size, children, etc.), and the terminating nodes on the tree will be considered leaves. I will be writing the code demo in OpenGL on the SGIs in the graphics lab. The interface will consist of an output window, which displays the rendered view of the tree at each iteration, and the input window, which allows the user to modify various settings of the environment.
A general outline of the program's run will be as follows: 1) Get parameters (from user input). 2) Generate L-system rules based on tree parameters. 3) Compute next generation of growth. 4) Compute modifications on the branches of the tree. 5) Modify the tree's structure based on the pull of gravity. 6) Unless the limit of growth has been reached, go to 3.By April 3rd I hope to have the framework for the application completed. This includes user interface, general code structure outline, and stubs for all functions that are to be implemented. I would like to have all my research on the subject completed by this time, so that in the following weeks I can concentrate on building a working model. I also plan to have at least simple L-System generated plants working, although they may not yet be affected by the previously mentioned forces acting on them. By May 1st I plan to have a complete working model of the plants, complete with animation and user control/interaction.
Stan Sclaroff
Created: April 15, 1996