A Proposal for Versioning Support for the Chimera System

E. James Whitehead, Jr.,
Kenneth M. Anderson,
Richard N. Taylor

Department of Information and Computer Science
University of California, Irvine
Irvine, California 92717-3425
FAX +1.714-856-4056
e-mail: {ejw,kanderso,taylor}@ics.uci.edu

Abstract

This paper proposes a method of versioning hypertext structures which are stored separately to versionable objects. Key requirements are listed, including the need to maintain subsets of the hypertext structure which might be affected by edits to a given object version, and a means of associating these hypertext structure subsets to externally stored object versions. Central to meeting these requirements are the concepts of configuration and version association table.

Key Words and phrases: hypertext versioning, configuration management, heterogeneous databases, hypertext, link servers, separation of concerns, software development environments.

This material is based upon work sponsored by the Advanced Research Projects Agency under Grant Number MDA972-91-J-1010. The content of the information does not necessarily reflect the position or the policy of the Government and no official endorsement should be inferred.

1. Introduction

Existing software development environments are characterized by heterogeneity, especially with respect to object management. When developing software it is desirable to capture relationships between requirements, design, test cases, and code using hypertext links. However, the design tool may have a proprietary data repository (such as Software thru Pictures [8]) which cannot be versioned, and the requirements documents and the code files may be under the control of a configuration management system with its own proprietary object database (such as CaseWare/CM [2]). This makes it unreasonable to assume that hypertext data (such as anchors and links) can share the same object repository as the data they relate. Neither the Chimera [1] nor the Microcosm [3] system make this assumption, instead providing a strong separation between hypertext data storage and object data storage.

Versioning of hypertext structures in situations where the hypertext structure is maintained in one repository and the objects being related are maintained in one or more separate repositories is significantly more difficult than the case where both hypertext structures and objects are stored in the same repository. However, separating hypertext structure from objects allows a clear separation of responsibility between hypertext structure versioning and object versioning. This separation is the fundamental departure point between this proposal, the Palimpsest data model [4] and the CoVer version server [5], which address versioning of hypertext structure and objects simultaneously. The versioning system used in HyperPro [6] is better in that it separates anchor storage from the objects themselves, but it too takes responsibility for versioning objects. However, with versioning systems such as RCS [7] taking up 13K lines of code, and modern commercial configuration management systems such as CaseWare exceeding 750K lines of code [2], it makes more sense to separate hypertext structure versioning from object versioning and thus avoid duplication of existing object versioning solutions.

This paper presents a proposed extension to the concepts presented in the Chimera architecture to allow for hypertext versioning support in a heterogeneous object storage environment containing a configuration management system. While this work has not yet been validated by the augmentation of the Chimera system according to this proposal, it is expected that this work will occur in the Fall of 1994.

2. Requirements

Hypertext versioning support for a heterogeneous object store environment shares many of the requirements identified for the non-heterogeneous case. To summarize an excellent description in [5], these requirements are: support for versioned and unversioned objects, versioning for all hypertext concepts (anchors and links), allowing for multiple version branches, and the ability to regress to a given version of an object. There are also additional requirements for the heterogeneous object storage case:

  1. Objects stored external to the hypertext system and hypertext structures stored internal to the hypertext system must be capable of independent development. The hypertext concepts must be capable of manipulation independent of the external system, and vice-versa. Hypertext versions will get out of synchronization with object versions -- a hypertext versioning system must accommodate this.
  2. It must be possible to have multiple named and versionable subsets of the overall hypertext structure. Using these subsets, it is possible to capture the portion of the hypertext structure that might be affected by changes to an object.
  3. There must be some method of associating an external object version to a hypertext structure subset version, and vice-versa. This association provides the bridge between the external object versions and hypertext structure versions, allowing a given external object version to be mapped to the matching hypertext structure version.
  4. The versioning of hypertext structures should occur as transparently to the user as possible. The versioning system should not constantly prompt the user for version names, instead automatically selecting and assigning version names where possible.
  5. Demands on external tools should be minimized. Hypertext server systems assume that some level of modification of external tools will be required to use the hypertext functionality of the server. However, this modification should be as minimal as possible to ease tool integration.
  6. An external object repository must be able to retrieve an arbitrary version of an object. This is necessary because a link may contain an anchor which refers to an older version of an object, which must be retrieved to complete a link traversal.
  7. An external object repository must provide a function which, given an object, return its version. This function is needed to identify an object's version so the correct hypertext structure version can be used while the object is being manipulated.
  8. An external object repository must provide a function to return the immediately preceding version of an object. This function is employed by the hypertext system to determine if an external object has been checked-out since the last time its associated hypertext structure was accessed. This will occur every time an external object is checked-in, then checked-out, as external object repositories are not required to inform the hypertext system of actions upon objects.

The next section will briefly describe the Chimera system, and is followed by a section describing how the Chimera system may be augmented to meet the hypertext versioning requirements, numbers 1 - 5. Meeting the external versioning object management system requirements, numbers 6 - 8, is not discussed, as it is assumed these features can be found in modern version control systems. For example, the RCS command `co' meets requirement 6, `ident' meets requirement 7, and parsing the output of `co' or `rlog' meets requirement 8. The requirements do assume the use of a check-in/check-out system, and it is unclear whether the approach outlined in this paper would extend to systems that did not employ a check-in/check-out protocol.

3. Briefly, Chimera

The Chimera system, described in great detail in [1], is a hypertext server which provides persistent storage of a hypertext structure, functions for manipulating that structure, and support for traversing links within the structure. Key differences between Chimera and other hypertext systems are that Chimera does not store hypertext structure within objects related by that structure, and Chimera associates anchors with views of objects, not the objects themselves. This latter difference is critical for supporting multiple simultaneous views of an object, such as when spreadsheet data is being viewed simultaneously as a grid and as a graph -- it is quite possible that there will be anchors on the graph that will not appear in the grid representation.

Chimera hypertext structures are composed of instances of the following concepts:

Throughout this paper, an instance of any of the above concepts is referred to as a concept instance. For example, an individual anchor or an individual link is a concept instance. All concept instances within the Chimera system are tagged with a unique concept instance identifier upon creation. An instance of a particular concept may be described more specifically, as with an instance of an anchor being described as an anchor instance. Similarly, a particular version of an individual anchor is referred to as an anchor instance version.

Chimera clients access the functionality of the Chimera server through a programmatic interface known as the Chimera application program interface (API). Chimera currently has an API for the Ada language and the C language.

4. Meeting the Requirements

The primary change required of Chimera concepts is to add the time dimension by making all concepts versionable. With versioning added, each concept instance becomes a forest of directed acyclic graphs (DAGs) of component instance versions. Directed acyclic graphs naturally occur in versioning systems which allow parallel development. Tree-like structures occur when branch development is allowed, with a fork in the development path occurring at one or more objects. When the possibility of merging development paths is permitted (necessary for systems such as CaseWare which allow simultaneous parallel development of a single object), the tree structure evolves into a DAG, as two branches (and two objects) are merged together.

A forest of DAGs gives the advantage that long independent branches can be completely separated from their original branch, easing the identification of these parallel development paths. In code development, independent branches are useful in code reuse situations where an object may be derived off of an existing version, then used in a different product. The object will evolve through a largely independent development path, but still be capable of merging changes from other DAGs in the forest as errors are repaired. It is anticipated that the flexibility inherent in forests of DAGs will be necessary for hypertext structure versioning.

Versions are named using the reverse outline numbering scheme. With this scheme, the next version function is defined as follows. Assume an initial version number of 1. If a branch is made from an object, add the string `.1' to the end of the object's version. If a new sibling is derived from an object, add 1 to the last digit of the version. Given the version `1.1', the next version function returns `1.2'. Given the version `1.2', a new branch would be assigned the version `1.2.1'.

The Chimera server is responsible for providing storage for concept instance versions. It is anticipated that RCS-like backward deltas will be used to minimize the storage requirements of each version.

The Chimera hypertext structure versioning scheme presented in this section allows for modification of the hypertext structure independent of object modification. For example, modifications to a hypertext may be made using a Chimera web editor without any objects being notified or changed. This meets hypertext versioning requirement 1, independent development of hypertext structure and objects. Note that such independent development will cause the hypertext structure and the objects to become unsynchronized. The following paragraphs present a policy, supported by specialized data structures, that maintains synchronization between separately versioned objects and hypertext structure. To this end, the Chimera server provides two new concepts, configuration and version association table, described below:

A configuration is a named set of concept instance versions, with the special property that each concept instance will be represented by at most one concept instance version. A concept instance version may be a member of multiple configurations and thus a configuration does not exclusively contain a concept instance version. For example, anchor instance 4 may be represented by only one version in a given configuration version, and so either anchor 4,1 or anchor 4,2 could be a member of a configuration version, but anchor 4,1 and anchor 4,2 could not both be members of the same version of a configuration. However, anchor 4,1 could be a member of two different configurations. A configuration can be thought of as a collection of pointers to concept instance versions. A configuration is versionable, and the membership of the configuration and the versions of concept instances may all change from one version to the next. A hypertext, and two configurations derived from it are shown in Figure 1.

+----------------------------------+      +-----------------------------------+
|+---------+         +---------+   |      |+---------+           +---------+  |
||Anchor1,1|         |Anchor2,5|   |      ||Anchor1,1|           |Anchor2,5|  |
|+---------+\        +---------+\  |      |+---------+\          +---------+  |
|        |   \                   \ |      |     |      \1,1            |      |
|View1,6 |    \1,1                \|      |     |2,5    \              |3,4   |
+--------|-----\-------------------\3,4   |+---------+ +---------+ +---------+|
         |2,5   \                   \     ||Anchor3,2| |Anchor4,2| |Anchor7,4||
+--------|-------\-----------------+ |    |+---------+ +---------+ +---------+|
|+---------+      \+---------+     | |    |                 ________________  |
||Anchor3,2|       |Anchor4,2|     | |    +----------------|Configuration1,5|-+
|+---------+       +---------+     | |                      ----------------
|           +---------+            | |
|           |Anchor5,1|            | |    +-----------------------------------+
|View2,2    +---------+            | |    |     +---------+      +---------+  |
+---------------/------------------+ |    |     |Anchor5,1|      |Anchor2,5|  |
               /7,3                  /    |     +---------+      +---------+  |

+-------------/--------------------+/     |        /7,3               |       |
|       +---------+                /      | +---------+               |3,4    |
|       |Anchor6,2|               /|      | |Anchor6,2|           +---------+ |
|       +---------+   +---------+/ |      | +---------+           |Anchor7,4| |
|                     |Anchor7,4|  |      |                       +---------+ |
|View3,4              +---------+  |      |                 ________________  |
+----------------------------------+      +----------------|Configuration2,5|-+
                                                            ----------------

Figure 1: Configuration Example. On the left hand side, a sample hypertext structure is given, showing three views, anchors defined on those views, and lines representing links containing these anchors. Number pairs are concept instance identifier, version number pairs. Number pairs next to lines refer to links. On the right, Configuration 1,5 contains all hypertext objects that would be affected if the object presented in View 1,6 is modified. Configuration 2,5 displays the same subset of the hypertext structure for View 3,4. Chimera objects and viewers have been omitted for clarity.

The purpose of the configuration concept is to name, collect, and version the subset of a hypertext structure which might be affected by modifications to an external object. With this named, versioned subset, a mapping to an external object version is possible and easy to express. A configuration also makes it convenient to create new children or branches of a hypertext subset upon the creation of a new object version. The configuration concept meets hypertext versioning requirement 2.

A configuration supports several operations upon itself and its members. Since a configuration is a set, it is possible to create and delete a configuration, and add and delete concept instance versions from a configuration. It is also possible to create new versions of all concept instances currently within the configuration, and replace the current concept instance versions with the new, writable versions. This is equivalent to a `check-out' of all concept instance versions in a configuration. Another operation allows all concept instance versions within a configuration to be transitioned to a non-writable state. This is equivalent to a `check-in' of all concept instance versions in the configuration.

While checking-in a configuration is a relatively straightforward operation, checking-out a configuration is a tricky process. When a configuration is checked-out, new versions will first be created for all container concepts in the configuration -- views and links. Next, new versions of all contained concept instance versions will be created. For example, once all new link versions have been created, all anchors contained by these links will have new versions created. Finally, the new versions will be placed within their containers. Finishing the example, new anchor versions are placed inside the links that contain them, replacing the previous anchor versions.

The check-in and check-out operations on a configuration, combined with the automatic version numbering provided by the next version function, provide functionality which partially satisfies hypertext versioning requirement 4 (hypertext versioning transparency).

The version association table maps external object versions to configuration instance versions. Given the name and version of an object stored external to the hypertext system, the version association table will return the associated configuration instance version. Similarly, the version association table will return an object version associated with a configuration instance version. The version association table is managed by the Chimera server, which provides version association table lookup functions. The version association table directly satisfies hypertext versioning requirement 3.

The requirement that provides for independence of versioning between the hypertext system and the external object store (requirement 1) ensures that lookup failures will occur. Determining what to do when a configuration lookup does not return a configuration instance version is one of the most difficult problems inherent in this approach to hypertext versioning. Proposed methods for handling this problem are addressed in the example section below. A sample version association table is shown in Figure 2.

            Object Version | Configuration Version
            ---------------+----------------------
                defs.h,1   |    C2,1
                defs.h,2   |    C2,2
                defs.h,3   |    C2,3
                defs.h,3.1 |    C2,3.1
                defs.h,3.2 |
                defs.h,4   |    C2,4

Figure 2: Version Association Table Example. External object versions are listed on the left side of the table, while corresponding configurations are listed on the right hand side of the table. Note that this is an association table, which can perform lookups of object versions from configuration versions, and vice versa.

To support the requirement for transparency (requirement 4), the Chimera server will maintain a set of active configuration instance versions for each user. If an operation is performed on a given concept instance, it is assumed to be performed on a concept instance version found in one of the active configurations. Thus the current version of a concept can be automatically selected without user intervention. This allows the existing Ada and C application program interfaces for the Chimera system to be used without modification, since they refer to concept instances using a concept instance identifier, and have no built-in support for providing version names. This should help to minimize the impact of the adding hypertext versioning capability to existing applications, partially meeting hypertext versioning requirement 5 (minimize tool impact).

5. Effects on the Chimera API

Support for configurations and versioning of hypertext structures has several implications for existing functions in the Chimera API. First, the Chimera API needs to provide access to operations which act upon configurations and version association tables. Additionally, existing functions within the Chimera API require modification to support versioning, and to support the policy which maintains configuration membership as the subset of the hypertext structure that might be affected by changes to an external object. As a result, the Chimera API is an integral aspect of meeting requirement 2.

Entries within the API can be broadly categorized as creative, destructive, and modification functions. Creative functions encompass creating new concept instances, such as creating a new anchor instance version. Destructive functions encompass deleting concept instance versions, such as deleting a link instance version. Modification functions concern either modifying the concept's attributes, or in the case that the concept is a container, modifying the set of objects contained. For example, a link can be considered a container of anchors since a Chimera link is a set of anchors. Deleting an anchor from a link is considered a modification function on the link.

Concepts within Chimera may be contained by other concepts. For example, an anchor is always contained within one view, and may be contained in zero or more links. This containment relationship affects what occurs whenever a destructive function is called. If concept instance version (CIV) A contains CIV B, and a destroy(B) is performed, then A must remove B from its members. The implication for versioning for this case is that both CIV A and CIV B must be writable to perform a destroy(B). As a result, a check for this precondition is added to all destroy functions for Chimera anchors, viewers and objects. A destroy(B) also removes CIV B from all writable configurations. If CIV B cannot be removed from all configurations to which it belongs, then it is simply removed from all writable configurations, but still remains in existence.

When a creative function is called, a concept instance is created, and an initial version of that instance is also created. This new concept instance version is not added to a configuration until it is either manually placed there (through a function call), or the concept instance version either contains or is contained by a concept instance version that is already a member of a configuration. For example, if a new link instance version is created, it does not automatically get added to a configuration.

When a modification function is called on non-container concepts (such as anchors), there are no effects on a configuration. These modification functions simply require the concept instance version to be in a writable state. Similarly, if only the attributes of a container concept are modified, there are no effects on any configuration. However, if the membership of a container concept is modified, then there are possible effects on a configuration. If members are deleted from a container concept, then there are no effects on the configuration. Thus deleting an anchor from a link would not remove that anchor from the link's configuration. However, if a member is added to a container concept, then the container concept is added to the member's configuration, and the member is added to the container's configuration, if applicable. For example, if an anchor instance version from configuration A is added to a link instance version which belongs to configuration B, then the link instance version is added to configuration A, and the anchor instance version is added to configuration B. Since anchors in Chimera are automatically included in a view, the creation of an anchor instance version automatically causes each anchor to be added to the view's configuration.

6. Example

6.1 Initial Creation

In this example, a new object has just been created using an external configuration management tool. It has the name `main.c' and a version of `1'. A text editor which has been enhanced with hypertext editing capabilities is used to edit the main.c file. The text editor reads main.c and looks for a version identification string within the file. If one is found, it reads the version from the identification string. If none is found, it queries the configuration management tool for the version of the main.c instance it is editing.

Object version in hand, the text editor queries the version association table of the Chimera server. Since this is a new object, the query does not return a configuration instance version. The text editor then queries the configuration management tool for the previous version of the main.c object. Since the configuration management tool responds that there is no previous version, the text editor instructs the Chimera server to create a new configuration instance, with version 1 (C1,1). The text editor, through calls to the Chimera API, next creates a new object instance version and a new view instance version, then adds them to the newly created configuration instance version. The text editor adds this configuration to the set of active configurations for this user, and also informs the Chimera server to add an entry to the version association table mapping main.c,1 to C1,1.

6.2 Adding an Anchor and a Link

During the editing session, an anchor, A1,1, is created. This anchor is automatically added to the active configuration. A link, L1,1, is created but it is not initially added to the configuration. Only when anchor A1 is added to link L1,1 does link L1,1 get added as a member of the active configuration C1,1. The situation at this point is depicted in Figure 3.

After a few more changes, the user saves their work and quits the text editor session. Configuration C1,1 is removed from the user's list of active configurations. The user does not check-in the object version, and a little while later starts up another text editor session on main.c, version 1. This time the text editor queries the version association table and finds a match -- configuration C1,1. The text editor instructs Chimera to add configuration C1,1 to the set of active configurations, and resumes editing of main.c,1, eventually quitting the text editor.

+================================================+    +=============+
|                Chimera Server                  |    |   Chimera   |
+------------------------------------------------+    |    aware    |
| Hypertext Structure      Active Configurations |<..>| text editor |
|                         +---------------------+|    +=============+
|    V1,1     L1,1        |     V1,1     L1,1   ||           ^
|   /  |  \    |          |    /  |  \    |     ||           V
|  |   |   \   |          |   |   |   \   |     ||    +===============+
|  v   v    v  v          |   v   v    v  v     ||<..>| Configuration |
|Vr1,1 O1,1 A1,1          |Vr1,1  O1,1 A1,1     ||    |  Management   |
|                         +---------------------+|    |   System      |
|            Version Association Table           |    +---------------+
| |          -------------------------           |    |               |
| |            main.c,1  |  C1,1                 |    |  main.c,1     |
| v = contains                                   |    |               |
+================================================+    +===============+

V = View, Vr = Viewer, O = Object,      ^ = <..> = communicates
A = Anchor, L = Link                    V 

Figure 3: Example snapshot. A Chimera-aware text editor is editing file main.c, version 1. Configuration C1,1 is active and contains that subset of the hypertext structure which might be affected by modifications to main.c,1.

6.3 Editing a Newly Checked-out Object

This time the user starts up the text editor on a different object named defs.h, version 5. The object defs.h has been edited before, but has just been checked out to create version 5. The text editor starts up and queries the version association table, but does not find any matching configurations for this version. The text editor then queries the configuration management system for the previous version of defs.h, and gets the answer of version 4. Next, the text editor queries the version association table and is informed that defs.h is associated with configuration C2,4. The text editor then informs the Chimera server to create new versions for all concepts in configuration C2,4 to populate a new configuration instance version, C2,5. The text editor then informs the Chimera server to add an entry to the version association table mapping defs.h,5 to C2,5.

6.4 Link Traversal

Next, a previously defined link is traversed. The traversed link contains anchor A3,2 which belongs to the view created on defs.h,5 and previously created anchor A5,3 which belongs to a view containing an mpeg movie. The Chimera server receives the link traversal command from the text editor, then searches for a view instance version which contains anchor A5,3. Once found, the Chimera server retrieves the object version instance contained by the view. The Chimera server next retrieves the external object from the configuration management system. The Chimera server now spawns a viewer to view the object, and sends the viewer a link traversal message. The viewer presents a view of the object to the user, completing the link traversal. To the user, the retrieval of the appropriate object version has been completely transparent.

7. Conclusion

Requirements and a proposed method for providing hypertext versioning support to hypertext structures stored in a separate repository from the objects related by the hypertext structures have been presented. The versioning method requires all hypertext structures that might be affected by editing of an object to be collected into a configuration, which is a subset of the total hypertext structure stored by the hypertext repository. External object versions are related to configuration versions via a version association table. Policies which enable the editing of the hypertext structure while maintaining a configuration as the subset of the hypertext that might be affected by modification to the structure have been presented. While this method for hypertext versioning has not yet been validated by a functioning system, it is anticipated that the method described within will meet the stated requirements for versioning hypertext structure in a heterogeneous object storage environment.

Acknowledgments

The author would like to thank Rebecca Grinter for her thoughtful comments and helpful review.

References

[1] Kenneth M. Anderson, Richard N. Taylor, and E. James Whitehead, Jr. Chimera: Hypertext for Heterogeneous Software Environments. In Proceedings of the ACM Conference on Hypertext, Edinburgh, Scotland, September 1994.

[2] CaseWare, Inc. 108 Pacifica, Irvine, CA 92718-3332. CaseWare User's Guide, 1993.

[3] Hugh Davis, Wendy Hall, Ian Heath, Gary Hill, and Rob Wilkins. Toward an Integrated Information Environment with Open Hypermedia Systems. In Proceedings of the ACM Conference on Hypertext, Milano, Italy, November, 1992.

[4] David G. Durand. Palimpsest: A Data Model for Revision Control. Paper accompanying poster presentation at Hypertext '93, November, 1993.

[5] Anja Haake. CoVer: A Contextual Version Server for Hypertext Applications. In Proceedings of the Fourth ACM Conference on Hypertext, pages 33-42, Milano, Italy, December, 1992.

[6] Kasper Osterbye. Structural and Cognitive Problems in Providing Version Control for Hypertext.. In Proceedings of the Fourth ACM Conference on Hypertext, pages 33-42, Milano, Italy, December, 1992.

[7] Walter F. Tichy. Design, Implementation, and Evaluation of a Revision Control System. In Proceedings of the Sixth International Conference on Software Engineering, pages 58-67, Tokyo, Japan, September, 1982.

[8] Anthony I. Wasserman and Peter A. Pricher. A Graphical, Extensible Integrated Environment for Software Development. In Proceedings of the Second ACM SIGSOFT/SIGPLAN Software Engineering Symposium on Practical Software Development Environments, pages 131-142, December 1986. Appeared as SIGPLAN Notices 22(1), January 1987.