Nested Composite Nodes and Version Control in Hypermedia Systems

Luiz Fernando G. Soares     
Noemi L. R. Rodriguez       

Depto. de Informatica, PUC-Rio    

Marco Antonio Casanova

Centro Cientmfico Rio
IBM Brasil

1 - Introduction

We describe in this paper an extension of the Nested Context Model [Casa91] which, among other features, supports version sets, permits exploring and managing alternate configurations, maintains document histories, supports cooperative work and provides automatic propagation of version changes. The concept of version context is used to group together nodes that represent versions of the same object at some level of abstraction. Support for cooperative work is based on the idea of public hyperbase and private bases. The automatic propagation of versions uses the concept of current perspective to limit the proliferation of versions. All of the proposed facilities have as a goal the minimization of the cognitive overhead imposed on the user by version manipulation. Although the discussion about version control is phrased in terms of the Nested Context Model, the major ideas apply to any hypermedia conceptual model offering nested composite nodes, such as HyperBase [ScSt90] and HyperPro [Oste92]. The Nested Context Model is the conceptual model of the HyperProp system, whose architecture is described in [SoCC93].

We adopt as the metric for our versioning mechanism the remarks posed in [Hala88,Hala91], in [Oste92] and in [Haak92], briefly described by the following requirements:

In addition to these requirements, we believe that the notion of version should also cover two other situations:

In the remainder of this paper we will present our model and discuss how it addresses the requisites stated above.

2 - The Extended Nested Context Model

2.1 - Basic Concepts

The definition of hypermedia documents in the Nested Context Model (NCM) [Casa91] is based on two familiar concepts, namely nodes and links. Nodes are fragments of information and links interconnect nodes into networks of related nodes. The model goes further and distinguishes two basic classes of nodes, called terminal and composite nodes, the latter being the central concept of the model. Figure 1 illustrates the class hierarchy proposed. In this paper we will only discuss some classes of this hierarchy (For further information, the reader is referred to [SoCR93]).

             |                              |
            link                           node
                 |                                   |
             composite                            terminal **
                 |                                   |
         ----------------           ---------------------------
        |                |         |        |        |       |
      trail           context     text   graphic   video    ...
         |         |           |           |        |
  annotation    private     public      version   user **
                base        hyperbase   context   context
                Figure 1 - NCM Class Hierarchy

Each node has a set of anchors that acts as the external interface of the node. Anchors encapsulate the definition of regions. In conformance to the Dexter Model [HaSc90], every anchor has an associated id and value. Anchors are used as end points of links.

A terminal node contains data whose internal structure, if any, is application dependent and will not be part of the model. A composite node groups together entities, called components, including other composite nodes. The class of composite nodes may be specialized into other classes, including the class of context nodes.

A context node groups together sets of links, terminal nodes, trails and context nodes, recursively. Note that the components of a context node C form a set, different from a composite node. We say that a component A of C is contained in C. Note that the same component A can be contained in a composition B, also contained in C, since the "is contained" relation is not recursive. Context nodes are subclassed, originating five new classes.

A user context node C is a context node which groups together:

  1. a set S of terminal or user context nodes. Nodes in S are said to be contained in C.
  2. a set L of links, such that each link l in L has the base nodes contained in C. Links in L are also said to be contained in C.

To identify through which sequence of nested context nodes a given node is being observed and which links actually touch the node from that nesting, we define the notion of perspective for a node. A perspective for a node N is a sequence P=(N1,...,Nm), with m>=1, such that N1 = N, Ni+1 is a composite node and Ni is contained in Ni+1, for all i in [1,m). Since N is implicitly given by P, we will refer to P simply as a perspective. Note that there can be several different perspectives for the same node N, if this node is contained in more than one composite node. The current perspective of a node is that traversed by the last navigation to that node.

A hyperbase is any set of nodes such that, for any node N in H, if N is a composite node, then all nodes contained in N also pertain to H.

The problem of exploring alternative configurations of a document (as stated in R1) is trivially solved by creating alternative user context nodes over the same set of nodes, reflecting distinct views of the same document tuned to different applications or user classes. However, other classes are needed in order to address the other requirements posed in the introduction, adding new entities for versioning and cooperative work. Four new subclasses of context nodes, annotation, public hyperbase, private base, and version contexts are thus introduced.

In NCM, only terminal and user context nodes, marked with stars in figure 1, are subject to versioning. Each attribute (including content) of a user context or terminal node may be specified as versionable or non- versionable. Versionable and non-versionable attributes thus help to meet R7. Some kind of notification mechanism will be needed to enhance version support. Also, in NCM, the user may specify if the addition of new attributes is permitted without creating a new version of the object (also helping to meet R7).

We introduce the notion of state of a terminal node and a user context node to control consistency across interrelated nodes, to support cooperative work and to allow automatic creation of versions. A terminal node or a user context node N can be in one the following states: committed, uncommitted or obsolete. N is in the uncommitted state upon creation and remains in this state as long as it is being modified. When it becomes stable, N can be promoted to the committed state either explicitly at the user's request, or implicitly by certain operations the model offers (helping to meet R5). As an example of implicit change of state, an uncommitted user context or terminal node N becomes committed when a primitive for version creation is applied on it. A committed node cannot be directly updated or deleted, but the user can make it obsolete, allowing nodes that reference it or that are derived from it to be notified.

The concept of a node state is in fact only relevant for user context and terminal nodes that have versionable attributes. Therefore when we say, for example, that committed nodes cannot be modified, we mean that the versionable attributes cannot be modified.

Finally, we have not included link versioning in our model, since we believe that this facility adds more complexity than functionality to a system, and that, if necessary, can be modeled through user context node versioning. It remains to study if this facility becomes important when actions and conditions are associated to links.

2.2 - Version Contexts

A version context V groups together a set of user context or terminal nodes that represent versions of the same object, at some level of abstraction, without necessarily implying that one version was derived from the other. The nodes in V are called correlated versions, and they need not belong to the same node class (helping to meet R13). The derivation relationship is explicitly captured by the links in V. We say that v2 was derived from v1, if there is a link from v1 to v2 in V. A version context induces a (possibly) unconnected graph structure over all versions. There is no restriction on links (helping to meet R12), except that the derives from relation must be acyclic. It should be noted that version context node can contain user context nodes, since these nodes can be versioned. This provides us with an explicit versioning of the document structure, thus meeting R9.

A user may either manually add nodes (to explicitly indicate that they are versions of the same object) and links (to explicitly indicate how the versions were derived) to a version context, or he may create a new node from another by invoking a versioning operation, which will then automatically update the appropriate version context.

An application has several options to define the node it considers to be its current version in a version context V, according to a specific criteria. One of them is to reserve an anchor of V to maintain the reference to the current version. Other anchors may specify other versions following other criteria of choice. The reference may be made through a query. The query does not need to be part of the anchor of the version context, since it may be defined in a link and even in a more general way (helping to meet R5 and R4), as we will see when we discuss private bases. It should be noted that the query which defines the current version may return several versions (for example, "all versions created by John"), which can be interpreted as alternatives and presented as a user context (a version context view). Therefore, version contexts meet R4 since they provide an automatic reference update facility.

An application may use version contexts to maintain the history of a document d (R3), as well as for automatic reference update (R4), for example, as follows. Suppose that d has a component c whose versions the application is interested in. Let C be the version context containing the nodes C1,...,Cn that represent the versions of c. The application will refer to C, and not directly to any of the Ci's, in the user context node D it uses to model d. All links in D touching C will point to the same anchor of C, which will always point to the node Ci the application considers to be the current version. If the application wants to recover previous versions of d with respect to c, it simply navigates inside C. Indeed, since version contexts are just a special class of context nodes, users may, in principle, navigate through the document history using the basic navigation mechanisms of NCM [Casa91]. Alternatives of the sub-part c of the document can be accessed, for example, by a query, which may return a set of alternatives, meeting R12.

2.3 - Public Hyperbase and Private Bases

We define the public hyperbase, denoted by HB, as a special type of context node that groups together sets of terminal nodes and user context nodes. All nodes in HB must be committed or obsolete and, as in all hyperbases, if a composite node C is in HB, then all nodes in C must also belong to HB. The public hyperbase contains information which is public and stable.

We also define a private base as a special type of context node that groups together any entity, except the public hyperbase and version context nodes, such that:

Intuitively, a private base collects all entities used during a work session by a user, according to the paradigm (work session) proposed by the Dexter Model. Note that one specific (version of a) terminal or user context node can pertain to one and only one base (public or private).

In order to avoid the cognitive overhead imposed by defining the selection criteria for retrieval of a current version, every version context node has two special anchors, defined by default queries. One of these default queries is specified in the version context itself. The other one is specified in a more general way (helping to meet R5 and R4), in an attribute of the private base. When a link is created, the destination node is examined. If the node is a component of a version context not explicitly specified by the user (either directly or through a query), the link will be created using the selection criterion defined in the private base PB which contains the context node where the link is in. If this query is not specified in this private base, the link will use the default query defined in the version context.

A user may move a user context node or a terminal node from a private base into the public hyperbase through the use of the "check-out" primitive, as long as the node is committed. If a committed user context node C is moved into HB, then all terminal and user context nodes in C must also be moved into HB.

A user cannot move a user context or terminal node N from the public hyperbase to a private base, but he may create a new node N' as a version of N in the private base. In NCM, work on a document implies in the creation of new versions of all visited user context or terminal nodes in the current private base. These new versions may be derived from committed nodes or correspond to the creation of completely new information (the first node in a version context node). These versions correspond to instantiations in the Dexter Model. Two primitives, "open" and "check-in", are available for the creation of a new uncommitted version of a user context or terminal node N in a private base PB. They differ when N is a user context node. In this case, "open" creates an uncommitted version N' of N in PB, as well as of each of the components in N, and so on recursively. N' will contain the new versions of the components in N, and its links will be created so as to appropriately reflect links in N. If a committed component pertains to more than one context, only one uncommitted version will be created for this node. On the other hand, "check-in" creates an uncommitted version N' of N, in PB, that contains the original nodes contained in N.

Interesting consequences arise from the different behavior between the "open" and "check-in" primitives. Let N' contains nodes C1 and C2, that in turn contain the same node M. If N' is created through the "check-in" operation, and node M is modified through the two perspectives, C1 and C2, two different versions, M' and M'', will be created. On the other hand, if N' is created through the "open" operation, a single new uncommitted version M' will be created and will suffer modifications through both perspectives.

2.4 - Version Propagation

In any system with composite nodes, one may ask what happens to a node when a new version of one of its components is created. A system is said to offer automatic version propagation when new versions of the composite nodes that contain a node N are automatically created each time a new version of N is created.

In NCM, a node may be contained in many different user context (composite) nodes. Thus, version propagation may cause the creation of a large number of often undesirable nodes. As a solution to this problem, we propose to let the user decide whether he wants automatic version propagation or not, and to limit automatic propagation to those user context nodes that belong to the perspective through which the new version was created. We also limit propagation to those user context nodes that are committed, in line with the restriction that an uncommitted node cannot be used to derive versions. This amounts to providing a mechanism that supports sets of coordinated changes, thus meeting R5, R6 and R10.

3 - Conclusions

The Nested Context Model with versioning is the conceptual basis for the hypermedia project under development at the Computer Science Department of the Catholic University of Rio de Janeiro and the Rio Scientific Center of IBM Brazil. A single-user prototype system incorporating the basic Nested Context Model has been concluded. Currently, some applications run on this prototype. A second prototype, conforming with the MHEG proposal and including versioning, is nearly completed.