DRAFT

Proceedings of the ECSCW'95

Workshop on the Role of Version Control

in CSCW Applications

David Hicks, Anja Haake, David Durand, and Fabio Vitali

DISPLAY COPY

Abstract

The workshop entitled "The Role of Version Control in Computer Supported Cooperative Work Applications" was held on September 10, 1995 in Stockholm, Sweden in conjunction with the ECSCW'95 conference. Version control, the ability to manage relationships between successive instances of artifacts, organize those instances into meaningful structures, and support navigation and other operations on those structures, is an important problem in CSCW applications. It has long been recognized as a critical issue for inherently cooperative tasks such as software engineering, technical documentation, and authoring. The primary challenge for versioning in these areas is to support opportunistic, open-ended design processes requiring the preservation of historical perspectives in the design process, the reuse of previous designs, and the exploitation of alternative designs.

The primary goal of this workshop was to bring together a diverse group of individuals interested in examining the role of versioning in Computer Supported Cooperative Work. Participation was encouraged from members of the research community currently investigating the versioning process in CSCW as well as application designers and developers who are familiar with the real-world requirements for versioning in CSCW. Both groups were represented at the workshop resulting in an exchange of ideas and information that helped to familiarize developers with the most recent research results in the area, and to provide researchers with an updated view of the needs and challenges faced by application developers. In preparing for this workshop, the organizers were able to build upon the results of their previous one entitled "The Workshop on Versioning in Hypertext

" held in conjunction with the ECHT'94 conference.

The following section of this report

contains a summary in which the workshop organizers report the major results of the workshop. The summary is followed by a section that contains the position papers that were accepted to the workshop. The position papers provide more detailed information describing recent research efforts of the workshop participants as well as current challenges that are being encountered in the development of CSCW applications. A list of workshop participants is provided at the end of the report.

The organizers would like to thank all of the participants for their contributions which were, of course, vital to the success of the workshop. We would also like to thank the ECSCW'95 conference organizers for providing a forum in which this workshop was possible.

David Hicks GMD-IPSI
E-Mail: hicks@darmstadt.gmd.de
Anja Haake GMD-IPSI
E-Mail: ahaake@darmstadt.gmd.de
David Durand Boston University
E-Mail: dgd@cs.bu.edu
Fabio Vitali University of Bologna
E-Mail: fabio@cirfid.unibo.it

Table of Contents

Workshop Summary 7

Position Papers

Using Database Versions to Support Awareness in Group Interactions
Marcos Borges and Genevieve Jomier 17

Conceiving Collaborative Version Control for agent-based conceived Hypertext
Antonina Dattolo and Vincenzo Loia 23

The Role of Version Control in CSCW Applications: A Position Statement
Prasun Dewan and Jon Munson 33

Requirements for a CSCW System for Software Development Organizations
John Gintell 37

On Merging Hypertext Networks
Anja Haake, Jörg Haake, and David Hicks 43

Accessibility of Versions as Means of Handling Large Interdependent Object Spaces in
Corporate Planning Environments
Heiko Ludwig 57

Fine-Grained Version Control in COOP/Orm.
Boris Magnusson 63

Version Control in Microcosm
Mylene Melly and Wendy Hall 69

A Multiversion Database Model for CSCW Applications
Waldemar Wieczerzycki 77

List of Participants 89

Workshop Summary

David Hicks & Anja Haake
GMD - German National Research Center for Information Technology
IPSI - Integrated Publication and Information Systems Institute
Dolivostr. 15, D - 64293 Darmstadt, F.R. Germany
e-mail: {hicks, ahaake}@darmstadt.gmd.de

This section of the report provides an overview of the activities and results of the workshop. First the schedule and format of the workshop are briefly described. This is followed by a list of issues for version control in CSCW applications that was generated at the workshop. A more detailed discussion of the issues follows in a section in which the organizers have summarized the discussions that took place during the workshop.

1 Workshop Format

The workshop began with introductions. Each participant was asked to briefly introduce themselves and to provide a statement of their goals and expectations for the workshop including any specific versioning related issues they would like to see explored. The following session of the workshop was organized around participant position papers. Each participant was asked to spend 10-15 minutes elaborating on their position statement. Each position description was followed by a 15-20 minute discussion section in which the other participants were encouraged to ask questions and make comments. Throughout the discussions during the introduction and position presentations, a list was maintained of the issues that arose.

The final session of the workshop consisted of a discussion in which the participants examined and expanded upon the list of issues that had been generated throughout the day. An attempt was made to identify the requirements introduced by each of the issues and to list current systems that address the issue in some form.

2 Version Control Issues For CSCW Applications

The list of issues generated at the workshop included the following items (presented here in the order originally introduced):

r granularity of versions

r variants of versions

r overhead

r single or multiple system architecture

r merging

r group awareness

r access control

r work process specification

r version selection

r version propagation

r interoperability with existing systems

r integration with existing tools

r transition between different modes of collaborative work

r transaction models

During the discussions at the workshop it became evident that many of these issues are quite broad and actually represent a number of subissues. In many cases the subissues represented by one item are closely related to those of another, preventing them from being considered in isolation. Therefore, the mechanisms and solutions proposed to address one issue or category of issues must be considered carefully with respect to their affect on other issues.

3 Summary of Workshop Discussions

The following material summarizes the discussion of the issues that occurred at the workshop. The discussion of each issue begins with an examination of the requirements that it introduces, along with its relationship to other issues or categories of issues. Additionally, those systems and research efforts represented at the workshop that address the issue are identified. It is important to note that the list of systems addressing issues are not intended to be exhaustive. Instead they reflect the discussions that occurred at the workshop in which the details of those systems and research efforts represented by participants were examined.

To facilitate the discussion of the issues contained in this report, they have been grouped into four categories: usability, technology enabling, extensions required, and environmental factors. This grouping was developed after the workshop during the course of preparing this report as the organizers analyzed workshop notes and discussions. The categories are not mutually exclusive in that some issues may represent functionality or requirements belonging to more than one category. The material in this section is intended to provide an overview of discussions at the workshop. The individual position papers contained in the following section should be consulted for details regarding specific systems or positions.

3.1 Usability

This category contains issues that affect the way in which the version related services of a CSCW environment are used. The term "user" in this case can correspond to an end user utilizing the versioning facilities of a particular application or a developer who is incorporating the versioning functionality of the surrounding environment into an application. The overall goal for addressing issues in this category is to develop solutions that are powerful enough to meet user needs, but that are not too difficult to use or confining for the user.

Overhead - The overhead issue concerns the amount of effort associated with using the version related part of a system. There are various kinds of overhead that must be considered including user overhead and system overhead. The goal, of course, should be to minimize all types of overhead. However, there are usually tradeoffs involved in that lowering one type of overhead can increase another. For example, lowering system overhead can have the side effect of increased user overhead. Since machines will inevitably continue to get faster, but a poorly designed version control facility requiring too much user overhead will always be difficult and cumbersome to use, when conflicts occur and tradeoffs are necessary, priority should be given to minimizing user overhead. However, system overhead should not be allowed to increase to the point that significant processing time increases result causing user waiting time to rise to unacceptable levels.

Mechanisms suggested to address the overhead issue include a combination of front-end versioning, as done in the Suite system (cf. position paper Dewan/Munson) and back-end versioning as performed in the COSMOS system (cf. position paper Wieczerzycki).

Granularity - This issue involves the ability to perform version operations and interact with the version control part of a system at different levels of detail or granularity. Version creation is an important aspect of the granularity issue. The level of granularity required for version creation varies across application areas, and some applications require the capability to create versions at more than one level of granularity. For example, versioning an entire document might be appropriate when it has been changed extensively while versioning only a section, paragraph, sentence, or some other specific part is useful when only a limited number of changes have been made. The ability to create links between versions created at different granularities is also important.

The ability to interact with the system at various levels of detail is another important aspect of the granularity issue. For example, in some applications it might be useful to allow the granularity level of the version history inspection operation to be adjustable ranging from a very coarse level where only the major elements of an evolving artifact being inspected are shown to a very fine level where the version histories of all individual subcomponents of the artifact are visible.

Microcosm (cf. position paper Melly/Hall) and COSMOS (cf. position paper Wieczerzycki) are two examples of systems that address the version creation granularity issue. They support multiple levels of version creation. Additionally, both the COOP/Orm (cf. position paper Magnusson) and Microcosm systems provide capabilities for interaction at varying levels of detail.

Version Selection - Version selection involves the identification of specific versions of objects within an application. Ease of use for the end user is an important requirement for strategies used to address this issue. The ability to select consistent versions of interdependent or related objects is also an important goal. Finally, a mechanism to address this issue should provide flexibility and expressiveness in the specification of versions.

The conflicting goals for mechanisms employed to resolve this issue, such as easy to use but expressive and flexible, must be carefully considered and resolved, likely influenced by or based on the specific needs of the target application area. Since different version selection strategies will require varying degrees of system support, the version selection issue is closely related to the overhead issue. The version selection issue is also affected by version granularity since more effort may be required for the selection process when versioning is performed at multiple levels of granularity.

Both the Microcosm (cf. position paper Melly/Hall) and COSMOS (cf. position paper Wieczerzycki) systems provide facilities to help cope with the version selection issue, especially for maintaining the consistency of interdependent objects.

Variants - To enable support for a variety of object evolution patterns, the ability to maintain alternative versions or variants of objects is required. This capability has been found to be important in areas involving inherently cooperative development activities, such as software engineering. Additionally, this basic capability is related to several other version control issues in CSCW. Specifically, it can be useful in developing mechanisms to support: work process specification, the smooth transition between various modes of cooperation, and transaction models.

Propagation - The development process of a large document or other complex evolving artifact is often mapped onto several physical objects. The need often arises to establish dependencies between related elements of such an evolving object. This is particularly true when version control services are introduced, since the creation and availability of a new version of an object can have an effect on related objects. System support is needed to allow dependency relationships to be modeled, and for the appropriate actions to occur as new versions are created.

The basic functionality required is the ability to provide automatic notification within a system as new versions are created. This will allow a system to respond with the appropriate actions as new versions of related objects become available.

Important decisions must be made related to this issue. The scope of the effect of operations must be specified. For example, when a user "checks out" a version, what should the effect be on closely related objects. Should they also be considered implicitly checked out? When the object is "checked in" and a new version of it is created, should new versions of its closely related objects also be automatically created? Decisions are also required regarding the visibility of propagation operations. For example, as propagation operations occur (such as version creation), should their results automatically be forwarded to and become visible in all user sessions, or only to selected views of a data space?

The mechanism provided to define these relationships should be flexible, allowing dependencies to be specified either in a version generic way between the objects themselves, or between specific versions of objects. It should also enable the problems associated with (uncontrolled) version propagation to be avoided.

Many databases support the specification of version generic propagation relationships. The SEPIA system (cf. position paper Haake/Haake/Hicks) allows different propagation policies to be established for different user views of data managed by a central database. Additionally, the Suite (cf. position paper Dewan/Munson) and PlanKo (cf. position paper Ludwig) systems allow the specification of propagation relationships between specific versions of objects.

3.2 Technology Enabling

Version control techniques can serve as a basis from which new CSCW related technologies can be developed and existing ones can be improved. The issues contained in this category relate to areas in which version control capabilities can be exploited to support and enable CSCW technologies.

Transition between different modes of collaborative work - Since some types of collaboration can require a number of different modes of activities, support is needed to switch from one mode to another. For example, in a collaborative writing project, some authors may prefer to work together in a tightly coupled synchronous mode during an initial idea generation and outlining phase, work separately and asynchronously on writing the individual components of the documents, and then work together again synchronously to combine and reconcile the individual sections. In other cases, such as during a collaborative meeting, the need to change from one type of collaboration to another may occur more frequently requiring a significant number of mode switches.

To meet this demand, systems designed to facilitate these types of collaboration must support multiple modes of operation. Versioning functionality, such as the ability to support alternative versions, increases the number of collaborative modes possible in a CSCW application. For example, a user starting an editing session involving an object that is currently being edited by others might be allowed to either join the collaborative session or, through the use of a parallel version, initiate a private editing session on the object. Increasing the number of collaborative modes in a CSCW application increases the importance of support for transitions between them. Specifically, the transition between various modes should be as smooth and simple as possible. Additionally, the user should not have to change tools or encounter a drastically different user interface when making a transition from one mode to another.

The transition between various modes of collaboration is supported by the hypothetical merge facility of the COOP/Orm system (cf. position paper Magnusson). The check-out and merge operations of the Suite system (cf. position paper Dewan/Munson) can also be used to support the transition between asynchronous and synchronous modes. Transaction models, like the join/connect-to of the COSMOS system (cf. position paper Wieczerzycki) also provide this type of functionality.

Group Awareness - This issue is central to the ability of an application to support cooperative work. Mechanisms are required that allow members of a group to be aware of each other's presence and actions. The introduction of versioning into a CSCW application can support new types of group awareness strategies. For example, temporary or alternative versions can be used directly to provide a "delayed awareness" capability in which development can take place initially in private and then be made accessible to the other members of a group.

Interestingly, the introduction of version control into a CSCW environment introduces additional group awareness issues. For example, mechanisms are required to inform users of: the presence of other users who are currently investigating a version graph, the creation and availability of new versions, and changes to the content of current versions.

The mechanisms to support group awareness should be flexible so that varying degrees of awareness are possible. Additionally, it should be possible to select the type of display and update mechanisms to be used for presenting awareness information. For example, it should be possible to specify that group awareness information is updated and displayed automatically as related events occur, or that it is only updated on demand when requested by a user.

COOP/Orm (cf. position paper Magnusson) is an example of a system that provides group awareness facilities. Also the position papers by Borges/Jomier and by Ludwig address the issue of group awareness.

Work Process - When a work flow model or some similar strategy is being used to describe and/or control the work process during collaborative activities, integration with the versioning services is an important consideration: the ability to freeze versions of data objects enables the capture and preservation of the starting, ending, and important intermediate states of developing data objects. Consequently, it should be possible for work process specification tools as well as other ones within the overall CSCW environment to access and use versions as operands.

In general, tools supporting work process specification should accommodate deviations from standard organizational practices when necessary. Versioning technology can be used to enhance their flexibility. For example, parallel or alternative versions allow multiple work processes involving the same data items to progress concurrently.

The activity model of the COSMOS system (cf. position paper Wieczerzycki) provides support for work process specification.

Transaction Models - Transaction models are important in the support of collaborative work. In a comprehensive CSCW environment, transaction models can benefit from the presence of version control services. Versioning related functionality can be used to extend the capabilities of transaction models in ways that make them more appropriate for cooperative activities. For example, less restrictive transaction models can be implemented that allow increased concurrency by creating new versions when "locked" or "checked out" objects are accessed rather than denying access to such objects. Version control functionality could also be used to enhance support for long term transactions. For example, when a long term transaction must be aborted, progress made since the start of the transaction can be saved by creating a local version rather than lost due to the effect of a rollback procedure.

The ability for users to dynamically join with or separate from long term transactions is important in a collaborative environment. It supports the important capability mentioned earlier to allow smooth transitions between asynchronous and synchronous work.

3.3 Extensions Required

This category contains issues related to areas in which additional functionality for versioning concepts is required to be useful in a CSCW environment. Effective mechanisms to address these issues will increase the usability of version support mechanisms within a CSCW environment.

Access Control - Access control to versions is an important consideration in a cooperative work environment. Users must have ways to protect and preserve the results of their work, and to specify to what degree and in what ways their work should be shared and made accessible to others. To provide a mechanism to preserve objects, a freeze operation is required. It should signal that a specified version of an artifact is to be permanently preserved.

Mechanisms are also required that allow read and other version related privileges, such as version derivation permissions, to be specified. These mechanisms should be flexible and support capabilities such as the ability to specify access control information according to group organizations or by dependencies specified among the versions themselves. They should also allow the specification of access control capabilities in asymmetric as well as symmetric modes.

Access control has been addressed to varying degrees in many systems. The PlanKo system (cf. position paper Ludwig) supports both group organizational and version dependent specification of access control capabilities. The PlanKo and Suite (cf. position paper Dewan/Munson) systems provide asymmetric as well as symmetric specification of access control capabilities.

Merging - Support is needed in certain types of collaborative work activities to coalesce the work of individual users and groups to produce an integrated and coherent whole. The importance of this issue is increased by other requirements of a comprehensive CSCW environment including support for transaction models, variants, work process specification, and the ability to transition smoothly between asynchronous and synchronous work modes.

Merging is a complex issue that brings up many subissues. For example, decisions are required concerning which type or level of merging a system should support. Several granularities are possible ranging from the merging of two or more individual atomic objects to the merging of entire networks of information. A mechanism for specifying merge results must also be selected. Choices for this subissue include manual, interactive, and automatic specification of merge results. After the granularity and result specification mechanism for merging have been determined, an appropriate user interface must be designed.

The Suite system (cf. position paper Dewan/Munson) addresses many aspects of the merge issue. It supports the merging of individual objects and allows the selection from a range of merge result specification strategies. The merging strategies proposed for the VerSE versioning support environment (cf. position paper Haake/Haake/Hicks) also address the merge issue supporting a range of different granularities of merging and merge result specification strategies. The COOP/Orm system (cf. position paper Magnusson) also provides flexible support for the specification of merge results.

3.4 Environmental Factors

Issues placed into this category are those which arise from and relate to the circumstances of the surrounding CSCW environment and system architecture. They include issues that impact both the design and implementation strategies employed for version related services. Design related issues, such as integration with existing tools and single or multiple system architecture, influence the shape of versioning services by placing requirements on the functionality that must be provided. Implementation related considerations, such as interoperability, have more impact on the way a design is implemented within a particular environment.

Integration with existing systems - Multiple tools are likely to be present in a comprehensive CSCW environment. Many of these tools will be able to benefit from version related functionality. The version control services offered in the CSCW environment should be capable of being accessed and utilized by existing and new tools so that they can enhance their functionality to include version related services. As the amount of version control functionality required will vary from one CSCW tool to the next, a range of degrees of integration should be possible. Integrating basic versioning services should require minimal effort.

A wrapper approach is used by the Microcosm system (cf. position paper Melly/Hall) to allow integration with existing systems.

Single or multiple system architecture - This issue concerns the overall architectural approach employed in designing a CSCW environment. When the environment is composed of multiple systems, issues concerning their interaction must be considered. Well defined interfaces must be established. This issue is closely related to other issues such as interoperability and integration with existing tools.

Interoperability - A number of existing tools and, more recently, some database management systems, provide various types and degrees of versioning related functionality. The version services within a CSCW environment should be capable of interoperating with and building upon the services of such systems when appropriate. This will enable the services of existing systems to be exploited when possible as CSCW versioning facilities are created.

Microcosm (cf. position paper Melly/Hall) is an example of a system that interoperates with tools from its surrounding environment. It utilizes the facilities of the RCS and Exodus systems in providing version control functionality.

Position Papers

Using Database Versions to Support Awareness
in Group Interactions

Marcos Borges*


Santa Clara University
USA
e-mail: mborges@otl.scu.edu

Genevieve Jomier
University of Paris - Dauphine
France
e-mail: jomier@lamsade.dauphine.fr

The problem

Awareness mechanisms are essential to group support systems in order to transform irregular interactions of group members into a consistent and perceptive performance over time. Awareness mechanisms are important to keep members up-to-date with important events and therefore to contribute to a more conscious acting from their part. This is especially true for asynchronous interactions.

When members of a group are working in a geographically distributed and asynchronous way, there is a need for a group memory that stores all group interactions because no members are guaranteed to have a complete knowledge of all interaction. The group memory is then the only means a user can count on to be acquainted to group activities.

Group support systems have been relying on database systems to work as the group memory management. One of the most important features of the DBMS is the ability to handle multiversions. The version capability is essential to sustain the history of objects that are relevant to group activities [7]. However, most version mechanisms were designed without considering the requirements for groupware. Moreover, to be fully aware of a new version it is not enough to know of its existence, but also the reasons that justify it. In some applications the creation process is even more important than the resulting version itself.

In this paper we explain our point of view that normal multiversion mechanisms are insufficient to handle all forms of awareness. We claim that besides supporting versions the mechanism should also support the development process. Furthermore, different levels of awareness should coexist in the system. We introduce some concepts that we believe will help in designing awareness mechanisms integrated with multiversion capability that will handle these situations.

1 Introduction

Groupware is being claimed to represent a paradigm shift from computer science, in which human-human rather than human-machine communications are emphasized [1, 9]. In order to support this new paradigm one of the main requirements is to provide mechanisms for awareness. Awareness can be defined as an "understanding of the activities of others, which provides a context for your own activity" [5]. Without awareness a member cannot build his/her sense of a group and the human-human paradigm will remain mainly on the intentional level [11].

Taken from the definition above, be aware of other people's work has several facets in a groupware. First of all, group members require selective awareness, meaning that not everything is of his/her interest. Second, there are different levels of interest. Some relate to the member's current work, others relate to past or future work, and some may be of secondary interest. Therefore, awareness should be set to different degrees. Third, people have time-related levels of attention, meaning that awareness also varies according to how ready is a member to accept interference from other group members.

As an analogy, let us use the example of an answering machine. Most of the time people use the machine to record messages from the outside world when they are not available. However, some people use it to select whom they want to communicate with. A third use it to store messages when you are present but do not want to be interrupted. An awareness mechanism, however, must be much richer than the services provided by an answering machine (or an e-mail). For example, a group member should be capable to tune his perception based on the type of events or the origin of the information, and this can change during a period of time.

Also, for groupware activity it is not enough to know that there is a new version of an object and its origin. It is very important to know the motives behind this new version and why the old version has been discarded. In a decision process for software engineering, for example, it is very important to store the discarded alternatives in order to facilitate a possible return to this discussion in the future.

A very important use of versions in groupware has much to do with awareness. Differently from some database applications that store versions to preserve history, multiversions can be used in groupware to support awareness by means of storing evolution in order to explain the transformation process. Together with a representation of the user memory the versions can also help on a selective recovery of the new additions that are of interest of the user.

By definition, awareness requirements vary from one member to another. The same is true for the member's perception of the group memory. Our approach then is to support groupware both at the group and member levels. At the group level there is the database working as the group memory. For each member we suggest mechanisms for awareness based on each member perception of the group memory.

2 Our Approach to the Awareness Problem

To provide relevant awareness mechanisms a repository of the group memory should support the following services:

1. store all relevant versions of groupware objects;

2. store all relevant information about the evolution of a version, for example the reasons behind a proposal for an update;

3. store users' current perception or knowledge of the group memory;

4. store users intentional level of awareness.

The combination of all this information will enable a system to provide the basic elements for awareness. The first service is normally provided by multiversion databases [6, 10]. The second service can be supported by associating each object version or each diff to elements of the data model that explain the reasons for the change. An argumentation model such as IBIS [8] can be used for this purpose. These two services are illustrated in Figure 1.

Figure 1 - A multiversion database integrated with IBIS

The user perception of the group memory is the third and a new service. It represents a particular knowledge that each member has of each object in the database. Therefore it can be represented by a set of pointers to each object version of the group memory. In this way it is neither time-based nor author-based versions, it is actually a particular and a dynamic view of the current stage of the group memory. In the simplest mode objects can be classified in two categories for each user: those which the user knows to exist and what they contain and those he/she does not.

The user perception can also be multiversion when we need to keep track of the progress of his/her perception of the memory, such as in a cooperative learning application. It is interesting to note that while traditional database objects have new versions usually associated with updates, the user perception is updated as a result of read-only queries.

The fourth service will allow agents or query mechanisms to respond differently depending on the user's setting. For example, the user can set the system to send him/her a message when events of a defined category occur.

The awareness mechanism can be provided through agents that will work for each group member according to his/her setting, by selectively querying the database. Alternatively, the users themselves can query the database using a higher level language that makes use of the elements described above. The main idea is that the result of a query should vary according to the user perception of the group memory, which varies from one member to another.

Besides serving a particular member's interest, the awareness mechanism can also be used by users playing coordination roles. For example, a discussion facilitator may want to know if all users are aware of a certain proposal before casting a vote.

To support the requirements without any limitation coming from the version model we will use the database version approach [4, 6]: the multiversion database stores and manages as many versions, called database versions, of the universe modeled by the database as necessary. Working on a database version is similar to working on a classical monoversion database and users are allowed to query simultaneously several database versions. This model allows

r to consider versions at different levels of granularity (simple or complex objects, part or the whole universe), to follow their histories and document their changes

r to follow the history of an object or a group of objects across versions, even if at a time an object is changing its type, is split into some other objects etc.

Regarding the implementation of the additional services described, our approach is to build a groupware layer on top of such a multiversion database management system. The layer will interface with the database management system and the applications. The layer will deal with queries, agents, and the representation of user knowledge and level of awareness. A change on the user interface level will also be required.

3 Conclusions and Next Steps

The main goal of our project is to build what we call a "database-supported cooperative work" through the integration of several services in a database system to support groupware. Our main claim is that database management systems, though very important to groupware, lack a number of repository services that are important for groupware. The alternative so far is to incorporate these services into the application. The awareness mechanisms and the user-sensitive queries are only two examples of this.

Figure 2 - A multi hierarchy version model of group memory evolution

Regarding awareness and versions, our next step is to refine the user awareness model in order to support user-sensitive database queries. In parallel we are adapting the version model to provide better support for the representation of evolution. We are specially interested on a multi hierarchy model of both versions and evolution such as illustrated in Figure 2.

Some of theses ideas are being implemented in SISCO [2]. Besides, we will need a special interface to deal with awareness and evolution. A preliminary step towards this is described in [3].

Acknowledgments

This work is being partially supported by the Brazilian Research Council under the grant #200219/94-6.

References

[1] Baecker, R. Readings in Computer-Supported Cooperative Work, Morgan Kaufmann Publishers, 1994.

[2] Bellassai, G., Borges, M., Fuller, D., Pino, J.A., Salgado, A.C. SISCO: A tool to improve meetings productivity. In Proceedings of the 1995 Cyted-Ritos International Workshop on Groupware, to appear.

[3] Borges, M. & Pino, J.A. Additions to the card metaphor for designing human-computer interfaces, 4th Workshop on Information Technologies and Systems (WITS `94), Vancouver, Canada, December 1994

[4] Cellary, W. & Jomier, G. Consistency of Versions in Object-Oriented Databases, In Proceedings of VLDB `90, Brisbane, Australia, 1990.

[5] Dourish, P. & Belloti, V. Awareness and Coordination in Shared Spaces. In Proceedings of CSCW `92, Toronto, Canada, 1992.

[6] Gancarski, S. & Jomier, G. Managing entity versions within their context: a Formal Approach. In Proceedings of DEXA `94, Athens, Greece, pp: 400-409.

[7] Haake, A. & Haake, J.M. Take CoVer: Exploiting Version Support in Cooperative Systems, In Proceedings of the INTERCHI `93, Amsterdam, The Netherlands, April 1993.

[8] Kunz, W. & Rittel, H. Issues as Elements of Information Systems, Working Paper # 131, Institute of Urban and Regional Development, University of California at Berkeley, 1970.

[9] Khoshafian, S. & Buckiewicz, M. Introduction to Groupware, Workflow, and Workgroup Computing, John Wiley & Sons, Inc., 1995.

[10] Minör, S. & Magnusson, B. A model for Semi- (a)Synchronous Collaborative Editing. Technical Report LU-CS-TR:93-109, Lund University, 1993. Also in Proceedings of ECSCW `93, Milano, Italy, 1993.

[11] Sohlenkamp, M. & Chwelos, G. Integrating Communication, Cooperation, and Awareness: The DIVA Virtual Office Environment. In Proceedings of CSCW `94, Chapel Hill, USA, 1994, pp: 331-343.

Conceiving Collaborative Version Control
for agent-based conceived Hypertext*

Antonina Dattolo & Vincenzo Loia
Dipartimento di Informatica ed Applicazioni,
Università di Salerno, 84081 Baronissi (SA), ITALY
e.mail: {antos,loia}@dia.unisa.it

Abstract

In this paper we discuss collaborative version control issues in a hypertext architecture which has been defined using an agent-level design metaphor. Within this perspective all the basic activities which rule the functioning of the hypertext are formulated in terms of collaborative, parallel mechanisms which organize global tasks disseminating knowledge and duties over a population of computational agents. We focus those agent behaviors designed to accomplish the task of version control. A concurrent version control mechanism suitable for a single user is initially described, then the extension of our model towards CSCW (Computer Supported Collaborative Work) environments is presented.

1 Introduction.

The impact of agent software technology in software development represents one of the most important breakthough, as illustrated by recent studies [16]. In the following Figure 1 we illustrate the impact of this novel software technology with respect to different application fields [10].

In the agent-based approach the knowledge and duties are distributed over populations of computational agents. To reach global solutions specific cooperation schemes are formulated and implemented in order to allow the agents to execute autonomously and independently their goals [5]. In the hypertext community, distributed processing was introduced in the middle of the '80s as a technological support for multi-user access in order to (de)centralize databases [3]. More recently, distribution is used as a key issue in CSCW architectures to allow cooperation activity in organizational environment [12, 14, 6]. In these last years, the notion of ``agent'' has extend its role in hypertext domain; in [15] knowledge agents are attached to an information object to provide simple and flexible user access capabilities, in [2] an active use of distributed knowledge is introduced to improve operational efficiency and to maintain consistency. In a previous paper [4], the agent metaphor was adopted to define a general distributed framework for hypertext systems. In this work we discussed how a number of basic issues of hypertext environment can be formulated in terms of cooperative tasks of independent agents. In particular we proved how the agent approach improved efficiency and simplicity on the version management, which presents considerable difficulties in realization of efficient hypertext systems. Herein we discuss how the same approach is useful to realize the version control activity in CSCW environment [1, 8, 13]. We underline how it is possible to treat this task thanks to simple extensions which do not change the basic features of our initial single-user model. To facilitate our discussion we briefly introduce in the next section the agent-based hypertext architecture that explains the concurrent version control mechanism. The CSCW extensions are discussed in Section 3, focusing our attention on the version control management for collaborative users. Conclusions in Section 4 close the paper.

2 The Model

Our model is very simple: hypertext is a population of independent agents which are the basic computation activities of our platform. Each agent maintains local information in its acquaintances - slots - and interacts in the agent society activating its scripts - subprograms - when specific stimuli are received. The internal knowledge of a single agent is very reduced; its external view is given by its neighbours: this means that each agent possesses partial, local knowledge of the hypertext, but potentially, through message passing of tasks, it gets a global perspective of the overall net. Each agent works asynchronously in a parallel universe. For common goals, some agents may cooperate to attain global plans. Decentralizing the control and data, we accomplish concurrently all the basic activities, such as the version control. The model is composed of two basic levels: the structural and the meta levels. The HypAgents compose the structural layer, the Collectors the meta one.

The HypAgent entity plays the same role of well-known objects, such as notecards, frames, nodes, entities, etc. [9]. All the HypAgents collection represents the most general, complete user perspective of the hypertext. The pseudo-code in Figure 3 shows its definition.

As the reader may note, we emphasize the two basic aspects of the agent, i.e., the data part (defined by the keyword `with-acquaintances') and operational part (introduced by the construct `with-communication-list').

Following the pseudo-code in Figure 3, each HypAgent gets the following information:

r name: to address the agent;

r text/image/sound: to store textual/graphical/acoustic informations;

r incoming/outcoming: to identify any ``traditional'' input/output nodes, in our model input/output HypAgent;

r mbox: to maintain external messages;

r author: to identify the author;

r config: to define to which configurations the agent belongs;

r ...

The functions enclosed in the script section represent all the possible actions which the agent can perform. For instance, `replace-HypAgent' is that script which activates when a version mechanism is applied. The meta-level layer contains the agents designed to manage alternative browsing structures and views of partial sections of hypertext. The distinction with the HypAgent is due to the need to create and handle separate collections of HypAgents, by providing a more abstract treatment to better shape browsing techniques or in general to provide more abstract operations. The Collectors as well as the composites, "would provide a means of capturing nonlink-based organizations of information, making structuring beyond pure networks an explicit part of hypertext functionality" [7].

The pseudo-code of this agent population is shown in Figure 4.

In particular, we note that the acquaintance `collection' maintains the set of HypAgent addresses corresponding to a given collection and the scripts `create/optimize/search-config' handle and improve the configuration management.

2.1 Concurrent Version Management.

We use the term "version control" [11] to indicate the different control strategies suitable for handling the node-based version and the structure-based version. In our model, the approach of version control is uniform, i.e. the node/structure distinction is broken, since, in the agent model, each single entity acquires the global net not by accumulating data in single entity, but by applying concurrent cooperation schemes among de-centralized entities in such a way as to reach common goals. Thanks to this new perspective the node-based versioning becomes a particular aspect of the most general structure-based version. This means that the same schemes of versioning can be applied on structural/meta-level agents. Hence, in this paper we use the term "configuration or version" with the same meaning as the term configuration adopted in software engineering [11], i.e. to indicate a specific state of the hypertext (generally, program) structure as a whole. Here we synthesize the agent-level version management providing an informal description. In Figure 5, we can observe a general situation which occurs when the user decides to create a new state of the hypertext with a new version mark. The original nodes are identified by Nr, whereas the notation Nrvj identifies a node Nr existing in a subsequent version vj

.

Figure 5a depicts the state of the hypertext associated with a version labeled with ti. Each node of the hypertext, i.e. each HypAgent, contains, as local information, the list of all the versions to which it belongs (for simplicity we suppose that the only existing version is ti). The cognitive activity of the user is located on the node N2. The user modifies the node and stores the new content. This command provokes a session-based1 versioning operation, with a new storing of the hypertext indicated by ti+1. This situation triggers the Collector script `create-config', which essentially stores the previous state of system, via a collaboration with the neighbours of N2 (in our example only the HypAgent N1). To optimize such management, the storing consists in duplicating only the section of the agent web which is probably submitted to change; this section is composed of two different entities:

r the current agent(s);

r the collection of the neighbour agents named "frontier".

Regarding the frontier, the possible alterations concern only the links, i.e. only the incoming/outcoming links associated with the current agent(s). For this reason, the duplication is not propagated recursively on the net (in the example of Figure 5, the HypAgents N3 and N4), but it stops at the frontier (the HypAgent N1, in the example), since it is the only section subject to possible modifications. In this way, the storing process consists in generating clones of the neighbouring HypAgents (N1) and of themselves (N2).

The clones are perfect duplications of the original agents. The difference consist only in the fact that when an agent is cloned, then the cloned HypAgent replaces the existence of the original one which becomes an inhibited entity, i.e. it remains deaf to any external stimulus. After all of the HypAgents have generated the corresponding clones, they reset the field version, since they represent a new configuration of the system and thus do not belong to the previous contexts. For example, in Figure 5b, the reader can observe that the node labeled with N2v1 identifies the node N2 in the version v1. Once a new version is established, it is necessary to include the other active HypAgents in this new configuration. In Figure 5b, we sketch the configuration after the cloning of the nodes N1 and N2. The reader can observe that the new state is stressed by the fact that the nodes N3 and N4 now belong to both the configurations marked with ti and ti+1, whereas all the remaining nodes exist only in the latter version ti+1. In order to distinguish the active entities from the inhibited ones, we use, in our graphical representation, the bold objects to depict the active agents. We underline the fact that an agent can be active only in one configuration. Thus, in the configuration ti+1 of Figure 5b, the incoming links for N4 come only from the nodes N1v1 and N3 (since the node N1 is invisible in such configuration). Each time that a new configuration is created, the designed Collector acquires such a situation, in order to allow the identification of all the nodes belonging to such configuration. The Collector owns a perspective of the hypertext in terms of inhibited and active HypAgents. To execute a given version selection, the Collector informs all the agents belonging to such configuration to awaken in order to surface an ``old'' version of the hypertext. This task is performed obscuring the HypAgents which do not match the configuration.

2.2 Extending the model for CSCW.

The extension has been applied in two directions:

r Improve the HypAgent with local intelligence to support lock/unlock operations on the acquaintances of the node. Essentially, it is necessary to introduce new script entities which rule the access to the resources, in order to differentiate the operation mode for each media `text, image, sound'. Furthermore, such scripts may apply on links (the acquaintances `incoming/outcoming'), since this resource is defined as a local HypAgent object.

r Define new agent populations to adapt our previous model for CSCW environment.

This last extension has led to the definition of a new agent population (see Figure 6), which acts as interface between the HypAgent community and the different users. Such agents are named UserAgents; for each new user we have a corresponding UserAgent in order to manage the possible concurrent actions with the other users which may concurrently or not cooperate for reading/writing operations.

Analogously for the HypAgents, there exists a meta-level object for the UserAgents. The Collector that organizes the UserAgents is named UserCollector. Its main role is to synchronize the different active UserAgents. In the following section, we discuss the behavior of these new agents, focusing our attention on a CSCW environment.

3 Cooperation Schemes.

To generalize our discussion we suppose that n users are active on the net in a given instant. In particular, k users are focused on a same node N1, while the remaining n-k are located outside N1. In this paper we concentrate our discussion on these agents categories:

r UA

the set of UserAgents on N1 which may require r/w (read/write) operations;

r UR

the set of the remaining UserAgents.

To simplify our discussion, we treat only the concurrent reading/writing processes occurring on the HypAgent N1, taking in account that such restriction does not affect the generality of our method.

In the rest of the paper, we list the possible situations which demand the use of cooperation activities between the UserAgent, HypAgent and the Collector. For each of these cases, we provide the basic features of the collaborative version control management, showing how this is carried out in a concurrent way.

3.1 Case 1: Reading Activity.

Users are involved only in reading activity. Standard browsing demanded.

3.2 Case 2: a single writer UA

.

This situation is depicted in the next Figure 7a2

. The first task is to apply the versioning procedure on N1. This task is accomplished by the HypAgent N1. Then the acquaintances selected by the user for modification are locked. This process is executed in parallel since the HypAgent exploits the multicast message passing facility. Afterwards, a notification mechanism must be executed. All the UserAgents must be informed about the writing on N1. At meta-level object, the UserCollector knows all the UserAgents; thus it is up to the UserCollector the envoying, in multicast, the notification message. In this situation, the users' behaviors fall in two categories:

r those who want to share the view of the new N1 according to the WYSIWIS principle;

r those who do not want to share.

More precisely, if UA

belong to the first category, then the link between each UA

and N1 is removed and a new link is established between each UA

and N1v1; if UR

belong to the same category, simply they may acquire a new link towards N1v1. No particular action is required for the second user category, i.e. the remaining UA

and UR

. This behavior is depicted in Figure 7.

3.3 Case 3: p writers UA

on separate acquaintances.

Let UA

UA

... UA

be the UserAgents wanting to write on the HypAgent N1 in different acquaintances, as shown in Figure 8a. Parallel versioning mechanisms are executed independently for each of the UAw writers (see Figure 8b). A locking operation is then applied and a notification is sent by the UserCollector on all the UserAgents in multicast. At this point, the UA

... UA

may work according to the following schemes:

Figure 7 - One writer UA

, k-1 readers on N1 and n-k readers outside N1

r UA

want to work in WYSIWIS mode. In this case, the relative cloned sub-sections (in Figure 8b, N1v1, ...) associated with the writing UserAgents (UA

) are merged in a unique texture, i.e. a unique cloned subsection (N1v1x in Figure 8c) on which the different users work concurrently.

r UA

want to work in separate mode. In this case, no particular actions are required. The different cloned HypAgents (N1vx+1, ..., N1vp in Figure 8c) are maintained for each UserAgent UA

. At the end of the modification session, the cloned sub-sections could be collapsed in a unique texture.

r UA

want to work in loosely cooperation mode. This means that some UA

users want to see the writing operation accomplished by other users on N1 without being observed in their writing activity. This situation is treated by adding a link between each UA

towards the selected cloned sub-sections. This action is represented in Figure 8c, where additional links are shown between the HypAgents UA

and the cloned sub-sections N1v1x,... .

Regarding the remaining UR

, they will have the same behavior as described in the previous section 3.2 (the situation is not depicted in Figure 8c to avoid graphical confusion).

3.4 Case 4: more than one writer among UA

on a same acquaintance.

The unique difference with respect to the previous state is the need to synchronize the access to the same resource. The synchronization is monitored by the meta-level object UserCollector to better control the lock-unlock activities and the notification process. In particular, the synchronization is fundamental in WYSIWIS mode.

3.5 Case 5: there exists at least one writer among UA

and at least one
writer on a N1 neighbour among UR

.

Let Nneib be the neighbour of N1. This situation must be considered during the version mechanism applied on N1. In fact, if UAw and URw work on separate acquaintances, then the cloning is not applied on Nneib; in this way UAw may refer to the Nneib HypAgent without affecting the consistency of the model. The sharing of resources may occur if both the UAw and URw want to modify the same link between N1 and Nneib. In this case, the UserCollector provides appropriate synchronization mechanisms, according to the previously described cases.

4 Concluding remarks.

The objective of this paper has been to discuss how an agent-oriented hypertext design environment previously defined for single-user applications has been extended for CSCW environments. The paper has shown that this goal has been accomplished without spending expensive efforts in terms of design prototyping. Having conceived the hypertext architecture as a society of collaborative independent agents has allowed us to introduce those mechanism which support cooperative functionalities without modifying the bulk of our distributed framework.

References

[1] U. M. Borghoff, G. Teege. Application of Collaborative Editing to Software Engineering Projects. Proc. of International Symposium of Software Testing and Analysis, ACM SIGSOFT, ACM Software Notes, 18(3), July pp.56-64, 1993.

[2] H. Chang, T. Hou, A. Hsu, S. K. Chang. Tele-Action Objects for an Active Multimedia System. Proc. of Second Int'l Conf. on Multimedia Computing and Systems, May 15-18, 1995.

[3] J. Conklin. Hypertext: An introduction and survey. IEEE Computer, 20(9), September pp.17-41, 1987.

[4] A. Dattolo, V. Loia. Hypertext Version Management in an Actor-based Framework. Proc. of the Seventh Conference on Advanced Information Systems Engineering - CAiSE*95, 12-16 June 1995, Jyväskylä, Finland. Springer-Verlag Lecture Notes in Computer Science (932), Ed. by J. Iivari, K. Lyytinen, M. Rossi, pp. 112-125, 1995.

[5] L. Gasser. Agent organizations for information retrieval and electronic commerce: the next frontier. Proc. of the Workshop on Heterogeneous Cooperative Knowledge-Base, International Symposium FGCS'94, Japan, December 15-16, pp.49-63, 1994.

[6] K. Gronbaek, J. A. Hem, O. L. Madsen, L. Sloth. Hypermedia Systems: a Dexter-based architecture. Communications of the ACM, 37(3), pp.65-74, 1994.

[7] K. Gronbaek, R. H. Trigg. Design issues for a Dexter-based hypermedia system. Communications of the ACM, 37(3), pp.40-49, 1994.

[8] A. Haake, J. M. Haake. Take CoVer: Exploiting Version Support in Cooperative Systems. Proc. of INTERCHI'93, Amsterdam, April pp. 406-416, 1993.

[9] F. Halasz, M. Schwartz. The Dexter hypertext reference model. (K. Grønbaek and R. H. Trigg Eds.) Communications of the ACM, 37(3), pp.30-39, 1994.

[10] N. R. Jennings, M. Wooldridge. Applying Agent Technology. IJCAI-95, Workshop on Agent Theories, Architectures and Languages, Montreal, Quebec, August 1995.

[11] R. H. Katz. Toward a Unified Framework for Version Modeling in Engineering Databases. ACM Computing Surveys, 22(4), pp.375-408, 1990.

[12] C. C. Lin, C. S. Kao, W. C. Shang, S. K. Chang. The Transformation from Multimedia Data Schema to Multimedia Communications Schema in Distributed Multimedia Systems. Proc. of Pacific Workshop on Distributed Multimedia Systems, February 26, pp.1-13, 1994.

[13] B. Magnusson, U. Asklund, S. Minor. Fine-Grained Revision Control for Collaborative Software Development. Proc. of the First ACM-SIGSOFT Symposium on the Foundation of Software Engineering, ACM Software Notes, Los Angeles, CA, USA, December 7-10, 18(5), pp.33-41, 1993.

[14] D. E. Shackelford, J. B. Smith, F. D. Smith. The architecture and Implementation of a Distributed Hypermedia Storage System. ACM Hypertext '93 Proceedings, Seattle, Washington USA, November 14-18, pp.14-24, 1993.

[15] Y. Shibata, M. Katsumoto. Dynamic Hypertext and Knowledge Agent Systems for Multimedia in Information Networks. ACM Hypertext '93 Proceedings, Seattle, Washington USA, November 14-18, pp.82-93, 1993.

[16] M. Wooldridge, N. R. Jennings. Intelligent Agents: Theory and Practice. Knowledge Engineering Review, 10(2), June 1995.

The Role of Version Control in CSCW Applications:
A Position Statement

Prasun Dewan & Jon Munson
University of North Carolina at Chapel Hill
e-mail: dewan@cs.unc.edu

A CSCW application is an interactive application that allows multiple, distributed users to interact with it. It does so by creating copies of its user-interface state on the workstations of the different users, and allowing the users to manipulate these copies. In CSCW applications supporting WYSIWIS (What You See Is What I See) collaborations, these copies are physical replicas of a common logical state and are kept consistent. However, in other CSCW applications, these copies are separate versions derived from some common object.

There are many reasons for maintaining user copies of the interaction state as separate logical versions rather than consistent physical replicas:

r (a) Disconnected Users: It is sometimes impossible or too costly to connect the users of a CSCW application, especially if they are mobile.

r (b) Immediate Local Feedback: Users may wish to receive immediate feedback to their local operations rather than wait until these operations have been invoked on all of the user copies of the interaction state. As a result, the local copies can become, at least temporarily, inconsistent with each other.

r (c) Asynchronous Collaboration: Users may wish to independently manipulate the application state to try different alternatives. For instance, two users interacting with a multiuser spreadsheet might wish to try different alternatives for a budget.

In a system supporting versions, users need to be able to perform the following operations:

r a) Check-Out: derive a private version from some base object.

r b) Check-In: make the state of a private version public.

r c) Merge: merge two versions.

These operations have been supported by traditional version control systems managing persistent objects such as files and databases. A replicated CSCW application can provide the services of these systems to users. In particular, it can Check-Out versions of persistent objects into its in-core data structures, allow users to modify these versions, and invoke, on behalf of the user, the Check-In and Merge operations provided by the version management system.

However, it is not sufficient for a CSCW application to rely completely on these systems to manage its versions, for several reasons:

r (a) Fixed Physical Granularity: Users of the application might want to create versions of logical, application objects such as outline items and spreadsheet cells. Moreover, they might want to determine the granularity of versions. For instance, a user of a document editor may wish to create a version of the complete document, a section, or a paragraph. These systems support fixed-granularity versions of ``physical'' persistent objects such as files.

r (b) Heavyweight Versions: Check-In/Check-Out operations provided by these systems are heavyweight actions in that (i) the user must explicitly invoke these operations and (ii) the application must access persistent storage. The user and system overhead of these operations is justified if the active lifetime of a version is large. Users, however, need to sometimes rapidly create versions with short lifetimes, specially when they are creating versions of small objects such as document sentences or spreadsheet cells.

r (c) Persistent Versions: These systems create persistent versions of persistent objects, whereas the users might want to create transient in-core versions of both persistent and transient objects.

These problems can be solved by systems that manage versions of in-core application data structures rather than persistent objects. This idea has been used in the design of some CSCW applications. For instance, Greif's Chronicle spreadsheet supports variable-grained logical versions by supporting versions of user-defined ranges of spreadsheet cells. Our work on Suite tries to meet these requirements in an application-independent framework. It supports two main abstractions for creating CSCW applications, active variables and interaction variables. An active variable is a program variable of arbitrary type (such as integer, record, sequence) that can be manipulated by multiple users. An interaction variable is the local buffer for an active variable in which the user composes changes to the variable. An interaction variable and its active variable have identical types. Thus, for every active variable component (e.g. record field or array element), there is a component in the interaction variable, which can itself be considered an interaction variable for the active variable component.

Suite provides several primitives for keeping an active variable and its interaction variables consistent. It allows an interaction variable to be coupled with selected corresponding remote variables created for other users. It provides two commands, Transmit and Accept, which can be invoked by a user on a local interaction variable. The Transmit command updates all remote coupled variables with the value of the local variable. The Accept command is similar, except that after updating all coupled variables, it invokes a callback in the application program, which can update the corresponding active variable and take other actions. Suite also allows Transmit and Accept to be invoked implicitly on interaction variables when some condition occurs. These conditions are specified by special attributes of these variables. These attributes can be used to implicitly execute the Accept command on every incremental change to an interaction variable. Thus, the framework allows an active variable and all its interaction variables to always hold consistent values, thereby making them physical replicas rather than logical versions.

The framework also provides the Merge command to merge the values of two interaction variables. The scheme used to merge these interaction variables is specified by special merge parameters of the variables.

This framework can be considered a version management scheme. The first command to edit an interaction variable transforms a physical replica into a new logical version, and thus can be considered as a Check-Out command. The Accept and Transmit commands can both be considered as Check-In commands. The framework has many features missing in persistent version control systems:

r (a) Logical Granularity: It directly creates versions of "logical" application objects rather than "physical" persistent objects to which they may be mapped.

r (b) In-Core Versions: It creates in-core versions of in-core application objects, some of which may also be persistent objects.

r (c) Implicit Check-Out: It does an implicit Check-Out of an active variables of an application when a user first tries to edit the corresponding interaction variable.

r (d) Implicit and Explicit Check-In: By allowing the Transmit and Accept commands to be invoked both explicitly and implicitly, it supports both implicit and explicit Check-In.

r (e) Selective Check-In: A user has control over which remote interaction variables are coupled to an interaction variable. As mentioned above, the Transmit and Accept commands operate only on the coupled interaction variables. Thus, a user can select the set of collaborators to which a version is checked-out.

r (f) Variable-Grained Check-In: As mentioned above, a user can (implicitly or explicitly) select the interaction variable to be checked-in. The interaction variable can be not only a complete structure but also a substructure of a structured interaction variable. Thus, a user can control the granularity of the Check-In operation.

Thus, the framework solves the problems of persistent version management systems. However, it currently has several limitations:

r (a) Temporary Versions: An interaction variable is a transient object which goes away when the user disconnects with the application. The system does not provide a method for creating persistent versions of active variables.

r (b) One Version Per User: A user is limited to one version per active variable. An undo command is provided to restore previous versions, but the system does not allow a user to simultaneously manipulate multiple versions of an active variable.

r (c) One-level Check-Out: Similarly, a user can Check-Out a base object (active variable), but not a version of it (interaction variable).

r (d) Inter-Application Versions: The framework does not support sharing of versions among multiple applications.

r (e) Storage of Complete Version State: Persistent version control systems do not store the complete state of checked-in versions. Instead, they store only the differences between the base object and the version. Our framework stores the complete state of the version. One reason for this is that an interaction variable serves not only as a version of an active variable but also a cache of it. However, if a user is allowed to create multiple versions, it would be important to compute and store diffs of these variables.

r (f) Explicit 2-Way Merging: The framework requires users to explicitly invoke the Merge command, which can do only two-way merges. It would be useful to do implicit n-way merges of versions.

Future research is required to overcome these problems. Such research can lead to a truly general purpose version management system for CSCW applications.

Requirements for a CSCW System for Software Development Organizations

John W. Gintell
JWG Software Systems
9 West Street, Cambridge, MA 02139
e-mail: gintell@shore.net

1 Introduction

This paper outlines the requirements for a CSCW System to be used by Software Development organizations to develop, maintain, and manage changes to a software system. The software system consists of a large number of artifacts (documents, designs, source code, "build" directives, and executable software). During the development of initial and subsequent releases, new versions of these artifacts are created; these artifacts are stored and managed by a version management system. There is a software development process enacted by people to make and manage the transformation on these artifacts. The CSCW system is designed to support three major project functions that involve collaboration between these people: change management, inspection/review, and planning/scheduling.

Section 2 of this paper describes the attributes of a software development organization. It then describes the components of the CSCW system used by such an organization. Section 3 contains a scenario to show how such a system is used. Section 4 describes some of the challenges faced by such a system. Finally, Section 5 contains background information about the author.

2 A CSCW System for Software Development Organizations

2.1 Attributes of a Software Development Organization

Prior to describing the system, consider the following attributes of a software development organization:

The product and its database

r Multiple versions of the products produced must be supported.

r A history of artifacts and their changes needs to be maintained.

The people

r A set of people are organized in some structure with relationships, responsibilities, and activities assigned.

r Considerable cooperation between people and sub organizations is needed to achieve results.

The process

r There is a defined process that governs how and when the work is done.

r Each project must have custom plans and schedules made.

r Review of proposed changes and of the work products produced is needed.

The environment

r A set of existing tools is used to perform the tasks.

r A CSCW system to support this organization should interoperate with these tools.

2.2 A high-level view of a CSCW system to meet these needs

The following system is envisioned to meet these needs. This CSCW system handles the artifacts and the interaction between the people in three collaborative aspects critical to the software development process. The components of this system are:

r The software database:

A database that contains multiple versions of the work products and history information that comprise the system and a version management system that maintains this database.

r The change management system

The mechanism for people to collaborate while determining what the changes are to be. A CSCW application used to determine, review, and manage the changes that people propose and make.

r The Inspection / Review system

The mechanism for people to collaborate while reviewing the correctness of items in the database. A CSCW application for inspection and review used by people to examine the work products for defects during the course of the entire development process.

r The Planning / Scheduling system

The mechanism for people to collaborate while determining who will make the changes and when they are to be made. A CSCW application for planning and scheduling that is used by people to plan the activities and to report on progress of these activities.

These components all interoperate to support a consistent and robust software development process. Actions in one subsystem usually result in actions in one or more of the other subsystems. Changes to the database are controlled by the version management system, frequently as a result of interactions between people using the other components of the system. A large number of configurable roles are defined. People are assigned to roles in a manner consistent with the organizational structure and the process used by the organization. The possible interactions are organized into stages to reflect major activities in the software development process. The role, stage, subsystem state, and access rights of the participants determine which actions may be performed.

2.3 The Software Database and version management system

The software database contains multiple versions of the components as well as versions of the relationships between them. The database is managed by a version management system. The version system meets a number of needs of the organization:

r some components differ for varying hardware and system platforms

r old releases must be supported and maintained

r work is frequently done in parallel during the development/integration process

r multiple versions of some components exist for short periods of time to support varying degrees of frozenness and fluidity

2.4 The Change Management subsystem

All proposed changes to the product are processed by the change management subsystem. Changes are proposed by many participants in the software development process for such reasons as meeting new product requirements; for improvements to the architecture and design of the system; or to correct defects found during testing, review, or use of the system. These proposals are reviewed and may cascade into further change proposals as well. Participants follow a set of protocols for reviewing, modifying, and approving these changes where organizations and individuals hold specific roles.

2.5 The Inspection / Review subsystem

This subsystem is used at varying points in the software development lifecycle to examine work products for defects. The results of this review will produce future work items such as lists of defects to be removed and suggested improvements to be made in the future. The review and inspections follow a defined process with roles assigned to participants. These results become additional artifacts in the database and linked to the artifacts that have been reviewed. As new versions of the reviewed artifacts are made to resolve the review issues or remove defects in subsequent stages of the development process, new versions of the remaining defect/issue lists are made. Roles and stages are defined to support various models of inspection and review. The review system supports the review of versioned objects where changes to previous versions can be reviewed and with assistance to the merging and resolution of conflicting versions.

2.6 The Planning / Scheduling subsystem

This subsystem is used to manage when tasks are to be performed and who will do them. It is also used to determine status and to manage changes to the plans and schedules as a result of new events or situations.

3 Usage Scenario with this System

The following scenario illustrates the use of such a system to process a proposed change to a product. This scenario contains a number of interactions between people and organizations to accomplish the work according to the process for the organization.

r A marketing submitted request for new features is sent to the Change Management system.

r The development project leader and architect collaborate to produce a design specification which includes versions of existing designs.

r The Inspection system is used to review this specification with members of the development team participating.

r Further design analysis is done and a number of detailed changes to modules in the product are proposed in the Change Management system and reviewed by project participants.

r The development project leader and the test manager use the Planning and Scheduling system to schedule the work; affected people comment upon and accept these plans and how they affect other existing plans; the plan is then approved by the project manager.

r Some of the modules planned for changes have lists of other proposed changes stored from a prior review; these changes will be added to the work needed for this change proposal.

r Some of the modules are being changed for another project; consequently an additional branch is made in the version system and a task is scheduled with the Scheduling system to merge and reconcile later.

r The changes are made and inspected with a number of invocations of the Inspection system. The inspection results in some defects to be removed immediately and some change requests to be handled in a subsequent project.

r Progress on these activities is tracked and reviewed on a regular basis with the Scheduling system.

4 Challenges for this CSCW System for Software Development

For a system like this to be successful and meet the requirements of the software development enterprise it should obey the following properties:

r The system must be robust and reliable with adequate backup procedures and auditing capabilities.

r It must be flexible and customizable for variations of the process.

r The system must support considerable parallel activities on shared components in the database.

r It should not incur undue overhead and unfavorably affect the productivity of the participants

r The development enterprise wants to use off-the-shelf tools to perform their work. It already has an investment in these tools and thus the CSCW system should be able to be integrated with these tools.

As a reality test, it would be worth looking at some existing products and experimental systems to see how far away they are from this model. It is beyond the scope of this paper to discuss these in any degree of detail, but a few comments are offered (with apologies for any erroneous inferences).

r Lotus Notes and other groupware systems are excellent for handling change management or scheduling but are difficult to integrate with a robust versioning system.

r Commercial version management systems (e.g. ClearCase from Atria or Continuus/CM from Continuus Software Corp) provide excellent support for versioned objects and their relationships, support a customizable change management process, and have APIs and hooks in their object management systems to store other attributes and linkages to other packages. They contain no direct support for the CSCW aspects and permit limited integration with CSCW systems.

r CSCW software inspection systems (e.g. Scrutiny, from Bull or CSRS from the University of Hawaii) are designed to support the review process but not to be tightly integrated with a version management system.

r Process Centered environments (e.g. OZ from Columbia University, ProcessWeaver from Cap Gemini) allow flexibility of process description and some integration with tools, and allow some support for CSCW aspects.

r IPSEs (e.g. EAST from SFGL) are complete systems aimed at supporting strict software development processes, contain their own versioning systems that support change management and scheduling, permit tool integration, but don't support the CSCW aspects very well.

5 Background of Author

My professional experience has been (in reverse order): Independent groupware and software engineering consultant; Architect and Project manager (and co-founder) of an Applied Research Lab at Bull in Billerica, MA, working with collaborative applications; Senior staff member of company-wide team in Bull introducing improved software engineering practices and tools to entire company; Second level manager of several software product development organizations at Honeywell including 18 years of technical and management experience with Multics; and a software developer designing and implementing operating systems and compilers. The relevant experience is:

r Architect and project manager for the Scrutiny1

project at the Bull. Scrutiny is a project to experiment with a CSCW toolkit by building a CSCW application, obtaining usage of it by software engineers and evolving the application as a result. The toolkit used was ConversationBuilder (CB) from the University of Illinois at Urbana Champaign. Lead author and presenter of several papers about Scrutiny. "Trade-show" planning and demonstration. Managed experimental and pilot use of Scrutiny by software development teams in Bull and explored use by a number of potential customers.

Scrutiny is a distributed system for geographically separated users performing the entire inspection process (from participant invitation to defect tracking). Roles defined by the inspection process are assigned to participants and the actions permissible are dictated by the role and state of the inspection. The inspection is divided into four sequential stages (called Initiation, Preparation, Resolution, and Completion) to reflect the workflow of inspection. The artifacts managed by Scrutiny are the work products that are imported into it, participant-created annotations and defects associated with these work products, and inspection results that are exported for use by other tools.

r Participated in CSCW and CHI workshops on collaborative technologies for past 3 years. Designed an extensive multi-user distributed document review system for a potential client. Studied and/or experimented with current integration technologies including ToolTalk, the Message Bus from UIUC, OLE, OpenDoc, and CORBA. These are key technologies permitting composite applications to be built with existing applications interoperating as components of a larger system.

r Worked with users to determine requirements and critical success factors for introduction of new tools for software development (UNIX workstation introduction, electronic meetings and other communications systems, and CASE tools). Made acquisition, development and deployment plans for tools to support improved software development process.

On Merging Hypertext Networks

Anja Haake, Jörg Haake & David Hicks
GMD - German National Research Center for Information Technology
IPSI - Integrated Publication and Information Systems Institute
Dolivostr. 15, D - 64293 Darmstadt, F.R. Germany
e-mail: {ahaake, haake, hicks}@darmstadt.gmd.de

1 Introduction

Versioning is a key problem in many hypertext applications. For a single user, versioning provides the ability to keep a history of the evolution of a hypertext network and the ability to explore several design alternatives at a time. In a multi-user environment supporting group work, keeping track of the evolution of data and allowing multiple alternatives, possibly created by different group members, is even more important. In particular, version support in a CSCW environment enables the group members to work asynchronously at the same time on the same artifacts by keeping their changes in separate revisions.

Allowing parallel versions or alternatives, either due to alternative designs or parallel work, requires the ability to merge parallel versions into a single consistent version from time to time: the different design alternatives have to be merged into an overall optimal design or the different contributions of the group members have to be integrated.

Whereas several version models for hypertext applications have been proposed that allow for the development of parallel versions [9, 16, 7] little work has been done on merging hypertext versions. By hypertext versions we mean merging not only atomic hypertext nodes - but also nested composites that may contain complex networks.

In this position paper we raise several issues relevant for merging hypertext networks. We first summarize the state of the art on merging technology in general in Section 2 before raising six issues relevant for merging hypertext networks in Section 3. It turns out that the interface offered to merge hypertext network versions plays a major role for the merging task. We propose three different interfaces for merging hypertext networks in Section 4. Finally, we will summarize our claims in Section 5.

2 Related Work

Merging is normally based on a comparison of the versions to be merged. During merging, it has to determine which changes should be placed into the integrated version and which should not. The goal is to make as many merge decisions as possible automatically by a merge prcedure and only prompt the user in the event of unresolvable conflicts that require an intellectual manual decision.

The first work on merging versions was prompted in the application area of software engineering: parallel software development requires the integration of several software modules. Initial work focused on merging text files, namely the text files containing the source code. In syntax-directed programming environments some work taking into account the structure of the programming language has also been done.

In comparing text files for merging it turns out that not all changes are always relevant. For example, the insertion of extra blanks or comment lines. Therefore, tools for flexible comparison have been developed [13]. In addition, [15] have mentioned the importance of flexible differencing techniques to high-light the changes between text objects at different levels of detail and abstraction.

However, even those merge tools based on a flexible comparison algorithm are based on a more or less fixed set of assumptions on what kind of information is relevant for the merge procedure. Recently, [12] proposed a flexible object merging framework for CSCW applications. They argue that standard merging policies are not applicable for all group-work situations. Actually, the merging decisions that can be done automatically and the merging conflicts that have to be resolved manually (interactively by the users) depend on the cooperative situation. For example, a merge policy implemented in many merge tools is as follows:

Imagine co-workers A and B having each changed a common starting version V-I of a document. Doing so, they created two parallel versions, namely V-A and V-B. A common strategy to merge V-A and V-B is to take everything from V-I that has neither been changed by A nor by B, take everything from V-A that has not been changed by B with respect to V-I, take everything from V-B that has not been changed by A with respect to V-I, and forward everything that has been changed by A and B as a conflict to an interactive merge tool that can be operated by the co-workers. But what if A is a teacher and B a student and A has to see all changes of B before they can be accepted for a merged version? [12] mention several more examples illustrating the need for a flexible merge environment.

In the environment of Munson & Dewan it is therefore possible to define merge rules for structured objects. The merge rules apply to the constituents of objects and define which changes of which co-worker should be selected in a certain situation, or whether the changes should be reported as a conflict to the co-workers. If a constituent is a reference to another object, the merge procedure will be recursively applied to the referenced object.

As for the domain of hypertext, little work has been done on merging versions of hypertext networks, although several version models for hypertext applications have been proposed that also allow for the development of parallel versions [9, 16, 7]. Only systems that store the content of atomic nodes into files and use standard software packages designed for maintaining versions of the file to maintain versions of the content of the node [3] also offer standard merge software such as rcsmerge [20]. [6] and [7] make the proposal to consider all versions to be merged as alternatives and show how this approach has been used in the SEPIA system [19] - but this results in a hypertext network consisting of alternative versions and not of merged versions. We are not aware of any work considering the merging of versions of hypertext networks.

3 Issues for Merging Hypertext Networks

Allowing parallel versions or alternatives, either due to alternative designs or parallel work, occasionally requires the ability to merge these parallel versions into one single consistent version. As for hypertext, not only do we have to consider the merging of atomic nodes, it is also necessary to merge links and nested composites, the latter possibly containing complex networks. So, in our opinion, a major challenge is to support merging of structured hypertext networks. This raises the following issues:

r (1) We agree with Munson & Dewan [12], that it should be possible to flexibly determine and select a merge procedure. Applied to hypertext applications, this translates into the need for the flexible determination of a merge procedure based on the hypertext application data model and the (group-)work situation.

r (2) We anticipate that it may always be possible for merge conflicts to occur that cannot be resolved automatically. Therefore interactive merge tools have to be developed for hypertext applications.

r (3) As stated in [15], flexible diff-ing of text files is important, in particular, to support manual merge decisions on textual hypertext content. For this reason flexible diff-ing should always include a suitable presentation of the differences to the user. We propose that every interactive merging environment for hypertext applications should provide such techniques to enable the interactive comparison of text content of atomic hypertext node versions.

r (4) Hypertext also comprises non-textual data types and includes additional media types (hypermedia). If available, comparison techniques for other media (e.g., vector graphics, video) should be integrated into a merge environment for hypermedia. Following the previous claim, they should be as flexible as possible to serve different application needs.

r (5) In particular, support for structural merging of hypertext networks has to be provided. This includes flexible comparison techniques for hypertext graphs and adaptable tools that communicate the relevant differences to the users in a comprehensible way to support merge decisions.

r (6) Interactive merge tools not only have to support flexible diff-ing for multimedia content and hypertext network structure, but should also give the opportunity to specify the merge result interactively.

Considering these six items, we focussed our work on the development of interactive merge tools (issues 2 - 6) for merging hypertext network versions. As noted above, it is possible for merge conflicts to occur that cannot be resolved automatically (issue 2), so interactive merge tools for hypermedia will be required. Additionally, we are not aware of any standard merge rules that are common for merging versions of hypertext networks (cf. Section 2). By implementing and using interactive merge tools to investigate different merge situations and perform the merge manually, we hope to define different sets of merge rules (merge styles) depending on different hypertext data models, hypertext applications, and group-work situations (issue 1) while at the same time improving the merge tools we developed.

Looking at the merge tools, we are not focussing on the development of enhanced or new flexible comparison techniques. For atomic content, we expect to integrate existing mechanisms developed by others. For structural comparison, we can benefit from our VerSE hypertext version environment [9]: In VerSE both versioned hypertext objects and their individual versions carry an identifier which makes comparing of composites a simple task.

Rather, we are focussing on the interface of the merge tools. The interfaces of merge tools for merging hypertext networks play a key role. The interface has to adequately visualize the different kinds of differences between the versions that have been defined by flexibly-defined comparison criteria. In addition, it must support simple ways to specify merge decisions.

The next section introduces our work on interfaces for merging hypertext networks.

4 Interfaces for Merging Hypertext Networks

Before introducing three interfaces for merging versions, we will first describe the setting in which our research takes place.

4.1 VerSE: A Version Support Environment

To have a test-bed for investigating merging for hypertext networks, we are currently adapting the approach of Munson & Dewan [12], which has been designed for and implemented in the Suite system [1], to our VerSE version support environment. VerSE is based on ideas of the HB3 version support framework [9] and the CoVer version model [5, 7]. It is based on the Smalltalk Frame Kit (SFK) [2].

SFK is an extension to Smalltalk-80 which provides typed frames along with the ability to attach declarative descriptions as facets to the frame properties (slots). In particular, SFK provides inverse slots which are very useful to consistently implement highly connected structures like hypertext networks. SFK has been used to implement the SEPIA [19] and DOLPHIN [18] cooperative hypertext systems and was also the basis for the CoVer version server that has been used to integrate version support into SEPIA. In SFK, merge rules according to [12] are applied to frame slots. However, as mentioned in Section 3, there are no common merge rules for hypertext applications. So for now, all differences are reported as conflicts to the user.

For our initial approach on merging hypertext networks the basic assumption is that each hypertext network can be described as a composite node. A composite contains a set of references to hypertext objects: atomic nodes, binary links, or other composites. Consequently the network versions can be described as versions of the respective composite node. A composite version contains a set of hypertext versions: atomic node versions, binary link versions or other composite versions.

To begin our experiments on merging versions of hypertext networks we made the following assumption: each composite version of a network (transitively) contains at most one version of each of its versioned constituent objects. This assumption is a consistency constraint maintained by many version models that assume the application is interested in at most one version of each constituent object of a complex system [16, 17, 4]. With this constraint it is possible to express all references to the versioned objects of a composite on a more abstract level: given a certain composite version, the appropriate version of each of its versioned constituent objects is automatically and uniquely defined.

In contrast, many hypertext version models allow different versions of the same versioned hypertext object to be contained in a single composite version [9, 7]. This implies that a composite version to be merged may already consist of alternative constituents. We will not elaborate the details on how to merge these composite versions in this position paper, but rather stick to the assumption that each composite version transitively contains at most one version of each other versioned hypertext object available. However, we will take advantage of the possibility to represent merge conflicts as alternatives to construct powerful merge tools (cf. Figures 4 and 5 in Section 4.3 and Section 4.4, respectively).

At the moment, we envision three different merge tools supporting the interactive merging of hypertext versions: The List-Merger, the Graph-Unification-Merger and the Graph-Comparison-Merger.

4.2 The List-Merger

The List-Merger is a basic tool to merge versioned frames in general. It does so by showing the differences between all attributes of two or more versions to be merged. It allows the user to select for each attribute one of the alternative values that will serve as the merged value of the attribute. Alternatively, a new value can be specified for the merged value of the attribute. The List-Merger is intended to support structural merging at an attribute level, as proposed by Munson & Dewan. It can be characterized as frame- or object-centered. As we will see by the example used to explain its interface, it can also be used to merge hypertext objects and hypertext networks.

The design for the List-Merger is as follows (cf. Figures 1 (a) - (c)):





Given a versioned hypertext object, the version graph of the hypertext object to be merged is displayed in the upper window of the List-Merger. In the version graph each version may use arrows to point to its predecessor version(s). The users may select two or more versions of the object that should be merged. The list views underneath the version graph will be configured depending on the number of versions chosen by the users.

The first column (frame-column) shows the identifiers of all versioned frames that belong to the contents of the selected versions of the versioned hypertext object and the versioned hypertext object itself. As mentioned above, to merge two or more hypertext network versions, we assume that every hypertext network version contains at most one version of each versioned constituent. By mentioning a versioned constituent in a given composite version, the version of the versioned constituent is uniquely defined. Therefore, the frame-column shows the versioned frames contained in at least one of the versions to be merged and not the individual versions of the constituents themselves.

In addition, the definition of content varies for the different hypertext objects: an atomic hypertext object contains text only, so it does not consist of other hypertext objects and the hypertext object to be merged will be the only entry in the list (cf. Figure 1 (a)). A link has the start and destination object as its `content', and a composite has its component hypertext objects as content (cf. Figure 1 (b)). The expansion of content can be applied recursively.

The next column (slot-column) shows all attributes/slots defined for a selected object. Specifying different so-called filters, it should be possible to show just certain slots the user is interested in. For example, the creation date of versions by definition is not subject to the merging process and does not need to be shown. Or the user may just be interested in seeing the values of those slots that are actually different with respect to the versions selected.

To the right of the slot-column is a number of columns (version-columns), one for each selected version. Whenever the user selects a slot from the slot-column, each version-column displays the value of the slot as of the version the column represents. In this way, the user can compare the different values for the selected slot. For atomic values like text, flexible diff-ing mechanisms should be available to high-light the differences (cf. Figure 1 (a)). For composite versions the content attribute value is a set of references to the versioned objects it contains (cf. Figure 1 (b)). For each referenced versioned object the actual version can be investigated individually using the List-Merger: the respective versioned object has to be selected from the frame-column and the values of its versions with respect to the composite versions to be merged will be shown in the version-columns (cf. Figure 1 (c)). If, for some version of a composite, there is no version of a particular constituent hypertext object, the corresponding version-column will indicate that no value for the attribute exists for that version of the composite.

The rightmost column (merge-column) contains the merge decisions made by the users. The users may either directly choose a value of a certain version by adding the version identifier to the column for the selected slot, or they can compose a new merged value, for example for text. When the merge column is filled in for every slot for every frame of the content that differs for the selected versions, the merge is completed.

Although this interface allows a comparison of all slots of the versions to be merged, we expect it to be mainly used for simple merge tasks. One problem is that only one slot value can be examined at a time, and, depending on the values for other slots, the user might have made different merge decisions. As the example of merging two composite versions, i.e. two versions of a hypertext network, shows, investigation of component versions requires the user to temporarily give up the display of the network versions in order to show the component versions. For example, in Figure 1 (b) the user might get interested in investigating the version(s) of the component N1, which is present in both versions of the composite C1. The user may then select the node N1 in the frame-column, as shown in Figure 1 (c). Since the List-Merger is object-centered, the display of the composite is given up to display its component and the user risks the loss of context. Of course, the user may revise decisions in the merge-column, but it is difficult to get an overview of all options available using the List-Merger. This is also true due to another problem with this tool, namely that the representation of the hypertext network versions is totally different from the usual interface used for creating and manipulating hypertext networks.

Assuming that many hypertext tools support a graphical view of the hypertext network structure, we plan to overcome these problems with special graphical browsers for merging hypertext networks. At the moment, we envision two graphical browsers for merging: the Graph-Unification-Merger and the Graph-Comparison-Merger.

4.3 The Graph-Unification-Merger

The Graph-Unification-Merger is basically a graphical browser tuned to support the merging of hypertext networks. Like the List-Merger, given two or more versions of a hypertext network (represented by a composite) to be merged, it determines the union of all versioned objects of which a version is directly referenced in the content of one of the composite versions. Whereas the List-Merger computes the union recursively, the Graph-Unification-Merger considers direct constituents only. While the List-Merger adds all constituent versioned objects to the frame-column, the Graph-Unification-Merger presents the unified content of the composite versions as a graph. The union of those versions of all versioned objects that are contained in all composite versions to be merged may be laid out using automatic visualization techniques, such as those that have recently been developed at IPSI [10]. Thereby, spatial information of the individual component versions may be ignored. Therefore, the Graph-Unification-Merger is particularly useful when the spatial attributes of hypertext network versions are not that important.

If the union of the composite version content contains more than two versions for a versioned component, then these versions will be displayed as alternatives. Alternatives may either be displayed by piling the versions on each other (cf. Figure 2) or by indicating them using tabs (cf. Figure 3). Piling of alternatives has been implemented in the versioned SEPIA system [6] and was later replaced by the tab dosplay to deal with alternatives.



Figure 4 shows a sketch of the Graph-Unification-Merger envisioned using tabs to indicate alternatives. Using the version browser in the upper window, the user can specify the composite versions to be merged. After determining the composites to be merged (e.g., the versions V-A, V-B and V-C of the composite C1 in Figure 4), the unification of their content will be displayed in the lower window as a graph with alternatives. Whenever one of the composite versions to be merged is selected in the upper window (e.g., version V-C in Figure 4), the Graph-Unification-Merger high-lights all constituent versions of the composite version in the lower window. Versions not belonging to this composite version will not be high-lighted. In this way, the user can examine the structure of the different hypertext network versions.

We are considering two options to specify the merge result. The first is to specify the merge by eliminating alternatives or creating new (versions of versioned) objects in the lower window. When only one version for each constituent remains, we consider the merge completed. Another idea is to provide a default empty alternative for each constituent version that is assigned to the merged composite version and also visualized by a tab. The merge result can then be high-lighted and compared with the original versions. By assigning constituent versions to the merge result versions, eliminating a merge result version altogether, or creating new merge result versions, the user may then interactively specify the merge of the composite versions. When all empty merge versions have been assigned a specific version, we consider the merge complete.

The lower window of the Graph-Unification-Merger is in particular useful to inspect the structure of the composite versions content. To inspect further attributes of the composite versions, e.g. the title, a simplification of the List-Merger could be added to the Graph-Unification-Merger.

4.4 The Graph-Comparison-Merger

If the spatial information of the hypertext network components is important [11], we expect it is more suitable to open separate browsers for each composite version to be merged. But the separate browsers should then be linked in such a way, that whenever the user selects a component version in one browser, all other browsers of composite versions that contain the same version or other versions of the same versioned component will high-light these. Two different types of high-lighting will be required to differentiate between the presence of the same version and the presence of another version of the same versioned object. In this way, the user can examine the different alternatives and positions of component versions.

To ease the comparison, the browsers for the composite versions should either be displayed next to each other (cf. Figure 5(a)), like the version-columns of the List-Merger (cf. also [14] for the advantage of using a column-based interface for comparison) or as overlays (cf. Figure 5(b)). An extra browser or overlay which represents the merged composite version can be used to copy versions from the other browsers to construct the merge result.



Figure 5 (a) shows the versions of a composite in separate windows and an extra window for the merge result. The gray component version has been moved from the right group in the upper version to the left group in the lower version of the composite.

Figure 5 (b) shows three overlays for the same situation as shown in Figure 5 (a), one overlay for each composite version and another for the merge result. The positions of the overlays are adjusted with respect to the 0,0 position of the individual composite versions. Using a version browser or tabs (indicating the composite versions to be merged and the merge result versions) added to the Graph-Comparison-Merger, the user may move one composite version to the front. In Figure 5 (b), version V-B of C1 is moved to the front (gray nodes in Figure 5 (b)). The user may then select a component version, which will be high-lighted by the Graph-Comparison-Merger (gray rectangle with bold edges in Figure 5 (b)). In addition, the Graph-Comparison-Merger uses two other kinds of high-lighting to indicate occurrences of the same or another version of the selected component in other composite versions. Using appropriate color coding, the Merger can also visualize to which version of the composite the different occurrences belong. For example in Figure 5 (b), another version of the component is present in version V-A (light-gray in Figure 5 (b), but no version of the component is included in the merge version (the tab of the merge version is still white). So, pushing specific overlays to the front and selecting a component version will cause the Graph-Comparison-Merger to show the occurrence of other versions of the same versioned component. In this way, the user can examine and compare the different positions of the hypertext network elements.

Again, these tools are suitable to investigate the structure and spatial arrangement of the network content. It may be useful to add functionality as offered by the List-Merger to merge non-graphical attributes of the network versions.

5 Conclusions and Future Work

In this position paper, we raise six issues on merging hypertext networks. Adopting the position of [12] we propose to provide a flexible merge environment for hypertext applications. The main claim is that the resolution of merge conflicts has to be interactively supported by appropriate merge tools, in particular for merging hypertext networks, but also for the (multimedia) content. We propose that merging of structured artifacts such as hypertext network versions should be supported by tools that visualize the structure of the versions explicitly. Thereby the presentation of differences to the user for comparing the versions to be merged and the functionality offered to specify the merge decisions are very important.

Our current work concentrates on the implementation of the different interfaces. We are aiming at both single user and synchronous multi-user versions of the tools. The latter could then be used by a distributed group of users in a kind of merge session.

Our research agenda for the future includes the following questions:

r Are the tools anticipated applicable in a real-world context?

r Which tool is better in which situation?

r Is a fixed number of tools sufficient?

r How to support the merging of composites that already include alternative constituents?

r What are good hypertext network merging styles?

References

[1] Dewan, P., & Choudhary, R. A high-level and flexible framework for implementing multi-user user interfaces. ACM Transactions on Information Systems 10, 4 (October 1992), 345-380.

[2] Fischer, D.H. & Rostek, L. SFK: A Smalltalk Frame Kit - Concepts and Use. GMD-IPSI, Darmstadt, Germany, January 1993.

[3] Garg, P.K. & Scacchi, W. On Designing Intelligent Hypertext Systems for Information Management in Software Engineering. In Hypertext '87 Papers, pages 409-432, Chapel Hill, N.C., November 1987.

[4] Goldstein, I. & Bobrow, D. A Layered Approach to Software Design. In: Barstow, D., Shrobe, H. & Sandewall, E. (Eds.), Interactive Programming Environments, pages 387-413. Mc Graw Hill, 1984.

[5] Haake, A. CoVer: A Contextual Version Server for Hypertext Applications. In: Lucarella, D., Nanard, J., Nanard, M. & Paolini, P. (Eds.), Proceedings of the 4-th ACM Conference on Hypertext, Milano, Italy, November 30 - December 4, 1992, ACM press, pages 43-52.

[6] Haake, A. & Haake, J. Take CoVer: Exploiting Version Support in Cooperative Systems. In: Human Factors in Computing Systems, INTERCHI'93 Conference Proceedings, Amsterdam, 24 - 29 April, 1993, acm Press, pages 40-413

[7] Haake, A. Under CoVer: The Implementation of a Contextual Version Server for Hypertext Applications. In ECHT94 Proceedings, ACM European Conference on Hypermedia Technology, Edinburgh, September 18 - 23, 1994, pages 81-93

[8] Haake, A. & Hicks, D. VerSE: Towards Hypertext Versioning Styles. Accepted for Hypertext '96.

[9] Hicks, D. A version control architecture for advanced hypermedia environments. Dissertation. Department of Computer Science, texas A&M University, College Station, Tx., 1993.

[10] Kamps, T., Reichenberger, K. A dialogue approach to graphical information access. In: Schuler, W. hannemann, J., and Streitz, N. (Eds.), Designing user-interfaces for hypermedia. Springer 1995, pages 141-155

[11] Marshall, C. & Shipman, F. Searching for the Missing Link: Discovering Implicit Structure in Spatial Hypertext. In Proceedings of the 5th ACM Conference on Hypertext, Seattle, WA, USA. Nov. 14 - 18, pages 217-230.

[12] Munson, J.P. & Dewan, P. A Flexible Object Merging Framework. In: Proceedings CSCW'94, October 1994, Chapel Hill, NC, ACM-Press, pages 231-241

[13] Nachbar, D. Spiff - A Program for Making Controlled Approximate Comparisions of Files. In: Proceedings of the Summer 1988 USENIX Conference (San Francisco, CA, Juni 21-24). USENIX Association, Berkely, CA, 1988, pages 73-84.

[14] Neuwirth, C.M., Kaufer, D.S., Chandok, R. & Morris, J. Issues in the design of computer support for co-authoring and commenting. In Proceedings of the Conference on Computer Supported Cooperative Work (CSCW '90), Los Angeles, California, October 7-10, ACM press, 1990.

[15] Neuwirth, C.M., Chandok, R., Kaufer, D.S., Erion, P., Morris, J., & Miller, D. Flexible Diff-ing in a Collaborative Writing System. In Proceedings of the Fourth Conference on Computer Supported Cooperative Work (CSCW 92), ACM Press, pages 147-154.

[16] Østerbye, K. Structural and Cognitive Problems in Providing Version Control for Hypertext. In: Lucarella, D., Nanard, J. Nanard, M. & Paolini, P. (Eds.), Proceedings of the 4th ACM Conference on Hypertext, Milano, Italy, November 30-December 4, 1992, ACM-Press, pages 33-42.

[17] Prevelakis, V. Versioning Issues for Hypertext Systems. In: Tsichritzis, D. (Hrsg.), Object Management, pages 89-105. Atélier d' Impression de l' Université de Genève, Juli 1990.

[18] Streitz, N., Haake, J., Hannemann, J., Lemke, A., Schuler, W., Schütt, H. & Thüring, M. SEPIA: A Cooperative Hypermedia Authoring Environment. In: Lucarella, D., Nanard, J., Nanard, M. & Paolini, P. (Eds.), Proceedings of teh 4-th ACM Conference on Hypertext (ECHT '92), Milano, Italy, November 30 - December 4, 1992, ACM press, pages 11-22.

[19] Streitz, N., Geissler, J., Haake, J. & Hol, J. DOLPHIN: Integrated meetings support across Liveboards, local and remote desktop environments. In: Proceedings of the Conference on Computer Supported Cooperative Work (CSCW'94), Chapel Hill, North Carolina, October 22-26, 1994, pages 345-358.

[20] Tichy, W.F. RCS - A System for Version Control. Software-Practice and Experience, 15(7):637-654, Juli 1985.

Accessibility of Versions as Means of
Handling Large Interdependent Object Spaces
in Corporate Planning Environments

Heiko Ludwig
University of Bamberg
Prof. Dr. W. Augsburger
e-mail: ludwig@buva.sowi.uni-bamberg.de
http://www.buva.sowi.uni-bamberg.de/mitarbeiter/ludwig.html

1 The Problem We Are Working on

In the PlanKo project the construction and use of integrated cooperation systems in organizations is investigated. There is a prototype which offers an integrated environment for planning and designing teams in organizations. Examples are teams working on investment budgeting or marketing strategy. They can use various groupware components for their cooperation such as Group Decision Support Systems (GDSS) or workflows. All user-generated data - the plan or design objects - are stored in a hypertext system to provide a common planning database [6].

In large scale organizations several planning groups having different goals manipulate the same semantic objects, e.g. advertising spendings in a newspaper. Imagine a scenario where a group uses a GDSS based on the idea of an issue-based information system (IBIS) as proposed by Kunz and Rittel [4] and which was implemented for example by Conklin [7]. In an IBIS users can state a problem by issues. Positions represent possible resolutions for an issue and arguments support or defeat a position. Applied to a planning scenario, the issues are the appropriate values for the plan items, positions are alternative values for plan items.

In that case alternative propositions for a plan's detail are accessible by all participants of the GDSS-session and can be considered within the decision process. Propositions of others not working in the same GDSS session but on the same planning items will not be available.

It is obviously not sufficient to lock data objects for a workgroup in the way transaction oriented software systems would do to guarantee the consistency of the data base. Different groups should be able to work on the same planning items. However, to be aware of each others interests, everyone should be notified of the activities of others on the same objects. Most of today's groupware systems do not support this kind of awareness, because they are designed as single cooperation systems.

The question we are currently working on at the University of Bamberg is, how to realize inter-group awareness in order to get an inter-group decision. How to integrate GDSS with a database capable of versioning? What should one group let others know of the current state of its work and what kind of information of others should a group take over into its own decision process?

2 Solution Space - Making Alternatives Visible

Groups should be able to design the appearance of their work for other people. Some groups might have a current state of their work which they want to publish to others. They are not willing to show their internal propositions. Another group may like to offer all versions of a plan item. There also may be a group that wants to offer historic versions of agreements on plan items.

Figure 1 - Presentation of group work to people outside

Fig.1 shows different behaviour of two groups working on the semantic object Plan Item. Rectangles left of the vertical line represent the part of a versioned object which is visible to the outside of the object. The rectangles in the left row of a group area represent alternative values for the object Plan Item which are visible to other groups or individuals working on the objects. The semantics of the edges is "<right> is an alternative value for <left>". It is not the semantics of a revision tree!

In our case, group A does not want to make its internal decision process visible to other groups working on the same object. The group has an alternative Current Agreement which has the value Proposition A2 at the moment. The bright color of the two rectangles represents the decision. Group B works in an open way. All alternatives can be accessed by others, in our case the group A.

3 Solution Space - Importing Alternatives

A strategy for the integration of propositions of other groups in the solution space of the own GDSS is necessary, too. An integration of all accessible information might lead to an information overload. Should only propositions other groups have agreed on be integrated? Which kind of transparency is useful for which decision situation?

Figure 2 - Integration of external versions in the decision process

4 Functional Dependent Objects

One group does not only affect the other by working on the same plan item, but by deciding a functionally determining item as well. Imagine a group of controllers who want to forecast the marketing spendings of an enterprise in the next period, which is a sum of the marketing spendings for the products A, B and C. (Functions are represented as ovals in the figures) The product management group of product B discusses the production level of the next period and the marketing spendings necessary to reach this level. Even if the two groups do not decide on the same objects, they have to be aware of each others decisions.

Figure 3 - Functional dependent objects

In which way could this happen? Should the function be evaluated to generate a new version of the total spendings for each combination of propositions for the items product A to C? To whom does the generated position belong? There may be many combinations of positions in a deep aggregation hierarchy.

5 The PlanKo Solution

To solve the problem, we extended the hypertext system with the concepts control group and version. There are two global states of a node: If there is just one value and no alternatives for an object, the object's state is called static. When several groups or individuals work on a node in our system, its state is called dynamic. A control group represents the "presence" of users in a node and may consist of a single individual or a whole group. Versions are generated by these control groups. A node's version is also a node. Its status can be static or dynamic. In this way hierarchical versioning can be implemented like group A in Fig.1. If a version is represented by a static object, it is just a usual value like the positions of Group B in Fig1.

Functional relationships between objects are always evaluated and treated as versions of the dependent object. A special control group called generated is used for it.

Versions can be seen by peers, i.e. the value of a (top level) node can be seen by anybody, versions of nodes can be accessed by control groups on the same versioning level. In Fig.2 group B accesses the version current agreement of the object plan item. It may do this, because group A and B are both control groups of plan item. Group B cannot access proposition A1. An API provides these functions which can be used when designing and implementing GDSSs.

In the PlanKo system several tools use this mechanism. There is a version view which allows a user to browse versions of an object, but only if the user is part of a control group of the object. In the GDSS the participants can choose to work in an open mode like group B in Fig.1 and Fig.2 or in a closed mode like group A. Versions within other control groups may be included dynamically during the GDSS-session.

6 Related Work

There are multi-user hypertext systems like SEPIA ([2], [3]) which notify the "presence" of other individuals working on the same node. Users can even perceive operations of others at the same time they are performed. But usually users are individuals and cannot decide, which of the alternatives they have generated are visible to others.

Several database systems like LINCKS [5] allow different parallel versions of an object. Usually they do not support the access and control of versions by a group. A generator of a version is an individual user.

Many other groupware tools such as group editors use versioning to a certain extent, but in almost every tool versions are owned by single users and do not meet the requirements of a decision making environment for complex object spaces.

7 Summary of the Research Interest

Corporate or infrastructure planning of complex, highly interdependent subjects always involves many groups and committees with - maybe conflicting - goals. These groups work simultaneously or in temporal sequence on common objects. Some results of the work of others may be useful for them, but groups usually do not want to bring all the internal material to the public. Groups should be able to design their output and input of versions and they need supporting software tools.

References

[1] Fuchs, L.; Pankoke-Babatz, U.; Prinz, W. Ereignismechanismen zur Unterstützung der Orientierung in Koordinationsprozessen. In: U. Hasenkamp (ed.): Einführung von CSCW- Systemen in Organisationen, D-CSCW '94, Vieweg, Braunschweig, 1994, pp. 31-45.

[2] Haake, A; Haake, J.M. Take CoVer: Exploiting version support in cooperative systems. In Proceedings of the InterCHI'93 (Amsterdam, Netherlands, April 26 - 29), pp. 406-413, ACM Press, New York, 1993.

[3] Haake, J. M.; Wilson B. Supporting Collaborative Writing of Hyperdocuments in SEPIA. In: Turner, J.; Kraut, R. (eds): Proceedings of the CSCW `92 - Sharing Perspectives (Toronto, Canada October 31 - November 4), pp. 138-146, ACM Press, New York, 1992.

[4] Kunz, W.; Rittel, H. Issues as Elements of Information Systems. Working Paper 131, University of California, Berkeley, 1970.

[5] Sjlin, M. LINCKS - a platform for computer supported cooperative work. In: Demo at the Conference on Computer-Supported Cooperative Work (Chapel Hill, USA, October 22-26), 1994.

[6] Wittke, M.; Mekinic, G.: Kooperierende Informationsrume: Ein Ansatz für verteilte Führungsinformationssysteme. In: Wolf Rauch, Franz Strohmeier, Harald Hiller, Christian Schlögl (Hrsg.): Mehrwert von Information - Professionalisierung der Informationsarbeit. Proceedings des 4. Internationalen Symposiums für Informationswissenschaft (ISI '94, Graz, November 2 - 4), pp. 399-407, Konstanz, 1994.

[7] Yakemovic, B.C.K; Conklin, E.J. Report on a Development Project Use of an Issue-Based Information System. In: Proceedings of the CSCW `90 (Los Angeles, USA, October 7 - 10), pp. 105-118, ACM Press, New York, 1990.

Fine-Grained Version Control in COOP/Orm

Boris Magnusson
Dept of Computer Science, Lund University
Box 118, S-221 00 Lund, Sweden
e-mail: Boris@dna.lth.se

Abstract

In this paper, we discuss the role of version control in collaborative applications. We argue that many of the mechanisms needed in a collaborative editing environment can be built on top of a fine-grained version control mechanism and provide the integration mechanism. We outline how such a system has been designed in the COOP/ Orm effort.

1 Background

When sharing information with other people performing simultaneous changes, maintaining versions of the shared information is a very critical issue. Not only do we have to answer questions like "What changes did I make to the program (since it is no longer working)?", or "What changes have I made since the last release?", but also "What has happened to the system during my vacation?" and "What changes did my colleague make to the paper since I last saw it?" Such questions regarding the history of a document are very frequent. One should expect them to be easy to answer and such facilities should be considered basic functionality. In a common environment, where people actually work together to develop technical systems such as software systems, structured documentation, etc. this is not so. We will first try to summarize the needs related to version control in collaborative environments and then describe how we in COOP/Orm have set out to try to solve these problems.

2 Problems and demands

2.1 Integrated representation.

Version control systems are traditionally implemented on top of an existing file system, as a secondary concern. The primary representation of a document is a snapshot of its development and several versions are realized through several files. The tools in the environment (editors, compilers, text processors etc.) do not understand versioning. This way of representing versions of documents as independent files thus limits the support that can be offered to the user. We have chosen a model where the representation of a document includes its history with all its versions. As a consequence all tools in our environment must understand this representation and version control.

2.2 Support for structured documents

Most interesting documents are structured in the sense that they are made up of small information units that are related to each other. A very common structure is hierarchies, or trees. A book with chapters, sections and paragraphs containing text or figures is one example, programs with classes and procedures is another important example. Typically, hierarchical documents have a shared revision history in the sense that a version of a composite (e.g., second printing of a book) determines the version of all its components and that a change to one of its components creates a new version of the composite. Traditional version control systems have no notion of hierarchical documents and the user has to make a choice between creating large documents, which are versioned controlled as one unit, but can be awkward to work with or share, or creating many small components and to managing the versioning of the composite manually. Some recent version control systems such as CVS have acknowledged the need for a collection of documents with a common version history and offers versions of Unix directories as a mechanism for this. Again we have chosen to integrate support for structured documents into the fundamental storage model.

2.3 Versioned configurations

Most systems are actually configurations of smaller parts. One important reason for this is to make it possible to include a component into several different systems. Program libraries are a trivial example of this situation as well as the reuse of drawings and illustrations in papers and articles. Configurations are different from structured documents since a change to one component should not automatically generate new versions of all systems where it is included. If it does, which is often the case in simple check-in/check-out versioning systems, the possibility to track the history of the configurations is lost. The mechanism needed here is to have links which include information about the version of the destination document. A pre-requisite is that all versions of a document are readily available. Again, these mechanisms are included in our model.

2.4 Alternatives and merge

In our opinion, pessimistic check-out ('locking') is too restrictive to use in a collaborative, distributed environment. The alternative is to allow parallel development of alternatives and then provide easy to use facilities for merging. Alternatives and merging also result from planned development and need to be supported anyway, but allowing optimistic creation of new alternatives will make this, as well as merging, even more frequent. Current version control systems focus on managing changes to information in single files. Changes to the structure of a large document, or system, is not supported and can be very awkward to perform in a multi-user environment.

2.5 Intuitive user interface

We have argued that facilities for seeing old versions, comparing versions, and indeed also merging documents are fundamental and frequently needed in a single user environment and are even more important in a collaborative environment. It should thus be simple and easy to perform these tasks. Our model of active diffs, interactively presenting differences between documents, provides a uniform and intuitive model. As we shall see below it is used during editing, when comparing versions, during a merge, and in synchronous editing mode. A graphical presentation of the version graph provides an overview of the development history of the document and enables the user to navigate and compare versions.

2.6 Distributed shared documents

Users may be geographically distributed and need to work together despite a network failure, although there might be some degradation of system capabilities. In order to support such a situation, the storage model must cope with the replication and later merge of documents. Our model, which considers the full history of a document to be the identifiable unit, rather than each individual version, makes it almost trivial to synchronize modified replicas because each modification is always just an addition of versions.

2.7 Modes of synchronization

Teams of users tend to work with different needs for synchronization during different phases of a project. There is thus a need for support for a variety of synchronization modes, from asynchronous (where each user works independently) to synchronous where users can see what others are doing with a fine grain of detail, both regarding time and detail of modifications. It should be possible to change between different modes of interaction without much effort.

2.8 User awareness

Users of a collaborative system need to be aware of what other users are doing, or have been doing in the system. The level of detail needed depends on the mode of synchronization the user is engaged in. In our model modifications are always represented through creation of a new version of the document. All users share the same version graph which provides the lowest level of user awareness. In synchronized mode each single modification can be shown through active diffs.

3 The COOP/Orm model for cooperative applications

An analysis like the one presented in the previous section has resulted in a model built on a fine-grained version control mechanism. We will here very briefly describe the model and discuss how it addresses the needs outlined above.

The COOP/Orm model is built on the idea of representing the complete development history of a document. Rather than having some versions of a document represented in full and other versions as deltas, all this information is stored in one document. The storage model also supports a hierarchical structure of information. When retrieving information the client has to provide the tree address of a node in the hierarchy and a version identifier. The result is a representation of the information in the node in full, and if it is not the latest revision of a node in an alternative, also some deltas. With this information the client can recreate the version of the node that was asked for. Storing information the client has to provide both the node in full, and backward deltas to the version it started from.

The storage representation is compact since it is only storing the full version of an information node in the last revision of an alternative of a node. Older versions of the node are represented by backward deltas. Nodes that are unchanged between versions are shared. Unchanged nodes are also shared between alternatives, which makes creation of an alternative a cheap operation (while in other models it means copying full documents). It should be noted that in this model a version of a document can never be changed, the only way to change a document is to create a new version.

3.1 Structural representation

The storage model supports managing the version history and the hierarchical structure but the information stored is not fixed in the model. This means that it can be used by a variety of editors for text, graphics, syntax trees etc. An editor provides information nodes in full and delta form. The same editor is responsible for combining information and deltas to recreate a version of an information node.

The storage model includes information about which versions exist and how they relate to each other. It therefore supports construction of a version graph. When comparing versions on the structural level it is immediately evident from the model which nodes are unchanged between the versions since they are shared. For each node in a document it is also possible to request a 'local version graph'. This is a presentation indicating in which versions of the document this particular node is actually changed. The version graph, which is dynamically updated, is the primary source of user awareness in our system.

Providing redundance in a distributed environment is simple through replication. Each replica might, during a network failure, evolve through the addition of new versions. Later, merging is almost trivial since each replica has only additions of new versions, starting from some common state.

3.2 Editors

The editors in our system are responsible for presenting, editing and storing the information in a particular node and are thus free to determine its storage format. The current text editor we are using supports the metaphor of 'active diffs' to present differences between two versions of a node. This is used to mark the changes made in the current editing session (comparing with the start version), to present differences between any two sequential versions, and to present differences during the merge of two alternatives. Merging is guided by a set of default rules used to calculate a suggested merged version and to identify conflicts. Two alternatives can be 'hypothetically merged' to explore the potential conflicts. This mechanism together with a mechanism to communicate changes from one user to another gives a very detailed level of user awareness. It is used in the synchronous mode of collaboration to present what other users are doing. It should be noted that each user is free to make whatever changes he likes since he is creating his own version. Although we have so far only worked with an editor for text, the model supports arbitrary information. We plan to add editors for abstract syntax trees and graphics.

Editing on the structural level involves adding and removing information nodes and 'folders' (nodes that only contain other nodes, e.g., representing chapters). Changes and differences here are represented using the same 'active diff' approach used by the content editors. Diffs on this level also make it easy to identify in a tree structured environment, in particular to recognize sub-trees that are equal and thus of less interest. This mechanism provides an intermediate level of user awareness, between the version graph and the detailed information in a node.

4 Development and future work

The basic mechanisms of COOP/Orm have been designed and implemented and are now operational. The interaction model supporting both asynchronous and semi-synchronous collaboration is described in more detail in a paper at the last ECSCW [6]. The storage model has been described from a technical perspective in [2] and in [1]. Finally the text editor has been presented in [7]. The implementation work has now converged to a client-server architecture and the interactive update of active diffs in the synchronous collaboration mode is the next step. With this step completed the COOP/Orm collaborative editor can be used for evaluating the functionality in a multi user setting. Integration with traditional tools and environments which demand files with single version documents can be achieved through dumping complete versions to file. The COOP/Orm system can thus be used as a front-end in existing environments.

The server-server protocol has been designed and outlined in [4], but not yet implemented. The design focuses on distributed aspects and on collaborative aspects at the configuration management level.

Future development includes re-hosting our grammar interpreting syntax directed editor (SbyS) to this fine-grained version control environment. This will allow us to have control of the versioning of both the document and the defining grammar, and thus unique control of the consistency relations in evolving formal notations. We are thinking not only of traditional programming languages, but also 'application languages' which are much more likely to change rapidly.

Another aspect, perhaps more central to CSCW, is support for the work process. Here we plan to explore the possibilities of using formal descriptions defined by our syntax directed environment, and to achieve a unique level of flexibility and tailorability by allowing the users to modify and extend, these descriptions. Using the version controlled grammar descriptions will, hopefully, make it possible to maintain some level of consistency in such a situation and at the same time give the users sufficient freedom so that they do not feel restrained. In such an environment one could envision that the users in the end will define the work process, rather than the managers, who too often know too little about the work involved. The consequence would be that the work process descriptions that emerge could be seen as a synthesis of experience rather than prescriptions prepared in advance.

References

[1] Ulf Asklund. Identifying Conflicts During Structural Merge. In [5].

[2] Boris Magnusson, Ulf Asklund, and Sten Minör. Fine-Grained Revision Control for Collaborative Software Development. In Proceedings of ACM SIGSOFT'93 - Symposium on the Foundations of Software Engineering, Los Angeles, California, 7-10 December 1993.

[3] Boris Magnusson and Ulf Asklund. XXX: A Model for Fine Grained Version Control of Configurations. Draft report, Lund University 1995.

[4] Boris Magnusson and Rachid Guerraoui. Support for Collaborative Object-Oriented Development. Draft report, Lund University 1995.

[5] Magnusson, Minör and Hedin (Eds.): Proceedings of Nordic Workshop on Programming Environments, Lund 1994, LU-CS-TR:94-127.

[6] Sten Minör and Boris Magnusson. A Model for Semi-(a)Synchronous Collaborative Editing. In Proceedings of the Third European Conference on Computer Supported Cooperative Work, Milano, Italy, 1993. Kluwer Academic Publishers.

[7] Torsten Olsson. Group Awareness Using Fine-Grained Revision Control. In [5].

Version Control in Microcosm

M. Melly*

and W. Hall
Department of Electronics and Computer Science,
University of Southampton,
Southampton, England, SO17 1BJ
Tel:+441703594492
Fax:+441703592865
e-mail: {mm91r, wh}@ecs.soton.ac.uk

1 Introduction

Version control is something essential in a cooperative working environment since different authors share the same information, and sometimes update it in a different way for later integration. Version control for a hypermedia system goes beyond the normal version control for documents we are used to, as it adds the concept of version control for links between documents. A hypermedia version control maintains the history of updates for a certain hypermedia web, and also allows the user to experiment with different configurations of links for a given web.

Microcosm is an open hypermedia system, originally developed for a PC environment. Currently there is a Unix version of Microcosm that has tools to support co-authoring. The cooperative tools offered are:

r supply of awareness information between participants. The following cooperative tools were integrated on Microcosm, and are available for users:

* electronic email

* talk sessions

* whiteboard sessions allowing users to make notes and discuss ideas

r for a whiteboard session, we supply the following functionalities:

* which sessions are active;

* participants of each session

* facilities to allow a new session to start

* insertion of a person in a session

* request to join a specific session

r version control

r provision of a shared access to the linkbase (link database)

r after node editing and revision, supply of a tool to help co-authors to create links between each others nodes, using information retrieval techniques.

In this position paper, we will concentrate on the version control aspects of the cooperative version of Microcosm. More information about general aspects of Microcosm can be found in [1], [2]. Aspects of cooperative Microcosm can be found in [3], [4]. We can say that the original Microcosm had a sort of version control mechanism since it allowed the definition of different webs over a same set of nodes. This is due to the fact that Microcosm keeps the link information separated from the nodes. The problem here, was that nodes and individual webs were not versioned objects. Intermedia [5], [6] had the same kind of version control mechanism. Some of the problems of implementing version control mechanisms in an open hypermedia system are described in [7]. In order to expand the version control concept in Microcosm to a cooperative environment, we developed version control mechanisms for links and nodes.

We did not create a mechanism to allow users to create versions of individual objects like in many systems such as CoVer [8] [9] [10], or HyperPro [11], since this mechanism, although powerful can cause overheads and be confusing for the end user. An application in Microcosm is a collection of nodes and linkbases that are, from the user's point of view, correlated. This allows users to use versions to keep different states of the whole application. The user interface allows the creation of versions for the whole application in a temporal mode, but as will be explained later, users are free to navigate from one version to another if they wish. So, we offer an easy way for the users to create versions, and at the same time we allow the user to navigate through versions, so that no temporal constraints are imposed during browsing. All the individual objects of an application have individual versions, but an application has a very well defined set of versioned objects. Although one can say that this approach is not very open because it does not allow users to choose the individual versions of objects that form an application, it seems to us that it represents very well the meaning of versions: the evolution from an old, well defined state, to a new one, that represents changes made in relation to the old one. It also avoids the construction of mechanisms to avoid combinations of conflicting versions in one application.

PIE [12] is a system developed to assist version control in software engineering applications. It implements different layers of deltas for every different version. Basically there is a first layer, and new layers are created in order to accommodate sets of changes to one or more entities in the network. So a layer could be referenced as the set of changes that implemented feature Y. A fixed combination of layers can be stored as contexts. The concept of context, existent in PIE, does not exist in our implementation. We restricted the layered network model to represent only temporal relationships, so a new layer created constitutes a new version of an entire application, and an evolution of the layer below it. Users can jump from one layer to another during browsing. From one point of view, this browsing facility is more flexible than the contexts in PIE, since no rigid structure is imposed, but on the other hand, there is no mechanism, at the moment, to save a combination of objects belonging to different versions in a context. The functionalities that the user can achieve with our system are:

r the user can create a new version whenever he/she thinks that is convenient

r the user can list all available versions and get a brief description of them

r the user can swap from one version to another in a very simple way

r the concept of version can be abstracted even further by allowing users to choose on the fly which version of a link he/she wants to follow. The user can swap from one layer to another without even noticing that he/she is navigating through completely different versions. Suppose a situation where a vast hypermedia application was created and that different versions of this network were created to accommodate different levels of details and complexity. Readers could jump from one layer (version) to another according to the amount of detail that they would like to see as a result of following a hypertext link. From the cooperative work point of view, supposing that co-authors made modifications in different versions, it is possible to follow links to different versions of the same node and compare different points of views of different authors. Using the awareness information mentioned during the introduction of this paper, co-authors can agree on a final version, for instance. Co-authors are also able to discuss the different versions of links created by the different co-authors - as various linkbases can be active at one time, more than one version of a given application can be active at the same time. Co-authors can use this functionality in conjunction with awareness mechanisms for merging versions. The same functionality can be achieved with the item above.

r as various linkbases can be active at one time, more than one version of a given application can be active at the same time. Co-authors can use this functionality in conjunction with awareness mechanisms for merging versions. The same functionality can be achieved with the item above.

Although we offer mechanisms for co-authors to compare their versions of a given piece of work, they still have to put a lot of manual effort into merging their versions. This is a problem of many other systems, and in our opinion needs further attention from the developers of version control mechanisms.

2 Implementation Details

A network (or web or application) is a set of nodes and links that form a hypertext/hypermedia structure. An application, the name that we are going to use from now on, can have different versions. Each version represents a different set, in some way, of nodes and links. Application Version is the name we are going to use to refer to each one of these version numbers.

The following sections, will describe how we implemented version control on nodes and links.

2.1 Version Control on Links

Exodus [13] [14], [15], [16], [17], the database chosen for our linkbase has some primitive mechanisms for controlling version of the objects stored inside it. Basically, to create a new version of an object, it is necessary to freeze an object and then create a new version of this frozen object. A frozen object can not be modified further, and a new Object Identifier (OID) is given for the new version of the frozen object. As our links are stored inside a database, this primitive mechanism for controlling versions was refined, and used to control the version of the links. Kasper Osterbye in [11], discusses two interesting problems normally found in hypermedia systems. Freezing a node in a hypermedia system that stores the link information inside the node, would not allow the creation of annotations. He proposes the concept of "jel" a node, in order to allow certain operations like annotations to be made in previous versions of a node. On the other hand, freezing a node in a hypermedia system that stores the links separately from the nodes allows the creation of links in a frozen version. He suggests a "frost radiation" mechanism that would radiate the information of a frozen node to some kinds of links. Microcosm can deal with both situations. When an application is frozen, all nodes and linkbases belonging to it are also frozen. On the other hand, if users want to make annotations to frozen nodes they can always do it in a separate linkbase. This separates completely links belonging to the application and users' annotations. This mechanism is also interesting because each user can have their own linkbase of annotations.

One of the strongest characteristics of Microcosm is the separation between node information and link information. Following this philosophy, and using the resources available in the Unix environment, we decided to use RCS to implement version control for nodes. The linkbase stores information about the nodes such as: node name, collection of all source anchors, collection of all destinations anchors, etc (object of type Document). For each node reference stored in the linkbase we also store the collection of all its versions, and a reference to the application version. The next section will give an idea of how node versions are created.

2.2 Versions on Nodes

The Document Control System (DCS) is the module in Microcosm that is responsible for activating the appropriate viewers for the nodes (text, bitmap, video, etc.). As there is already a division of responsibilities between the Filter Management System (FMS) and the DCS, it was natural to delegate to the DCS the responsibility for creating new versions for the nodes, while the linkbase is generating new versions for the links. The user is prompted to give a brief description to be used in all nodes frozen by RCS. The new RCS version number is calculated based on the previous RCS version existent for each one of the nodes. This new version number is inserted in the collection of versions for each one of the nodes in the slot allocated for the application version number just frozen. We assume that the group of authors that are working together have already organised their work environment as follows: all authors belong to the same group (in the Unix concept of it) and all of them have rights to read and create new versions of all files belonging to an application. In the same way, we assume that all users have the right to read and write all objects stored in the linkbase for the particular application being manipulated. Although, we assume an organised environment, the DCS is able to detect the impossibility of creating versions for nodes, so it can send a warning message to the user, reporting the problem, and expecting the user to fix it before continuing the version creation.

3 Version Recovery

To access the objects stored in the linkbase, we have to read, for each node, the complete collection of available versions for it, and compare the field application version number with the current application version number, and then get the right OID for the object document. It is the responsibility of the DCS to check out the required version of the node using the command 'co' from RCS. When the right RCS version of the requested node is checked out, the DCS loads it in the appropriate viewer. As updates are not allowed in old versions, the DCS disables the functionalities 'Start Link' and 'End Link'. Also, for every node requested, the DCS checks if the node being requested was part of the application when the application version number being used was created. The linkbase then accesses, in the same way as described above, the right version for the objects and it is able to present the correct buttons for the node/version being used, follow the right links, etc.

In our implementation the right to access one version is the same as defined for the original application, see [4]. The right to access one node, is the right to access the respective Unix filename. Some systems, like CoVer [9], implement access rights for individual versions of the same original. This can be useful in cases where different versions are generated from one single ancestor version, by different co-authors working individually. We suppose in our system, that co-authors working in the same application do not need this kind of access control, since the purpose is not destruction, and there is no need to hide information between contributors. Therefore we do not implement any facilities to change the access rights for individual versions of the same application. However, it is possible to implement it in our system if the users find it necessary.

4 Merging Versions

The merging algorithm in Neptune [18] works well if authors split their work, and make modifications in independent parts. It is possible to simulate in Microcosm a merge model similar to the one existent in Neptune. In Microcosm, we can create two versions of a successor version (let us say versions 1.1.1 and 1.1.2 originate from 1.1). Each one of these descendant versions will have all nodes and links from version 1.1. If the authors know how to split the work, we can just suppress in each version the nodes that are not desired, and each author can make the necessary updates. To merge both works is just a question of starting up the two linkbases. When following a link both linkbases will be consulted.

In Microcosm it is possible to have any amount of branches emerging from a given version, and these versions can at the end be merged. But we do not offer any tools for the user to start up a new application by copying nodes and respective links from completely different versions, like CoVer, HyperPro, Neptune and other systems support. This mechanism, although very interesting, in our opinion can cause many inconsistencies, and perhaps is difficult for the user to understand. It is possible to implement this functionality in Microcosm, but our priority when we developed this version control mechanism was to support version control for the hyperdocument as a whole.

In [19], Davis mentions that one of the problems and advantages of Microcosm and other systems with a similar architecture, is the separation between links and nodes. Editing nodes can cause the removal or change of positions of anchors, and lead to inconsistencies in the linkbase. At the moment, there is no automatic mechanism to prevent these problems, and updates in the linkbase have to be made manually. In order to have consistency between nodes and links for each version of an application, the offset updates in the linkbase have to be made manually (at least at the moment). But even if all versions are consistent, the merging operations can generate new inconsistencies. As Hicks mentioned in [20], this problem is common to many hypermedia systems, and it depends on the viewer being used. In this first attempt to introduce version control in Microcosm, we did not create any mechanism to solve this problem, but we were happy at this stage with the functionalities obtained and described throughout this paper. We intend to continue our research in this field and find a solution for this problem in a near future. At the same time, we have to say that in Microcosm this problem is reduced by the use of generic and local links1

.

5 Conclusion

In general, version control is still a subject that needs a lot of research in all areas related to cooperative work. We have to worry not only about creating an efficient version control mechanism, but also creating an easy to use interface for users. We all know that users give up using a tool that is too complicated or that implies too many overheads. Also, we have to develop efficient tools to allow users to merge their work, since this activity normally needs too much intervention from users.

References

[1] A.Fountain, W.Hall, I.Heath, and H.Davis. `Microcosm: an open model for hypermedia with dynamic linking.' in The Proceedings of The European Conference on Hypertext '90, (France), Cambridge University Press, November 1990.

[2] W.Hall, I.Heath, G.Hill, H.Davis, and R.Wilkins. Microcosm: State of the art. Computer Science Technical Report CSTR 92-18, University of Southampton, Southampton, England, 1992.

[3] M.Melly. CSCW in an open hypermedia system. A thesis submitted for transfer to PhD registration, July 1993.

[4] M.Melly. An object oriented linkbase for Microcosm. April 1994.

[5] N.Yankelovich, B.J. Haan, N.Meyrowitz, and S.Drucker. Intermedia: The concept and the construction of a seamless information environment. IEEE Computer Magazine, vol.21, pp.81-96, January 1988.

[6] L.Garret, K.Smith, and N.Meyrowitz. Intermedia: Issues. strategies, and tactics in the design of a hypermedia document system. IEEE Computer Magazine, vol.21, pp.81-96, January 1988.

[7] H.Davis, W.Hall, I.Heath, G.Hill, and R.Wilkins. Towards an integrated information environment with open hypermedia systems. in Fourth ACM Conference on Hypertext (ECHT '92) (D.Lucarella, J.Nanard, N.M., and P.Paolini, eds.), (Milan, Italy), pp.181-190, December 1992.

[8] J.Haake and B.Wilson. Supporting collaborative writing of hyperdocuments in Sepia.. in CSCW' 92, (Toronto, Canada), pp.138-146, ACM, Oct 31-Nov 4 1992.

[9] A.Haake and J.Haake. Take CoVer: Exploiting version support in cooperative systems. In INTERCHI'93 - Human Factors in Computing Systems, (Amsterdam, The Netherlands), pp.406-413, ACM, April 23-29 1993.

[10] A.Haake. How to integrate state-oriented and task-oriented versioning. Report of the HT'93 Workshop on Hyperbase Systems. Department of Computer Science Technical Report No. TAMU-HRL 93-009, Hypertext Research Lab, Texas AM University, College Station, TX, November 1993.

[11] K.Østerbye. Structural and cognitive problems in providing version control for hypertext. Hypertext'92, (Milano), pp.33-42, ACM, November 30-December 4 1992.

[12] I.Goldstein and D.Bobrow. A layered approach to software design. In Interactive Programming Environment (D.Barstow, ed.), pp.387-413, McGraw Hill, 1984.

[13] M.Zwilling. Using the Exodus Storage Manager, v3.0. University of Wisconsin - Madison, April 1993.

[14] Available from anonymous ftp Address:ftp.cs.wisc.edu - Wisconsin University, EXODUS Installation Manual.

[15] Available from anonymous ftp Address:ftp.cs.wisc.edu - Wisconsin University, Using the EXODUS Storage Manager.

[16] Available from anonymous ftp Address:ftp.cs.wisc.edu - Wisconsin University, The E Reference Manual.

[17] E.Hanson, T.Harvey, and M.Roth. Experiences in dbms implementation using an object oriented persistent programming language and a database toolkit. In OOPSLA'91, pp.314-328, 1991.

[18] N.Delisle and M.Schwartz. Contexts - a partitioning concept for hypertext. ACM Transaction on Office Information Systems, vol.5, pp.168-186, April 1987.

[19] H.Davis. Applying the Microcosm link service to very large document collection. Report of the HT'93 Workshop on Hyperbase Systems. Department of Computer Science Technical Report No. TAMU-HRL 93-009, Hypertext Research Lab, Texas AM University, College Station, TX, November 1993.

[20] D.Hicks, J.Leggett, and J.Schnase. Version control in hypertext systems. Department of Computer Science Technical Report No. TAMU-HRL 91-004, Texas AM University, College Station, TX, July 1993.

A Multiversion Database Model for CSCW Applications

Waldemar Wieczerzycki
Franco-Polish School of New Information and Communication Technologies
Mansfelda 4, 60-854 Poznan, POLAND
e-mail: wiecz@efp.poznan.pl
http://www.efp.poznan.pl/dbgroup/wieczerzycki

1 Introduction

The integration of database technology into the area of CSCW has been discussed and investigated for many years now [1, 2, 3, 6, 10, 11, 12, 13, 15], and various solutions have been proposed for a number of problems with which database systems are confronted here [16]. One of the key issues is the appropriate logical organization of database support for design applications, since this has to account for specific requirements including versioning, their consistency, and their management and manipulation within design transactions. This paper describes recent and current work related to both flexible activity modeling and efficient transaction management in multiversion databases addressed to support CSCW applications.

The approach is based on the fundamental perception that valuable database support for CSCW applications must be based on the specific way in which cooperating team members (designers) carry out their activities (projects) and make use of a database in such activities [9]. Large design projects are a specific instance of CSCW in which many people of distinct levels of expertise and competence experiment over considerable periods of time and eventually merge their results into (versions of) the final artifact. The following problems occur in such environments: How to organize the cooperative work around the database? How to manage the objects that are both composite and multiversion? How to define database consistency and how to maintain it? To solve these problems we propose a new organizational framework for CSCW databases which offers adequate support for complex database structures and long-duration database transactions, but at the same time is easy to use due to several simple types of transactions available. It thus differs considerably from previous proposals, which emphasize simple database structures at the expense of complex transactional concepts. The framework we present is based on versioning particular subsets of database objects which are the units of object allocation to team members (designers).

The organization of the paper is as follows. In Section 2 a new approach to flexible CSCW activity modeling is presented. Section 3 briefly describes the database version approach, which is very relevant to the requirements of CSCW applications. Section 4 proposes new extensions of the database version approach necessary to support long-duration CSCW transactions. In Section 5 related work is briefly described. Finally, Section 6 concludes the paper.

2 Activity modeling

2.1 Data model requirements

Group working can be structured in terms of a range of activities occurring within various working environments which represent the humans, resources, projects and groups within organizations [4, 14]. Multiple activities might share the same resources or might be assigned relative priorities. Activities dynamically evolve in time due to changes in activity management policies. This evolution has three different aspects. First, new activities are created from scratch which extend the list of activities performed by the enterprise. Second, some activities may become not necessary or not valid, in which case they should be removed from the list of activities performed. Third, some activities are modified which means that their new versions are created, which replace previous (old) versions.

Both deletion and modification of activities may be temporary, for a particular time period. This reflects suspension of activities rather than deletion. Thus, removed activities and old versions of activities should be in fact frozen, but still available. In the near future they may again become active or they may be used as a basis for creation of a new activity version.

There is also another aspect of activity evolution. A single activity may be sometimes performed in slightly different ways in parallel, depending on triggering events, particular conditions, temporary results obtained, etc. In this case, instead of a single activity version, we deal with many variants of the same activity version.

Management policies are associated with managed objects. In case of activity management, managed objects are defined by the components of activity definitions. A common formal notation for specifying communication structures and patterns within activities is SDL (Structure Definition Language - COSMOS Project) [5] in which users can configure their own communication. SDL distinguishes the following classes of managed objects: roles, message objects, rules and actions. Roles are allocated (assigned) to people involved in a particular action to describe their required contribution. Message objects are used to exchange information between different roles, and to model external information needed to run an activity, or produced by an activity. Actions are the components of roles. They describe both exchange and manipulation of message objects by various roles. Finally, rules are used to group and trigger actions. Roles, messages, rules and actions together with the description of people and resources involved in activity and their mutual relationships (assignments) are called activity intent, as proposed in [17]. Because activities are versioned activity intent is also versioned. Thus, activity intent corresponds to a single activity, while activity intent version corresponds to a single activity version or variant.

Most activities, when performed, usually create objects which are the primary or secondary goals of the activity, deliverables being expected. For example, in case of design activities, particular design artifacts are expected as an output from the activity. These objects are dynamically created and modified by activity actions in a way described by the activity schema. Because they exist somehow outside of the intent, they will be called activity extent.

Objects included in an activity extent, especially very complex ones, e.g. airplanes, are developed step-by-step by the creation of their improved versions. It reflects the nature of CSCW applications which are interactive processes composed of different stages. After many iterations of those stages, go-ahead or roll-back, the final version of the product (artifact) is released. Thus, many versions of an activity extent are associated with a single version of activity intent. Two types of versions of activity extent are distinguished: variants and revisions. Variants reflect concurrent nature of the cooperative work, while revisions reflect its progressive nature. Both variants and revisions make it possible to roll-back an activity if its succeeding stages are unsatisfactory.

2.2 Implementation

The database is composed of a database background and a set of database configurations. The database background models an enterprise (or a design team) without activities. It contains all the objects which may be used to start (instantiate) an activity, like resources and people available, roles and actions that are well defined, however not performed yet, rules which sooner or later will be applied to activities, messages in standard form typically used in the enterprise. In other words, database background contains all the objects which probably will be useful to start most of foreseen activities.

Every database configuration models exactly one activity. Initially, i.e. after so called activity instantiation, it is composed of logical copies of some objects contained in the database background. Afterwards, due to the detailed activity definition, it is dynamically extended by new description objects local to the particular activity, reflecting specificity of the activity. It is also extended by semantic relationships among the objects mentioned above which order actions, assign them particular roles, bind rules to actions which they trigger, etc. When properly defined, the activity is started and local configuration objects are created which represent temporary or final results of the activity.

Activities may be instantiated in two different ways. In the first way, a particular selection mechanism is applied to the database background, in order to identify necessary activity objects, which are then logically copied to the corresponding database configuration. Of course, physically, they are shared by the background and the configuration. In the second way, the whole background is logically copied to the corresponding database configuration, which is then browsed and shortened in order to eliminate (remove) unnecessary objects.

The database background may be seen as a generic database configuration, reflecting common elements of the description of future activities, which has intent only and which is not versioned. The database configuration may be initially seen as an intent of the corresponding activity which is a subset of background objects, next extended by objects local to the configuration, and empty extent. When the activity is started, the extent grows up by the creation of objects according to patterns included in the activity intent. Moreover, on the contrary to the background, the configuration may be versioned in order to reflect the activity evolution. Notice that due to the activity intent and extent specificity, the intent is a part of the database schema, while the extent is a part of the database contents which may be properly interpreted only via the part of database schema corresponding to it.

Updates in the database background are allowed and are automatically propagated to those configurations which include logical copies of updated objects. On the contrary, newly created objects in the background are not included in configurations already created. They may be used in subsequent instantiations of new database configurations. Updates in configuration concerning both activity intent and extent are local, that means they do not influence the background and other configurations (configuration versions). In particular, if an update concerns a shared object, then before performing it the logical copy of the object concerned is replaced by a physical one, and the object is no longer shared with other configurations (background).

As mentioned before, activities may evolve over the time. First we focus on the evolution caused by updates in the activity intent. Two types of activity intent updates are distinguished: non-versioning updates and versioning updates. In the former case, the update operation just replaces an old value of the object affected by a new one. As a consequence, if the transaction commits, the old value is no longer available. In the latter case, directly before the update operation a new version, called child version, of the configuration addressed is automatically derived which is a logical copy of its parent version. Next the update is performed.

Now we focus on the database evolution caused by updates in an activity extent. Similarly to updates in an activity intent, we distinguish non-versioning and versioning updates of an extent. In case of non-versioning updates, an old object value is replaced by a new one. In case of versioning updates, before update a new child version of the configuration addressed is automatically derived, which is a logical copy of its parent version. Next the update is performed. This configuration derivation is orthogonal to the configuration derivation caused by versioning updates mentioned previously, i.e. updates concerning activity intent. Thus, every activity is modeled by a set of database configurations forming a tree whose nodes are the roots of other trees (i.e. forming a tree of trees). The outer tree nodes correspond to versions and variants of the activity intent available, while the nodes of every inner tree correspond to revisions and variants of the activity extent under the same version of the activity intent. As a consequence, in order to address a particular configuration, first the activity intent version must be specified, and next the activity extent version must be specified.

3 Database Version Approach

The approach presented in Section 2 is mainly addressed to activity modeling. Now we briefly remind the so-called the database version approach, originally proposed in [7], which is much more general than the previous one and flexible enough to be used in almost every domain of CSCW application. For the sake of transaction execution efficiency, however, it must be extended - as proposed in Sections 4 and 5.

The main concept of this approach is that of a database version which comprises a version of each multiversion object stored in the database. Some objects may be hidden in a particular database version by the use of the nil values of their versions. In the database version approach, a database version is a unit of consistency and versioning. It is a unit of consistency, because each object version contained in a database version must be consistent with the versions of all the other objects contained in it. It is a unit of versioning, because an object version cannot appear outside a database version. To create a new object version, a new database version has to be created, where the new object version appears in the context of versions of all the other objects and respects the consistency constraints imposed. Database versions are logically isolated from each other, i.e., any changes made in a database version have no effect on the others.

To operate on database versions, dbv-transactions are used, while to operate on object versions inside database versions, object transactions are used. A dbv-transaction is used to derive a new database version, called a child, from an existing one, called its parent. To derive a child means to make a logical copy of all object versions contained in the parent. Once created, the child database version evolves independently of its parent; also its parent is not prevented from evolving if it is admitted to by the application.

To efficiently implement database versions, and to avoid version value redundancy, database versions are organized as a tree reflecting derivation history, and are identified by version stamps. A version stamp makes it possible to easily identify the path in the derivation tree from a given database version to the root, i.e. to identify all the ancestors of the given database version. A multiversion object is implemented as a set of object version values and a control data structure called association table. Each row of the association table of a multiversion object associates an object version value with one or several database versions. Some database versions are associated explicitly, i.e. their version stamps appear explicitly in the association table, while others are associated implicitly.

To update a shared version value in a database version d, the following simple algorithm is used. First, a new row is added to the association table, associating the new version value and the version stamp of the database version d. Then, in the original row concerning the old version value, the version stamp of d is replaced by the version stamps of those of its immediate successors (children) that do not explicitly appear in any row of the association table.

The versioning mechanism described above permits two object transactions addressed to two different database versions to run in parallel. They do not conflict and need not be serialized, even if both write the object version whose value is shared by both database versions addressed [8]. This follows from the logical isolation of database versions: the update of a shared version value in one database version gives birth to a new one, while preserving the old one as explained above. The object transactions addressed to the same database version are serialized in exactly the same manner as in a monoversion database.

4 Long-duration transaction support

4.1 Delayed versioning

In CSCW environments, especially supporting a design process, users usually access different sets of objects, which are components of the whole design artifact. Moreover, their transactions typically access different database versions (private) which means that they never conflict with each other (cf. Section 3). When the objects are ready to release - designers merge their database versions into a single one (public), reflecting the actual state of common work. Sometimes, however, it is better for simultaneously working designers to address the same database version and to avoid rather tedious merging process, which requires the users to decide which version of every object should be available in the merged database version. Notice, that if the users really access different subsets of objects in the same database version, there is no need to isolate them by deriving their private database versions. They may share the same public database version. What will happen, however, if after many operations on disjoint subsets of objects, two transactions incidentally try to access the same object. They are usually long-duration transactions, thus aborting one of them is not recommended. The problem may be solved by providing a new database functionality called delayed database versioning, as proposed in [20].

Delayed database versioning consists in automatic derivation of a new private database version dedicated to one of conflicting transactions (after detecting its first access conflict). The new private database version is a logical copy of its parent (public) database version with all non-committed updates performed by the corresponding transaction before the access conflict. These updates are also logically removed from the parent database version. At this point, the transaction for which the new database version has been derived is not informed about this event and continues to access objects in the parent database version. The system, however, automatically re-addresses all the consecutive access requests to the newly created private database version. Now, locking conflicts in the private database version are no longer possible. If, because of some reasons (e.g. user request), transaction aborts - the private database version is simply deleted. Otherwise, i.e. if transaction commits, the user is informed by the system about the new database version derivation, which becomes visible to other transactions. Then the user has three possibilities: he may merge his updates back to the parent database version, he may redo his transaction once again in the public database version using log file, and - if succeeded - delete the private database version, or he may just inform all the users concerned about a new database version derivation.

To support delayed versioning a new transaction type comprising the features of the dbv-transaction and the object transaction has to be provided. Furthermore, two transaction sub-types should also be distinguished: non-conditional and conditional. A non-conditional transaction implicitly derives new database version, immediately after an access conflict is detected. On the contrary, a conditional transaction gives to the system a special parameter which defines the time transaction is willing to wait, in case of an access conflict. As a consequence, automatic database version derivation may be avoided, if the access conflict is resolved before the specified amount of time elapses. In case of users working interactively, they may be informed by the system about access conflicts between their corresponding conditional transactions. Afterwards, the users may dynamically decide how long they are willing to wait for resolving particular conflicts. They may also abort their transactions, or roll-back them to the beginning or a previously defined savepoint.

4.2 Temporary database versions

Some CSCW long-duration transactions do not update objects or updates are conditional and it is very possible that they will not be performed at all. If transactions of this type are addressed to a public database version together with transactions which tend to update objects - the concurrency degree may be substantially decreased. The reason is obvious: long-duration transactions set many locks and keep them for a long time. Below, some examples of possibly long and read-only transactions are given.

r global queries which access all instances of a particular class (class subtree) or scan all objects in a particular database version;

r transactions with conditional updates;

r hypothetical reasoning transactions;

r short-duration transactions updating objects which are referenced by a relatively great number of integrity constraints.

Transactions like those mentioned above, to be executed efficiently, require a new database functionality which is called a temporary database version. When a transaction appears in the system and addresses a particular public database version - a temporary database version is automatically derived from it, which is not visible to other transactions. The transaction considered is then re-addressed to this private database version and performs all its operations in an exclusive environment. Because there are no other transactions addressed to the temporary database version, there is no need for lock setting [8]. The transaction re-addressed may be executed without delays. After the transaction commitment the temporary database version is automatically removed from the system. Because it is a leaf of the derivation hierarchy - its removal may be performed very efficiently.

All new versions of objects created in a public database version concurrently to the execution of a transaction re-addressed to a temporary database version are not percolated to the children.

4.3 Savepoints

A new database version may be derived only from a consistent parent database version (cf. Section 3). That means, it is not possible to embed the database version derivation in an object transaction. In case of a long-duration transaction, it would be very useful to provide a mechanism, usually called savepoint, to store on demand the state (generally inconsistent) of the database version accessed, in an arbitrary point of a transaction execution. Afterwards the transaction may be rolled back to a particular savepoint, not necessarily to the beginning.

A straightforward solution of the problem mentioned above is the following. Whenever a transaction demands a savepoint, a new database version is derived directly from the database version accessed. This database version is not visible to any transaction, including transactions that have demanded the savepoint. The reason is that it is temporary and potentially inconsistent. This kind of the database version derivation may be performed without any delay even during commit phases of other transactions addressed to the same database version.

If the transaction commits or aborts database versions assigned to all its savepoints are automatically removed by the DBMS. It may be done very quickly, because the database versions being removed are leaves of the database version derivation tree.

If the transaction requests a roll-back to a particular savepoint SP then the following actions are performed. First, all database versions assigned to savepoints demanded after SP are removed, due to the fact that they are no longer useful. Next, values of all objects in the database version accessed by the transaction in exclusive mode (i.e. values of objects updated by the transaction) are replaced by corresponding values from the database version assigned to SP. No conflict may occur due to the nature of X locks kept by the considered transaction. Values of other objects, i.e. objects not accessed by the transaction or read only, need not to be changed. Now the transaction may normally continue its execution. Database versions assigned to savepoints demanded before SP are not removed. They still might be useful, until the transaction commitment.

5 Related work

5.1 Workspaces

Users of CSCW applications tend to access subsets of objects, rather than all objects stored in the database. Moreover, objects included in these subsets are usually bound to each other by different semantic relationships, e.g. composition, inheritance, derivation. The subset of objects accessed by a single user is called his/her workspace. Workspaces accessed by different users in the same working environment may be disjoint or they may overlap.

From a user point of view, the system should support two important functionalities. First, a flexible and powerful mechanism for defining a workspace has to be provided. Second, workspaces which overlap must be managed in a way that avoids, if possible, conflicts between different users accessing the same objects.

Workspaces are defined by the use of a high-level language, each time a user logs in. He/she refers to nodes of both the object composition hierarchy and the class inheritance DAG. If required, all objects included in a subtree (or a sub-graph) rooted by the node referred are automatically added to the workspace. Next, if the workspace is properly defined, the system compares it with the workspaces of all the other users working in parallel. If the intersection of the workspace being considered and other workspaces is not empty, the user is informed about potential future access conflicts. Then, to avoid conflicts, he/she may decide to derive a private working environment. Otherwise, a list of user names addressing conflicting workspaces is displayed. The information about shared objects may be given as well. Now, the user may consult his/her intended operations with other users in order to find out if they are consistent to each other. If they are, a particular concurrency control mechanism for the objects discussed is applied. It increases the level of concurrency between the users, on the one hand, and ensure the database consistency after committing or aborting the user transaction, on the other hand. Otherwise, i.e. if the intended operations are not consistent with those of the other users, and the user does not derive his/her environment, he/she must be aware of possible conflicts during the session.

More details may be found in [19].

5.2 Object percolation

Private working environments associated to different members of the working team are usually derived directly or indirectly from the same public environment, created by a head of the team, and representing an initial state of the cooperative work. The initial state is described, on the one hand, by a set of commonly accepted integrity constraints, reasoning rules and invariants, and, on the other hand, by a set of elementary objects useful for each member of the group. In private environments users "inherit" objects and constraints given by the group head improving or extending them by new ones, and creating composite objects. Everyone of them uses a single transaction (a sequence of transactions) executed in the dedicated environment. There is no need to access other environments or to propagate updates to them. Sometimes, however, the group head or a privileged person decides to introduce new (refined) constraints, which must be observed by the whole group, or new objects simplifying the work of his/her colleagues. In this case, he/she needs a special mechanism which will make all his/her changes visible to other members of the design team. This mechanism is also necessary, if the team head updates his/her initial object and wants to immediately propagate this update to other designers.

To solve the problem outlined above, a database functionality is provided, called object percolation, which enables to update an object in a particular public environment and automatically propagate this update to all other environments or to a subset of environments.

More details may be found in [20].

5.3 Merge operation

Finally, there is an evident need for supporting a merge functionality, which makes it possible to integrate different private working environments (database versions) into a single (public) one. Notice that merging is relatively easy to perform if an intersection of sets of objects composing private environments is empty or its cardinality is low. In this case it may be done almost automatically. However, if source environments share many objects, the problem is more complex. To solve it, a new transaction type, called merging transaction, is proposed. The merging transaction takes two arbitrary private environments, each one in a consistent state, and derives a new environment, composed of a combination of object versions taken from source environments. Of course, the resulting environment must also be consistent.

More details may be found in [18].

6 Conclusions

To summarize, in order to precisely model CSCW activities and to provide a flexible framework for their execution the database should provide two different levels of versioning (cf. Section 2). First, activity intent, which is a detailed activity description, must be versioned. Second, activity extent, which is a set of mutually related objects developed by particular activity version must be versioned.

In order to support long-duration transaction execution the database should provide some necessary functionalities proposed in Section 4. Advantages of delayed versioning functionality become particularly beneficial for transactions corresponding to computer aided design activities. Temporary database versions are strongly recommended for transactions which mostly read objects and rarely update them. Savepoints support transactions which often roll back only last updates.

The database version approach is implemented in the Multiversion Object Manager (MOM) prototype which is being currently extended by all the concepts briefly outlined in this paper.

References

1. R. Agrawal, H.V. Jagadish. On Correctly Configuring Versioned Objects. Proc. 15th Int. Conf. on Very Large Databases, 1989.

2. R. Ahmed, S.B. Navathe. Version Management of Composite Objects in CAD Databases. Proc. ACM SIGMOD Int. Conf, 1991.

3. F. Bancilhon, W. Kim, H.F. Korth. A Model of CAD Transactions. Proc. 11th INT. Conf. on Very Large Databases, 1985.

4. S. Benford, Requirements of Activity Management. In Studies in CSCW: theory, practice and design. North-Holland, 1991.

5. J. Bowers, J. Churcher, T. Roberts. Structuring Computer-Mediated Communication in COSMOS. Proc. of EUTECO88, Vienna 1988, North-Holland.

6. A.P. Buchmann, C.P. de Celis. An Architecture and Data Model for CAD Databases. Proc 11th Int. Conf. on Very Large Databases, 1985.

7. W. Cellary, G. Jomier. Consistency of Versions in Object-Oriented Databases. Proc. 16th VLDB Conf., Brisbane, Australia, 1990.

8. W. Cellary, G. Jomier. Apparent Versioning and Concurrency Control in Object-Oriented Databases. Journal of Computing and Information, Vol. 1, No. 1, Special Issue: Proceedings of the 6th International Conference on Computing and Information, Peterborough, Ontario (Canada), May 1994.

9. W. Cellary, G. Vossen, G. Jomier. Multiversion Object Constellations: A New Approach to Support a Designers Database Work. Engineering with Computers, Vol. 10, 1994.

10. H.T. Chou, W. Kim. A Unifying Framework for Version Control in CAD Environment. Proc. 12th INT. Conf. on Very Large Databases, 1986.

11. K.R. Dittrich, R.A. Lorie. Version Support for Engineering Database Systems. IEEE Transactions and Software Engineering 14, 1988.

12. R. Katz, E. Chang. Managing Change in CAD Database. Proc. 13th Int. Conf. on Very Large Databases, 1987.

13. H.F. Korth, W. Kim, F. Bancilhon. On Long Duration CAD Transactions. In: S. Zdonik and D. Maier (eds.), Readings in Object-Oriented Database Systems, Morgan-Kaufmann Pub., 1990

14. W. Prinz, P. Pennelli. Relevance of the X.500 Directory to CSCW Applications. In: Bowers J., M., Benford S.D. (Eds.), Studies in Computer Supported Cooperative Work, North Holland, 1991.

15. M.A. Ranft, S. Rehm, K.R. Dittrich. How to Share Work on Shared Objects in Design Databases. Proc. 6th Int. Conf. on Data Engineering, 1990.

16. G. Vossen, Data Models. Database Languages and Database Management Systems, Addison-Wesley, 1991.

17. W. Wieczerzycki. Activity Modeling in Office Information Systems. Proc. of 6th Intern. Conf. and Workshop on Database and Expert Systems Applications - DEXA, London, September 1995.

18. W. Wieczerzycki, J. Rykowski. Version Support for CAD/CASE Databases. East-West Database Workshop, Klagenfurt, Austria, September 1994.

19. W. Wieczerzycki. Transaction Management in CSCW Applications. 21st Euromicro Conference, Short Notes, Como, Italy, September 1995.

20. W. Wieczerzycki. Long-Duration Transaction Support in Design Databases. accepted for 4th Intern. Conf. on Information and Knowledge Management - CIKM95, Baltimore, Maryland, November 1995.

List of Workshop Participants



Contributors unable to attend the Workshop