MIT/LCS/TR-342 FOUNDATIONS OF A THEORY OF SPECIFICATION FOR DISTRIBUTED SYSTEMS Eugene W. Stark This blank page was inserted to presenie pagination. Foundations of a Theory of Specification for Distributed Systems by Eugene William Stark B.E.S. The Johns Hopkins University {1977) · S.M. Massachusetts Institute of Technology (1980) Submitted to the Department of Electrical Engineering and Computer Science In partial fufflUment of the requirements for the degree ol DOCTOR OF PHILOSOPHY at the . . MASSACHUSETTS INST~TUTE OF TECHNOLOGY August, 1984 ~pyright Massachusetts Institute of Technology 1984 Signature of Author_ ____________________ Department of Electrical Engineering and Computer Science August 24, 1964 Certified by_ ______________________ Prof. Nancy A. Lynch, Thesis Supervisor · Accepted by,_ ______________________ Prof. Arthur C. Smith, Chairman •• E.E.C.S. Department Committee on Graduate StudentS -2- Foundations of a Theory of Specification for Distributed Systems by Eugene William Stark Submitted to the Department of Electrical Engineering and Computer Science on August 24, 1984 in partial fulfillment of the requirements for the Degree of Doctor of Philosophy. Abstract This thesis investigates a particular approach, called state-transition specification, to the problem of describing the behavior of modules in a distributed or concurrent computer system. A state-transition specification consists of: (1) a state machine, which incorporates the safety or invariance properties of the module, and (2) validity conditions on the computations of the machine, which· capture the desired liveness or eventuality properties. The theory and techniques of state-transition specification are developed · from first principles to a point at which It is possible to write example specitications,-to check the specifications tor consistency, and to perform correctness ~f~ I i,1ina thA ~ifi,:~~ ~ utitity of ~ te-:hrnq\199 is ~~..r!!!~ th!'OUgh examples. Major contributions.of the thesis include: (1) the definition of a semantic model that incorporates hierarchy of abstraction and modular decomposition • fundamental notions; (2) specification and proof techniques that smoothly handle both safety and liveness properties; (3) techniques that use liveness properties stated In rely-/guarantee-condition form to. obtain simple proofs of correctness; (4) an interesting and useful notion of consistency for ~ons involving thleness properties. Keywords: state-transition specification, verification, concurrency, hierarchy, modularity, temporal logic, safety, liveness. rely/guarantee conditions Thesis Supervisor: Nancy A. Lynch Title: Associate Professor of Computer Science and Engineering - 3- Acknowledgements _ I am deeply indebted to my thesis adviser, Nancy Lynch, but for whom this thesis would likely never have been completed. Nancy read many, many difficult drafts of this work with enthusiasm and promptness that went far beyond the mere call of duty. She always seemed to manage not only to identify the most troublesome portions of each draft, but to make insightful suggestions for improvement as well. I am also grateful to John Guttag, Barbara Liskov, and Albert Meyer for their suggestions on improving the presentation. Discussions with Bill Weihl helped to formulate ideas during the early stages. In an entirely different category are my parents, Joan S. Stark and William L. Stark, Jr., who made it seem natural that I should seek and complete graduate education, and whose love and support during this endeavor I cannot adequately acknowledge. Julian Stanley of Johns Hopkins made it possible for me to pursue the undergraduate portion ct m;' cduc3tion 3.t ::m 3ccolcmtcd rote. Finally, I would like to thank the chessplayers at Au Bon Pain in Harvard Square for providing a much-needed diversion during the past year. . 4- CONTENTS 1. Introduction ................................................. 6 1.1 Scope of the Thesis . . . .. .. .. . ... .. . •.. .. .•.. ..•.... .. .. .. ....•.. ...•... .. .... ... 7 1.2 An Example .....•........•....•..........•....•..................•...•.••........•... 8 1.3 Outline of the Thesis ....................................•....•........•...... 15 1.4 Related" Work •........•.••.•.•••.•••..•..••.•...•.•••..•••.••.••••.•••.••.•••••... 17 2. Framework for a Theory of Specification ..... 38 2.1 Interfaces, Observations, and Behaviors .......................... 38 2.2 Abstraction, Decomposition, and Interconnection .......... 41 2.3 Specification, Implementation, and Correctness ............. 43 3. State-Transition Specifications ................... 45 3.1 Subset Specifications ..............................•........•.....•••.•...•.. -46 3.2 Machines and Computations ............................................ 48 3.3 Properties of Histories .. . . ... ......•....... .. .. .. .. .. . .. .. .. ..••.... ... . .. ... 50 3.4 State-Transition Specifications ......................................... 52 3.5 The Correctnes,s Theorem .......•..•.........••....•...•..•.............. 54- 3.6 Possibilities Mappings .........•.••..................•...•...•......•.•...... 58 3.7 Rely-/Guarantee-Conditions ............................................ 80 4. The Synchronizer Implementation .....•......... 66 4.1 Notation .. . .. .... ........ .. .. . ... .. .. .. .•. .. .... .• .. . . .. .. . . ...... •.... •. . . ... ...•. .. 66 4.2 Specification of the Synchronizer Module ....................... 67 i 4.3 Specification of the Synchronizer Component Module ... 69 4.4 Correctness of the Synchronizer Implementation ............ 73 - 5 - 5. Consistency of Specifications ..................... 82 5.1 1/O-Systems ..............................._ .. ...................................... 83 5.2 1/O-Behaviors and 1/O-Consistency ................................. 87 5.3 Machine Characterization of 1/O-Behaviors .................... 88 5.4 Examples of 1/O-Behaviors ............................................... 97 5.5 Composition of 1/O-Behaviors ... ...... ............ ........ ........... 107 5.6 Alternative Classes of Computable Behaviors ................ 11 0 6. A Completeness Result ............................. 112 6.1 Specification Domains ....................... ............................. 112 6.2 Locally ,-consistent Subset SpecificaUons ................... 116 6.3 Well-Formedness Properties of Specifications .............. 117 6.4 The Completeness Theorem ..... .... ...... .... ...... ............... ... 120 ...... 7. Conclusion ..................." .. .......................... 123 7.1 Summary ........................... ~ .............................................. 123 7.2 Ideas for Future Work ...................................................... 124 Appendix I. Formal Specification and Proof ... 136 1.3 Event/State Algebras ............. ............ ....... .......... ............. 137 1.4 Description of Event/State Algebras ............................... 140 1.5 Implementation Algebras ......................... ........................ 144 1.6 Proof Techniques ............................................................. 146 1.7 Rely-/Guarantee-Condition Proof Techniques ............... 149 1.8 1/O-Consistency Proof Technique ................................... 150 Appendix II. Additional Examples .................. 152 11.9 A Distributed Resource Management Algorithm ............ 153 11.10 A Message Transmission System ................................. 180 Appendix 111. Index of Definitions ................... 202 . 6. 1. Introduction The purpose of this thesis is to investigate a particular approach, called state-transition specification, to the problem of describing the behavior of modules in a concurrent or distributed computer system. In the state-transition approach, the desired behavior is described in terms of a kind of state machine whose computations generate records of event occurrences, called observations. A state-transition specification consists of two parts: (1) the definition of the state machine, which incorporates the "safety" or invariance properties of the module, and (2) the definition of some validity conditions on the computations of the machine, whose purpose is to capture the desired module "liveness" or eventuality properties. A state-transition specification defines a set of "acceptable" observations, namely the observations produced by valid computations of the state machine. A module behavior satisfies such a specification if the module behavior contains only acceptable observations. The idea of describing module behavior with the help of state machines is not new, having already been proposed in various forms by other authors, [e.g. Parnas72, Yonezawa77, Lamport83]. However, previous work seems to be concerned primarily with how to write module specifications, and how to use proof rules to prove the correctness of implementations. The important issues pf what constitutes the meaning of a specification, and what it means for an implementation to be correct, have not received satisfactory treatment. As a result, it is impossib!e to answer important questions such as: "What rules are sound for proving the correctness of an implementation," and "When is a specification consistent?" This thesis improves upon previous work by systematically developing the theory and techniques of specification from "first principles" to a point at which it is possible to write example specifications, to prove implementations correct, and to check specifications for consistency. The theory incorporates an underlying semantic model within which one can formulate language-independent definitions of the notions of "implementation" and "correctness." The meaning of state-transition specifications is defined in terms of the model, and all proof techniques are shown to be sound with respect to the model. . 7. The major contributions of this thesis are: (1) The definition of a semantic model that incorporates hierarchy of abstraction and mopular decomposition as fundamental notions. (2) Specification and proof techniques that smoothly handle both safety and liveness properties. (3) Techniques that use liveness properties stated in rely-lguarantee-condition form to obtain simple proofs of correctness. (4) An interesting and useful notion of consistency for specifications involving liveness properties. (5) Illustration of the utility of the ideas developed through specifications, implementations, and correctness proofs for three examples: (a) a synchronizer module, which is implemented by a ring-structured network of synchronizer component modules, (b) a resource management module, which is implemented by a tree-structured network of local resource manager modules, (c) a message transmission module, which is implemented by unreliable transmission line modules, a send protocol module, and a receive protocol module, which together obey the alternating bit protocol. 1.1 Scope of the Thesis A specification is a piece of text whose purpose is to describe the desired operation of a module in a computer system. Specifications form an integral part of a "top-down" design method in which design proceeds by the successive decomposition of a module to be implemented into a collection of interacting component modules [Liskov79, Wirth71]. The purpose of specifications in such an approach is to serve as a contract between the user and the implementer of a module. This helps to limit complexity by permitting a system to be decomposed into modules of reasonable size, such that each module depends only upon the specifications, and not the implementations, of the modules with which it interacts. To permit the possibility of rigorous reasoning about specifications, a specification language should be given a formal semantics in terms of an underlying mathematical semantic domain. In this thesis, we use the term behavior to refer to the elements of a semantic domain, since the purpose of these elements is to serve as a mathematical -8- model of the behavior of a portion of a real-world computer system. The semantics of a specification language describe how each specification denotes a set of behaviors that satisfy the specification. If the semantics of a programming language are defined so that each important program fragment denotes a behavior, then it is possible to derive syntactic rules for proving that (the denotations of) program fragments satisfy (the denotations of) specifications. The purpose of this thesis is not to propose particular formal specification or programming languages, but rather to investigate a collection of language-independent semantic concepts upon which particular specification and programming languages might be based. We therefore assume that specification and programming languages can have their meanings defined in terms of behaviors, and do not concern ourselves with the precise method by which this is accomplished. In this thesis, we are concerned with concurrent or distributed systems. By this we mean systems that are most naturally viewed as a collection of independent, communicating. modules, such that effects of concurrent operation of the various modules form a significant part of the description of system behavior. This thesis is primAri_ly r.oncerned with thA r.onr.11rrAnr.y "Rraci of distributed r.:ompllting; while the model and techniques do not rule out the possibility of treating other aspects such as node crashes and network failures, no special structure to deal with these problems is included. The techniques of this thesis have been developed primarily with the idea that they would be applied to the problem of describing and reasoning about distributed algorithms. The examples presented are of this kind. 1 .2 An Example In this section, an example specification problem will be used to introduce informally the fundamental ideas about specification on which this thesis is based. 1.2.1 The Synchronizer Module Consider the following scenario: A number of processes in a computer system require the use of a single resource to accomplish their respective tasks; however, because of limitations inherent in the resource, at most one process can be allowed to access the resource at any instant of time. To enforce this restriction, a synchronizer module is introduced, and the processes, which we will refer to as the user processes, are required to obtain permission from the synchronizer module before accessing the -9- resource. It is the job of the synchronizer module to produce correct synchronization of the user processes' accesses to the resource. Our problem is to describe precisely the synchronizer module behaviors that are "acceptable" in the sense that they always produce "correct synchronization." This precise description is the specification of the synchronizer module. When a user process desires to access the resource, it issues a try request to the synchronizer module. The user process is then supposed to wait until it receives a run response from the synchronizer module.,, When the user process is finished using the resource, It issues a rest response to the synchronizer module. We can capture these decisions in diagrammatic form as shown in Figure 1, in which the synchonlzer module Is depicted as a circle, and the possible requests and respon~ are drawn as arcs incident on and exiting from the circle. respectively. We assume that there are a total of N user processes accessing the synchronizer module. and have used a subscripted process number to distinguish the requests and responses corresponding to different processes. Fig. 1. The Synchronizer Module try 2 try 1 ---'---31>►, Synchronizer run 1 1 rat > Module , • try N rat N n,n N - 10 - The set of all possible requests and responses for the synchronizer module can be thought of as an "alphabet" or "syntax" for describing the interaction of the synchronizer mod~le with its environment. We call this set the inter/a ce of the synchronizer module, and refer to its elements as events. By observing the synchronizer module during an execution, we can obtain a record of the events that occurred during the execution. We call this record of event occurrences an observation, and assume that it takes the form of a finite or infinite sequence of events. By fixing the interface of the synchronizer module to be a particular set of events, we determine a universe of possible observations. We next consider how to describe which observations in this universe are "acceptable." We must include in our description the idea that at most one user process at a time may access the resource. Also, we wish to require that the synchronizer module be fair in the sense that every try request by a user process is eventually answered by a run response, if it is possible to do so without violating the mutual exclusion property. A natural way of describing which observations are acceptable is through the use of conceptual states. With this technique, we imagine that at any instant of time the synchronizer is in one of a number of possible internal states. These states may or may not have anything to do with the actual internal state of the synchronizer module; they are merely a tool for describing its observable behavior. After defining the set of initial states, we then describe for each event the preconditions required for that event to occur, and how the conceptual state of the synchronizer changes as a result of the occurrence of that event. The conceptual state of the synchronizer module at any instant of time Is a vector that tells for each user process what the synchronizer module thinks that user is currently doing with respect to the resource, based on the requests and responses that have occurred so far. The possibilities are that the user is either trying to obtain permission to access the resource (trying), is actively using the resource (running), is done using the resource (resting), or has failed to correctly follow the protocol (error). 1. The formal definition of observation used in this thesis is slightly more complicated than a finite or infinite sequences of events (see Chapter 2). This is done for technical reasons that are unimportant for the present, informal discussion. Initially, the synchronizer module believes that each user process is resting. The state changes and preconditions are as follows: a try event for a process causes the state of that process to change to "trying" if it was previously resting, and to "error" otherwise; a run event for a process can occur only if that process is trying· and no processes are currently running, and causes the state for that process to change to "running;" a rest event for a process causes the state of that process to change to "resting" if it was previousl_y running, otherwise to "error." A particular observation for the synchronizer module satisfies the description of the previous paragraph if to each finite prefix of the observation we can assign a conceptual state in such a way that each state change satisfies the conditions enumerated in the previous paragraph. For example, assuming there are only two user processes, the observation try try run rest run rest 1 2 1 1 2 2 satisfies the conditions above since we can assign internal states as follows: try try run 1 2 1 re,t;t mn re.st 1 2 2 . However, the observation try try run run rest rest 1 2 1 2 1 2 does not represent a correct functioning of the synchronizer module since try try run 1 2 1 run rest rest 2 1 2 , which is the only assignment of states that satisfies the state change requirements, has the property that the precondition for the run event is not satisfied by the state 2 . We will use the term "history" to refer to an observation that has been annotated with states. The state-transition description above tells us a significant amount about what are the correct observations of the synchronizer module, but it does not say everything that should be said. In particular, the requirement that every request by a user process should eventually be satisfied, if possible, is not captured by the state-transition description. Informally, the reason is that a state-transition description captures only properties of histories that are "local" in the sense that they involve only adjacent - 12 - states, whereas the fairness property we would like is a "global" property that involves possibly widely separated portions of the history. If the conceptual state technique is to work, we must find some way to state global properties In a form compatible with the statement of the local properties. In Chapter 4 it will be shown how global properties can be expressed in the language of temporal logic. A specification of the synchronizer module via the conceptual state approach therefore consists of a state-transition description of the local properties that must be satisfied by acceptable observations, plus a description of additional global properties satisfied by such observations. A particular synchronizer module behavior is said to satisfy the synchronizer module specification if it contains only acceptable observations. 1.2.2 Implementation, Abstraction, and Composition Now let us consider how the synchronizer module might be implemented. A possible organization is shown in Figure 2. In Figure 2, the synchronizer module is shown to be composed of a number of "synchronizer component" modules connected in a ring-like fashion. Each synchronizer component module interacts with exactly one user process and with its neighboring synchronizer component modules. The 1 implementation operates as follows: There is a single conceptual token that circulates around the ring in the clockwise direction. A synchronizer component module must possess the token whenever it grants its associated user permission to access the resource. In addition to the try, run, and rest events with which communication with the user is accomplished, a synchronizer component module may pass the token to its clockwise neighbor with a token_out event, may receive the token from its counterclockwise neighbor with a tokenJn event, may request the token from its counterclockwise neighbor with a request_out event, and may accept a request from its clockwise neighbor with a requestJn event. We resolve the implementation relationship between the synchronizer component modules and the synchronizer module into two separate operations on systems: a composition operation, which takes a number of component modules and combines them into a larger system, and an abstraction operation, which takes the larger system and throws away internal details that are not of interest in the more abstract view. In the synchronizer example the composition operation takes a collection of synchronizer -13- Fig. 2. Implementation of the Synchronizer Module , • • • Synchronizer Module component modules and connects them Into a ring network, and the abstraction operation throws away the details of the Internal communication between the component modules, saving only the events that make up the Interface with the user processes. • 14 • 1.2.3 Correctness of an Implementation Suppose we are given a specification for the synchronizer module, and specifications for each of the synchronizer component modules. Each specification determines a set of behaviors that satisfy it. The implementation is "correct" with respect to these specifications if, no matter what behaviors we "plug in" for the synchronizer component modules, as long as each component behavior satisfies its specification, then the resulting synchronizer module behavior, constructed from the componerits via the operations of composition and abstraction, satisfies the synchronizer module specification. 1.2.4 Summary The ideas presented in this section can be summarized as follows: (1) Every module in a system has a well defined interface, which is the syntax with which it interacts with other modules in the system. (2) An interface defines a universe of observations, which are records of operation that might be produced by a module with that interface. These observations constitute the possible "functionings" of the module. The set of all observations that can be produced by a particular module instance serves as the behavior of that module Instance. (3) A module can be specified by describing a set of "acceptable" observations. A module behavior "satisfies" such a specification if it contains only acceptable observations. (4) An implementation of an abstract module in terms of a collection of component modules consists of a composition operation for combining component module behaviors to form a "composite" behavior, and an abstraction operation for deleting information from the composite behavior to obtain a behavior of the abstract module. (5) An implementation is correct with respect to given specifications if, whenever we apply the composition operation of the implementation to a collection of behaviors that satisfy the component module specifications, and then apply the abstraction operation of the implementation to the resulting composite behavior, we obtain a behavior that satisfies the abstract module specification. • 15 - 1.3 Outline of the Thesis This thesis is an attempt to elaborate and make more precise the ideas illustrated informally in the previous section. In particular, an attempt will be made to answer the questions: (1) What is an appropriate mathematical framework that adequately captures the notions of interface, observation, composition, abstraction, implementation, specification, and correctness discussed above? (Chapter 2) (2) How can we translate, in a natural and systematic way, an intuitive understanding of the function to be performed by a module into a precise specification? (State-Transition Specifications, Chapter 3) (3) Once we have obtained such a specification, how can we be sure that it says something meaningful? (Consistency of specifications, Chapter 5) (4) How can we show, in a systematic way, that a particular implementation of an abstract module by a collection of component modules is correct with respect to given specifications? (Correctness Proofs, Chapters 3, 4, Appendix II) (~) What ~ancial priiiclple& can ·we lcam about how to oryanize specifications and proofs of correctness? (Rely/Guarantee-Conditions, Chapters 3, 4, Appendix II) (6) How might the specification and proof techniques developed in this thesis be formalized to permit the use of mechanical aids? (Event/State Algebras, Appendix I). This thesis is organized as follows: Chapter 2 introduces formal definitions of the notions of interface, observation, abstraction, composition, implementation, and correctness. Some of the modeling choices embodied in these definitions are discussed. In Chapter 3, the basic definitions of Chapter 2 are used to define formally the notion of a state-transition specification. The main result of Chapter 3 Is the Correctness Theorem (Theorem 3.9), which shows how the structure of state-transition specifications can be exploited to obtain a systematic method for performing correctness proofs. Secondary results of Chapter 3 (Lemma 3.11, Lemma 3.12) suggest how the proof method embodied in the Correctness Theorem can be further systematized if module liveness specifications are expressed in terms of re/y-/guarantee-conditions. · 16 · Chapter 4 applies the theory of Chapters 2 and 3 to the synchronizer example. The complete specifications of the synchronizer and synchronizer component modules are presented, and the synchronizer implementation is proved correct with respect to these specifications. The language of temporal logic is used as a notation for expressing liveness properties. Chapter 5 is concerned with finding an appropriate notion of consistency of specifications that include nontrivial liveness properties. Intuitively, a specification ought to be consistent if and only if it is satisfiable by some behavior. However, if by the term "behavior" we mean "arbitrary set of observations," then we obtain a notion of consistency that is much too liberal. To obtain stronger notions of consistency, we must restrict our attention to "realizable" or "computable" behaviors. Chapter 5 introduces a particular class of computable behaviors, the "1/0-behaviors," that is based on an underlying model of asynchronous concurrent computation called "1/0-systems." The corresponding notion of "1/0-consistency" is found to be useful for distinguishing between "obviously realizable" and "obviously unrealizable" liveness specifications. ChaptP.r 5 rlPvP.lop~ A technit11 •e for proving state-transition spe,:!fi,:ations to be 1/0-consistent and applies this technique to examples. In Chapter 6 a kind of completeness result is proved (the Completeness Theorem, Theorem 6.4), which gives sufficient conditions under which a correct implementation has a proof by the Correctness Theorem. The statement and proof of Theorem 6.4 uses in a crucial way the existence of a "specification domain," which is a class of behaviors, like the 1/0-behaviors, with certain closure properties. Finally, Chapter 7 summarizes what has been accomplished and suggests avenues for future investigation. Additional important material is contained in Appendices I, II, and Ill. Appendix I provides a formal semantics for the temporal logic language used informally in Chapters 4-6, and shows the correctness and consistency proof techniques developed in the thesis can be formalized within this language. Appendix II considers two additional examples: a distributed resource management system, and a reliable message transmission system based on the alternating bit protocol. Both of these systems are specified and proved correct using the techniques developed in the main body of the thesis. Appendix Ill is an index of definitions. · 17 · 1.4 Related Work The rather large body of work related to this thesis can be divided roughly into the following categories: (1) Specification of sequential programs/abstract data types. (2) Models of distributed/concurrent computation. (3) Temporal logic specification techniques. (4) Specification of communication protocols. (5) Other distributed/concurrent system specification techniques. Each of these categories will be discussed below. Further discussion is included at appropriate points in this thesis. 1.4.1 Specification of Sequential Programs/ Abstract Data Types Work in the area of specification of sequential programs can be classified into two categories: that concerned with the specification of the function to be performed by a program or program fragment, and that concerned with the specification of the data types manipulated by a program. Sequential Program Function Specification Specification of the function to be performed by a program or program fragment is a problem that must be addressed by any work on program correctness. In the sequential case, the semantics of a programming language assigns to each program fragment (statement, procedure, etc.) some mathematical object (denotation) representing the effect of executing that fragment. Typically, (see, e.g. [Jones81]) this denotation takes the form of a partial function or a binary relation on program states. A specification for a program fragment consists of some properties that must be satisfied by the denotation of that fragment. Often function specifications .a re expressed in the form of Floyd/Hoare partial correctness assertions (PCA's) [Floyd67, Hoare69], consisting of a precondition and a postcondition, which are predicates on states. A program fragment satisfies a PCA if, whenever execution of the fragment is begun in a state satisfying the precondition, then execution will terminate only in a state satisfying the postcondition. Thus, if binary relations are used as denotations of fragments, a PCA is satisfied by any relation R such - 18 - that if E R and q satisfies the precondition of the PCA, then r satisfies the postcondition. Besides being convenient for specifying the function that must be satisfied by a program fragment, partial correctness assertions can be used to construct a formal deductive system for reasoning about the behavior of program fragments. For a good overview of these "Hoare logics" of programs, see [Apt81]. The partial correctness assertion technique has been generalized with some success to systems of concurrent processes [e.g. Owicki76]. However these techniques suffer from a lack of modularity in the sense that there is no notion of the behavior of a single process in isolation. Thus it is possible to specify the function of a complete parallel program, but not the behavior of its constituent processes. Although a logic of partial correctness assertions is used to prove that the behavior of a program satisfies its specification, the truth of PCA's associated with one process cannot be determined, except within the context of the PCA's for all other processes. Partial r:orrectness assertions arA CAp;thlA of Axrre~qjng only SEtfe.ty properti~s ryf the form: "Whenever control is at point P, then relation R holds on the program variables. In general, one is interested in liveness specifipations as well. For sequential programs, often the only liveness specifications of iaterest are statements that the program is guaranteed to terminate under certain conditions. Liveness properties of this simple form can be handled by incorporating termination into PCA's, as in Dijkstra's calculus of "weakest preconditions" [Dijkstra76], or by techniques completely outside of PCA's, such as Floyd's well-founded set technique [Floyd67]. For distributed or concurrent programs, it is almost always the case that more general liveness properties than simple termination are of interest, and these require alternative techniques. Data Type Specif Ic ation The problem of describing the data objects manipulated by a program, especially the user-defined data objects, is usually referred to as "specification of abstract data types." There are actually two quite different problems that are addressed in the literature on abstract data type specification: the specification of immutable abstract data types, whose objects do not change their state during execution, and the specification of mutable abstract data types, whose objects have changeable state. - 19 - Specification of immutable abstract data types is the problem of describing and reasoning about static collections of values, functions, and relations. Usually a collection of interdependent immutable abstract data types is identified with the mathematical notion of a heterogeneous algebra, and algebras are described either axiomatically, as in [Guttag78, Goguen78, Kapur80], or via set-theoretic constructions, as in [AbriaI80]. Specification of mutable abstract data types, on the other hand, can be thought of as the problem of describing and reasoning about the dynamic behavior of a collection of objects that can be manipulated using a limited set of procedures [Guttag80, Wing83]. Berzins [Berzins79] models a mutable abstract data type as a kind of state machine, which describes how the states of the mutable data objects evolve as a result of the invocation of the procedures. The problem of specifying immutable abstract data types is not addressed by this thesis. In fact, the specification and proof techniques presented in this thesis assume as a prerequisite the ability to describe heterogeneous algebras and to perform reasoning about such algebras once they have been described. On the other hand, the problP.m of sper:ifying mutable Ah~tract d~ta types can be viewed .es a s~la! t.::ase of the general problem of module specification considered in this thesis, by thinking of a mutable abstract data type in terms of a "type manager" module, which encapsulates the objects of the data type and which performs manipulations on these objects in response to requests by the environment. Viewed in this way, the purpose of a mutable abstract data type specification is to describe the correct "observations" for the type manager module. The notion of observation appropriate here is that of a history of "events," where each event records either a request for the type manager to perform some manipulation on the objects, or a reply indicating the results of some previously requested manipulation. 1.4.2 Models of Concurrent Computation Quite a number of models have been proposed for investigating concurrent and distributed computer programs [Brock83, Hoare81 b, Hoare81 a, Greif75, Hewitt77, Kahn74, Keller76, Lynch81, Pratt82, Rounds81]. In this thesis as well, specific assumptions are made about how to model the behavior of such systems. It is necessary to make these assumptions to reach a point at which concrete example specifications can be written and correctness proofs performed. However, a conscious -20- effort has been made to assume no more structure than is necessary for the results of this thesis. An attempt has been made to identify a few fundamental concepts that are required of any m9(:tel, if it is to serve as a semantic foundation for the theory of specification developed here. The fundamental concepts identified in this thesis are the notions of interface, observation, behavior, abstraction, and composition. These concepts, which have already been informally discussed, are given formal definitions in Chapter 2. In this section, we will briefly review the features of a number of extant models of concurrency and attempt to identify the notions of event, interface, observation, behavior, abstraction, and composition used here with corresponding notions in each of the models. We will also be interested in whether each model is suitable as a semantic basis for a specification language -- in particular, whether the .model can model is useful for specifications involving liveness properties. Kahn-MacQueen Processes A rtatin::r eit:Yctl°ll rnodt:,I oi ro11current compuicdion is the stream processing rnodel of Kahn [Kahn74] and Kahn and MacQueen [Kahn77]. In this model, a process communicates with its environment through a collection of named channels. A process uses each channel either as an input channel or an output channel, but never as both. During execution, a process can read input values from input channels and emit output values on output channels. We can imagine observing a process throughout an entire execution and recording the sequence of values transmitted on each channel. Such a sequence of values, which can be either finite or infinite, is called a stream. A process is modeled by a continuous function from tuples of input streams to tuples of output streams. The notion of continuity used here is derived from the fact that streams under the prefix ordering form a partially ordered set which is complete under limits of increasing chains. Processes are deterministic in the sense that to each input tuple I, there is precisely one output tupl~ O that can be produced by a particular process, when that process is supplied with input /. This Is a consequence of the fact that processes are modeled by functions. In the stream processing model, the sets of input and output channels used by a process serve as the interface of that process. The role of an observation of a process is played by a pair , where / is a tuple of streams corresponding to the input \ · 21 · channels, and O is a tuple of streams corresponding to the output channels. The usual identification of a function with its graph permits us to view a process behavior f as the set of all observations of the form . A process network describes how to compose a collection of processes to form a composite system. Formally, a process network defines a kind of fixed point construction that maps a collection of component process behaviors to a behavior for the composite network. These fixed point constructions comprise the composition operations. The composition operations used by Kahn and MacQueen include features of both composition and abstraction as defined here, in the sense that once two processes have been connected by a communication channel, the stream of values transmitted over that channel is no longer of interest, and is ignored. The Kahn/MacOueen model is unsuitable for the purposes of this thesis because it is incapable of representing processes with nondeterministic behavior. Nondeterministic Process Nets There have been several attempts to generalize the stream processing model of Kahn and MacQueen to incorporate nondeterminism. One such attempt is reported by Brock in [Brock83] (superseding the earlier version [Brock81] by Brock and Ackermann), where references to other attempts are given. In [Brock83], It is shown that the straightforward attempt to generalize the model of Kahn and MacQueen by permitting process behaviors to be relations, rather than functions, is doomed to failure. Intuitively, the reason is that the behavior of nondeterministic processes depends, in general, on the relative orders in which inputs are received and outputs produced. In essence, Brock's approach is to replace the observations used by Kahn and MacQueen by scenarios. Scenarios Include, in addition to the streams of values transmitted on each of the channels, a partial ordering that records some of the information concerning the temporal order in which values were transmitted. The behavior of a process is defined to be the set of all scenarios that the process can produce in its various executions. Brock shows how composition operations on scenario sets can be defined, in analogy to the operations on continuous functions defined by Kahn and MacQueen. . 22. Pratt's [Pratt82] "repackages" Brock's model into a general framework for modeling processes and their composition, in which the behavior of a process is represented by the set of all traces (partially ordered multisets of events) it is capable of producing. As in the models of Kahn/MacOueen and Brock, the interface of a process can be identified with the set of all events in which the process can participate. The notion of trace plays the role of an observation. The notion of the restriction of a trace to a subset of its events is used to define composition of process behaviors. Restriction mappings on traces play essentially the same role in Pratt's model as decomposition maps play in the model of this thesis. The models of Brock and Pratt admit the possibility of infinite scenarios or traces, and therefore do not a priori rule out the possibility of modeling processes that satisfy nontrivial liveness properties. However, this possibility is not addressed by either Brock or Pratt. Since we are interested in modeling processes with liveness properties, the models of Brock and Pratt are not suitable in their present state of development. Communicating Sequential Processes An important class of models of concurrency [Francez79, Hoare81 a, Hoare81 b, Rounds81] has been developed through attempts to give a formal semantics to the language of "Communicating Sequential Processes" (CSP) defined in [Hoare78). In each of these models, the behavior of a process describes the traces (finite sequences of communication events) in which the process is willing to participate as it executes. The set of all events in which a process can ever participate plays the role of the interface of that process. The notion of a trace plays the role of an observation. Although the particular notion of process behavior is different for different models, each of the models of CSP contains a collection of algebraic operations on process behaviors, which are used to define the meaning of the various constructs of CSP. In particular, each model has some sort of "restriction" or "hiding" operations, which cause events to be deleted from a proces.~ behavior, and some sort of "relabeling" operations, which allow events of a process to be renamed. These operations are used for essentially the same purpose as the abstraction operations used in this thesis. Each model also has one or more "composition" operations (composition by intersection, composition by interleaving, or a mixture of the two) corresponding to the composition operators of this chapter, whose effect is to combine process behaviors in various ways. -23- The important considerations for models of CSP derive from a feature peculiar to that language. A CSP process can refuse to communicate with its environment. If a CSP process refu~s to perform any of the communications offered by its environment, then deadlock is the result. The different definitions of process behaviors in the various models of CSP arise from attempting to deal with (or to ignore) the subtleties of refusals and nondeterminism. In [Hoare81b], a process behavior is a prefix-closed set of traces, which can be viewed equivalently as a behavior of the kind defined in this thesis. There are operations in (Hoare81 b] for deleting and renaming the events of a process. These operations are examples of the abstraction operators used in this thesis. Process behaviors are composed by the parallel composition operator II, which is defined as follows: If A is the behavior of a process with interface E and B is the behavior of a process with interface F, then A II 8 is the set of all traces u formed from events in EU F such that the restriction of u to Eis in A and the restriction of u to Fis in B. This notion of composition is a particular example of the composition operators defined in this thMi~. Hoare, Brookes, and Roscoe [Hoare81a] extend the work of [Hoare81b] to deal with the problems of refusals and nondeterminism. They do this by permitting behaviors to be more highly structured objects than just sets of traces. In particular, a behavior Is a set of pairs , wheres is a trace, and Xis a set of events that can be refused by the process after the trace s has been produced. Although they use a single universal set of events for all processes, we can imagine designating the set of all events that actually appear In a process as the interface of that process. As in the model of [Hoare81 b], traces play the role of observations. There are "concealment" operators for deleting events, and "inverse image" operators that permit renaming of events. There are no ''direct image" operators, apparently because they are not as well behaved as the inverse image operators. Two kinds of parallel composition operations are defined: composition by intersection, In which events of the component processes are connected, and composition by Interleaving, in which the events of the components remain independent. Rounds and Brookes [Rounds81] attempt to justify and extend the work of [Hoare81 a] in the following way: A definition of process behaviors Is made that includes somewhat more information than that of (Hoare81a], and is based on supposedly more · 24- fundamental intuitive considerations. A number of algebraic operations, including composition and abstraction, are defined on behaviors. A notion of "observable equivalence" of behaviors is defined, and is shown to be a congruence. The quotient of the algebra of behaviors with respect to this congruence is then shown to be isomorphic to the model of [Hoare81a], thus providing evidence that this model exactly captures the externally observable properties of processes. There seem to be problems associated with the use of models of CSP as a semantic basis for specification languages. These problems center around the following two questions: (1) Do traces represent a "complete" record of execution of a process, or simply some finite portion of such a record? (2) What is the meaning of a liveness specification such as "eventually event a will occur," if a process can be placed in an environment that refuses to permit the occurrence of event a? With respect to question (1 ), it is difficult to see how the designers of the CSP models could have intended traces to represent complete observations. This is because in general a complete observation will be infinite, but the CSP models provide no method for extracting infinite traces from behaviors. Without a distinction between complete and incomplete observations, we have no way to determine whether a particular CSP process satisfies a liveness specification. It is clearly ridiculous to I require that a specification such as "eventually event a will occur" be satisfied by all "incomplete" as well as all "complete" observations. Question (2) arises from a desire to "assign the blame" for an unsatisfied liveness specification, either to a process or its environment. If a process can always be placed in an environment that can prevent the occurrence of event a, then the only reasonable conclusion we can draw is that the specification "eventually a will occur" is too strong (i.e. inconsistent). However, it is not clear how to weaken such a specification so that it can be regarded as consistent. The above problems associated with the models of CSP have been avoided here as follows: First, it is assumed here that the observations in a behavior represent complete records of execution. Second, we accept the obvious conclusion that the specification "eventually a will occur" is inconsistent with respect to a model (such as the model of [Hoare81 bl} that admits the possibility of refusals. Instead of trying to find ways to weaken specifications like this so that they can be regarded as consistent even -25- in the face of refusals, though, we construct a model in which refusals are not allowed. This is the idea behind the 1/0-behaviors constructed in Chapter 5 of this thesis. Calculus of Communicating Systems Rather similar to the models of CSP discussed above is the "Calculus of Communicating Systems," (CCS) of [Milner80]. As in CSP, the notions of a communication event and a sequence of communication events are the fundamental concepts for describing the behavior of a process. The role of a process interface is played, in CCS as in CSP, by the set of communication events in which the process is capable of participating. The CCS notion of an observation is a sequence of events; in contrast to CSP, CCS admits the possibility of infinite observations. To represent the behavior of a process, Milner introduces the notion of a communication tree whose paths represent all possible complete histories of communication for a process. In a communication tree there can be multiple arcs emanating from a single node, labeled with with the same communication event, and &res can ~ laueit::d with lhe SJ>tK;ictl ~ymbol .,, which represents Hfl initunHi Hciion of a process not associated with any communication event. Communication trees therefore contain more information about a process than just a simple set of traces. In fact, communication trees contain more information about a process than can be detected through composition with other processes. Milner addresses this problem by defining several notions of "observable equivalence" of communication trees, and shows that these relations are congruences for an algebra of processes whose operations include operations of composition and abstraction. He suggests that the class of process behaviors be obtained by forming the quotient of the algebra of communication trees with respect to one of these congruences. He is unable to reach a conclusion, though, as to which of the congruences is "best," or to give explicit characterizations (not involving quotient constructions) of the quotient algebras. Although communication between two processes in CCS, as In CSP, is synchronized in the sense that it is represented by the simultaneous occurrence of communication events for the participating processes, communication in CCS is unlike that in CSP in the sense that a CCS process cannot prevent another process from performing an event. This is because the definition of the composition operation in CCS states that, if process A can perform an event a, and process A' can perform the - 26- "complementary" event a', then the composition A II A' can perform either a, or a', or the communication represented by the simultaneous occurrence of both a and a '. The fact that observations can be infinite in CCS raises the question of whether it is possible to define CCS processes that satisfy interesting liveness properties. However, it seems that this possibility is ruled out by Milner's composition operation. Milner's composition operation is "unfair" in the sense that there are paths in the communication tree corresponding to the composition of two processes along which only one of the component processes gets to run. This means that no process can satisfy a specification of the form: "eventually a will occur," in an environment that has the capability of producing an infinite observation. Actors One of the earlier event-based models of computation is the actor model [Greif75, Hewitt77]. An actor system consists of a collection of primitive computing agents (actors), that communicate by passing messages. A computation for an actor system is ' a .,:,c11 iiciliiy ofdt:,-~ set of event::;, where an event marks the arrivai of H message at its target. Receipt of a message activates the target actor, and may cause additional messages to be issued. The partial order represents a kind of temporal "precedes" relationship between events, formed by taking the transitive closure of the union of the "causes" relation and the "arrival" ordering, the latter of which linearly orders all events with the same target. Hewitt and Baker [Hewittn] postulate certain laws that must be satisfied by the various orders. The actor model was originally applied [Greif75) to the specification of synchronization problems such as the mutual exclusion and readers/writers problem. The specifications are written as axioms that constrain the possible computations of a system. The language used, although not formally defined, is essentially a propositional calculus in which the propositions are of the form "e - e '," which means that event e must precede event e ' in any computation of a system satisfying the specification. Although no notion of state was used in the specifications, the language has nevertheless sufficient expressive power to handle several important examples. Subsequent work concentrated on applying the actor model to the specification of more complex systems, both distributed and centralized [Yonezawan]. In contrast to the work ot Greif, Yonezawa's specifications have a decidedly state-transition flavor, - 27 - and although proponents of the actor model consistently argue that global state is not a well-defined notion for distributed systems, the "situations" used in Yonezawa's correctness proofs appear to be just such global states. In the actor model, the notion of an actor is generally defined by informal axioms and description, which are insufficient to answer the question: "What is an actor?" We must know the answer to this question if we wish to obtain a meaningful notion of the collection of all actors that satisfy a given specification, and to show the validity of rules for deriving consequences of specifications of actor systems. The question of what actors are has only recently been dealt with by Clinger [Clinger81 ], who defines actors and their computations directly in terms of set-theoretic constructs. It is interesting to note that, although actor enthusiasts like to point out that viewing computations as· partially ordered sets of events captures "true" concurrency better than linearly ordered computations, Clinger shows that the laws of Hewitt and Baker are in fact equivalent to the existence of a global linear ordering of events in a computation. To relate the actor model to the model used in this thesis, we can attempt to identify notions of interface, observation, behavior, abstraction, and composition in the actor model. There seems to be no obvious notion of the Interface of an actor. The notion of a partially ordered set of events plays the role of an observation. Roughly speaking, Clinger defines the behavior of an actor to be a function that describes the actor's response (i.e. its state change and message transmissions) to the receipt of a message. Although we can imagine composing a collection of independent actors into a composite system, there seems to be no formal notion in the actor model corresponding to such an operation. As mentioned above, the existence of the arrival ordering prevents the definition of an abstraction operation. The actor model has certain defects that render It unsuitable for a theory of specification. The major difficulty is that the actor model does not support abstraction of systems in a uniform way. There are notions of an actor and a system of actors, but no way to view a system abstractly as a single actor. The artificial "arrival ordering," imposed on all events that occur at a single actor, is the primary feature that prevents abstraction from being defined in a reasonable way. Another reason is the fact that every message must contain the name of its target actor, since this means that it is never possible to completely suppress the internal structure of an actor system. - 28- Lynch/Fischer Processes In the model of distributed computation proposed by Lynch and Fischer [Lynch81], the primitive objects are variables and processes, and systems of processes. A variable is a mailbox-like container for values, and a process is a kind of state machine that can perform input and output on variables. A system of processes consists of a collection of processes that communicate through variables. The variables of a system of processes are partitioned into external and internal variables. There is a kind of composition operation that combines a collection of systems of processes to form a larger system. There is also a kind of abstraction operation that transforms some of the external variables of a system into internal ones. A correspondence between Lynch and Fischer's model and the model of this thesis can be established, if the notion of an event is identified with Lynch and Fischer's notion of a "variable action." A variable action describes the change in the value of a variable resulting from a single execution step. The interface of a system of processes is the set of all variable actions it can perform. The behavior of a system of processes is, as Lynch and Fischer define, the set of all finite arid infinite sequences of variable actions the system is capable of performing. To view Lynch and Fischer's operation of composition of systems of processes as a special case of the composition operators defined here, it is necessary to account for the requirement that the actions on a single variable in the computation of a system have consistent values. This is easily accomplished if variables are thought of as active entities with an interface and a behavior. The interface of a variable is the set of all variable actions that can be performed on it. The behavior of a variable is the set of all finite and Infinite sequences of variable actions in which the value read in each variable action equals the value written in the immediately preceding variable action. In terms of modeling power, the model of this thesis and that of Lynch and Fischer appear equivalent. Lynch and Fischer's model is certainly capable of handling nondeterminism and liveness properties. The main advantage of the model of this thesis over that of Lynch and Fischer is that the former contains fewer primitive concepts. It is not necessary to draw distinctions between variables, processes, and systems of processes, and the definitions of composition and abstraction are simplified by avoiding these distinctions. - 29- 1.4.3 Temporal Logic Specification Several authors [Hailpern80, Lamport83, Schwartz81] have proposed the use of temporal logic as a specification language and a vehicle for expressing correctness proofs. The use of temporal logic as a specification language evolved gradually from its use as an assertion language, that is, as a language for expressing properties of program executions [Pneuli77, Lamport80]. There is a subtle difference, though, between the semantics appropriate for temporal logic used as an assertion language and temporal logic used as a specification language. This difference, which has not been explicitly addressed in the literature.can be summarized as follows: Whereas temporal formulas as assertions express properties of single computations of a fixed program, temporal formulas as specifications express properties of the set of computations of an undetermined program. Stated another way, whereas a model for a temporal formula used as an assertion about a fixed program is a single computation of that program, a model for a temporal formula used as a specification is the set of all computations that can be produced by some program. This distinction has important ramifications for whal notion of consistency i~ appropriate in each case. A temporal formula used as an assertion about the computations of a fixed program is consistent if and only if there exists a computation of that program that satisfies the formula. A temporal formula used as a specification is consistent if and only if there exists a program, all of whose computations satisfy the formula. Another important issue that is not addressed explicitly In literature on temporal logic specification is the ability to specify a single module in isolation from particular program context.1 The notion of a program module satisfying a specification in isolation must be meaningful if specifications are to effect the beneficial separation between module use and implementation. Since extant work does not include the notion of the meaning of a specification in isolation, there has been no discussion of the following important question: How can we combine independent module specifications to perform 1. Recent work [Barrlnger83], performed independently of the work described in this thesis, has begun to address some of the same issues, in particular: (1) temporal specifications express properties of sets of computations, rather than single computations, (2) specifications should have meaning that is independent of an enclosing context. a proof of correctness? In particular, in what common language can the proof of correctness be expressed. and what deductions in this language are sufficient to imply the correctness of the implementation? Among the papers on temporal specification of concurrent program modules. the approach developed by Lamport [Lamport83] contemporarily with work on this thesis, results in specifications that appear most similar to the state-transition specifications described here. In Lamport's approach, a specification consists of three parts: (1) A list of state functions, which define salient features of the program state; (2) A list of initial conditions, which represent assumptions on the initial values of the state functions; (3) A list of properties, which constitute the main body of the specification, and which can be viewed as standing for a collection of temporal logic formulas. The properties are of two kinds: safety properties and liveness properties. Safety properties describe the state transitions that are permissible for a program satisfying the specification, and liveness properties describe situations under which transitions are required. The way one writes a specification in Lamport's approach is quite similar to the way one writes state-transition specifications as described in this thesis. At the semantic level, though, Lamport's approach seems rather different. The difference can be summed up briefly as follows: In Lamport's work, specifications for program I modules play the role of assertions about the computations of a complete program in which the module appears. Whether or not a particular program module satisfies a specification can only be determined in such a context. In the framework presented in this thesis, whether a program module satisfies a specification can be determined without reference to any contextual information. The meaning of the state functions used in Lamport's approach is obscure. Lamport says that state functions in a specification "should describe information that must be contained in the program state of any real implementation." This statement apparently implies that the value of the state functions is part of the observable behavior of the module being specified, and in this sense is just as important a part of a module specification as the relationship between the arguments passed and results returned from an invocation of an operation on the module. Choosing state functions that provide too detailed a view of the internal operation of a module can result in overspecification, since an implementer wishing to satisfy the specification is constrained to include enough information in the state so that the state functions can be • 31 • defined. This thesis resolves the problem of overspecification by introducing the notion of an interface. By defining a module interface, one fixes a particular class of module instances (i.e. the behaviors of that interface) which serves as a domain of discourse for the temporal specifications. In this thesis, a module interface is a set of events. An interface does not contain any notion of module state. States are used merely as a device for increasing the expressive power of the specification language to permit the desired properties of observations to be expressed in a convenient and natural way. Since states are not part of the module interface, the state set in a state-transition specification can be chosen on the basis of convenience, without danger of overspecif ication. Schwartz and Melliar-Smith have also proposed the use of temporal logic as a specification language. In [SchwartzBO], specifications are developed for the alternating bit communication protocol. Appearing in these specifications are uninterpreted symbols such as "lnQ" and "OutQ." These symbols, like the state functions used by Lamport, are evidently intended to refer to portions of the state that must be identifiable in any program satisfying the specifications. Schwartz and Melliar-Smith present collections of temporal axioms which they claim completely characterize the send and receive processes supporting the alternating bit protocol. There is little basis for this claim, since it is impossible to determine what a process is, much less determine whether the specifications characterize a particular process or class of processes. The axioms presented by Schwartz and Melliar-Smith involve complicated derived temporal operators such as "latches-until," which make the resulting specifications quite difficult to understand. The specifications have an ad hoc flavor, and It is difficult to obtain insight into how specifications for different examples would be obtained. In contrast, the state-transition approach discussed in this thesis suggests a systematic way of proceeding from an Intuitive conception of the desired module behavior to a precise specification. Schwartz and Melliar-Smith present no proof that their send and receive process specifications correctly implement the &ervice specification for the alternating bit protocol. Experience gained from the examples presented in this thesis suggests that specifications that have not been used in a proof of correctness are quite likely to contain errors. -32- Hailpern and Owicki [Hailpern80] propose a style of temporal logic specification that is different from the styles of Lamport and of Schwartz and Melliar-Smith. Hailpern and Owicki also use the alternating bit protocol as an example to illustrate their approach to specification. In addition to symbols representing components of the internal states of processes in the system, Hailpern and Owicki introduce the notion of a history variable, whose value at any instant of time represents the entire history of commun\cation between two processes up until that instant of time. They state explicitly that history variables are simply a descriptive tool, and are not intended to be implemented. History variables appear to be quite useful for writing high-level, nonprocedural specifications. For example, the safety properties satisfied by a transmission line could be expressed by stating that the history of messages delivered is always a prefix of the history of messages sent. The state-transition approach to specification presented in this thesis takes the history variable idea to its logical conclusion, by allowing arbitrarily structured history information (in the form of states), to be introduced into a specification, together with npArati_ons for manipulating this informAtion. This r.an he none diffArently for e1:t.c:.:h specification, without change to the underlying semantic model. For example, the specification of the reliable transmission module presented in Chapter 6 uses the notion of the history of all messages input to the reliable transmission module. In the specification of the send protocol module in Chapter 6 it is convenient to define the notion of "the history of all messages for which acknowledgements .h ave been received." This history is a subhistory of the history of all messages transmitted by the send protocol module, and would not be directly accessible in the model of Hailpern and Owicki. 1.4.4 Specification of Communication Protocols The problem of specification of communication protocols has received a good deal of attention, and can be viewed as a special case of the more general problem, investigated here, of specification of modules in a distributed system. Two surveys of the protocol specification literature, written from different vantage points, can be found in [Sunshine78] and [Hailpern81]. -33- Of the numerous papers on protocol specification and verification, that of Bochmann [Bochmann78] appears to be most directly relevant to this thesis. Bochmann models a system as a collection of finite-state machines that affect each other through coupled state transitions. This is highly analogous to the definition, given here, of composition of behaviors by identifying events. Bochmann also has a notion of abstraction by ignoring uninteresting transitions, which matches the concept of abstraction of behaviors used here. Schwabe [Schwabe81 a, Schwabe81 b] exploits the analogy between the instantaneous state of a communication protocol and a value of an abstract data type, to translate state-transition specifications of protocols into equational axioms that define an abstract data type. This translation enables him to verify correctness properties of communication protocols using an automated verifier (AFFIRM) originally intended for proving properties of abstract data types. However, only certain kinds of correctness properties can be stated and proved using his technique. In particular, liveness properties cannot be handled. Schwabe pays little attention to the semantics of his specifir:ation~: leaving somP. a.mhigl•ity a!S to what objects satisfy a s~if!caUon, e.nd what consitutes correctness of a protocol. It is interesting that the notions of hierarchy and modularity of systems, and the prerequisite concept of the interface of a system with its environment, are much more prominent in the literature on protocol specification than they are in the literature on specification in general. In protocol specification, a system is viewed as a nested set of layers: the bottom level corresponds to the communication hardware, and each layer provides an abstract service to the next higher layer. The top level implements the service provided to the "end user." Typically the service provided by a level can be viewed as an abstract communication network connecting two users, which often have an asymmetric sender/receiver relationship. Higher levels of abstraction are implemented by interposing protocol processes between the users and the communication service provided by the next lower level. The interface between the users and a service comprises the set of operations (e.g. open connection, send message), they can perform. A distinction is drawn between the specification of an abstract service provided to a user (the service specification) and a description of the protocol processes (the protocol specification). -34- There are only a few specific correctness properties of interest for communication protocols: freedom from deadlock, completeness (i.e. definedness of the protocol in every reachable sta~e), progress, and stability in the face of unexpected perturbations of the protocol. These properties are certainly also of interest for more general kinds of distributed systems. All verification techniques in the communication protocol literature are ultimately based on representing the protocol processes and abstract communication media as finite-state machines, constructing a combined state-transition graph for the implementation, and performing various analyses on this graph. The state-transition approach to specification and verification is a natural generalization of this technique. It should be noted, however, that the machines used in the state-transition specifications in this thesis are not necessarily finite-state, and that reachability analysis of a system is performed by proving predicates to be invariant, rather than by explicit construction of the combined state-transition graph. This means that the proof techniques discussed in this thesis need not be subject to the combinatorial explosion problem often referred to in the literature on protocol verification. 1.4.5 Other Concurrent System Specification Techniques Chen [Chen81, Chen82] develops a concurrent system specification language called EBS (Event-Based Specification Language), and gives specifications for a number of examples, including the alternating bit protocol. The EBS language can be thought of as a generalized version of the language used in [Greif75] to specify various synchronization problems. An EBS specification expresses properties of an event history, which is a partially ordered set of events. The EBS notion of an event history corresponds to the notion of an observation used in this thesis. Chen's work seems to be motivated by a number of the same concerns that motivated this thesis. In particular, Chen discusses the distinction between the user's view and the designer's or implementer's view of a system, and introduces a notion of interface to capture the way in which a system interacts with its environment. In Chen's approach, a module interface consists of a collection of ports. There is a notion of module interconnection by identifying ports, which is reminiscent of the composition operations used in this thesis. Chen's work does not, apparently, include a notion of behavior, or the idea that a module specification has meaning except with respect to a -35- complete program context. Chen does not have a semantic definition of the correctness of an implementation from which the soundness of proof techniques can be derived. Rather, the notion of correct implementation seems to be identified with the notion of logical consequence. An interesting property of Chen's specifications is that they tend to be "orthogonal." An orthogonal specification is a specification that is composed of a collection of independent subspecifications. For example, Chen defines a number of different properties of a reliable transmission system, such as "no loss of messages," "no duplication of messages," and "no erroneous messages." It is not obvious how the state-transition technique presented in this thesis could support the writing of specifications with a comparable orthogonality property. The Gypsy system [Good79, Good82] has some capability for the specification and verification of distributed systems. In the Gypsy model, a distributed system is viewed as a collection of independent processes that communicate through message buffers. Specifications of the communication function performed by a process are expressed in terms of properties of "buffer histories," which represent the sequences of messages transmitted on, or received from message buffers. Gypsy seems capable of handling only safety properties. Correctness proofs in Gypsy are performed by deriving a collection of verification conditions from annotated program text, and then proving the validity of these verification conditions using a semi-automatic theorem prover. Evidently the validity of the verification conditions is taken as the definition of correctness; the literature shows no attempt to justify the sufficiency of the verification conditions in terms of any fundamental model of computation. Reasoning about the behavior of a system of processes in Gypsy is done in terms of relationships between buffer histories. The approach appears similar to Hailpern and Owicki's history variable approach. An outgrowth of the Gypsy work is the work of DiVito [DiVito82], which is concerned with the description and mechanical verification of communication protocols. DiVito's specifications contain liveness properties only, and are expressed in a decision table style that captures much the same information as the definitions of state-transition relations presented in this thesis. The purpose of DiVito's work seems to be to quickly reach a point at which experimentation with mechanical verification is . 36. possible. His focus is primarily on linguistic issues, rather than their semantics. Lansky and Owicki [Lansky83] have developed a language, called GEM, for the specification and verification of properties of concurrent systems. The underlying model of computation is an event-oriented model similar to the actor model [Greif75, Hewitt77], in which a computation of a system is represented as a set of events plus various relations on this set. The enable relation captures the notion of necessary temporal precedence, or causality, between events. The element partial ordering captures the notion of incidental temporal precedence, where one event precedes another because they happen to occur at the same point in space. The temporal partial ordering is the transitive closure of the union of the enable relation and the element ordering. Besides the notion of an event and the relations on events discussed above, GEM includes a number of additional primitive notions. An element corresponds to a locus of activity or point in space. A group is a set of elements and other groups, which is used to collect semantically related objects. History sequences are certain increasing sequences of computation prefixes, and are used as a domain of interpretation for tAmporal lnoir. formulas. ThrARds ar~ A mechanism for dyn~mir:al!~, grouping a sequence of related events. The issues considered by GEM seem largely orthogonal to those examined in this I thesis. The design of GEM seems to have been motivated primarily by a desire to describe, within a common framework, the semantics of a number of primitives of concurrent programming languages. For example, monitors and the CSP communication primitives are discussed. In contrast, this thesis is not concerned with the description of prc;>gramming language primitives, although this is a problem that must ultimately be addressed. A GEM specification describes constraints on computations of a single program, whereas in this thesis a specification is viewed as describing constraints on the entire set of computations of an undetermined program. GEM apparently does not include any notion of behavior, composition, or abstraction. Yonezawa [Yonezawa77] develops techniques for the specification and verification of parallel programs, based on the actor model of computation. The central concepts used in these techniques are the notions of a conceptual state, and a situation. A conceptual state is a summary of the past communication history of an actor, and corresponds closely to the conceptual states used in the state-transition specifications of this thesis. A situation assigns a conceptual state to each actor in a -37 • system, and is used in verification in much the same way as the state of the "composite machine" is used in this thesis (see Chapter 3). The notion of an implementation invariant appears i'1 Yonezawa's work, and plays roughly the same role there as it does in this thesis. Yonezawa's model seems to incorporate a notion of hierarchy of abstraction, in the sense that it is possible to view a system both at a more detailed level, where there is a larger collection of events and more detailed states, and at a less detailed level, where only a subset of the events is considered and less information is contained in the states. Yonezawa's specifications look very much like the definitions of state-transition relations used here, in the sense that a specification describes, for each possible event, a precondition on the state that must hold for an event to occur, and a postcondition that describes the state that results after the event occurrence. The semantics of the event/precondition/postcondition triples used by Yonezawa seems to differ from their counterparts in this thesis, in the sense that if the precondition of an event ever holds, then that event must eventually occur. Thus, Yonezawa's formalism appears, to a r-.P.rtRin P.xtent, to he capable of Axpressino liv~nP,~~ rvoperties. There are three major deficiencies with Yonezawa's work, which are improved upon in this thesis: (1) The semantics of Yonezawa's specifications are defined informally in terms of the actor model, whose precise definition is somewhat obscure. It is therefore not possible to address rigorously the question of what constitutes correctness in Yonezawa's model, and to show that his proof techniques suffice to prove correctness. (2) The actor model lacks a useful notion of modular decomposition. In particular, there is no reasonable way to view a system of actors as a single actor. (3) Yonezawa's techniques can handle only a very limited form of liveness property in specifications and proofs; namely, those of the form: "If the precondition of an event holds, then eventually that event must occur." -38- 2. Framework for a Theory of Specification The purpose of this chapter is to construct a framework of definitions that is suitable as a foundation for a theory of specification. We present and motivate formal definitions of the notions, discussed informally in Chapter 1, of "interface," "observation," "abstraction," "decomposition," "implementation," and "correctness." 2.1 lnte.rfaces, Observations, and Behaviors An event is an observable instantaneous occurrence during the operation of a computer system. If one were to examine a particular computer system in microscopic detail, the events of a system could be identified with physical events, such as voltage changes on signal lines. However, we are generally not interested in such a large amount of detail, and instead regard large classes of physical events as equivalent and indistinguishable. Examples of such equivalence classes are: the event in which process A submits a message to a transmission system for delivery to process B, the event in which the variable x is set to three, and the event in which the synchronizer module receives a try request from user process p. The first step in modeling a particular system is to identify and classify the interesting instantaneous occurrences. As a result of this procedure, we associate with each system and each particular level of abstraction at which the system is to be viewed, an "interface," which represents the set of all possible instantaneous occurrences of interest at the given level of abstraction, plus a single element A, which represents all uninteresting occurrences. Lower levels of abstraction (those that incorporate more detail) are characterized by larger interfaces, corresponding to finer classifications of the instantaneous occurrences, whereas higher levels of abstraction are associated with smaller interfaces, corresponding to coarser classifications. Definition - An interface is a structure , where Eis a set whose elements are called events, AE Is a distinguished element of E called the null event, and the ellipsis indicates that further structure may be present. I We use the symbol E to denote both the entire structure and the underlying set of events. When the interface E is clear from the context, we will omit subscripts, writing A instead of AE. -39- In general, an interface E will have additional structure besides the distinguished element "J\E" For example, in Chapter 5 we will be concerned with interfaces of the form ' where lnE and OutE are subsets of E called the sets of "input events" and "output events," respectively. Except for the material in Chapter 5, the only structure required is the existence of the distinguished null event >..E. If Eis an interface, then let E• denote the set of all finite strings, and EfX) the set of all finite and infinite strings, on the alphabet E - {>..EJ. It is convenient to view E as a subset off• and ffX), where the element >..E of Eis identified with the unique string of length zero, and each non-"J\ element e of Eis identified with the corresponding string e of length one. In the synchronizer example, the interfaces are defined as follows. Let Proc be the set of user processes. The synchronizer module has interface ESM = {tryP, runP, restP: p € Proc) U {>..). A synchronizer component module has interface Esc = {"J\, try, run, rest, token_in, token_out, request_in, request_out}. To rtP.sr.ribP. the functioning nf a ~~te.m dl•ring A ~ingle ev~1_1tion. we postulate the existence of an omniscient observer, outside of the system under consideration. The observer is able to watch the operation of the system and compile a complete record of the events that occur, along with their time of occurrence. We refer to this record, the structure of which will be precisely defined below, as an "observation." An observation is a function that maps each instant of time t in the interval (0, ex:>) to the event that occurs at time t. We assume that at most finitely many non->.. events can occur in any bounded interval. This assumption, which is used to permit inductive reasoning about observations, seems reasonable if we think of a computer as executing in discrete steps taken at a finite rate. The fact that an observation is a (single-valued) function implies that at most one event occurs at each instant of time. This is not to be interpreted as a fact about real-world systems, but rather as part of the definition of the term "event." That is, by definition no more than one event occurs at any instant. To model a situation in which a number of primitive occurrences can happen simultaneously, we must use an interface that contains one event for each possible combination of primitive occurrences. -40- The reason why we define observations as functions from (0, oo) to events rather than simply as sequences of events (and in Chapter 3 define computations on [0, oo) as well), is a technic~I one. We shall often be interested in composing a collection of observations, one for each component module in a system of modules, to obtain a single observation of the composite system. If observations are defined to be sequences of events, then composition of observations corresponds (in the special case that the component modules do not interact) to interleaving of sequences. For example, if a module M can produce the sequence of events ab and module M can 1 2 produce the sequence of events cd, then the composite system consisting of modules M1 and M2 can produce the interleaved sequence of events acbd. The feature of interleaving that is inconvenient for our purposes Is the fact that the indices of events change under interleaving. That is, the event b appears as the second event in the sequence ab, but as the third event in the sequence acbd. The definitions of observation and composition we use have the more convenient property that an event appearing at time t in an observation for module in isofation always corresponds to the event appearing at time t in a composite observation. Definition - An observation over an interface Eis a function x: [0, 00) - E, such that x(t) '¢- X for at most finitely many t in each bounded interval. I Let A denote the identically X observation, and let Obs(E) denote the set of all observations over E. If x E Obs(E), and a € [0, 00), then let [x] denote the function that maps each t € [0, 00) to the the (finite) string of non-X events that occur during the interval [0, t) in x. Let suffix (x) be the observation y € Obs(E) such that y(t) = x(t + a) 8 for all t € [0, 00). By collecting the set of all observations that can be produced by a system in various environments, we obtain the "behavior" of that system. Definition - A behavior of interface Eis a subset of Obs(E). Let Beh(E) be the set of all behaviors of interface E. - 41 • 2.2 Abstraction, Decomposition, and Interconnection In this section, we show how the concepts of hierarchy of abstraction and modular decomposition can be captured through the use of certain mappings between interfaces, which we call "translations, 11 and the corresponding mappings they induce on observations. Definition - A translation from an interface E to an interface F is a function h: E - F such that h(AE) = '>..F. A translation h from E to F extends in a natural way to a function h: Obs(E) - Obs(F), under the definition h{x) = h•x. I The concept of an "interconnection, 11 defined below, is the formal notion corresponding to a diagram like Figure 2. Intuitively, an interconnection consists of of 11 an "abstraction map, which captures the relationship between a more concrete and a more abstract view of a system, and a "decomposition map," which captures the relationship between a composite system and its component modules. An abstraction map is simply a translation from the interface corresponding to the concrete view, to the interface corresponding to the abstract view. A decomposition map is a collection of translations that shows how the events for the composite system are decomposed into events for the component modules. Definition - An interconnection is a pair , = ,e>, where a3: E' - o' is a translation, I is a finite index set, and each al: E3 - F/ is a translation. The interfaces Fl are the component interfaces of ,, the interface E' Is the composite interface of ,, and the interface o' is the abstract interface of,. The translation a' is the abstraction map of ,, and the vector ;Ei is the decomposition map of,. I In the sequel, underlining will be used to denote a vector of objects; thus we write 4' for the vector ,€,. The synchronizer implementation yields an example of an interconnection. The content of Figure 2 is formalized by the interconnection 1 1, where ESM , aSM : 1 ESM - ESM and a:-41: fSMI - Esc, p E Proc, are defined below. The composite interface for the synchronizer module implementation is ESM' = {tryP, runP, restP, tokenP, requestP: p E Proc} U {'>..}. - 42 - The decomposition map ~ SM, projects or decomposes each event for the composite interface into corresponding events for the synchronizer component modules. The events tryP, runP, and restP in ESM1 decompose to try, run, and rest events of the pth synchronizer component module. The events tokenP and requestP of ESM1 represent interaction between the pth synchronizer component module and its neighbors in the ring. Specifically, the event tokenP represents the joint occurrence of a token_out event for the pth synchronizer component module, and and a token_jn event for the p + 1st synchronizer component module. Similarly, the event requestP represents the joint occurrence of a request_out event for the pth synchronizer component module and a request_Jn event for the p-1 st synchronizer component module. Formally, 6SM1(e) = try, if e = try p p = run, if e = runP = rest, if e = restP = token_in, if e = tokenP-1 = token_out, if e = tokenP = request_ln, if e = requestP + 1 = request_out, if e = requestP = >., otherwise. The abstraction map 1 aSM preserves events in which the system of synchronizer component modules interact with the user processes, but deletes (i.e. maps to >.) events corresponding to internal interaction between synchronizer component modules. Formally, aSM1(e) = e, if e E {tryP, runP, restP: p E Proc}, if e E {tokenP, requestP: p E Proc} U {>.}. We assign intuitive significance to some of the operators on behaviors that are naturally induced by abstraction and decomposition maps. The direct image operator associated with an abstraction map takes a behavior of a system viewed at a more concrete level, and produces the corresponding behavior of that system viewed at a more abstract level. Definition - The abstraction operator associated with a translation a: E - D, is the function, also denoted by a, that maps each behavior B E Beh(E) to the direct image a(B) E Beh(D). We refer to the behavior a(B) as the abstraction of B under a. I -43- The inverse image operator induced by a decomposition map models the operation of composing a collection of component module behaviors to produce the corresponding behavior of the composite system. Intuitively, if S is a system consisting of component modules ,E,• where B E Beh(F) for each i E /, 1 to a behavior a·1c.a) € Beh(E), under the definition: &·1(li) = {x E Obs(E): 8 (x) Es, for all/ E /}. 1 Thus, the set i ·1w_) contains an observation x E Obs(E) iff 8 (x) € B for all i E /. We call 1 1 this set the composition of a. under 4 . I 2.3 Specification, Implementation, and Correctness In practice a specification will take the form of a string of symbols in a formal specification language, since it must be possible to write down a specification. However, since this thesis is not concerned with the details of a particular formal language in which specifications are to be expressed, it is convenient to adopt a more liberal view: A specification is any mathematical object that denotes, in a well-defined way, an interface and a set of behaviors of that interface. Definition - A specification language is a triple , where Specs is a set of specifications, g is a mapping that assigns an interface C(S) to each specification S E Specs, and ~ is a mapping that assigns a set 5(5) ~ Beh(S(S)) to each specification S E Specs. We say that S is a specification of interlace S(S), and that each 8 E ~(S) satisfies S. I An interconnection describes the pattern of interaction between modules in a system in analogy to the way a program scheme describes the flow of control between uninterpreted statements. It makes no sense to speak of an interconnection as "correct" or "incorrect," since an interconnection Includes no information about the behaviors of the component or abstract modules. However, if we provide an interpretation for the modules by augmenting an interconnection with specifications of the abstract and component module interfaces, it does become meaningful to speak of correctness. We use the term "implementation" for an interconnection augmented with -44- specifications. Definition - An implementation is a tuple (3, Saba' ;E;>' where 3 is an interconnection, Sabs is a specification of interface o3, and S is a specification of 1 interface F;, for each i € /. I An implementation is correct if, whenever acceptable behaviors are plugged in for the component modules, then the resulting abstract module behavior is also acceptable. The composition and abstraction operators associated with the interconnection formalize the notion of "plugging In." Definition - An implementation <3, Sabs, ,€;> is correct If a30~ 3)·1(li ) € ~(Sabs), whenever 8 E ~(S) for each IE/. I 1 - 45 - 3. State-Transition Specifications In this chapter, we will investigate a particular approach, called "state-transition specification," to the derivation of module specifications. In this approach, we imagine that at any instant of time a module can be thought of as being in one of a number of conceptual states. Associated with each conceptual state is a collection of events that can occur in that state, and a description of the state change that results from the occurrence of each of those events. Thus, a state-transition specification describes the desired functioning of a module in terms of a kind of machine that generates an observation as it executes. It is important to note that the conceptual states in a state-transition specification are merely a tool for describing the desired functioning, and need not have anything to do with the "real" state present in any particular module instance that satisfies the specification. The properties captured by the state-transition technique discussed here are divided into two classes: "local" properties, which concern the relationship between an event and the conceptual state of the module immediately preceding and immediately following the occurrence of that event, anc! "global" properties, which relate events and states perhaps distant from each other in time. Local properties are of the form: "An event e can occur only if the state of the module satisfies P, and if e occurs, then the old state and new state of the module are related by the binary relation R." Examples of global properties are "eventuality" conditions of the form: "If the module is now in a state with property P, then eventually event e will occur." Local properties are specified by a machine as mentioned above. Global properties are specified by defining a set of "validity conditions" on computations of the machine. The set of computations that satisfy the validity conditions is called the set of "valid" computations. The reason for investigating state-transition specifications is that they appear to provide a natural, straightforward strategy for turning an intuitive understanding of the desired function of a module into a formal specification. This strategy consists of the following steps: (1) Define an appropriate set of conceptual states. For example, in the specification of the abstract synchronizer module, a state is a vector that tells for each user process whether the synchronizer module thinks that process is trying, running, resting, or in error. (2) Define a set of initial states, in which the module begins execution. For the -46- synchronizer module, there is a single initial state in which all user processes are resting. (3) Defin~. for each event, the conditions required on the state for the occurrence of that event to be possible, and the state changes associated with an occurrence of that event. For example, a "run" event for process p can occur only if p is trying and no other process is currently running. Occurrence of a "run" event causes the state of p to change to "running" and leaves the states of all other processes unchanged. (4) Define the desired global properties for the module. For the synchronizer module, we wish to require that every user request eventually result in a corresponding reply, if possible. Besides serving as a natural vehicle for formalizing specifications, the state-transition approach also provides a strategy for performing correctness proofs. The Correctness Theorem (Theorem 3.9) gives sufficient conditions for correctness that exploit the machine structure of the specifications. This chapter is organized as follows: In Section 3.1 the notion of "subset specifications," of which state-transition specifications are an example, is introduced. In Section 3.2 the machines used in state-transition specifications are defined, and in section 3.3 some tools for reasoning about their computations are developed. The notion of a state-transition specification is defined in Section 3.4. In Section 3.5 the Correctness Theorem, which is the main result of this chapter, is proved. Section 3.6 shows that the Correctness Theorem is a natural generalization of the "possibilities mapping" proof technique of Lynch [Lynch83] and Goree [Goree81]. Section 3.7 shows how the proof technique suggested by the Correctness Theorem can be further systematized in the case of state-transition specifications whose sets of valid computations have been defined by "rely-/guarantee-conditions." 3.1 Subset Specifications As discussed in Chapter 2, a specification S of interface E defines a set ~(S) of behaviors of interface E. In general, we might look for specification techniques that are capable of expressing arbitrary properties of behaviors. However, in practice it appears that the properties of behaviors we wish to express in a specification are nearly always of a special form. That is, it is nearly always the case that we wish to express universal · 47 • properties of the observations in a behavior, of the form: "Every observation x in 8 has property P," where Pis a property of observations. This means that in practice it is usually not necessary to have a specification technique that is powerful enough to express arbitrary properties of behaviors. Rather, a less powerful technique, which is capable only of expressing properties of observations, suffices. The state-transition specification technique introduced in this chapter is of this less powerful variety. Definition - A specification S of interface Eis a subset specification if there exists a set O(S) ~ Obs(E) such that ~(S) = {8 E Beh(f): 8 ~ O(S)}. I For the rest of this thesis we will be concerned only with subset specifications. To see what we give up by restricting our attention to subset specifications, let us consider some examples. Examples of properties of behaviors that can be expressed as subset specifications, that is, as universal statements about observations in a behavior 8, are the following: - Every observation in 8 contains at most finitely many occurrences of non->. events (that is, computation always quiesces). - In every observation in 8, either each occurrence of a try event for process p is ultimately followed by a run event for process p, or else there is a point in time after which some process is in the "running" state forever. Examples ·o f properties of behaviors that cannot be expressed as subset specifications, and hence cannot be captured by the state-transition approach discussed here are: - There exists an observation in 8 that contains at most finitely many occurrences of non->. events (there exists a quiescing computation). - If xis an observation in B and t E [O, 00), such that [x](t) = u, then there is an observation y E 8 and a t ' E [O, 00) such that [y](t ') = ue. (if the module is capable of doing u, then it is also capable of doing ue). - If x is an observation in 8, and f is an order-isomorphism from [O, 00) to (0, 00), then x0 f is also an observation in 8 (the module is asynchronous, or timing-independent). Because the properties of behaviors defined by subset specifications are really just "lifted" properties of observations, the definition of correctness of an implementation that involves subset specifications has an equivalent statement in terms -48- of observations. Lemma 3.1 - Suppose that<,, Sabs' ;E;> is an implementation, where S bs and each 8 S; 1is a subset specification. Then<-', Sabs' S, > is correct iff a' 0 (.d ')" ((0(S1)>;Ei) ~ O(Sabs). Proof - =>Suppose<,, Sabs' S. > is correct. Suppose that x E Obs(E') is such that B;(x) E O(S;) for each; E /. Then the behavior {x} is the compo~tion under .d' of the vector of behaviors <{8;(x)}>;Ei• and the behavior {a'(x)} is the abstraction under a' of the behavior {x}. Since the behavior {8;(x)} satisfies S for each i E /, it follows by 1 correctness that the behavior {a'(x)) satisfies Sabs' Thus a'(x) E O(Sab ). 8 <= Suppose that a' 0 {,d ')"1(<0(S;)>,E,) ~ O{S bs). For each i E /, suppose that B 8 1 is behavior that satisfies Sr Then B; ~ O(S;). Let Babs = a' 0 (.d '>"1(L! ). Then Babs is a subset of O(S bs) by hypothesis, and hence Hence Baba satisfies Sabe· I 8 3.2 Machines and Computations In this section, we detine a kind of nondeterministic machine that generates an observation in each of its computations. Definition - A nondeterministic event machine (or just rmachine" for short) M consists of: - An interface E,., - A set o,., of states. - A nonempty set I nit,., ~ QM of Initial states. - A relation Trans,.,~ Steps(E,.,, O,.,) = Q,., x EM x QM' called the state-transition relation, such that for all q € O,.,, the null step E Trans,.,. I If E,., = E, then we say that Mis a machine of interface E. The state-transition relation Trans"' of a machine M has a natural extension Trans,.,• that applies to strings of events, rather than just single events. Formally, define Trans./ ~ QM x E,/ x QM to be the least relation containing Trans"', and having the following closure property: If E Trans"'• and E Trans"'•, then E Trans,.,•. (Recall from Section 2.1 that we identify the null event AE with the empty string.) -49- Definition - A state q E QM is reachable by M if there exists a state q E lnitM and a 0 string u E EM• such that E TransM •. Suppose that R ~ QM' Then R is inductive for M if (1) lnitM ~ R. (2) For all E Transu, if q ER then r € R. We say that R is invariant tor M if it contains all reachable states of M. The following extremely important induction principle is a standard technique (see, e.g. [Keller76]) for proving properties of reachable states. Lemma 3.2 (Induction Principle) - Suppose Mis a machine, and that R ~ QM. If R is inductive for M then R is invariant for M. Proof - Straightforward. I Ordinarily, a computation of a machine might be defined to be a pair consisting of a state sequence%• q , ... , and an event sequence e , e , ... , such that each step 1 0 1 (qi(, eI (' qi(+,> s~tisfies the state-transition relation. Intuitively, q1c and q1c + 1 repreeent the states "just before" and "just afterl' the occurrence of the event e,c, respectively. To define a computation in which the notion of an event sequence has been replaced by that of an observation, we generalize the notion of a state sequence to that of a "state function," which assigns a state to each nonnegative real number, in such a way that the notion of state "just before" and "just after" each point t € [O, 00) is meaningful. Definition - A state function over a set of states Q Is a function f: [O, 00) ➔ Q such that for all t € [O, 00), there exists t > 0 such that f is constant on the intervals [t-e,, t] n 1 [O, 00) and (t, t + t ]. I - 1 We write f(t +) as an abbreviation for the constant value of f on the interval (t, t + e ), 1 which intuitively represents the state "just after" time t. The state at and also "just before" time tis represented by the ~alue f(t). Definition - A history over an interface E and state set Q is a pair X = , where Obsx is an observation over E, and Statex is a state function over Q. Let Hist(£, Q) denote the set of all histories over interface f and state set Q. I -50- If XE Hist(E, Q) and t E (0, oo), then define the step occurring at time tin X by: Stepx(t) = . The generalization of the ordinary definition of a computation is now straightforward. Definition - A computation of a machine Mis a history XE Hist(EM' QM) such that {1) Statex(0) E lnitM. (2) Stepx(t) E TransM for all t E [0, oo). Let Comp(M) denote the set of all computations of M. I If V is a set of computations of M, then define Obs(V), the set of all observations generated by V, by Obs(V) = {Obsx: X € V}. 3.3 Properties of Histories The purpose of this section is to develop some machinery for passing back and forth between histories and "history skeletons," which are sequences of steps plus timing information. Each history skeleton naturally defines a unique history. Conversely, given a history X we can extract (though not in a unique way) a history skeleton by restricting Stepx to a suitable subset T of [0, oo). Whereas histories have convenient behavior under projection, history skeletons are more useful for performing computational induction arguments. Definition - A skeletal sequence is a monotone increasing sequence t < t < ... of 0 1 elements of [0, oo), such that t = 0 and tit-+ oo ask - oo. A skeletal sequence T = 0 ;E, is a finite collection of histories. Then there is a skeletal sequence T that spans all the x,. Proof - For each i E /, let T; be a skeletal seQuence that spans X , and define T = u,E:, Tr 1 The finiteness of I implies that T has order type w, and is hence a skeletal sequence. It is obvious that Tspans each x,. I Definition - A history skeleton over an interface E and a state set Q is a function f: T - Steps(E, 0), where T = f or 11 11 11 11 each k € Jf, then r,.. = q + for all k E Jf. The history skeleton f spans a history X if T 11 1 spans X and f is the restriction of Stepx to T. I Lemma 3.5 - Suppose that f is a history skeleton over E and o. Then there is a unique history X over E and O such that f spans X. Proof - Suppose f: T - Steps(£, 0), where T = E:.N' Suppose f(t"') = . The requirement that f spans X defines X uniquely: Obsx(t) = e11 , if t = ti< = A, otherwise. Statex(t) = q , if t = O 0 = qlt+1' iftE(tH,t1<+11• It is easy to see that Xis a history. I Lemma 3.6 - Suppose Xis a history over E and Q. If T = . If f 11 11 11 is a history skeleton, then f spans X by definition. To see that f is a history skeleton, we must show that r"' = q 1i + for all k E .K. Fix k E Jr. By definition of a state function, we 1 can select e > O such that Statex is constant on the interval (t , t + eJ. Then '1c = 11 11 Statex(t,.. + e). Since Statex is constant on the interval (t , t1c + ) by the fact that T is a 11 1 skeletal sequence of X, it follows that '1c = q + ,. I 11 -52- The following consequence of Lemma 3.3 and Lemma 3.6 says that every state appearing in a computation is reachable. Corollary 3.7 - Suppose X is a computation for a machine M. Then Statex, where Ms is a machine of interface E and Vs is a set of computations of Ms, which we call the set of valid computations. I !f S is a state-transition specification of interface E, then the eet of behav!crn that satisfy S is defined as follows: ~(S) = {8 E Beh(E): B ~ Obs(Vs)} It is clear from this definition that state-transition specifications are subset specifications. As a concrete example of a state-transition specification, consider the specification for the synchronizer module. The interface for the synchronizer module is defined by: £SM = {>.} u {tryP, nmP, restP: p E Proc}. The state set 0 5M for the synchronizer module specification is defined by oSM = npEProc {trying, running, resting, error}. Thus each element of the state set oSM is a vector that tells, for each process p E Proc, what the synchronizer module thinks that process is currently doing. If q E oSM and p € Proc, then let q(p) denote the component of q corresponding to process p. If v € {trying, running, resting, error), then let q(vlp] denote the stater E oSM that is identical to q except that r(p} = v. -53- Next, we define the initial state set lnitSM and state-transition relation TransSM for the synchronizer specification. The initial state set lnitSM consists of the single state q that assigns the value "resting" to each p € Proc. The state-transition relation TransSM contains a step iff either e = A and q = ,, or one of the conditions (try), (run), or (rest) below is satisfied for some p € Proc: (try) e = tryp ' and either q(p) = resting and, = q[trying/p], or q(p) ~ resting and r = q[errorlp). (run) e = runP, q(p) = trying, q(p ')~running for all p '€ Proc-{p}, and, = q[running/p]. (rest) e = restP, and either q(p) = running and, = q[resting/p], or q(p) ~ running and r = q[errorlp). We have defined the machine M8M = for the synchronizer module specification. To complete the state-transition specification of the synchronizer module, we must define the set vSM of valid computations of MSM. The intuitive property we wish to capture by this definition is that the synchronizer must eventually grant all requests, if possible. The qualification "if possible" is required since if one user process remains In the "running" state forever, then it will be impossible for the synchronizer module to grant any further requests, without violating the mutual exclusion property. We can informally state the defining property of vSM as follows: "If, for all user processes p, every instant of time at which p Is running is eventually followed by an instant of time at which p is not running, then, for all p, every instant of time at which p is trying is eventually followed by an instant of time at which p is running." The validity condition for the synchronizer module is relatively simple, but already the locutions used to precisely define this condition are somewhat awkward. To deal with more complex specifications, we require a more compact notation that can be systematically applied, as opposed to the ad hoc approach taken above. Such a notation is developed in the next chapter, where the constructs of temporal logic are used to express properties of histories. . 54. 3.5 The Correctness Theorem In this section we consider the problem of how to prove the correctness of an implementation with respect to state-transition specifications. The fundamental result of this section is the Correctness Theorem. This theorem shows how the correctness of an implementation follows from certain properties of a composite machine, which is a kind of a kind of product of the machines for the component module specifications and the machine for the abstract module specification. Associated with this product construction are projection maps that take each computation for the composite machine to a corresponding computation for the abstract module machine and for each component module machine. The Correctness Theorem states that, for an implementation to be shown correct, it suffices to show that two conditions hold for the composite machine. We call these conditions the "maximality" condition and the "validity" condition. The maximality condition concerns the relationship between the state-transition relations of the component module machines and the state-transition relation of the abstract module machine. The validity condition concerns the relationship between the set of valid computations for the component modules and the set of valid computations for the abstract module. If the inclusion of the machine from the abstract module specification as a part of the composite machine seems somewhat strange, consider the following analogy: In proofs of concurrent program correctness using Hoare-like deductive systems [Apt81, Owicki76], it is well known that it is sometimes necessary to introduce "ghost variables," which have no effect on the execution of the program, but merely serve to capture information about the state of program execution not reflected in the values of the program variables. The abstract module part of the composite machine serves_the same function as ghost variables: namely to capture information about the history of system execution possibly not reflected in the states of the component module machines. The proof technique suggested by the Correctness Theorem seems closely related to the "data refinement proofs" of [Jones81]. Jones shows how the correctness of implementations of data abstractions can be performed via "representation relations," which relate the states of abstract data objects to states of their concrete -55- representations. Representation relations capture the same information as the "implementation invariants" defined below, and the "possibilities mappings" of Lynch [Lynch83] and Gor~ [Goree81] (see Section 3.6). We now define precisely the notion of the composite machine for an implementation. Suppose <,, Sabs' is an implementation, where Sabs = and S1 = , for each i €/,are state-transition specifications. Definition - The composite machine M for the implementation is defined as follows: 3 EM = E QM = QM X n,e, QM. abs / Let "abs and ,,,, be the canonical projection maps from the cartesian product Q"' onto the factors Q"' and Q"' , for each ; € I. abs / lnitM = lnitM x n,e, lnit"'. abs / Trans"' = { € Steps(E1,1, Q"'): {,rab6(q), a(e), "•(r)> € Trans" and {,r (q), B(e), w (r)> E Trans"' for -all IE/}. I 1 1 1 I . Suppose that X € Hist(E"', QM). Then associated with Xis its canonical projection x and xt o be correct. Intuitively, the maximality condition states that the abstract machine can perform any event that can be performed by the system of • component module machines. The validity condition states that a computation that Is -56- valid for each of the component modules is also valid for the abstract module. Definition - The maximality condition holds for the implementation i f for all states q reachable for the composite machine M, and all e € E, If 8 (e) is enabled for M in 1 1 state w;(q) for each i EI, then a(e) is enabled for M bs in state ,rabs(q). I 8 Definition - An implementation invariant for the implementation i s a set Inv ~ 01,1, such that Inv is inductive for the composite machine M. I Note that an implementation invariant is indeed invariant for M by the Induction Principle (Lemma 3.2). Since an implementation invariant contains all reachable states of the composite machine, it is sufficient to use "q € Inv, where Inv is an implementation invariant," in place of "q reachable for the composite machine," in proving that the maximality condition holds; Definition - The validity condition held~ for the implementation <-', Sabs' S. > if: Whenever X is a computation for the composite machine M with the property that x E v.bs as well. I We now come to the main technical lemma (Lemma 3.8 below) used to prove the Correctness Theorem. The intuitive content of this lemma is as follows: Suppose we are given a collection X of computations for the component module machines, which are "coherent" In the sense that there is a single observation x € Obs(E) such that each Obsx is the image of x under the mapping a,. The vector X of computations can be I thought of as a computation of the system of machines, obtained by juxtaposing the machines for the component module specifications, and "interconnecting" their events as specified by the decomposition map I. . Lemma 3.8 asserts that, if the maximality condition holds, then it is possible to construct a computation X for the composite machine M, such that Obsx = x, and furthermore, such that the projections x<11 of the computation X are the given original computations x,. Since x;c,> be an implementation, where S bs = . Suppose that x E Obs(£'), and that x, is a computation of M for each i E /, such that the collection K is x-coherent. Then there 1 exists a computation X of the composite machine M such that Obsx = x I and such that x is a history skeleton for M. By Lemma 3.5 11 11 1 there is a unique history X for M such that f spans X. It is easy to see that X is a computation of M with Obsx = x and x<,) = x, for each i EI. The q are constructed by induction on k. At the kth stage of the construction (k > 11 0), we assume that q• has been constructed so that q is reachable and ..- (q•) = 11 1 Statex,C,11) for all i E /. We construct q 11 + 1 so that ..-1(q• + ) = Statexp1i + 1) for all i EI and 1 so that E TransM. It follows by definition of reachability that q• + 11 1 Is reachable. Basis: Let q be an arbitrary element of {q E lnitM: w (q) = Statex (O) for all i E /}. Note 0 1 I that this set is nonempty since it is a cartesian product of nonempty sets. Clearly % E Inv and ,,,,(qJ = Statex (0) for all i EI. I Induction: Suppose, for some k E .N', that q• has been defined so that q" E Inv and .,,,(q") = Statex_(t ) for all i EI. Since x, is a computation for M , for each i EI, we know that 11 1 I B,(e ) is enabled for M in state .,,,(q"), for each ; E /. Since q is reachable for M, the 11 1 11 maximality condition implies that a(e.) is enabled for M bs in state wabs(q ). Hence 8 11 -58- {q E QM: E TransM and w;(q) = Statex_;E;> is an implementation, where Sabs = and 8 that S; = for each i E / are state-transition specifications. Suppose that the maximality and validity conditions hold. Let M be the composite machine. Suppose that x E Obs(E) is such that 6(x) E 0{S;) for all i E /. By Lemma 3.1, it suffices to show that 1 a(x) € 0(Sabs>· Since 6(x) E 0(S ) for each i E /, we know that for each i E / there is a 1 1 computation x, E v,, such that Obsx = «';(x). Since the collection K is x-coherent, by f Lemma 3.8 there exists a computation X for the composite machine M, such that Obsx = x and such that x € O(Saba)' as required. I 3.6 Possibilities Mappings In this section we show that the Correctness Theorem is a natural generalization of the "possibilities mapping" proof technique proposed by Lynch [Lynch83] and Goree [Goree81]. Lynch and Goree define a possibilities mapping to be a function that assigns a set of abstract module machine states to each vector of states for the component module machines, in such a way that the initial state set and state-transition relation are preserved. The fact that each vector of component module states is mapped to a set of abstract states, rather than to a single abstract state, means that possibilities mappings are a generalization of the usual notions of simulation or machine homomorphism. Intuitively, the value of the simulation mapping on a vector of component states is the set of "possible" abstract states that correspond to the given component states -- hence the name "possibilities mapping." Lynch and Goree's proof technique can be stated as follows: "If there exists a possibilities mapping for an implementation, then the implementation is correct." Interpreted in the framework of this thesis, Lynch and Goree's technique applies only to -59- implementations that involve state-transition specifications tor which V = Comp(M). For such implementations, the validity condition required by the Correctness Theorem is vacuous. Theorem 3.1 O below shows that the existence of a possibilities mapping is equivalent to the maximality condition required by the Correctness Theorem, and thus the Correctness Theorem includes Lynch and Goree's proof technique as a special case. To define the notion of a possibilities mapping, suppose ;r_,> is an implementation. Suppose S bs = and S = , tor each i € /. Let M be 8 1 1 the composite machine. Definition - A possibilities mapping tor the implementation ,E,> is a function t: n;Ei QM - ~QM ), with the following properties: I abs (1) lnitMabs ~ t(,E,) whenever q1 € lnitM for all/€/. 1 (2) For all q E QM' if"a bs(q) E f(,E), then: (a) Whenever r € QM and e € EM are such that E TransM, then "abs(r) € f((.,,!(r)> € ). 1 1 (b) For all e € EM' it 81(0) is enabled in state w1(q) for each ; € I, then a(e) is enabled for Mabs in state,,a bs(q). I Theorem 3.10 - Suppose that ,E,> is an implementation, where Sabs and S1 for each i E I are state-transition specifications. Then the following are equivalent: (1) There exists a possibilities mapping for . (2) The maximality condition holds for . Proof - Suppose that S bs = Wabs' V bs>, and that S = , for each i € I. Let M be 8 8 1 1 the composite machine for the implementation 0, s., S>. (1) = > (2): Suppose that f is a possibilities mapping for J, Sabs and S. Define Inv = {q € QM: "abs(q) E t(,E1)}. Condition (1) in the definition of a possibilities mapping implies that lnitM ~ Inv. Condition (2)(a) in the definition of a possibilities mapping implies that Inv Is inductive, and hence by lemma 3.2 contains all states reachable for M. The maximality condition now follows from condition (2)(b) in the definition of a possibilities mapping. (2) = >( 1 ): Conversely, suppose that the maximality condition holds. Define f: n,E, 01 - -60- Qabs as follows: f(;E,) is the set of all q bs E Q bs such that there exists a reachable 8 8 state q for M with ,, abs(q) = q bs and w(q) = q; for all i E /. We claim that f is a 8 1 possibilities mapping. Condition (1) in the definition of a possibilities mapping holds, since given ;Et E n;E1 lnitM;' then every qabs E lnitMabs yields a state ;E,> that is in lnitM and hence is reachable for M. To show that condition (2) holds, suppose that q is a state of M such that wa bs(q) E t(,E,). Then q is reachable for M by definition off. To see that (2)(a) holds, note that if E TransM' then r is reachable by definition of reachability, and hence .,, abs(r) E f(,E,). The maximality condition implies that condition (2)(b) holds. I 3. 7 Rely-/Guarantee-Conditions In this section we will see how state-transition specifications whose sets of valid computations are defined by rely-/guarantee-conditions can be used to perform the va!idity part o! a proof of correctne~. The principle of roly /_;u::::-:::r.tcc cor.dltlor.:; states that the set of valid computations v in a state-transition specification S = should be defined in the form: "Rely implies Guar," where Rely expresses the properties that the module being specified relies on Its environment to provide, and Guar expresses the properties that the module guarantees to provide in return. For the synchronizer module, we wish the validity conditions to capture the idea that every user's request should eventually result in a response, if possible. The tricky part is the precise formulation of the "if possible" condition. Clearly if some user goes into the running state and remains in that state forever, then It will never be possible to allow any other user in the trying state to go to the running state, without violating the mutual exclusion property. This condition can be stated in rely-/guarantee-condition form as follows: "If every user process obeys the requirement that, once in the running state, it will eventually leave the running state, then the synchronizer module guarantees that every user in the trying state will eventually lea"e the trying state (and hence advance to the running state.)" We have two results, Lemma 3.11 and Lemma 3.12 below, that describe techniques for using rely-/guarantee-condition specifications in proofs of correctness. In both of these techniques, we are required to prove: - 61 · (*) Each component module's rely-condition is implied by the conjunction of the abstract module's rely-condition and the guarantee-conditions for some subset of the component module~. Although the exact form taken by condition (*) is different for the two techniques, a proof by either of the techniques is simplest when the rely- and guarantee-conditions for the component modules are chosen so that the truth of condition(*) is obvious. Thus, rely- and guarantee-conditions serve to "cut" the interdependence of modules on each other, analogously to the way in which a loop invariant cuts the dependence of one iteration of a loop on the previous iteration. This observation is strong motivation for the suggestion that module specifications ought not to be derived In isolation, but rather with a proof of correctness in mind in which those specifications are used. A correctness proof that makes use of Lemma 3.11 or Lemma 3.12 is rather different from one in which eventuality conditions (such as termination) are verified by the well-founded set techniques of [Floyd67, Keller76] and others. Proofs by the latter techniques tend to take the form of reasoning about the structure of a computation, whArARs proofs by I amma 3.11 arid Lemma 3.12 tent.:« to be argum~nts baeed on the communication structure of the modules in the system. Experience gained from the examples presented in this thesis suggests that arguments based on communication structure are simpler and more natural. The use of rely- and guarantee-conditions has been proposed for safety specifications in [Jones83]. Independently of this thesis, Barringer and Kuiper [Barringer83] have proposed the use of liveness specifications that are partitioned into an "environment part," which captures assumptions made about the environment, and a "component part," which captures commitments made by the module being specified. Jones, as well as Barringer and Kuiper, exploit the refy-/guarantee-condition structure of specifications by defining inference rules for process composition that seem closely related to Lemma 3.11. Barringer and Kuiper's environment/component division seems essentially the same as the rely/guarantee division used in this thesis, except that Barringer and Kuiper apply the environment/component division to state-transition properties, as well as liveness properties. Misra and Chandy (Misra81] have also used a kind of rely/guarantee distinction to develop proof techniques for safety properties. In that paper, a process his specified by an assertion of the form rlhls, where, ands are predicates on finite sequences (called -62- traces) of communication events. Such an assertion is interpreted as: "The predicate s holds of the empty trace, and for all traces t that can be produced by process h, if r holds for all proper prefixes oft, thens holds for all prefixes (both ·proper and improper) of t. The predicates r and s can be thought of as roughly analogous to rely- and guarantee-conditions, respectively, although the former are properties of finite prefixes of traces rather than properties of infinite computations. Misra and Chandy's proof technique is a "Theorem of Hierarchy," which gives conditions under which specifications of a collection of components can be used to infer a specification of the network formed by interconnecting the components. Their proof technique can be stated as follows: To show that the specification ROIHISO for the network H is a consequence of the specifications r) h)s (i E /) for the components, it suffices to show 1 that: (1) S Implies SO, (2) RO and S implies R, where Rand S denote the conjunction of the r ands,, respectively. These conditions 1 are syntactically similar to the conditions (1) and (2) of Lemma 3.11, although their meaning is quite different. The proof of Misra and Chandy's Theorem of Hierarchy Is by induction on computation prefixes, whereas the proof of Lemma 3.11 is by structural induction using a well-founded dependency relation. In [Misra82], the techniques of (Misra81] are extended to encompass a weak form of liveness specification in which an additional predicate q is used to state conditions under which a process trace is guaranteed to be extended. The Theorem of Hierarchy is augmented with additional conditions to permit its application to these more general specifications. The additional conditions do not appear to relate In a simple way to any conditions used in this thesis. To state Lemma 3.11 and Lemma 3.12, the following notation is convenient: If R and Gare subsets of a universe U, then define R -u G (read R implies Gin U) to be the subset (U - R) U G of u. In applications of Lemma 3.11 and Lemma 3.12, the set U will be the set Comp(M) of computations of a machine M, and Rand G will be the sets of computations of M that satisfy rely-conditions and guarantee-conditions, respectively. Lemma 3.11 below says that to prove that the validity condition holds, it suffices to prove: (1) The abstract module's guarantee condition is implied by the conjunction of -63- the guarantee conditions for the component modules. (2) There exists a well-founded partial ordering (a "depends on" relation) of the component modules in the system, such that each component module's rely-condition is implied by the conjunction of the abstract module's rely-condition and the guarantee-conditions for the modules on which the component depends. Lemma 3.11 - (Rely/Guarantee Technique I) - Suppose U is a set and that Rabs' Gabs' and R1, G1 tor each ; E / are subsets of U. Suppose Vabs = Rabs -u Gabs and v, = R1 - u G1 , tor each i E /. Suppose (1) n,E, G, {; Gabs' (2) There exists a well-founded partial order < on I such that for all i E /, Rabs n (ni 1 > ... , In 0 1 contradiction with the well-foundedness of <. I An example of the use of Lemma 3.11 can be found in the proof of correctness of the transmission module implementation in Appendix II. The existence of the "depends on" ,-elation required to satisfy hypothesis (2) of Lemma 3.11 is a rather stringent condition. In some cases, for example the synchronizer implementation, all of the component modules in the system are symmetric In their relationship to each other, and it is hard to see how a suitable dependency relation might be found. Lemma 3.12 below shows that an alternative "acyclicity" condition can be used, in case the component module rely- and guarantee-conditions can be factored in a certain way. Specifically, Lemma 3.12 assumes that the rely-condition for module ; can be expressed as the conjunction of what module ; relies on the external environment and on each component module i to provide, and that the guarantee-condition for module i can be expressed as the conjunction of what module i guarantees to the external environment and to each -64- component module j. In Lemma 3.12 below, one should think of Rabs' Gabs as the rely- and guarantee-conditions for the abstract module, and of R , G as the rely- and 1 1 guarantee-conditions for component module i. The hypotheses of Lemma 3.12 require us to find {RG;i i, i E / + {abs}}. {RG stands for "rely/guarantee.") Intuitively, if i, i E /, then RG .. expresses what module i guarantees to module j, and also what module i IJ relies on module; to provide. RGabsj expresses what the external environment of the entire system guarantees to component module j, and also what module J relies on the external environment to provide. RG,.abs expresses what component module i guarantees to the external environment, and also what the external environment relies on module ito provide. Condition (1 )(a) and (1 )(b) in Lemma 3.12 state, intuitively, that the abstract module's rely-condition implies what each of the component modules rely on the external environment to provide, and that the abstract module's guarantee condition is implied by the conjunction of what each of the component modules guarantees to provide to the external environment. Condition (2)(a) states that each component module's rely-condition is implied by the conjunction of what that component relies on the external environment to provide and on what that component relies on the component modules in the system to provide. Condition (2)(b) states that .t he guarantee-condition for component module ; implies what module ; guarantees to the external environment and what module; guarantees to each of the component modules in the system. Condition (3) in Lemma 3.12 is an acyclicity condition, which states that there can be no unbroken cyclic dependency between component modules. If / is a set, then define a cycle of I to be a nonempty subset of / X / of the form: {, , ... , }, such that in= i • 0 2 1 0 Lemma 3.12 (Rely/Guarantee Technique II) - Let I be a finite index set. Suppose that U is a set and that Rabs' Gabs' and R1, G; for each ; E / are subsets of U. Suppose Vabs = Rabs -u Gabs and v, = R -u G;, for each; E /. If there exists, for each i, J E / + {abs}, a 1 set RG j ~ U such that (1)-(3) below hold, then n,€, V ~ vlbs. 1 1 (1)(a) R8bs ~ ni€t RGabsJ' (b) n,E, RGi,abs ~ Gabs' {2){a) n,E, + {abs} RG ij {;; Ri' for all i E /. -65- (b) G; ~ n}E/+ {abs} RGij' for all;€,. (3) Whenever {, ... , ) is a cycle of/, then 0 1 2 1 u = u:-10 RG; , • = k' 11+1 Proof - Suppose (1)-(3). Suppose further, to obtain a contradiction, that there exists X € u n Rabs n (n;E, V) such that x (l Gabs. We perform an inductive construction to 1 1 obtain a _cycle {, •.. I (in' ;n+1>l of I such that X ( uz: m RG,k,111+1 This contradicts hypothesis (3). As the induction hypothesis at stage k of the construction, we assume that i , i , ... , i have been constructed so that X ( R , and that X ( u11-~ RG ,1 • 1 2 11 1 1 II 1"' 1/+1 Basis: From (1)(a) and the assumption that XE Rabs' we know that XE RGabsJ for all i E /. Since X ( Gabs' by {1}(b) we know that X ( RG ,abs for some i E I. By (2)(b) we know 1 1 1 that X ( G , and from the assumption that XE V , we conclude X ( R • 1 1 1 1 . 1 1 Induction: Assume the induction hypothesis holds for some k ~ 1. By (2)(a) we know that X t RG; ,1 for some I._ EI. If i.. =; for some m with 1 < m < k, then we have II 11+1 ,.+ 1 n+ 1 m - - obtained the desired cycle and the construction terminates. Otherwise, by (2)(b) we know that X ( G1 , and from the assumption that X € V , we conclude that X ( 1 11+1 11+1 R. . This establishes the induction hypothesis fork+ 1. 111+ 1 Since the set I is finite by hypothesis, we cannot extend the sequence i , 1 , ... , lk 1 2 indefinitely without creating a cycle. I Examples of the use of Lemma 3.12 can be found in the proof of correctness of the synchronizer implementation in Chapter 4, and in the proof of correctness of the resource manager implementation in Appendix II. -66- 4. The Synchronizer Implementation In this chapter, the theory developed in Chapter 3 is applied to obtain complete specifications and a proof of correctness for the synchronizer example. In Section 4.1 we review the synchronizer module specification which has already been developed. It is shown how the set of valid computations for this specification can be given a concise definition using the language of temporal logic. In Section 4.2, the synchronizer component module specification is presented. In Section 4.3, the definition of the synchronizer module implementation is reviewed. In Section 4.4, the Correctness Theorem is used to prove the correctness of the synchronizer implementation. 4.1 Notation This section introduces the notation we will use to express state-transition specifications, and in particular, the temporal logic notation we use to define the sets of valid computations. We use this notation in this chapter in a highly informal fashion, and do not concern ourselves with precise syntax and semantics. The reader who is interested in a careful treatment of the notation we use is referred to Appendix I. To define a state-transition specification S, we first define the interface Es and I state set Os of the machine Ms· As discussed in detail in Appendix I, we regard these two sets as two distinguished sorts Events and States in a many-sorted algebra As· We associate a first-order language L(S) with the algebra As in the usual way. The language L(S) is used to define the initial state set /nits and the state-transition relation Transs of the machine Ms. In this chapter, we often use constructions that are not part of a first-order language. Appendix I shows how the use of these constructions can be justified. From the first-order language L(S), we obtain a temporal language ~S) by augmenting L(S) with the temporal operators □ (read "henceforth") and ◊ (read "eventually"), which are applied to formulas to obtain new formulas. In addition, three new atomic terms are added to the language: Now and After, which behave syntactically like constant symbols of sort States, and Occura, which behaves like a constant symbol of sort Events. The meanings of the symbols Now, Occurs, and After depend upon the particular instant of time under consideration, and thus are altered by the action of temporal operators □ and◊ in a way that is detailed below. Intuitively, if -67 · the particular instant of time under consideration is t, then Occurs denotes the event that occurs at time t, Now denotes the state at time t, and After denotes the state "just after" time t. The semantics of the temporal language associated with a specification S = · are captured by the binary relation t= (read "satisfies"), which tells when a formula of the temporal language is satisfied by a particular history over E,., and QM. To assert that the history X satisfies a particular temporal formula ,p, we write X I= cp. Satisfaction is defined informally as follows: If ,p is a formula that contains no occurrences of temporal operators, then X t= ,p iff cp holds in the usual sense of first-order logic, with the symbols Now, Occurs, and After interpreted as Statex{0), Obsx(0), and Statexco•), respectively. If,,, is a formula of the form □+, then X t= cp iff suffix,(X) t= +f or all t € [0, 00). If ,,, is of the form ◊I/,, then X t= cp iff suffix1(X) I= "1 for some t E [0, 00). Note that the semantics we use are essentially the "linear time" semantics of [Lamport80], and hence the ◊ operator is equivalent to the compound operator -,o-,, We say that a formula cp is a consequence of a set of formulas v, written v I= ff', if X t= cp whenever X I= 1/, for all + e: v. A formula ,,, is valid, written t== rp, if it Is a consequence of the null set of formulas. The temporal language ~S) of a specification S = contains an important 5 sentence to which we shall refer extensively. This is the sentence Comps= lnits(Now) A □Transs(Now, Occurs, After). Intuitively, X I= Comp iff X Is a computation for the machine Ms· 5 4.2 Specification of the Synchronizer Module In this section, we review the state-transition specification sSM = of the synchronizer module, which has _already been developed in Chapter 3. Let Proc be a finite set of user processes. Interface: ESM = p.J u {tryP, runP, restP: p € Proc}. In anticipation of Chapter 5, we classify each event in the synchronizer module interface -68- as either an input event, an output event, or both (the null event ). is the only event that is both an input and an output event). lnSM = {1'} u {tryP, restP: p € Proc} OutSM = {>.} U {runP: p € Proc}. Although our theoretical framework so far draws no formal distinction between input and output, in Chapter 5 such a distinction is introduced to obtain a useful test for consister_tcy of liveness specifications. Input events should be thought of intuitively as stimuli that are applied to a module by its environment, and output events as responses applied by a module to its environment. A module does not have the capability of regulating the application of input stimuli to It. Machine: The state set for the synchronizer module machine is defined by QSM = flp€Proc {trying, running, resting, error}. To ease later discussion, let us say that process p is resting (resp. trying, running, In error) in state q if q(p) = resting (resp. trying, running, error). The set of initial states for the synchronizer module machine is defined by lnitSM = {q € QSM: q(p) = resting for all p € Proc}. A step is in the state-transition relation TransSM for the synchronizer module machine iff either e = A and q = r, or one of the conditions (try), (run), or (rest) below is satisfied for some p € Proc. A try event for process p can occur at any time. If process p was previously resting then it advances to the trying state, otherwise to the error state. The states of all other processes are unaffected. (try) e = tryP ' and either q(p) = resting and r = q[trying/p], or q(p) * resting and r = q[errorlp]. A run event for process p can occur only if process p is trying, and no other processes are currently· running. Process p advances to the running state, and the states of all other processes are unaffected. (run) e = runP, q(p) = trying, q(p ')*running for all p 'E Proc-{p}, and r = q[running/p]. -69- A rest event for process p can occur at any time. If process p was previously running, then it advances to the resting state, otherwise to the error state. The states of all other processes are unaffected. (rest) e = restP, and either q(p) = running and r = q[resting/p), or q(p) ,,,_ running and r = q[errorlp]. Validity Conditions: We wish the validity condition for the synchronizer module to capture the idea that every user's request should eventually result in a response, if possible. This condition can be stated in the rely-/guarantee-condition form as follows: "If every user process obeys the requirement that, once in the running state, it will eventually leave the running state, then the synchronizer module guarantees that every user in the trying state will eventually leave the trying state (and hence advance to the running state)." We can express this condition concisely as a temporal sentence. ValidSM = RelySM - GuarSM where RelySM = □(VpEProc)(Now(p) == running - ◊{Now(p) ,,,_ running)) GuarSM = □(VpEProc){Now(p) = trying - ◊{Now(p),:,: trying)). 4.3 Specification of the Synchronizer Component Module A synchronizer component module communicates with an associated user process via the try, run, and rest events, with its neighboring synchronizer component module in the clockwise direction via token_out and request_in events, and with its neighboring synchronizer component module in the counterclockwise direction via token_in and request_out events. The conceptual state of the module contains a count of the number of tokens the module possesses, plus information concerning the state of the associated user process. The synchronizer component module can allow the user process to enter the running state only if it possesses a token, and must retain a token throughout the entire period during which the user is in the running state. We would like the synchronizer component module to be "fair" in the sense it eventually grants each user request, if possible, and eventually responds to each request for the token by its clockwise neighbor in the ring, if possible. - 70- The specification of the synchronizer component module is parameterized by the number of tokens it possesses in the initial state. Thus the specification presented below actually is a specification schema that represents a family {SC": k € .N) of related specifications, where SC" is the specification for the synchronizer component module with k tokens in the initial state. The only place the initial number of tokens appears in the specifiqation is in the definition of the initial state set. Interface: The first task in the construction of the synchronizer component module specification is the description of its interface. Esc = {A, try, run, rest, token_in, token_out, requesLin, requesLout}. The sets of input and output events are defined by: lnSC = {A, try, rest, token_in, request_in} OutSC = {X, run, token_out, request_out}. Machine: A state for the synchronizer component module contains a "token" component, whose value represents the number of tokens the module possesses, and a "ustate" component, which tells what state the synchronizer component module thinks the user process is In. asc = token: .KX ustate: {trying, running, resting, error}. The "tags" token and ustate are used as selectors; If q € asc, then q(token) denotes the token component of q and q(ustate) denotes the ustate component. In an initial state the synchronizer component module SC has k tokens and the 11 user process is resting. lnit8C1c = {q E osc: q(token) = k "q(ustate) = resting}. A step is in the state-transition relation TransSM for the synchronizer component module machine iff either e = A and q = r, or one of the conditions (try), (run), or (rest), (token_in), (token_out), (requesLin), (request_out) below is satisfied: A try event can occur at ahy time. If the user process was previously resting, then it advances to the trying state, otherwise to the error state. (try) e = try and either · 71 · q(ustate) = resting and r = q(trying/ustate], or q(ustate) * resting and r = q(error/ustate]. A run event can occur only if the user process is trying and the synchronizer component module currently possesses a token. The user process advances to the running state. (run) e = run, q(ustate) = trying, q(token) * 0, and r = q[running/ustate]. A rest event can occur at any time. If the user process was previously running, then it advances to the trying state, otherwise to the error state. (rest) e = rest and either q(ustate) = running and r = q[resting/ustate], or q(ustate) * running and r = q(error/ustate]. A token_jn event can occur at any time, and causes the number of tokens possessed by the synchronizer component module to be increased by one. (token_in) e = token_in and r = q(q(token) + 1 /token] A token_out event can occur only if the user process is currently not running, and the synchronizer component module possesses at least one token. The number of tokens possessed is decremented. (token_out) e = token_out, q(ustate) * running, q(token) * O, and r = q[q(token)-1/token] A request_jn event can occur at any time, and has no direct effect on the state. The way in which a requesLin event induces the synchronizer component module to eventually respond with a token_out event is captured by the validity conditions. (requesUn) e = requesLin and r = q A request_out event can occur only if the synchronizer component module currently does not possess a token. Occurrence of such an event has no effect on the state. (requesLout) e = requesLout, q(token) = 0, and r = q Validity Conditions: - 72- We would like the synchronizer component module validity conditions to capture the following two ideas: (1) A synchronizer component module always eventually satisfies a user's request, if possible. (2) A synchronizer component module always responds to requests for the token issued by its clockwise neighbor, if possible. We can state this in rely-/guarantee-condition form as foHows: If all requests issued by the synchronizer component module to its counterclockwise neighbor are eventually granted, and the user process never remains forever in the running state, then all user requests and all requests for the token from the clockwise neighbor, will eventually be granted. Formally, Validsc = Relysc - GuarSC, where Rely8C = D(Now(ustate) = running - ◊(Now(ustate) -:1: running)) A □(Occurs = requesLout - ◊(Now(token) -:1: 0)) Guar5C = D(Now(ustate) = trying - ◊(Now(ustate) ;t trying)) A □(Occurs = request_in - ◊(Occurs • token_out)) 4.3.1 The Synchronizer Implementation To be able to describe and reason about the synchronizer implementation we must formalize the idea that the set Proc is a "ring-structured set of processes with a distinguished process." We assume that the set Proc is the set of integers modulo N for some N, and that zero is a distinguished process, which will be the process that initially possesses the token. We first define the synchronizer interconnection eSMI _ (ESMI SMI (6SMI} ) 1 " - a ' p pEProc · The abstract interface 0SM1 is the synchronizer module interface ESM, and the pth component interface F~1 is the synchronizer component module interface Esc. The composite interface for the synchronizer module implementation is defined by: ESM1 = {A} + {tryP, runP, restP, tokenP, requestP: p € Proc}. lnSM1 = {A} + {tryP, restP: p € Proc} OutSM1 = {A} + {runP, tokenP, requestP: p € Proc}. - 73- The tryP, runP, and restP events in the composite interface correspond under the decomposition map to try, run, and rest events for synchronizer component module p, and under the ab~traction map to tryP ' run P' and restP events for the synchronizer module. A token P event represents the transmission of a token from synchronizer component module p to synchronizer component module p + 1 (i.e. in the clockwise direction around the ring), and a "request" event represents the transmission of a request from synchronizer component module p to synchronizer component module p-1 (i.e. in the counterclockwise direction). We capture this information formally by defining the abstraction map aSM1a nd decomposition map i SM,_ aSM1(e) = e, if e € {tryP, runP, restP: p € Proc}, = >., if e € {tokenP, requestP: p € Proc} U {>.}. B:,11(e) = try, if e = tryP = run, if e = runP = rest, if e = restP = token_in, if e = tokenP- 1 = luktt11_oul, if t1 = tokt:mP = request_in, if e = requestp+ 1 = request_out, if e = requestP = A, otherwise. To complete the description of the synchronizer implementation <,SM,, S!:, S. SM'>, we must define the specifications s=: and S:,11 for each p € Proc. The specification s:= is the synchronizer module specification sSM. The specification s:,:~ is the specification ssc1 of the synchronizer component module with one initial token, and for all p € Proc - {zero}, s~' is the speciticatron · ssco of the synchronizer component module with no initial tokens. 4.4 Correctness of the Synchronizer Implementation In this section, we use the techniques of Chapter 3 to show the correctness of the synchronizer implementation. Most of the proof consists of straightforward case analyses. The interesting content of the proof is contained in the use of Lemma 3.12 to prove that the validity condition holds. • 74 - 4.4.1 Implementation Invariant To prove the correctness of the synchronizer module implementation, we first need to find an implementation invariant that provides enough information about the reachable states of the composite machine so that we can prove the maximality condition. The implementation invariant will also be useful in the proof that the validity condition holds, and so in this section we define an implementation invariant that is strong enough for both the maximality and validity proofs. For a set Inv to be an implementation invariant for an implementation means that it is inductive for the composite machine for the implementation. Formally, if M is the composite machine and E the composite interface, we must show: (Basis) (\JqEOM)(q E lnitM - q E Inv)) (Induction) ('lq,rEQ , e E E)( E Trans - (q E Inv - , E Inv)). 11 11 It is generally convenient to define an implementation invariant Inv by a predicate lnv(q) = Rep{q) /\ Abs(q), wheie R~p is called the reprtJSf:nlatiuti i11vc11ia11t and Abs is calleJ the absiraciion relation. A representation invariant describes a relationship that must hold at all times between the states of component modules In an implementation. Representation invariants serve roughly the same purpose here as what is called the "data type invariant" in the literature on abstract data types [e.g. Jones81, Jones83]. An abstraction relation describes the correspondence between the states of the component modules and the state of the abstract module. The abstraction relation plays the same role here as the ''retrieve functions" of [Jones81], and the "representation functions" of [Hoare72]. The implementation invariant lnvSM1 for the synchronizer implementation is defined as follows: lnvSM1{q) = RepSM1(q) /\ AbsSM1(q), The abstraction relation AbsSM1 holds of state q iff in state q, the abstract synchronizer module's view of the state of the pth user process is identical to the pth synchronizer component module's view, for each p in Proc. Stated another way, the abstract synchronizer module state corresponding to a given collection of synchronizer component module states is obtained by throwing away all information, except for the ustate component, in the states of the component modules. Formally, - 75- AbsSMl(q) = ApEProc qP = for all p € Proc-{zero}. It is easily checked that these three conditions imply. that AbsSM1(q), Mutex(q), and Token(q) all hold. We conclude that lnvSM1(q) holds for all q € lnitSM1, as required. Induction: We must show that for all € TransSM1, if lnvSM1(q) holds then lnvSM1(r) does, too. Suppose that E TransSM1 and lnvSM1(q) holds. First of all, note that if e = X, then q = rand hence lnvSM1(r) follows trivially from lnvSM1(q). We therefore assume in what follows that e * A. We consider separately the proofs of AbsSM1(r), Mutex(r), and Token(,). To prove that AbsSM1(r) holds, there are two cases: (1) e E {tokenP, requestP: p E Proc}; and (2) e € {tryP, runP, restP: p E Proc}. Case (1) is disposed of quickly by noting that if e = tokenP, or e = requestP for some p € Proc, then r abs(p ') = qabs(p ') and rP .(ustate) = qP .(ustate) for all p' € Proc. Thus in this case AbsSM1(r) follows directly from AbsSM1(q). Case (2) is handled by a straightforward enumeration of the cases: e = tryP, e = runP, e = restP, and verifying that in each case, ttie occurrence of e results in identical values for ra bs(p ') and rP .(ustate), for each p ' E Proc. - 76- We now consider the proof that Mutex(r) holds. Suppose not, then it must be the case that ,P(ustate) = running and 'itoken) = 0 for some p € Proc. By a case analysis one it is straightfory.,ard to check that the only way this can happen is if either qP(ustate) = running and e = tokenP, or qP(token) = 0 and e = runP. Examination of the specification of the synchronizer component module shows that it is impossible for a tokenP event to occur if qP(ustate) = running, and also for a runP event to occur if qP(token) = 0. Finally, we wish to show that Token(,) holds. A case analysis on e shows that the only events that affect the number of tokens in the system are those of the form tokenP for some p E Proc. Examination of the specifications shows that, when such an event occurs, ,P(token) = qP(token) - 1, 'P+ 1(token) = qP+ 1(token) + 1, and 'P~token) = qP .(token) for all p' E Proc - {p, p + 1 }. Thus :IP '€Proc rP ,(token) = IP 'EProc qP ,(token), and hence Token(,) holds. 4.4.2 Proof of Maximality We must show that for all q E QSM1 and e E ESM1, if /m,SM1(q) holds and 6~1(e) is enabled in state qP for all p E Proc, then aSM1(e) is enabled in state qabs' Suppose /nv8M1(q) holds and that c5:""11(e) is enabled in state qP for all p € Proc. There are two cases: (1) e = run P tor some p E Proc; and (2) e is not of this form. Examination of the synchronizer module specifications shows that case (2) is trivial, since a(e) is enabled in any state unless e = runP for some p E Proc. Now consider case (1 ). Since .S:"'41(e) is enabled in state qP, from the synchronizer component module specification we know that (A) qP(ustate) = trying and qP(token);,: O. The assumption that lnv8M1(q) holds implies that Token(q), Mutex(q), and AbsSM1(q) all hold. From (A) and AbsSM1(q) we infer that q bs(p) = trying. From (A) and Token(q) we 8 know that QP,{token) = 0 for all p '. € Proc-{p}. From this and Mutex(q) we infer that qP ,(ustate) ¢ running for all p' E Proc-{p}. From this and AbsSM1(q), we conclude that qabs(p 1 ~ running for all p' E Proc-{p}. We have shown that (B) qab (p) = trying A (AP '€Proc-{p} qab (p 1 ¢ running). 9 1 holds. Examination of the synchronizer module specifications shows that (B) implies 1 that aSM (e) is enabled in state qabs' as desired. -77 - 4.4.3 Proof of Validity To express the proof that the validity condition holds for the synchronizer implementation, we associate a temporal language ~sSM') with the composite specification sSM• = in the same way as temporal languages were associated with the synchronizer module and synchronizer component module specifications. In addition, we must have some way of taking the temporal sentences, each expressed in its own temporal language, that define the sets of valid computations for the synchronizer module and synchronizer component module specifications, and "lifting" them to the common language ~SSM1). This can be accomplished by a simple syntactic translation, which we now define. To each formula cp of ~SSM) we associate a corresponding "lifted" version (cp]abs of ~SSM1), by replacing each occurrence of the symbol Now by the term Now abs' each occurrence of After by the term Aftera bs' and each occurrence of Occurs by the term aSM1{0ccurs}. Similarly, to each formula cp of 'S(Ssc) and each p € Proc, we associate a corresponding formula (cp)P € '5(SSM1), by replacing each occurrence of Now by Now,, each occurrence of After by After,, and each occurrence of Occurs by Br41{0ccurs). The precise relationship between a formula arid its lifted version is captured by Lemma 1.2 in Appendix I. Informally, if rp € '5(SSM), then a history X for the composite machine MSM1 satisfies the formula &cpBabs € ~ASM1) iff the canonical projection xDp iff x, , ... , ) is a cycle of Proc, then 0 1 1 1 CompSMI I= vn-1 RG . 11 =0 111.,11+1 The sentences RG J express what is relied/guaranteed between each pair of 1 synchronizer component modules or between a synchronizer component module and the external environment of the entire system. The synchronizer component module specifications have been chosen in such a way that the sentences RG; can be obtained 1 simply by "lifting·" the synchronizer component module rely-/guarantee-conditions to the temporal language of the composite machine. The formal definitions are as follows: For all i,; E Proc, 1 RGi,abs = D(Now;(ustate) = trying - ◊(Now,(ustate) $ trying)) RGabsJ = □(Now1(ustate) = running - ◊(Now1(ustate) $ running)) For all ; E Proc, RG;-1,; = □{8~ 1{Occurs) = request_out - ◊(Now,(token) ~ 0)) For all i,; E Proc such that ; + 1 ~ /, RG1J = true Next, we verify (SMl1 )-(SMl3). Assume CompSM1 throughout the remainder of the proof. The interesting intuitive content of the validity proof is contained in the proof that (SMl3) holds. The remaining cases are practically automatic. Intuitively, hypothesis (SMl1 )(a) says that the abstract module rely condition implies what each component module relies on the external environment to provide. Hypothesis (SMl1 )(b) says that the conjunction of what each component module guarantees to the external environment implies the abstract module guarantee-condition. Formally, we must show: - 79- (SMl1 )(a) HRelySMflabs - ApEProc RGabs,p (SMI 1) (b) ApEProc RGp,abs - ff Gua~abs We show (SMl1 )(a), condition (SMl2)(b) is equally straightforward. From the synchronizer module specifications, we know that [RelySMDabs = OApEProc(Now abs(p) = running - ◊(Now abs(p) ~ running)) Suppose that ffRelySMDabs holds. By the invariance of the abstraction relation AbsSM1, we infer OApEProc(NowP (ustate) = running - ◊(Now iustate) -,1; running)). Interchanging the D and conjunction yields ApEProc RGabs,p' as desired. Intuitively, hypothesis (SMl2)(a) says that each component module's rely-condition is implied by the conjunction of what is guaranteed to it by the external environment and by each other component module. Hypothesis (SMl2)(b) says that each component module's guarantee-condition implies what the external environment and each other component module rely on it to provide. Formally, we must show: (SMl2)(a) A,EProc + {abs} RG1,p - [Rely891P, for all p E Proc (SMl2)(b) [Guar8CDP - A/EProc+{abs} RGPJ' for all p E Proc. To show condition (SMl2)(a) is completely straightforward. Let p E Proc be fixed. It suffices to show that RGabs,p A RGP-1.P - [Rely89P. By definition RGabs,p = □(Now P(ustate) = running - ◊(Now P(ustate) -,1; running)) 1 RGP-1.P = □(6:-4 (Occurs) = requesLout - ◊(Now P(token) ~ 0)). The conjunction of these two sentences is easily seen to be equivalent to ff Rely89P by inspection of the synchronizer component module specifications. To show (SMl2)(b) is not completely trivial because of the fact that what component module p guarantees to module p + 1 is not exactly what module p + 1 relies on module p to provide. Specifically, module p guarantees always to eventually send the token in response to a request from module p + 1. However, module p + 1 relies not on the eventual occurrence of a token_jn event, but rather on the eventual setting of the token component of its state to a nonzero value. The nontrivial portion of the proof is to use the state-transition relation for module p + 1 to show that occurrence of a token_jn event for that module implies the eventual setting of the token component of its state to a nonzero value. -80- Formally, to prove (SMl2)(b) it suffices to show that (Gua,.SCJP - RGp,abs A RG , since RG J = true by definition unless; = abs or; = p + 1. By definition p,p+ 1 p RGp,abs = □(Now P(ustate) = trying - ◊(Now P(ustate) * trying)) 1 RGp,p+ 1 = D(t5~ 1(Occurs) = requesLout- ◊(Nowp+ 1(token) * 0)). By inspection of the synchronizer component module specifications, we have fG uarSCJP = □(Now iustate) = trying - ◊(NowP (ustate) * trying)) A O(t5r1(0ccurs) = request_in - O(t5:-41(0ccurs) = token_out)). Assume ffGuar8CJ . Then RG abs follows immediately from the first conjunct of p P, f Guar8CJP. To show that the second conjunct of 8Guar8CJP implies RGP.P + 1, note the definition of the state-transition relation for synchronizer component module p implies that 1 O(c5~1(Occurs) = token_in - O(Nowp+t * 0)). From the second conjunct of ff Gua,.SCJP, using the definition of the decomposition map 4 SMI, we obtain □(8~11 (Occurs) = request_out - 0(8::\(0ccurs) = token_in)). Combining the preceding two sentences and applying temporal reasoning shows RGP.P ;p as desired. The most interesting part of the proof is the proof that (SMl3) holds. To show (SMl3), we must show that CompSMI I== yn-1 RG le• 1 111111 + 1 holds for every cycle {, ... 1 , , ... , } that traverses the entire ring in the clockwise direction, since every other cycle from Proc contains a link for which RG,J = true by definition. Suppose, to obtain a contradiction, that (SMl3) fails for this cycle. Then for all p E Proc, the sentence RGP- .P does not hold. 1 This means that 1 Ap€Proc O(B:-4 (Occurs) = request_out A □(NowP(token) = 0)). That is, for each p E Proc, eventually a point is reached at which synchronizer component module p issues a request_out event, but never has the token after that point. This implies that ApEProc O□(Now P(token) = 0). Since Proc is a finite set, it is valid to interchange the conjunction and ◊ operator in the preceding formula, concluding that ◊ ApEProc□(Now itoken) = 0). - 81 - This asserts that there is some point after which no synchronizer component module ever possesses a token. This is a contradiction with the invariance of Token, which states that the total number of tokens in the system is always precisely one. I -82- 5. Consistency of Specifications In Chapter 3 it is suggested that module specifications ought to be expressed in rety-/guarantee-condition form, and that the rely- and guarantee-conditions for the component modules in a system ought to be selected so that each component module guarantees precisely the conditions relied upon by its neighbors in the system. In Chapter 4 the synchronizer example illustrates how adherence to this principle can result in a simple proof of the validity condition required by the Correctness Theorem. In practice, there seems to be considerable flexibility In the choice of rely- and guarantee-conditions. Often significant simplifications in a correctness proof can be effected simply by adjusting the component module specifications. The apparent flexibility in the choice of rely- and guarantee-conditions in specifications raises the following somewhat disturbing question: What is to prevent us from writing component module specifications with extremely weak rely-conditions (e.g. true), and ridiculously strong guarantee-conditions (e.g. false), in order to simplify the proof of correctness? An implementation whose component module validity conditions are all of the form "true - false" makes the validity part of a correctness proof extremely simple, but also vacuous. We can also consider more subtle, but still problematic specifications in which a module "guarantees" the application of some input to it -- something that seems to contradict our intuitive notion of what it means to be an input. Since a specification of the form "true - false," or a specification that guarantees the application of input ought to be regarded as meaningless, we should have some way of distinguishing these specifications from others that are meaningful. The theory we have set up so far provides no formal criteria for making such a distinction. What we require is a suitable notion of consistency of specifications, with respect to which obviously unrealizable specifications such as "true - false" are inconsistent, and apparently reasonable specifications, such as the synchronizer component module specification, are consistent. In mathematical logic, a theory is consistent iff it has a model. Since the "models" of specifications are behaviors, it seems reasonable to define a specification to be consistent iff there is a behavior that satisfies it. If we take the term "behavior" in this definition to mean "arbitrary behavior," though, we do not obtain a stringent enough -83- notion of consistency. For example, every subset specification is consistent in this sense, since the empty behavior 0 satisfies every subset specification. To obtain more stringent notions of consistency, we must restrict our interpretation of the term "behavior" to mean "realizable" or "computable" behavior. In this chapter, we examine a notion of consistency based on a model of concurrent computation called "1/0-systems." An 1/0-system models a collection of concurrent processes that interact through coupled events. By viewing 1/0-systems at various levels of abstraction we obtain the "1/0-behaviors," which we take as our class of computable behaviors. A specification is defined to be "1/0-consistent" iff there exists an 1/0-behavior that satisfies it. The notion of 1/0-consistency seems to be quite useful for distinguishing between meaningful and meaningless eventuality specifications. We develop a technique for proving state-transition specifications to be 1/0-consistent and apply this technique to show the 1/0-consistency of the synchronizer component module specification. 5.1 1/0-Systems This section defines a model of asynchronous concurrent computation called "1/0-systems." An 1/0-system is a system of nondeterministic processes that interact through coupled events. The nonnull events in which each process can participate are partitioned into "input events" and "output events." An input event for a process represents the stimulation of the process by its environment, and an output event for a process corresponds to the process responding to Its environment. A process can choose whether or not it will produce output, but does not have the ability to control the application of input to itself. If a process wishes to produce output, then it cannot be prevented from doing so, although a process has no control over precisely when the output will be produced. The coupling of the processes in an 1/0-system is described by a "system interface," the elements of which are "system events." Each system event is a vector with one component for each process In the system, and represents a possible simultaneous occurrence in the computation of the system. No system event contains more than one component output event, modeling the idea that at most one process can produce an output at any instant of time. -84- To describe the execution of an 1/0-system, it is helpful to imagine the existence of a "scheduler," who controls the path of execution of the system. For each step of the system, the scheduler chooses a system event from the system interface. All processes then simultaneously take steps corresponding to the chosen system event. By the constraint that there is at most one output component of each system event, at most one process produces an output event in each step, and the other processes perform input steps or null steps. We are only interested in computations of an 1/0-system that are "fair" in the sense that the scheduler selects each process to perform output steps often enough. We now give a formal definition of 1/0-systems. We first define the notion of an 1/0-interface, which is an interface whose non-;\ events are partitioned into input events and output events. Definition - An l/O-interface is an interface ..E, lnE, Out,?, where lnE ~ E is a set of input events and OutE ~ E is a set of output events, such that the sets lnE, OutE, {">..E} partition E. I We next define the "asynchronous product" of a collection of 1/0-interfaces. Intuitively, the asynchronous product ®;Et F of the collection ,E, of 1/0-interfaces is 1 the interface F defined as follows: F = {l E n,E, F1: at most one f; is an output event} AF = ('),.F,>iE/ lnF = {l E F: t :it ">.F and no f1 is an output event} OutF = {l E F: exactly one f; is an output event}. The maps 11 : ®,E, F - F , for i E /, that take a vector l to its ith component, are called the 1 1 1 canonical projections associated with ®iEt Fr I -85- In general, a system interface will not be the entire asynchronous product of the process interfaces, but rather only a sub-interface of the asynchronous product. The reason for using a ~ub-interf ace of the asynchronous product as the system interface is to capture possible coupling of events between processes. One kind of coupling that can be modeled in this way is the identification of events of distinct processes. For example, if the output event out for process one is to be identified with the input event in for process two, then we would include in the system interface the vector , in which process one performs an out event at the same time as process two performs an in event, but we would exclude from the system interface the event , in which process two performs an in event while process one does nothing. Other kinds of coupling can also be modeled. For example, if the input event in for process two always occurs along with an output event out for process one, but the event out for process one need not occur along with an in event for process two, then the system interface would include the events and , but would exclude the event ,€, of 1/0-interfaces is an 1/0-interface E ~ ®,€, F such that 1 (1) The inclusion map y: E :-+ ®,€, F is an embedding. 1 (2) Each map ", 0 .., is onto F , where ,€, are the canonical projections 1 associated with ® ,E, F,. The collection of maps ,€, is called the canonical decomposition map associated with E. I -86- Each process in an 1/0-system is represented by an "1/0-machine," which is a machine that cannot prevent the occurrence of input events. The 1/0-machines in an 1/0-system are required to be "explicit" in the sense that each nonnull step results in the occurrence of some non->. step. This assumption is justified because we think of an 1/0-system as being a detailed, low-level modef, in which all steps taken by processes result in explicit observable events. Later we will apply abstraction maps to the behavior$ of 1/0-systems to obtain less detailed, higher-level views of system behavior, in which steps can be taken that do not result in observable events. Definition - An l/0-machine of l/0-interface Eis a machine M of interface E that is input-cooperative in the following sense: For all q € Q and e E lnE, there exists r E QM 11 such that E Trans"'. An 1/0-machine M of interface Eis explicit if every step ., r> E Trans has r = q. I 11 Definition - An 1/0-system is a tuple'!= ;E;>' where I is a finite, nonempty set of process indices, E ~ ®,E, F Is a system interface, and each M is an explicit 1/0-machlne 1 1 of intar.f ; :u~e F ,. I We associate with an 1/0-system '! = ,E;>, a system machine M defined as follows: EM = E Q"' = n,E, Q", lnit11 = n,E, lnit11,_ Trans11 = {: € TransM,tor alli E /}, 1 where <8;>,E, is the canonical decomposition map associated with E. I Definition - A computation for an 1/0-system is just a computation for its system machine. I A computation X for an 1/0-system projects to computations x,E;> is an 1/0-system and M is the system machine. Suppose a. € QM is a system state. We say that process i runs in a step E Trans"' if 8 (e) is an output event for process i. We say that process i is 1 -87 - enabled in system state a if there is a step E TransM in which process i runs. Suppose Xis a computation for M. Process i is repeatedly enabled in X if for all t E [O, 00) there exists t' E [t, 00) such that process ; is enabled in Statex(t). Process i repeatedly runs in X if for all t E [O, oo) there exists t' E [t, 00) such that process i runs in Stepx- Definition - A computation X for an 1/0-system is fair it for each process i in the system, if process i is repeatedly enabled in X, then process i repeatedly runs in x. I 5.2 1/0-Behaviors and 1/0-Consistency Each computation of an 1/0-system produces an observation over the system interface. We call the set of all observations that are produced in fair computations of an 1/0-system the "primitive behavior" of the system. This behavior is called "primitive" because it contains complete detail about the events that occur during a computation of the system. Definition - The primitive behavior PBeh(1) of a system of 1/0-processes :r is the set of all Obsx where Xis a fair computation for'!. I By applying abstraction maps to the primitive behaviors, we obtain additional (nonprimitive) behaviors. We call any behavior that Is the abstraction of a primitive behavior an "l/0-behavior." An abstraction map can suppress information in a behavior by mapping two distinct events of the same type (either input or output) to the same event, or by mapping an output event to A. To ensure that an abstraction map faithfully preserves the input/output structure of a behavior, we require that an abstraction map never map an input event to A, and never map an input event and an output event to the same event. Furthermore, we require that each abstract input event be the image of some concrete input event. Definition • An 1/0-abstraction map from the 1/0-interface E to the 1/0-interface D is a translation a: E - D with the following properties: (1) a(OutE) c; Out U {AD}. (a preserves outputs) 0 (2) a(lnE) c; ln • (a strictly preserves inputs) 0 (3) a is onto In • 0 I -88- Definition - A behavior a E Beh(0) is an /ID-behavior of interlace D iff there exists a system ':! of I/O-processes with system interface E and an 1/O-abstraction map a: E -+ D such that a = a(P~eh(:t)). I The following result shows that the class of 1/O-behaviors is a kind of completion under 1/O-abstraction of the class of primitive behaviors. Theorem 5.1 - The class of 1/O-behaviors contains all primitive behaviors and is closed under 1/O-abstraction operators. Proof - Obvious from the definition of an 1/O-behavior and the facts: (1) Identity translations are 1/O-abstraction maps. (2) If a: F - E and fJ: E -+ D are 1/O-abstraction maps, then /J O a is an 1/O-abstraction map. I By taking the 1/O-behaviors as our class of realizable or computable behaviors, we obtain the notion of "1/O-consistency" of specifications. Definition - A specification S of 1/O-interface D is I10-consistent if there exists an 1/O-behavior a of interface D such that a satisfies S. I 5.3 Machine Characterization of 1/0-Behavlors To obtain techniques for proving the 1/O-consistency of state-transition specifications, it is convenient to have a direct characterization, not involving 1/O-abstraction maps, of the 1/O-behaviors of interface E. Such a characterization is provided by Theorem 5.4 below. Theorem 5.4 states that the 1/O-behaviors are exactly the sets of observations produced by "productive step machines," which are 1/O-machines plus some scheduling information. Definition - A productive step set for an 1/O-machine M of interface E is a set Prod ~ TransM n Steps(OutEU{AE}, OM) that contains no null steps. I \ Definition - A productive step machine (PS-machine} of 1/O-interface E is a tuple ,E;>, where M is an 1/O-machine of interface E and ,0 is a finite, nonempty collection of productive step sets for M, such that u,E, Prod, ~uals the set of all nonnull steps E TransM n Steps(OutEup.E}, OM). I -89- Suppose that ,E,> is a PS-machine. The notions of the productive step set Prod., being enabled in a state of Mand running in a step of Mare defined in the obvious way. A computation X for Mis fair if for each i E /, if Prod, is repeatedly enabled in X then Prod, repeatedly runs in X. Define the behavior Beh(M, ,E,) of the PS-machine ;E,> to be the set of all Obsx where Xis a fair computation of M. The following lemma states that every PS-machine has the same behavior as a PS-machine whose productive step sets are pairwise disjoint. Lemma 5.2 - If ,E,> is a PS-machine of interface E, then there exists a PS-machine ,E,> of interface E such that the collection ,E, is 1 1 pairwise disjoint and such that Beh(M ', ,E,) = Beh(M, ,E ). 1 Proof - The idea of the proof is to include a dummy "tag" component in the state of M ', so that steps in Prod,' write ; into the tag component. This ensures disjointness, since if i * ;, then steps in Prod, and Prodi write different values into the tag component. Fnrrn~lly, define QM, = QM XI lnitM, = lnitM X I TransM. = {«q, k>, e, E TransM (2) If (q, e, r> E u,E, Prod,, then E Prodm. (3) If ( u,E, Prod,, then m = k. Prod,' = {«q, k>, e, E TransM .: E Prod,} It is straightforward to check that M' is an 1/0-machine of interface E and that the collection ,E, is pairwise disjoint. To show that ,€,> is a PS-machine, we must show that the Prod,' cover the non null output or A-steps in TransM ,. If «q, k>, e, is a nonnull output or A-step in TransM. If m -:t:- k, then E u,E, Prod, by part (3) of the definition of TransM. and hence E Prodm by part (2) of the definition of TransM .. By definition of Prodm ', we have that «q, k>, e, is a nonnull output or A-step in TransM' then E u,E, Prod, because the Prod, cover the nonnull output or A-steps in TransM. By part (2) of the definition of TransM ., we know that E Prodm' and . 90. hence «q, k>, e, ,E,). Each computation X' of ,E,> defines a computation X of ;E,>, which we obtain simply by deleting the tag information from X. Suppose X ' is fair and that Prod; is repeatedly enabled in X. It is easy to see from the definition of Prod;' that if Prod; is enabled at time tin X, then Prod;' is enabled at time tin X '. Hence Prod ' is repeatedly enabled in X ', and thus repeatedly runs in X' by the assumption 1 that X' is fair. If Prod;' runs at time t in X ', then by definition of Prod;' it follows that Prod runs at time tin X, so that Xis fair. 1 ' Case Beh(M, /E/) {;; Beh{M ', ,Et): Given a fair computation X of ,E,>, we wish to construct a fair 1 computation X 'of ,E,> that generates the same observation. We construct X ' from X simply by filling in appropriate tag information to match the occurrence of productive steps in X, however we must do this in such a way that X ' is fair. To construct X ', let f: T - Steps{E, QM) be a history skeleton that spans X, where T = icE)(' Suppose Stepx(t ) = for each k € X By a straightforward 11 11 11 11 1 inductive construction involving fair scheduling of the elements of I, we can obtain a sequence 11Exof elements of I such that «q., m11>, e., € Prod, for infinitely many k € J(, then m• = I tor 11 11 infinitely many k E X The history skeleton f' that maps t to the step «q , m >, e , 11 11 11 11 ,E,> of interface E, and an 1/0-abstraction map a: E - D, there exists a PS-machine ~M ', ,E,> of interface D such that Beh(M ', ;E,) = a(Beh(M, ,e~). . Proof - The basic idea of the proof is simple: M' and the Prod.' are defined by taking I the images of Mand the Prod under a. There is one problem with the straightforward 1 · 91 • execution of this idea: the Prod.' might contain null steps. We solve this problem by I introducing into the state of M ' an "idling counter," which is a boolean component whose only purpo~ is to change state upon execution of productive steps. Formally, define ;E,> as follows: QM, = QM X {0, 1} lnitM. = lnitM X {O, 1} TransM. = {«q, b>, a{e), E TransM (2) If E U;E:, Prod,, then c = 1 -b, otherwise c = b. Prod;' = {«q, b>, a(e), > E TransM .: E Prod,}. We claim that M ' is an 1/0-machine. It is clear that lnitM. is nonempty. Part (2) of the definition of TransM, does not prevent Trans,.,, from containing all null steps, since no such step can be in u,E:, Prod,. Thus M' is a machine. To show that M' is input-cooperative, suppose E QM. and d E ln . Since a is onto ln and preserves 0 0 outputs, there exists e E lnE with a(e) = d. By the input-cooperative property of M, there exists r with E TransM. Since ( u,E:, Prod, by the fact that e is an input event, it follows that «q, b>, d, ,E,> is a PS-machine. By definition Prod, ' ~ TransM. for all i E /. Since each step in Prod, is an output or A-step and a preserves outputs, it follows that each step in Prod,' is an output or A-step. Each Prod,' contains no null steps because the idling counter is complemented in each step in Prod,'. To see that every output or A-step in TransM, is in some Prod, ', note that because a strictly preserves inputs, each output or A-step in Trans...,, cannot be the image of an input step in TransM' and therefore must be the image of an output or A-step in TransM. Since the Prod, cover all output or A-steps of TransM, it follows that the Prod,' must cover all output or A-steps of TransM '. We claim that Beh(M ', ,E:,) = a(Beh(M, € )). 1 1 1 Each computation x of M maps in an obvious way (by taking the image of the observation part under a, and deleting the idling counter from the state part) to a computation X' of M ', such that Obsx. == a(Obsx>· It suffices to show that if X is fair, . 92. then so is X '. Suppose that Xis fair. Fix; E /, and suppose that Prod ' is repeatedly 1 enabled in X '. We claim that Prod, ' repeatedly runs in X '. By definition, Prod, ' is enabled in state q iff Prod is enabled in state q. It follows that Prod is repeatedly 1 1 enabled in X, and hence by fairness of X, that Prod, repeatedly runs in X. By definition of Prod,', if Stepx(t) E Prod,, then Stepx ,(t) E Prod,'. Thus Prod 'repeatedly runs in X '. 1 Case Be~(M ', ,E,) ~ a(Beh(M, ,E ), and let X' be a fair computation of M' in 1 1 which the observation x ' is generated. We will construct a fair computation X of M, such that a(Obsx> = x '. The idea is simply to choose inverse images under a of the steps in X ', however this must be done carefully to ensure fairness. Let T = ' d1t, , dlt, € TransM. Because a might map two different e's to the same d, we 11 can't necessarily select the e in such a way that for each;€/, the step € 11 11 11 11 Prod, iff «q , b ?, dlt, , dlt, E Prod, for infinitely 11 11 11 manyk. The function f that takes tit to the step is a history skeleton over EM and QM. By Lemma 3.5 there is a unique history X such that f spans X. It is easily verified that Xis a computation of M, with a(Obsx> = x '. To show fairness, fix; E / and suppose that Prod, is repeatedly enabled in X. We claim that Prod repeatedly runs in X. From the 1 definition of Prod ' we know that Prod ' is repeatedly enabled in X '. By the fairness of 1 1 X' we know that Prod ' repeatedly runs in X '. This implies that E Prod,' for 1 infinitely many k, and hence by construction that €Prod, for infinitely many k. It follows that Prod repeatedly runs in X. I 1 The following theorem is our desired characterization of the 1/0-behaviors: a behavior is an 1/0-behavior iff it is the behavior of a PS-machine. Theorem 5.4 - Suppose D is an 1/0-interface. Then a behavior 8 E Beh(D) is an -93- 1/0-behavior of interface D iff B = Beh(M, ;E,> of interface D. Proof - = > Since the class of behaviors of PS-machines is closed under 1/0-abstraction by Theorem 5.1, it suffices to show that every primitive behavior B is the behavior of a PS-machine. Suppose B = PBeh(:f), where '! = is an 1/0-system. We associate a PS-machine ;e:,> with 'J as follows: The machine M is the system machine for '!. The set Prod; is the subset of TransM in which process i runs. Since a step in which process; runs is always an output step, it is clear that Prod, is a productive step set for M. Since every nonnull output or A-step in TransM is in fact an output step for some process i E /, and hence is in Prod;, it follows that the Prod, sets cover the nonnull output or A-steps in TransM. It is obvious that the set of fair computations of the system 'J is exactly the set of fair computations of the PS-machine ,€,>' and thus PBeh(:f) = Beh(M, 1€1). < = Suppose that ,€,> is a PS-machine of interface D. We construct an 1/0-system '! = ;E,> and an 1/0-abstraction map a: E - D, such that Beh(M, ,€,) = a(PBeh(:f)). Without loss of generality we make the following three assumptions about ,€,>: (1) The set lnitM of initial states for M contains exactly one state q • 0 (2) For all q € QM and all e E ln0, there is a unique, E QM such that E TransM. (3) Prod, n Prodi = 0 for;~;. A PS-machine ,e:,> that does not have these three properties can easily be transformed, without changing its behavior, into a PS-machine ;€,> that does have these properties. We first obtain properties (1) and (2) by buffering input events in an input queue in the order that they occur so that the change of state associated with an input event is just to append the event to the end of the input queue. All nondeterministic choice, including the choice between multiple initial states, is absorbed into the output steps. -94- Formally, we transform the PS-machine ;e,> into a PS-machine ;e,> by defining StateM .. to consist of all pairs , where q is either an element of StateM. 9r the distinguished symbol .1., and u E Inf•. The single initial state of M" is the state <.1., A>. The transition relation TransM .. consists of all steps «q, u>, e, <,, v> > such that - If e E Inf then r = q and v ::: ue. - If e E Outf U {>\f}, then v = A, and either (a) is in TransM.•• or (b) q = .1. and E Trans,...• for some s in I nit,_. .. The set Prod;" consists of all steps «q, u>, e, E Prod;', and in addition, either q -= .1. and the step is in TransM .•, or q = .1. and for some q 'E lnitM, the step is in Trans,..,•. Once ;e,> with properties (1) and (2) is obtained, it can be transformed into with all three properties by an application of Lemma 5.2. We now proceed to the construction of '!. The idea is as follows: The system '! will contain one process for each ; E I. The processes in '! perform a lock-step simulation of the machine M. The interface for each of the processes in the system '! consists of the null event, the input events of D, and the set of all productive steps for M. The input events for process i will be the input events of D and the steps in u e,-c, Prod/' The 1 1 output events for process i will be the steps in Prodr Each process keeps track of the current simulated state of M, and permits an output event to occur only if the event corresponds to a step of M from the current simulated state of M. To ensure that the input-cooperative property holds, process i imposes no requirements on the state from which a step in Prod; can occur, if j -= i. Formally, define the 1/0-interfaces F as follows: 1 F1 = {AF,} + ln0 + u,e, Prod1 lnF, = ln0 + {U;EI-{,} Prodl OutF = Prod, I Define the system interface E ~ F = ®,e, F as follows: 1 E = {l E F: f = t for all i, i E /) U {A,} 1 1 Af = AF Inf =En lnF -95- OutE = E n OutF It is easy to see that the inclusion map y: E - F is an embedding. For each f E Inf, the I identically f vector ,E, is in E. By the assumption that the Prod; are pairwise disjoint, it follows that the identically f vector ,E, is in E tor each f E OutF. as well. This shows that I 0 .,,, 'Y is onto F , where the,,,, are the canonical projections associated with F. 1 Define a: E - D to be the translation that behaves in the following way on the identically f vector ,E, E E: - If f E ln0, then a(,E,) = f. - If f = E Prod, for some i E /, then a(,Et) = d. We claim that a is an 1/0-abstraction map. It is clear that a is onto ln0. The map a preserves outputs because if the identically f vector is an output event of E, then f E Prod, for some i and hence f is an output or A-step. To show that a strictly preserves inputs, suppose the identically f vector ,E, is an input event of E. Then f E ln0, so a(,E,) = f € lno· The machines M are defined as follows: 1 EM = F1 I • QM = QM I lnitM = lnitM = {q } 0 I TransM = {: one of (1)-(4) below holds} I (1) f = AF and, = q, I (2) f E ln and E TransM. 0 (3) f = EProd forsomeJ;ti. 1 (4) f = € Prod, Obviously M is a machine and every step E TransM has r = q. To see that M, 1 I I is input-cooperative, suppose q E QM, and f E lnF,' Then either f E ln or f E Prod; for 0 some/ ;t i. If f E ln then f is enabled in state q by part (2) of the definition .of TransM. 0 I because Mis input-cooperative. If f = E Prod; for some j ;ti, then f is enabled in state q by part (3) of the definition of TransM. I A straightforward induction establishes that if a is a reachable state of the system 'J, then q = qi tor all i, / E /. This argument uses the assumed uniqueness of the initial 1 state of M, plus the assumption that a state q and an event e E lnE uniquely determine a stater such that E TransM. Intuitively, since the processes in 'J do not interact . 96. with each other during input steps, the uniqueness assumptions are needed to ensure that all processes reach the same new state in each such step. There is an obvious correspondence between the steps of the machine Mand the steps of the system 'J. Specifically, each steps = of M determines a steps' = of 'J under the definitions: • g is the identically q vector • !. is the identically r vector - e = AE' if s is null = ;E1' if d E lno = ;Ei•· ifs is a nonnull output or A-Step. It easy to see that a step s of M is enabled in state q of M iff the corresponding step s ' is enabled for the system '! in state ,E.r The correspondence between the steps of M and the steps of 'J therefore defines a bijection between the set of computations of M and the set of computations of 'J, such that if X ' is a computation of 'J and X is the corresponding computation of M, then Obsx = a(Obsx .). Furthermore, a step s of M is in Prod, iff process; runs in the corresponding steps' of 'J, so that fairness is preserved in both directions of this correspondence. It follows that a(PBeh(1)) = Beh(M, ,E,). I The following two properties of 1/0-behaviors are easily derived from the PS-machine characterization. Corollary 5.5 If 8 is an 1/0-behavior of interface E, then B -:1: 0. Proof - Suppose B = Beh{M, ;€/). It suffices to show that there is a fair computation of M. We construct a sequence q , q , ... of states of M, and a sequence 0 1 e , e , ... of events of E, such that the following properties hold: 0 1 (1) € TransM for all k € .H', 11 (2) For each;€/, either Prod, is enabled in only finitely many of the q , or else 11 the step is in Prod, for infinitely many k. 11 11 11 Letting t = k tor each natural number k and applying Lemma 3.5 yields a fair 11 computation of M. -97 • To construct the q and e , first let q € lnitM be chosen arbitrarily. We maintain a 11 11 0 running assignment of priorities to the elements of A so that at each stage of the construction i is mo.re urgent than j iff a step in Prod, has been chosen less recently than a step in Prodr At stage k, where k > 0, we choose ek and qk + 1 so that .. and qk + 1 A behavior B is asynchronous if whenever x € 8 and f: [0, oo) -+ [0, oo) is an order-isomorphism, then x O f € a. Corollary 5.6 - 1/O-behaviors are asynchronous. Proof - Straightforward from the observation that if X is a fair computation of a PS-machine ,E,> and f: [0, oo)-+ [0, oo) is an order-isomorphism, then X O f is also a fair computation of ,E;>. I = qk. I 5.4 Examples of 1/0-Behavlors In this section we give two examples of 1/O-behaviors and an example of a behavior that is not an 1/O-behavior. Example 1: An 1/0-Behavlor: As an example of how 1/O-behaviors can be used to model a system capable of satisfying eventuality requirements, imagine that we wish to model the behavior of a "black box" to which input stimuli can be applied by pressing a single button: and from which output can be observed by flashes of a single light bulb. The black box has the property that every press of the button is later followed by a flash of the light bulb, and no flashes of the bulb occur unless the button has been pressed at least once since the time of the most recent previous flash. The interface of such a black box is the 1/O-interface E with E = p., button, flash}, lnE = {button] and OutE = {flash}. The behavior of the black box is defined by a PS-machine M of interface E. Intuitively, a push of the button sets a flag in the state of M to true. A flash of the light can occur only when the flag is true, and causes the flag to be reset to false. There is one productive step set Prod, which contains exactly those steps in which flashes occur. -98- Formally, EM =E QM = {true, false} lnitM = {false} Trans" = {: r = true} U (} U {(q, A, q): q € QM}. Prod = { € TransM} That is a PS-machine of interface Eis easily checked. Let B = Beh(M, Prod), so that Bis an 1/0-behavior. Through analysis of the fair computations of M it can be shown that an observation x E Obs(E) is in B iff there is a surjective total function f: {t E [O, oo): x(t) = button} - {t' E [O, oo): x(t 1 = flash} such that for all t E [O, oo), f(t) is the least t' E (t, oo) such that x(t ') = flash. That is, an observation xis in B provided that in x, every push of the button "causes" a future flash of the light, and every flash of the light is caused by some collection of recent past pushes of the button. Exampfe 2: Two Productive Step Seta We can give an example of an 1/0-behavior that is not the behavior of a PS-machine with one productive step set. Let the interface E be defined by: E = {A, button, flash1, flash2}, where lnE = {button} and OutE = {flash1, flash2}. Let B be the set of all x E Obs(E) such that the following properties hold: (1) Occurrences of flash 1 appear only between the 2kth and 21< +1st occurrences of button, where k EX. (2) Occurrences of flash2 appear only between the 21< + 1st and 2(k + 1) st occurrence of flash. ' (3) x contains infinitely many occurrences either of flash 1 or flash2 (4) If x contains infinitely many occurrences of button, then it contains infinitely many occurrences of both flash 1 and flash2. It is straightforward to show that B is the behavior of a PS-machine of interface E with two productive step sets, one that governs the occurrence of flash 1 events and one that governs the occurrence of tlash2 events. . 99. Suppose B is the behavior of a PS-machine with one productive step set. Construct a computation of M by repeating the following procedure: Run M until a flash 1 event is produced, then run M for two steps containing a button input. It is always possible to obtain the flash 1 events in this construction, since otherwise we could construct a fair computation in which only finitely many flash 1 events and no flash2 events occur. It is always possible to run the button events by the input-cooperative property of M. The above construction yields a computation X of M that must be fair, since it contains infinitely many steps in which the output event flash 1 occurs, and which must be in the single productive step set Prod because Prod contains all output steps of M. However, X generates an observation in which infinitely many button events occur, but no flash2 events occur. Example 3: A Non-1/0-Behavlor We can also give an example of a set that is demonstrably not a 1/0-behavior. DE:fine the 1/0-interfa ce E as fullows: E = {"-, button, flash} lnE = {button} OutE = {flash}. Let the behavior B E Beh{E) be the set of all x E Obs(E) such that x contains an infinite number of occurrences of flash, and such that either the number of occurrences of button in x is finite or (#flashes in x on the interval [O, t))/ (#buttons in x on the interval [O, t)) - Oas t - oo. We argue that B is not an 1/0-behavior of interface E. Suppose ,E,> is a PS-machine of interface E, whose behavior is 8. Construct a computation X for M by repeating the following procedure: Run M without input until a flash event is produced, then run M tor one step with a button input. We can run M until a flash event is produced by always trying to take steps in which flash events are produced, if possible, otherwise taking some other productive step. During this construction, we make sure to use a fair scheduling algorithm to determine which of the Prod, should be executed at each step. We can never reach a state in which no productive steps are enabled, otherwise we could construct a fair computation in which only finitely many flash events are produced. We can run M at any time with a button input by the input-cooperative -100- property of M. The above construction yields a computation X of M that must be fair, since the fair scheduling of the Prod; ensures that every repeatedly enabled Prod; will be repeatedly run. However, computation X generates an observation x that contains infinitely many occurrences of button events, and in which the ratio of the density of flash events to button events approaches one in the limit, rather than zero. This contradicts the assumption that Beh(M, ,e,> = B. 5.4.1 Proving 1/0-Consistency From the PS-machine characterization of the 1/O-behaviors we obtain the following test for 1/O-consistency of subset specifications. Theorem 5. 7 - Suppose that S is a subset specification of 1/O-interface E. Then S is 1/O-consistent iff there exists a PS-machine ,e;> of interface E such that Beh(M, ,e,) ~ O(S). Proof• Obvious. I If S = is a state-transition specification, then to show the 1/O-consistency of S, it suffices to define a collection of productive step sets for M, such that every fair computation of M is in the set V of valid computations. Corollary 5.8 - Suppose that S = is a state-transition specification of 1/O-interface E. Suppose that ,e;> is a PS-machine of interface E. If every fair computation of M is in V, then S is 1/O-r..onsistent. Proof - Since ,e;> is a PS-machine of interface E, it follows that Beh(M, ,e,> is a 1/O-behavior of interface E. Since fNery fair computation of M is in V, we know that Beh(M, ;e,~ ~ 0(S). By Theorem 5.7, Sis 1/O-consistent. I To illustrate the use of this result, we apply it to a simple example specification: A neuron is a module with a s:ngle input event in, and a single output fNent out. The state set for the neuron is the set {ff, tt}. At any instant of time, if the state of the neuron is tt, then the neuron is said to be excited, otherwise the neuron is said to be inhibited. Initially the neuron is excited. An in event can occur at any time, and causes the neuron · 101 • to become inhibited. If the neuron is excited, then it can fire, producing an out event, and then becoming inhibited. The neuron should satisfy the condition, "If the neuron becomes excited and remains that way, then eventually it will fire." The neuron module description can be formalized as a state-transition specification. ENEU = {X, in, out} rnNEu = {in} OutNEU = { out} QNEU : {ff' tt} lnitNEU = {tt} A step E TransNEu iff either e = X and, = q or one of the conditions (in), (out) below holds: (in) e = in and , = ff (out) e = out and q = tt The neuron module validity condition is defined by: ValidNEU = □(□(Now = tt) - ◊{Occurs = out)), To show the 1/0-consistency of the neuron specification, we define a single productive step set ProdNEU as follows: E ProdNEu iff e = out, q = tt, and , = ff. It is clear by inspection that ~Eu is input-cooperative, and that ProdNEU is a productive step set for ~eu. To show the 1/0-consistency of the neuron specification, we must show that every fair computation of ~eu is valid. That is, CompNEU A Fair-Neut- ValidNeu, where FairHEU = O◊EnabledNEU{Now) - D◊ProdNEU(Now, Occurs, After) EnabledNEU{q) a: (3eEENEU, r E oNEU) ProdNEU(q, e, r). We claim the stronger property FairHEU p= ValidNEU. To show this, we use the neuron module specification and the definition of ProdNEu to expand the term FairHEu. From the definition of ProdNEu we obtain • 102 - EnabledNEU(q) = q = tt, and hence that FairNEu = □◊(Now = tt) - □◊(Now = tt /\ Occurs = out/\ After = ff). By straightforward temporal and propositional reasoning it is now easy to see that □◊(Now = tt) - D◊(Now = tt /\ Occurs = out/\ After = ff) I= □(□(Now = tt) - ◊(Occurs = out)), That is, if we suppose that (1) whenever the state repeatedly takes on the value one then it is also repeatedly the case that an out event occurs (which takes the state from one to zero), then we are entitled to conclude that (2) whenever the state is persistently one after some instant, then there is a later instant at which an out event occurs. We can use the PS-machine characterization of the I/O-behaviors to show the 1/O-inconsistency of a slightly stronger version of the neuron specification, obtained by using the stronger validity condition Va!id:Eu; □(New = tt -- ◊(Occur: = out)). This condition states that if the neuron is ever excited for a single instant, then it must eventually fire. Suppose there is a PS-machine ,€;> of interface ENEU such that Beh{M, ,€,>) satisfies the strong neuron specification. Construct a computation of Mas follows: Run M for one step with input in, and then repeatedly run productive steps of M if possible, otherwise null steps, being sure to schedule the occurrences of Prod, fairly. The result is a fair computation X of M. Since the observation x = Obsx satisfies the strong neuron specification, there must exist a valid computation X' of MNEU such that Obsx. = x. In X ', the neuron module is excited at time 0, an in event occurs at time 0, and no input events occur after time O. Consequently, the neuron module is inhibited after time 0, and thus no out events can appear in x because X ' is a computation of ~EU. Thus, in the computation X' of MNEu, the neuron module is excit&d at time O but no out events subsequently occur. This means that the computation X 'of ~Eu fails to satisfy the validity condition Valid:Eu, a contradiction. We conclude that the PS-machine ,€;> cannot exist and the strong neuron specification is 1/O-inconsistent. -103 · 5.4.2 1/O-Consistency of the Specification SC As an extended example of an 1/0-consistency proof, we prove the 1/0-consistency of the synchronizer component module specification. For the productive step sets, we use the sets Prodrun' Prodtoken_out' and Prodrequest_out' defined as follows: Prodrun(q, e, r) = e = run A TransSC{q, e, r) Prodtoken ou,(q, e, r) = e = token_out "TransSC(q, e, r) Prod request_out (q ' e ' r) -= e = request_out A TransSC(q, e, r). 8 It is easily checked that > is a PS-machine of interface Esc. We must show that each fair computation is valid; that is, 8 Comp C11 A Fai~ A Fair~en_out A Fair~uesLout I== Validsc, where Fai~ e O◊Enabled~(Now) - O◊Prod~(Now, Occurs, After) Fai~en_out = O◊Enabled~en_out(Now)- O◊Prod~8n_out(Now, Occurs, After) Fai~uesLout e O◊Enabled~uesLout(Now) - O◊Prod~uesLout(Now, Occurs, After) and each Enabledf(q), where i E {run, token_out, request_out}, is a formula that expresses the conditions under which Prodr is enabled in state q. Using the definitions of the Prod; given above, we derive the following expressions for Enabledrun' Enabled1oken_out' and EnabledrequesLout: Enabledrun(q) = q(ustate) = trying A q(token) ~ 0 Enabled10ken_out(q) e q(ustate} ~ running A q(token) ~ O Enabled,equest_out(q) = q(token) = O To show Compsc11 A Fai~ A Fair~en_out A Fai~uest_out I== Validsc, we assume Compsc11, Retysc, -,Gua,.SC, Fairrun, Fair1oken_out' and FairrequesLout' and derive a contradiction. That Is, we consider a fair computation in which the synchronizer component module rely-conditions are satisfied, but in which the guarantee-conditions are not satisfied. If -,Gua,.SC holds, then either (A) -,o(Now(ustate) = trying - ◊(Now(ustate) ~ trying)) - 104- or (B) -,□(Occurs = request_in) - ◊(Occurs = token_out)). Thus the proof can be split into two cases, one headed by assumption (A), and the other by assumption (B). Case (A): Suppose that (A) holds. Then by temporal reasoning, we have ◊(Now(ustate) = trying A D(Now(ustate) = trying)) (*) ◊D(Now(ustate) = trying) That is, it is persistently the case that the user process is trying. By definition of Trans8c, the following is valid: Comp8c1< I= □(Occurs = run - After(ustate) ¢ trying) and thus, using the temporal tautology I= D(cp(After) - ◊cp(Now)), that Comp8C1< I= ◊D(Now(ustate) = trying) - ◊□(Occurs¢ run). Intuitively, since occurrence of run results in the user process leaving the trying state, if the user process is persistently trying, then it must be the case that a run event persistently does not occur. Applying this to formula(*) yields • ◊□(Occurs~ run;, Nuw(u::1late) = trying) That is, it is persistently the case that the user process is trying but a run event never occurs. Using the definition of Prod,un' we conclude ◊0(-,Prod,un(Now, Occurs, After) A Now(ustate) = trying). By applying of the hypothesis that Fairr un holds, we obtain ◊0(-,Enabledrun(Now) A Now(ustate) = trying). Using the expression for Enabled,un obtained above, we have ◊0(Now(token) = 0 " Now(ustate) = trying). That is, it is persistently the case that the synchronizer component module possesses no tokens, and the user process is trying. Using the hypothesis that Fair,equest_out holds, we obtain □◊(Occurs = request_out) A ◊O(Now(token) = 0). That is, it is repeatedly the case that request_out occurs, but persistently the case that the synchronizer component module possesses no tokens. Applying the hypothesis that Rety5C holds, we conclude D◊(Now(token) -:1: 0) " ◊D(Now(token) = O). That is, it is repeatedly the case that the synchronizer module possesses a token, but persistently the case that the synchronizer module possesses no tokens. This is a contradiction, and we conclude that case (A) is impossible. -105 · Case (8): Suppose that (8) holds. Then by temporal reasoning, we have ◊(Occurs = request_in A □(Occurs* token_out)). That is, eventually there is a point at which a request for the token is received, but no token is ever sent in response. Using the definition of Prodtoken_out' and temporal reasoning, we obtain o□-,Prod10ken out(Now, Occurs, After). Application of the hypothesis that Fairtoken out holds, we have o□-,Enabled1oken_out(Now). That is, it is persistently the case that a token_out event is not enabled. Using the expression for Enabledtoken_out obtained above yields (• •) ◊D(Now(ustate) = running v Now(token) = 0). Thus, it is persistently the case that either the user process is running or the synchronizer component module possesses no token. We now use the temporal tautology I= ◊O(cp v 1/,) - (O◊cp v ◊Di/,). Intuitively, this says that if It is persistently the case that cp v 1/, holds, then either cp holds repeatedly, or else 1/, holds persistently. Application of this tautology to (**)gives O◊(Now(ustate) = running) v ◊D(Now(token) = 0). That is, either the user process is repeatedly running, or the synchronizer component module persistently has no token. We now split the proof into two subcases, depending upon whether (81) D◊(Now(ustate) = running) or (82) ◊D(Now(token) = O) holds. Subcase (81 ): Suppose that (81) holds. Application of the hypothesis that Rely5c holds gives O◊(Now(ustate) = running) A D◊(Now(ustate) * running). Next, we use the temporal tautology I= (O◊cp(Now) A OO-,tp(Now)) - O◊(cp(Now) A -,cp(After)). Intuitively, if it is repeatedly the case that cp holds of the current state, and it is repeatedly the case that -,cp holds of the current state, then it must repeatedly be the case that a point is reached where cp holds of the current state and -,cp holds of the "next" state. Application of this tautology in the present situation gives -106- D◊(Now(ustate) = running/\ After(ustate) * running). In addition, we need the following invariance property: Comp8c11 t= □(Now(ustate) = running - Now(token) ;t O). The validity of this sentence can easily be shown by Corollary 3.7, and the details are omitted. Using this, plus the fact that t= (Vq,rEState, eEEvent)((Trans8C(q, e, r) /\ q(ustate) = running /\ r{ustate) :at running) - ,(token) = q(token)), which is verified by case analysis on e, we obtain D◊(After(ustate) ~ running/\ After(token) ;t 0). Let us examine the intuitive content of the preceding steps. If the user process is running in the current state and not running in the "next" state, then the following must be true: Since the synchronizer component module must possess a token whenever the user process is running, and no event that takes the user process out of the running state can affect the number of tokens possessed, It must be the case that the synchronizer component module possesses a token in the next state as well. Another use of the temporal tautology t= □{,p{After)-+ ◊cp{Now)), we obtain D◊(Now(ustate) ;t running /\ Now(token) :at 0), which Is a contradiction with formula r•). We conclude that subcase (81) is Impossible. Subcase (82): Suppose that (82) holds, that Is ◊D{Now{token) = 0). Then by definition of Enabled,equest.out we have ◊OEnabled,equesLout{Now), and thus by the hypothesis that Fairr equest.out holds, we infer □◊(Occurs = request_out). That is, it is repeatedly the case that request_out events occur. By the hypothesis that Rely8c holds, we conclude D◊(Now(token) * 0) a contradiction with (82). We conclude that subcase (82) is impossible, and hence that case (B) is impossible. -107 - Since both cases (A) and (8) have been shown to be impossible, we conclude that the original hypotheses are contradictory, and thus the synchronizer component module specification is 1/0-consistent. 5.5 Composition of 1/0-Behaviors We have previously shown that the class of 1/0-behaviors is closed under the abstraction operators associated with the 1/0-abstraction maps. In this section, we define the class of .. 1/0-decomposition maps," and show that the class of I/0-behaviors is also closed under the composition operators associated with these maps. 5.5.1 1/0-Decomposition Maps When we defined the notion of a system interface above, we noted that there is a canonical decomposition map (and hence a composition operator) associated with each system interface. We would now like to extend the notion of composition associated with system interfaces so that we can view behaviors of non-system · interfaces as a composition of component behaviors. The most natural way to do this is to require that the the domain of a decomposition map be a system interface only up to isomorphism. Definition - An isomorphism from the 1/0-interface E to the 1/0-interface D is a bijective translation y: E - D such that y and y·1 are embeddings. I Definition - An 1/0-decomposition map from the 1/0-interface E to the collection of 1/0-interfaces ;Ei is a vector <6;>/E/ of translations, where 81: E - F;, with the following property: There exists a system interface E' ~ ®,E, F and an isomorphism y: E - E ', 1 such that 6 = 6 ' 0 y for all i € /, where <6 '>,E, is the canonical decomposition map 1 1 1 associated with E '. I From this definition, we can immediately derive a number of properties of the 1/0-decomposition maps. Lemma 5.9 • If <6,>,E, is an 1/0-decomposition map from E to ,E,, then (1) e -:1:- e 'implies 6;(e) * 6 (e ') for some; EI. (4 is injective) 1 (2) 6 (1nE) ~ lnF, U {XFi for all; E /. (4 preserves inputs) 1 - 108- (3) If e E OutE then B;(e) E OutF for some i EI. (4 strictly preserves outputs) 1 (4) 6:1(0utF) n 8:1(0utF) = 0 whenever; :1: ;. (Compatible Coupling Property) I I I I (5) 6; is onto F for all; E /. 1 Proof - Straightforward. I 5.5.2 Closure Proof We can now prove that the class of 1/0-behaviors is closed under the composition operators associated with 1/0-decomposition maps. Theorem 5. 1 O • Suppose aEA,> is a PS-machine of interface F1, such that Beh(M1, a EA,) = a,. We construct a PS-machine ,E,.aEA,> of interface E such that Beh(M, : such that ., Q. > € TransM for all Q. € QM. Thus M is a machine. To show that Mis an 1/0-machine, we must show that it is input-cooperative. Suppose Q. € QM and e € lnE. Since i preserves input, it follows that 6 (e) € lnF, U {>-Fi for each i € /. Since each M is input-cooperative, for each ; E I we 1 1 can get r such that E TransM. 1 1 1 I For each;€ I, and a E A , define 1 Prod,..' = {€ TransM: € Prod,.. and e ( lnE}. It is clear that each Proo,..' is a productive step set for M. To show that m.-EA.> is a PS-machine, we must show that the sets Prod,..' cover the set I of non null output or >.-steps in TransM.. Suppose is such a step. Then .-step for M for some i € /, by the fact that i strictly 1 1 preserves outputs. Since the collection a EA, covers the nonnull output or -109 · A-steps for M , we know that E Prod,.• for some a E Ar Hence E 1 1 Prod,, '. 8 1 We claim thatHeh(M, m.aEA,) = j " (,EJ. 8 8 1 Case Beh(M, ,E,,aEA,) ~ ~ · (1;L ): Each computation X of M maps in an obvious way (by taking the image of the observation part under 8 , and the canonical projection of the state part) to a 1 computation x, of M , for each i E /. Suppose that Xis fair. Let i E / and a E A be fixed. 1 1 We show that if Prod,.a is repeatedly enabled in x,, then Prod,.• repeatedly runs in Xr Suppose Prod is repeatedly enabled in x,. 48 We first show that, given a E State,.,, if Prod,,a is enabled in state q , then Prod ' is 1 48 enabled in state a. If Prod1.• is enabled in state q , then there exists 11 E OutF, U {AF) and 1 ,, E 0,., such that E Prod,.•· Since 81 is onto OutF/ we we can get e E OutE U 1 {AE} with 8 (e) = fr By the compatible coupling property of i, we know that 6 (e) E lnF 1 1 1 U {Ar} for all j E /- {i}. For each j E /- {i}, by the input-cooperative property of M , we I 1 can get ';' such that E TransM/ It follows that E Prod,, ', and 1 1 1 8 thus Prod,, ' is enabled in state a. 8 Since Prod is repeatedly enabled in x, by hypothesis, and Prod,.• enabled in state 48 q implies Prod ,a' enabled in state a, we know that Prod,.•' is repeatedly enabled in X. 1 1 By the fairness of X, it follows that Prod ,a 'repeatedly runs in X. By definition of Prod,.•', 1 if Stepx(t) E Prod ', then Stepxp> E Prod,.•· This implies that Prod ,a repeatedly runs in 48 1 x,. 1 Case j "(fi) ~ Beh(M, ,E,.af.A ): 8 1 Suppose that 6 (x) € Beh(M , .EA,), for all i € /. For each i E /, let x, be a fair 1 1 11 computation of M in which the observation 8 (x) is generated. We construct a fair 1 1 computation X tor M, such that Obs~ = x. Without loss of generality we assume that the x, have the following property: For all t E [O, 00), if a productive A-step runs at time tin X. for some i €/,then the step that I . runs at time t in x is null for all j E / - {i}. If we are given a collection ,E, for which 1 1 this property does not hold, then it is a simple matter to construct order-isomorphisms 1: 1 - 110- (0, oo) - [O, oo) such that if x, = x,' 0 fl' then Obsx = Obsx , for all i and the desired I I property holds for the collection ,E,. Since the property of being a fair computation is preserved under stretching of [O, oo) by an order-isomorphism, it follows that each X 1 is a fair computation for M,. We now define X by letting Obsx = x and Statex(t) = ,E,. It is easy to I see that Xis a computation for M. To show that X is fair, suppose that Prod,,•' is repeatedly enabled in X. Since Prod;,a' enabled in state g, implies Prod,.• enabled in state q1, it follows that Proo,. • is repeatedly enabled in x,. Since x, is fair, we know that Prod;,• repeatedly runs in x,. We claim that if Stepx (t) E Prod., a then Stepx(t) € Prod,~ • ' as well, and hence Prod,..' I repeatedly runs in X. By definition of Prod,.. ', if Stepx (t) E Prod,, then Stepx(t) E Prod,, ', except in case 8 8 1 Obsx(t) € lnE. But if Obs,c(t) E ln(E), then the fact that 6 preserves inputs and Prod,.. 1 contains only output and >.-steps implies that Stepx (t) = ., r;> € Prod,. •. Since .4 is 1 I injecti>1e and pr~serves input=:;, it musi be the case that Obsx (t) € lnF for some i € I - {i}, I I and hence Stepx (t) is non null. This contradicts our assumption that if a productive I >.-step runs at time tin x,, then the step occurring at time; In x is null for all i E /-{i}. I 1 5.6 Alternative Classes of Computable Behaviors The class of 1/0-behaviors is by no means the only class of "computable" behaviors that it is interesting to consider. By replacing the fairness requirement for computations of 1/0-systems with that of "weak fairness," in which a process is required to repeatedly run only if it is persistently enabled, rather than repeatedly enabled, we obtain the class of weak 1/P-behaviors (Wl/0-behaviors). It can be shown that every Wl/0-behavior is an 1/0-behavior, but not every 1/0-behavior is a Wl/0-behavior. The notion of Wl/0-consistency is therefore strictly more stringent than 1/0-consistency. Besides the fairness assumption, the definition of the class of 1/0-behaviors embodies several other choices that might have been made differently: (1) (Asynchrony) - The 1/0-systems model is an asynchronous model of computation. We might have chosen a timing-dependent model of computation instead. • 111 • (2) (Input/Output Structure) - Instead of focusing on interfaces with input/o utput structure, we might have chosen additional or alternative structure, such as interfaces in which events include information about the physical location at which - they occur. (3) (Simultaneity) - The definition of an 1/0-system permits at most one process to perform an output at any instant of time. We might imagine a more general model in which any number of processes can perform an output at once. An interesting avenue for future research is to try to discover additional classes of behaviors and associated notions of consistency by modifying one or more of the above assumptions. / • 112 • 6. A Completeness Result A reasonable question to ask about the sufficient correctness conditions required by the Correctness Theorem is whether these conditions are also necessary. That is, is it the case that the maximality and validity conditions hold for every correct implementation involving state-transition specifications? In this chapter we show that in general the maximality and validity conditions need not hold for every correct implementation. However, it is possible to impose some well-formedness conditions on the state-transition specifications involved in the implementation, which are sufficient to ensure that correctness implies maximality and validity. The Completeness Theorem (Theorem 6.4) is the formal statement of this result. Although Theorem 6.4 is probably not the strongest result of this kind it is possible to prove, it nevertheless sheds some light on the limitations of the Correctness Theorem, and serves to motivate some well-formedness properties of state-transition specifications. 6.1 Specification Domains The statement and proof of Theorem 6.4 depends crucially on the existence of a collection of interfaces, behaviors, abstraction maps, and decomposition maps with closure properties like those of the 1/0-interfaces, 1/0-behaviors, 1/0-abstraction maps, and 1/0-decomposition maps defined in Chapter 5. The definition of a "specification domain" below summarizes these properties, which seem like fundamental properties that are likely to be shared by other interesting models. Informally, a specification domain ~ contains four pieces of data: the "Cl-interfaces," the "~-behaviors," the .. ,.abstraction maps," and the "~-decomposition maps." The Cl-interfaces are interfaces with structure particular to the domain Cl. For example, the 1/0-interfaces are those whose non-A events are partitioned into input and output events. For each ,-interface E, the 9-behaviors of interface E represent a class of "realizable" or "computable" behaviors of interface E. Just as the definition of 1/0-behavior depends upon the input/output structure of an 1/0-interface, whether or not a behavior of ,-interface Eis a ~-behavior of interface E will depend, in general, on the particular structure of the interface E. The ~-abstraction and ~-decomposition maps represent meaningful ways to abstract and decompose systems modeled by ~-behaviors. In general, these maps will have certain preservation properties with - 113- respect to the particular structure of the interfaces, just as the 1/0-abstraction and 1/0-decomposition maps preserve input/output structure in various ways. The definition of a specification domain requires that the class of 9-behaviors be closed under the abstraction and composition operators associated with the '3-abstraction and '3-decomposition maps. In addition to the properties of closure under abstraction and composition discussed above, we require a third regularity property of the '3-behaviors. This property, called "nondegeneracy," rules out the empty behavior as a '3-behavior of any interface. lntutively, the empty behavior does not model any real system, since it is always possible to obtain an observation of a real system, even if that observation is only the null observation A. Definition - A specification domain 9 consists of the following: - A class Interfaces, of interfaces, called the ':I-interfaces. - For each pair E, DE Interface&:,, a set AbsMaP9GJ(E, O) of translations from E to D, called the set of '3-abstraction maps from E to D. - For each pair E, ~,, where I is a finite index set and E and each F, are elements of lnterface5g, a set DecMap93(E, f) called the set of 9-decomposition maps from E to E. Each element of DecMap&._J(f, f) is a vector <&;>,e,• where 8; is a translation from Eto F,- • For each interface E E lnterfac85':J, a set Behavior5g(E) of behaviors of interface E, called the set of 9-behaviors of interface E. In addition, ':J is required to have the following properties: (1) (Nondegeneracy) - For all '3-interfaces E, the empty behavior 0 is not in Behavio~(E). (2) (Abstraction Closure) - For all '3-interfaces E, D, if a E AbsMal)SGJ(f, O) and 8 E Behavior&._J(f), then a(B) E Behaviorsg(O). (3) (Composition Closure) - For all '3-interfaces E, ,e,• if i E DecMap5g(E, E) and E! = ;e, is such that B; € Behavio~(FJ for each ; E /, then i ·1m. ) E BehaviOr8t:J(f). I A rather simple example of a specification domain is the domain "CSP," where we define every interface to be a CSP-interface, every translation a to be a CSP-abstraction map, every finite vector .d of translations with a common domain to be a CSP-decomposition map, and define the CSP-behaviors of interface E to be exactly those behaviors of interface E that are nonempty, asynchronous, and · 114 · truncation-closed.1 We call this specification domain CSP because it is closely related to the "trace model" for CSP defined in (Hoare81b]. In that paper, process behaviors are modeled by nonempty, prefix-closed subsets of E*, where E is an alphabet of process events. To each nonempty, prefix-closed subset of E*. there naturally corresponds a nonempty, asynchronous, and truncation-closed behavior of interface E. Thus, for each of Hoare's processes, there is a CSP-behavior that contains the same information. Hoare defines operations of parallel composition, concealment, and alphabet transformation on processes. Under the natural correspondence described above, Hoare's concealment and alphabet transformation operations are special cases of the CSP-abstraction operators defined here, and Hoare's parallel composition operation is a special case of the CSP-composition operators defined here. Since no truncation-closed behavior can satisfy a specification with nontrivial eventuality properties, the specification domain CSP is not particularly useful for the analysis of such specifications. As a consequence of Theorem 5.1, Corollary 5.5, and Theorem 5.10, the 1/O-ir.teifaces, I/O-behavi·ors, 1/O-abstraction maps, and 1/O-decornpc,sltiun ma.:,s also define a specification domain, which we call "l/O." , We can generalize the definition of 1/O-consiste")CY to an arbitrary specification domain 9. Definition • A specification S of 9-interface E is 9-consistent if ${S) n Behavior5g(E) ~ 0. I We define relativized notions of interconnection, implementation, and correctness with respect to a specification domain ~ as follows: An interconnection , is a !I-interconnection if the interfaces o'. E', and Fl for each ; E I are 9-interfaces, the abstraction map a' is a 9-abstraction map from E' to o', and the decomposition map ~ ' is a 9-decomposition map from E' to E '· An implementation is a 9-implementation if , is a 9-interconnection. We say that the 9-implementation 1. If x is an observation and t € [0, oo), then the t-truncation of x is the observation x ' such that x 'Ct ') = x(t ') for all t' E [0, t), and x 'Ct') = A for all t' E [t, oo). A behavior B is truncation-closed i_f whenever x E 8 and t E [0, oo), then the t-truncation of x is also in B. - 115 • <,, sabs' S,) is 9-correct if a' 0 1(~ 'r ca.) E ':!(Sabs)nB; E ~(S;)nBehavior~(Fl> for ea~h i € I. Every ~-implementation that is correct in the sense of Chapter 2 is also 9-correct, and thus the Correctness Theorem can be used to prove ~-correctness. However, in general there will be 9-correct implementations that are not correct in the sense of Chapter 2. Lemma 6.1 - If a 9-implementation is correct, then it is 9-correct. Proof - Suppose <,, Sabs' ;E,> is a correct 9-implementation. For each i € /, let 81 be an arbitrary 9-behavior of interface F7 that satisfies Sr Let Babs = a' 0 1(~ 'r <.a ). Then since~ is closed under abstraction and composition, it follows that Babs is a 9-behavior of interface o'. By the assumption of correctness, Babs satisfies Sabs' Since the 81 were arbitrary, it follows that<,, Sabs' S. >i s 9-correct. I We next define the notion of an "evolutionary" specification domain. Intuitively, if an evoiut1onary specification domain ~ contams a behavior B that models what a system S can do starting from time o, and if we observe S produce a certain prefix of an observation over the interval [O, t), then 9 will also contain a "future" behavior B ', which models what S is capable of doing, starting from time t. Probably any reasonable specification domain will be evolutionary (as is the specification domain 1/0) although this property does not seem quite fundamental enough to be included as part of the definition of a specification domain. To define the evolutionary property precisely, we require some additional notation. If x and y are observations and a € (0, 00), then we write x = • y if x(t) = y(t) for all t € [O, a). If 8 is a behavior of interface E, x € Obs(E) is an observation, and t E [O, 00), then define the future of B with respect to x and t as follows: futurex,t(B) = {suffixt(y): y E 8, y =, x). Intuitively, if a behavior 8 models what a system can do if we begin watching at time t = 0, then futurex.,(B) models what the system can do after we have already observed the initial segment of x on the interval (0, t). - 116 - Definition - A specification domain~ is evolutionary if, whenever B is a 9-behavior of 9-interface E, x € 8, and t € (0, oo), then future ,(B) is also a 9-behavior of 9-interface E. JI, I For the remainder of this chapter, we assume that an evolutionary specification domain 9 (such as the domain CSP or 1/0) has been fixed. 6.2 Locally 9-Consistent Subset Specifications This section introduces the notion of a "locally CJ-consistent" subset specification, and obtains some properties of such specifications that will be used in the proof of Theorem 6.4. Intuitively, local 9-consistency of a subset specification S means that O{S) contains no isolated observations that cannot be realized in some 9-behavior satisfying s. Definition - A subset specification S of 9-interface E Is locally 9-consistent if for all x € O(S) there exists a 9-behavior B of interface E such that x € B t; O{S). I Note that if S is locally ,-consistent, and in addition O(S) ~ 0, then S is 9-consistent. lemma 6.2 below states that if the component module specifications in a 9-implementation are locally 9-consistent, then the necessary and sufficient conditions for correctness provided by Lemma 3.1 for implementations involving subset specifications, are also necessary and sufficient for 9-correctness. Lemma 6.2 - Suppose ;E;> is a 9-lmplementation, where Sabs and each S 1 are subset specifications. Suppose that each S is locally 9-consistent. Then ,E;> is 9-correct iff a' 0 1(.d ')" (<0(S;)>;E:,) t; O(S..,.). Proof - = > follows directly from Lemma 3.1 and Lemma 6.1, and actually does not require the assumption of local 9-consistency. To show<=, suppo~ <,, Sabs' ,E;> is 9-correct. It suffices to show that if x € Obs(E) is such that a? (x) € O{S ) for each ; € I, 1 then a'(x) € O(S bs). Because each O(S;) is assumed locally 9-consistent, given x E 8 Obs(E) such that &!(x) € O(S;) for each ; € I, then for each ; € I there exists a 9-behavior B; of interface Fl such that a/(x) € B; t; O(S,). Thus a'(x) € s.bs : a'0 · I 8 - 117 - The proof of Theorem 6.4 requires Lemma 6.3 below, which expresses a special property of locally 9-consistent subset specifications in an evolutionary specification domain Cl. Lemma 6.3 - Suppose that 9 is an evolutionary specification domain, and that S is a locally ~-consistent subset specification of 9-interface E. Then future ,(0(S)) contains a JI, 9-behavior of ,-interface E whenever x E 0(S) and t E [0, 00). Proof - The local 9-consistency of S means that, given x E 0(S) there exists a ~-behavior B of interface E such that x E a ~ 0(S). Since Cl is evolutionary, it follows that futurex_,(B) is a ~-behavior contained in futurex_,(0(S)). I 6.3 Well-Formedness Properties of Specifications This section defines three properties of state-transition specifications, which are used in the statement of Theorem 6.4. These properties are: regularity, quasi-determinacy, and orthogonality. The original motivation for these definitions was technical, in the sense that they were sufficient to permit the proof of Theorem 6.4 to go through. However, it was surprising to find that these properties could be thought of as well-formedness properties that should be satisfied by "good" state-transition specifications. In a regular state-transition specification, whether or not a computation is valid depends only upon the observation that is produced, and not upon the particular choice of states. In a quasi-determinate specification, the fact that the state-transition I relation permits choices between states is inessential, since a choice of state made at time t can have no effect on the portion of the observation produced subsequent to time t. Orthogonality is related to the correct partitioning of "local" and "global" properties between the state-transition relation and the validity conditions of a specification. We first consider regularity. Intuitively, the requirement of regularity amounts to the assumption that whether a computation is valid does not depend upon the states appearing in the computation, but rather only the observation produced. Definition - A state-transition specification S = is regular if, for all computations • X and Y of M, if Obsx = Obsy, then X € V iff Y € V. I - 118 - To motivate the somewhat technical definition of quasi-determinacy, it is convenient to first examine the stronger, but more simply defined notion of "determinacy." Definition - A machine Mis determinate if lnitM is a singleton set, and for all q E QM and all e E EM, there is at most oner E o., such that E TransM. A state-transition specification S = is determinate if M Is determinate. I A determinate specification is automatically regular, since a determinate specification can have at most one computation that produces a given observation. The importance of the determinacy property is that each observation generated by a determinate machine is produced in exactly one computation of that machine. Thus, if S = is a determinate specification, x E O(S), and Xis a computation of M with Obsx = x, then it is automatically the case that X € V, since no other computation of M can produce the observation x. To show that the maximality condition holds for a correct implementation, it ilµµea,-s lo ue nec&$sary to as:sume lhat son~ fJIOi,t:1 ly similar to dt,itmninacy hoids for the abstract module specification. To see why, consider the following example: We are attempting to implement an abstract module whose function is to produce a finite number of occurrences of a single event e. (Think of a "black box" with a single light bulb on top, and let e be an event corresponding to a flash of the light bulb.) This module can be specified in two different ways: (Determinate): The state set of the specification consists of a single state•. The event e is enabled in state •, and obviously cannot produce any state change. The constraint that e should appear only finitely many times is captured by the validity condition. (Indeterminate): The state set for the specification is the set of natural numbers. Every state is an initial state. The event e is enabled in state k iff le * O, and the occurrence of e causes the state to be decremented. Every computation is valid. In this specification, the requirement that e occurs only finitely often is captured by the indeterminate choice of initial state. Let Sdet be the determinate specification and let S nd be the indeterminate specification. 1 Clearly O(Sdet) = O(S1nd). · 119- Now consider an interconnection, = , where the abstract interface o', the single component interface F', and the composite interface E' are all the same interface p., e}, and the abstraction map a' and the decomposition map 6' are the identity translation. Clearly both of the implementations <,, Sdet' Sine? and <,, Sind' Sdet> are correct. However, the maximality condition does not hold for the implementation <,, Sind' Sde?· To see this, note that any pair is an initial state for the composite machine, and is hence reachable for that machine. Furthermore, the event e is always enabled for the component machine. For maximality to hold, it would have to be the case that e is enabled for the abstract machine no matter what the value of k is. But e is not enabled for the abstract machine if k = O. In certain situations, for example the transmission line module specification in Appendix II, the use of indeterminate specifications is quite natural. However, the preceding example shows that unless we are careful, it may not be possible to use the Correctness Theorem to prove the correctness of Implementations when such a specification is used as the specification for the abstract module. The proof of the Completeness Theorem actually does not require that the abstract module specification S bs be determinate, but rather the somewhat weaker 8 assumption that Sabs be regular and "quasi-determinate." Intuitively, the set of future observations that can be produced by a quasi-determinate machine is independent of the choice of states made on the initial segment [O, t]. To define quasi-determinacy precisely, we extend to histories the = a notation defined above for observations. If X and Y are histories, then we write X = a Y if Obsx(t) = a Obsy(t) and Statex is quasi-determinate if M 2 is quasi-determinate. I If a state-transition specification is determinate, then it can also be seen to be quasi-determinate by choosing Z = Y in the above definition. Determinacy implies that State can be defined in exactly one way on the interval [O, t], thus showing that X =, Z. 2 -120- We next consider orthogonality. Intuitively, in an orthogonal specification, every computation agrees for an arbitrarily long time with a valid computation. Orthogonality is related to the correct partitioning of "local" and "global" properties between the state-transition relation and the validity conditions of a specification. Roughly, orthogonality means that the validity conditions contain no information that could have been expressed by strengthening the machine part of the specification. Definition • A state-transition specification S = is orthogonal if for all computations X of Mand all t E [O, oo), there exists YE V such that X == t Y. 6.4 The Completeness Theorem We can now state and prove the Completeness Theorem. Theorem 6.4 (Completeness Theorem) - Let g be an evolutionary specification domain. Suppose that <3, Sabs' ,E;> is a cJ-implementation, where Sabs and S for each i €/are 1 state-transition specifications. Suppose that Sabs is regular and quasi-determinate, and that S; is orthogonal and locally ,-consistent for each i E /. If <3, Sabs' ,E;> is ,-correct then the maximality and validity conditions hold. I Proof· Suppose that Sabs = , and that S1 = , tor each i E /. Let M be the composite machine. Note that the assumption that each S is locally ,-consistent 1 together with the assumption of !I-correctness implies, by Lemma 6.2, that a-' 0 1~ -')" ((0(S )>,E,) ~ O(S t,). 1 8 (Validity): To see that the validity condition holds, suppose that Xis a computation for M, such that x<11 E v, for each i E /. Then s;(Obsx) E O(S ) for each i E /. It therefore follows, 1 by the previous paragraph, that 3a {0bsx> € O(Sab ). This means that there exists a 5 computation Xabs of Mabs' such that Xabs E Vabs and Obsx = 3a (0bsx>· Since S bs is 8 abs assumed regular, and Obsx<•bs) = Obsx , it follows that the computation x in a similar way -- this is what we are trying to show. We next use the orthogonality assumption on the S; to obtain, for each i E /1 a valid computation Y that "looks like" X; on the initial segment [O, n + 1) . Each Y produces an 1 1 cbservation y E 0(S;) that looks like 6/(x) on the interval [O, n + 1) . We do not know that 1 there is a single observation y such that y = a;(y) for all; E /. However, we can use 1 Lemma 6.3, plus the composition closure property of the specification domain 9 to obtain an observation z such that, for all; € /, a;(z) E 0(S ) and 8 (z) looks like y on the 1 1 1 interval [O, n + 1 ). 9-correctness implies that a'{z) E 0{Sabs). Since a'(z) E 0(Sab }, we can obtain a computation Zabs for Mabs' such that Obsz 8 abs = a'(z). Now, event a'(e) occurs at time n in Z bs. If we knew that Statez (n} = 8 abs ,, abs(q), then this would show that a'(e} is enabled for Mabs in state ,,, abs(q}. Although it need not be the case that Statez (n) = wa bs(q}, the quasi-determinacy of Saba lets us abs replace the [O, n] segment of Statez with the corresponding segment of Statex€ TransM for all k € {O, 1, ... , n-1 }. 1 Let f: Jf - Steps(EM ' QM) be de_fined by: f(k) = , if 0 < k < n = , otherwise. Then f is a history skeleton and by Lemma 3.5 there is a unique history X such that f spans X. Let x = Obsx· It is easy to see that Xis a computation of M. -122- Since B;(e) is enabled for M in state tr (q), for each ; €/we can choose,, E QM 1 1 1 such that E TransM,' It follows that for each i EI, the history x,, where X1 = x<11 n and Obsx (t) = BJ(e), fort = n I == A, fort E (n, oo) Statex (t) = ,,, fort E [n, oo) . I is a computation for M .. Let x. = Obsx for each;€/. I I I By the assumption that each S is orthogonal, we can obtain computations Y E V 1 1 1 such that Y1 = n + 1 x,. Let y1 = Obsy. Then y1 E O(S1) and y1 = n + 1 x,. I Since each S is locally ,-consistent, and the the specification domain , is 1 evolutionary, we can apply Lemma 6.3 to show that futurey,.n+ (0(S)) contains a 1 9-behavior B of interface FJ for each ; E /. Let a = (.4 '>·1ca ). Then since the 1 specification domain 9 is closed under composition, it follows that B Is a 9-behavior of interface E'. Since 9-behaviors are nonempty by the nondegeneracy property of 9, it follows that we can choose an element z ' of B. Let z be the observation defined by the properties: Z =n X z(n) = e z(t) = A, for t E (n, n + 1) suffixn+ 1(z) = z '. Then by construction, BJ(z) € O(S ) for each;€/. As shown in the first paragraph of the 1 proof, it follows by 9-correctness that a'(z) € O(S.._). Since a'(z) E O(Sabs), there exists a computation z. for Mab& with Obsz = abs a'(z). By construction, Obs abs = a'(z) = n a'(x) = Obsx and Obsz , = Obsz . Since Zabs' = n x<•bs>, we know that State ~n) = "•bs(q). Since 1 abs abs abs a'(e) occurs at time n in z.bs ', it follows that a'(e) is enabled for Mabs in state wa bs(q), as desired. I - 123- 7. Conclusion 7.1 Summary The important accomplishments of this thesis are the following: 1. Formal Framework - A major accomplishment of this thesis is that it sets up a formal framework within which it is possible to formulate precisely a large number of interesting and important questions about specifications and correctness proofs, and to obtain rigorous answers to these questions. The framework includes the notions of interface, observation behavior, composition, and abstraction, as primitive. These primitive notions are used to give precise, language-independent definitions of the notions of implementation, correctness, and consistency. 2. State-Transition Specifications - The thesis shows how module behaviors can be conveniently and naturally described in terms of a machine that generates an observation as it executes, plus some validity conditions on the computations of that machine. Specifications stated in such a form lend themselves to e systematic method fer performing correctncs3 proofs. 3. Rely- and Guarantee-Conditions - The concept of rely- and guarantee-conditions is shown to be useful for organizing eventuality specifications and proofs of correctness involving such specifications. The use of rely- and guarantee-conditions seems to result in simple proofs based on the communication structure of a system, rather than in proofs based on the structure of computations. 4. Consistency of Specifications - The 1/0-behavior model provides an interesting and useful notion of consistency for eventuality specifications. The thesis obtains a technique for proving the 1/0-consistency of state-transition specifications. The investigation of state-transition specification performed in this thesis has resulted in some practical insights that can be tentatively expressed in the form of the following procedure for refinement of an abstract module into a system of component modules: (1) Determine the interconnection of component modules that will be used to implement t.he abstract module. (2) Identify the implementation invariant and the rely-/guarantee-conditions required for the proof of correctness. (3) "Localize" the rely-/guarantee-conditions to each component module. -124- Introduce sufficient information into the component module states to permit the localized conditions to be conveniently expressed. (4) Define the state-transition relations for each component module. (5) Check the completed component module specifications by proving their consistency. (6) Use the component module specifications to perform a complete proof of correctness for the implementation. The resource manager example in Appendix II illustrates the use of this procedure. It has unfortunately been possible in this thesis to investigate only a tiny fraction of the questions that could conceivably be formulated using the framework developed here. The remainder of this chapter lists a number of questions that have not been addressed, but should be. Hopefully the answers to these questions can provide further practical insights into the problem of design, and ultimately contribute to more useful and reliable distributed/concurrent systems. 7 .2 Ideas for Future Work The basic framework set up in this thesis can serve as a starting point for a number of interesting extensions. The discussion below is concerned with the following broad possibilities for investigation: (1) Specification Domains (2) Semantic Properties of State-Transition Specifications (3) Organizing Principles for Specifications and Proofs (4) Formal Specification and Proof (5) Non-State-Transition Specifications 7 .2.1 Specification Domains The concept of a specification domain appears to offer considerable possibilities for theoretical investigation. There are two broad directions for future investigation of specification domains. The first direction is concerned with developing the general theory of specification domains, and relating this theory. t o domain theory as used in programming language semantics. The second direction is to construct additional example specification domains that model systems with interesting properties. -125- Plausible steps toward a general theory of specification domains might include the following: (1) The definition of a specification domain should be generalized so that the particular structure of an observation is not specified. The assumption of the particular structure of translations between interfaces would also have to be removed. A reasonable approach might be to assume that the interfaces and translations comprise the objects and morphisms of a category. The relationship between interfaces and observations would take the form of a functor defined on the category of interfaces, which maps each interface E to the set Obs(E) of observations over E, and which maps each translation a: E - F to a function on observations. (2) The notion of a behavior should be generalized so that a behavior determines, but is not identified with, a set of observations. This would permit behaviors such as the "futures processes" of [Rounds81] to be used, as well as correspondingly more general abstraction and composition operations. There still should be some constraints on the effect of abstraction and composition operations with respect to the set of observations determined by a behavior. It is not immediately obvious what those constraints should be. (3) An attempt should be made to try to identify the correct set of regularity assumptions for abstraction and composition operations. The results of Chapter 6 required no such assumptions, however it seems reasonable that the classes of abstraction and decomposition maps ought to be closed under function composition and include the identity translations. The 1/0-abstraction maps and 1/0-decomposition maps certainly have these properties. (4) The specification domain 1/0 seems to provide motivation for a kind of duality between abstraction and decomposition, in the sense that abstraction and decomposition maps seem to have complementary preservation properties with respect to input and output. It would be very interesting if abstraction and decomposition maps could be unified, so that they are just dual instances of a single underlying notion of translation, or "interface morphism." One way this might be accomplished is by assuming the existence of a kind of conjugation operation on interfaces. Intuitively, the conjugate of a module interface would be the interface of the module's environment. The duality between abstraction and decomposition might then be captured by stating that a decomposition map is an abstraction map defined on conjugate Interfaces. -126 - To help motivate the correct general definitions, further specific examples of specification domains should be constructed and studied. Ideas for constructing further examples of specification domains might be as follows: (1) Different notions of obiervation might be used to construct a number of interesting specification domains. One example is to replace the assumption that observations contain only finitely many events in any finite interval with some less restrictive topological assumption, and to attempt to construct corresponding classes of behaviors. If the machine approach to defining behaviors is to be used, then there is the problem of how to define a machine that permits infinitely many events to occur in a finite interval. Examples of such "machines" already appear in the theory of dynamical systems. For example, if one is willing to assume that an observation is a continuous, differentiable function on [O, oo), then the correct notion of machine is that of a differential equation. (2) Different special assumptions on behaviors can be made to model systems with particular properties. For example, it would be interesting to find a class of behaviors that includes non-asynchronous behaviors, corresponding to sets of observations that are not necessarily closed under stretching of the time axis. These behaviors would model timing-dependent syStems. If observations contain space coordinates, In addition to time coordinates, then it might be possible to construct a I class of behaviors with the property that information doesn't travel "too quickly" from one place to another. This specification domain could be used to investigate the problem of what can be observed by one module about the operation of another in a distributed system. Another idea might be to try to characterize a class of "atomic" behaviors, like the atomic data types of [Weihl84]. The observations in these behaviors would have certain serializability properties. (3) An attempt should be made to deal correctly with simultaneity. It should be possible to do this within the specification domain framework as follows: Introduce additional structure on interfaces to model the intuitive idea that some events represent the simultaneous occurrence of more primitive events. For example, it might be assumed that the events in an interface form a complete lower semilattice with >. at the bottom, and with the semilattice operation u representing the operation of "simultaneous occurrence." The main problem with this approach is how to introduce the notions of input and output so that an assignment of behaviors that is nondegenerate and closed under composition can be defined. · 127 - 7 .2.2 Semantic Properties of State-Transition Specifications In Chapter 6 three semantic properties of state-transition specifications were identified (determinacy, regularity, orthogonality) and it was suggested that these might be properties characteristic of "well-formed" specifications. The idea of finding semantic well-formedness properties of specifications also appears in [Jones81], where the notion of an "unbiased" specification is discussed. It is interesting and useful to try to identify such properties, since they can possibly serve as guidelines in the design process. An important extension to this thesis would be to try to examine more closely the properties identified in Chapter 6, to develop techniques for proving that specifications have these properties, and to try to develop additional well-formedness properties. 7 .2.3 Organizing Principles for Specifications and Proofs The development of organizing principles for specifications and proofs appears to be a promising area of investigation. The rely- and guarantee-condition approach to writing specifications and performing correctness proofs is an example of the kind of results one might try to obtain. The way to proceed in this area is to perform example specifications and correctness proofs, and then try to abstract from these examples something in the way of general methods that would be applicable to other examples. This is difficult, because the examples take a long time to do, and it is hard to abstract general methods from a few examples. Rely-/Guarantee-Conditions: Rely- and guarantee-conditions were used in this thesis in the statement of the validity condition portion of a specification only. This is in contrast to the work of other researchers, for example [Jones81 ], in which rely- and guarantee-conditions can be used for state-transition properties only. For the examples in this thesis it did not seem particularly helpful to use rely- and guarantee-conditions for the state-transition portion of a specification. One possible exception might be the synchronizer and synchronizer component module specifications, in which the use of rely- and guarantee-conditions in the state-transition part of the specification might obviate the need for an error state. · 128 - Determinate vs. Indeterminate Specifications: Both determinate and indeterminate specifications seem to be useful. From a strictly theoretical standpoint, determinate specifications are more convenient to work with than indeterminate specifications. From a practical point of view, though, there are cases (such as the transmission line specification of Appendix II) in which the use of indeterminate specifications is quite natural, and in which an equivalent determinate specification would have to be stated in a much more convoluted fashion. Perhaps a result could be proved which shows that determinate and indeterminate specifications are equivalent in expressive power, in the sense that every indeterminate specification could be stated equivalently as a determinate specification. Such a result would permit the theory of specification to deal only with the more convenient determinate specifications, while permitting indeterminate specifications to be used in examples where they seem natural. Parallel Specifications: In ce,,1ain exarnpt~, though .-,ui i,1 a.-,y of the ones con&id~,~ in this thesis, it i::1 convenient to describe the desired functioning of a module in terms of a collection of loosely interacting concurrent processes. This process structure is a logical one used for descriptive purposes only, and may or may not bear any relation to the structure of an implementation of the module. It would be nice to be able to write specifications that reflect such a logical decomposition. State-transition specifications as described in this thesis are an inherently sequential form of description, since they include only a single machine. Perhaps the state-transition technique could be extended by permitting specifications to include a collection of machines that execute in parallel, and whose state sets are mostly independent of each other. To perform correctness proofs with this kind of specification would require a modified version of the Correctness Theorem. Differential vs. Integral Form: There is a certain amount of flexibility in whether state-transition properties are expressed in "differential," state-transition form, or in "integral," Invariant form. In general, given a statement of the invariant form, "for all reachable states q, property P(q) holds," an equivalent expression in state-transition form can be obtained by a simple syntactic transformation analogous to differentiation, e.g. "Property P holds of I · 129- all initial states, and a state transition from q to r can occur only if P(q) implies P(r)." There is apparently no general method for "integration," that is, for obtaining equivalent statements in invariant form, given a statement in state-transition form. In this thesis, the policy was adopted that all local properties would be expressed in state-transition form, rather than in invariant form. One reason for this is that, in general, invariants for the composite machine for an implementation cannot be proved directly from invariants for the component machines. Rather, it is necessary to first "differentiate" the invariants for the components, to obtain corresponding preconditions for event occurrences, and then use these preconditions in an inductive proof of the desired invariant for the composite machine. In certain circumstances, though, it seems natural to express specifications in invariant, rather than state-transition form. For example, in the synchronizer module specification it is perhaps more natural to state explicitly that "at most one user process can be running at any instant," rather than the more indirect approach taken here, where we use the precondition "a run event can occur only if there are no users currently running." Further invAstigRtion into th~ relation~hip between state-tran~!Uon 2nd !nvariant specifications seems needed. 7 .2.4 Formal Specification and Proof For the specification and proof techniques developed in this thesis to be useful for practical examples, the development of mechanical aids for manipulating specifications and assisting in correctness proofs is essential. Appendix I takes the first steps toward this goal by showing how all of the proof techniques developed in this thesis can be formalized within an appropriate temporal language. Further steps should be taken along the following lines: (1) A practical method should be devised for describing heterogeneous algebras and for associating with each description a reasonably powerful, sound deductive system for deducing properties of the described algebra. In spite of the large amount of work that has been done in this area (specification of abstract data types), a completely satisfactory method is still lacking. (2) Tools are needed for enumeration and checking of cases in inductive proofs of invariance. In the correctness proofs performed in this thesis, once the implementation invariant is devised, the proof that it is inductive is a tedious case - 130- analysis that ought to be easily mechanizable. (3) Mechanical aids for checking proofs in temporal logic are needed. Such a proof checker wou!d probably not be capable of performing complete proofs by itself, but rather would serve to fill in intermediate steps in a proof generated by a human verfier. 7 .2.5 Non-State-Transition Specifications It would be interesting to use the framework of definitions set up in Chapter 2 to investigate specification languages not based on the state-transition approach. One obvious example is to investigate specification languages based on some kind of generalized regular expression. Preliminary experience with this kind of specification seems to indicate that the regular expression approach seems to produce shorter specifications for trivial examples, but for more complex examples it is much more difficult to express the desired properties. Interesting questions are what sort of deductive system, if any, could be used to derive consequences from specifications ~tAt~ in r1:>a•.tlar ~xpressiori form, and what form the Correctness Theorem wo!Jld take for such specifications. - 131 - References (Abrial80] Abrial, J.R., "The Specification Language Z: Syntax and Semantics," Programming Research Group, Oxford University, 1980. (Apt81] Apt, K., "Ten Years of Hoare's Logic: A Survey - Part I," TOPLAS 3, 4(1981), pp. 431-483. . [Barringer83] Barringer, H., Kuiper, R., "A Temporal Logic Specification Method Supporting Hierarchical Development," Manuscript, University of Manchester Department of Computer Science, November, 1983. [Bartlett69] Bartlett, K.A., et al. "Note on Reliable Full Duplex Transmission on Half Duplex Links," CACM 12, 5(May 1969), pp. 260-261. [Berzins79] Berzins, V.A., "Abstract Model Specifications for Data Abstractions," MIT /LCS/TR-221, 1979. [Bochmann78] Bachmann, G. V., "Finite State Description of Communication Protocols," Computer Networks 2(1978), pp. 361-372. [Brock81] Brock, J.D., Ackermann, W.B., "Scenarios: A Model of Non-determinate Computation," Proc. Peniscola Colloquim, Springer LNCS 107, 1981. [Brock83] Brock, J.D., "A Formal Model of Non-Determinate Dataflow Computation," MIT /LCS/TR-309. [Chen81] Chen, B., Yeh, R.T., "Event Based Behavior Specification of Distributed Systems," IEEE Symposium on Reliability in Distributed Software and Database Systems, July, 1981. [Chen82] Chen, B., "Event-Based Specification and Verification of Distributed Systems," PhD Dissertation, University of Maryland, 1882. [Clinger81] Clinger, W. "Foundations of Actor Semantics," MIT/ AI/TR~633, May, 1981. [Dijkstra76] Dijkstra, E.W., A Piscipline o1 Programming, Prentice Hall, 1976. [DiVito82] DiVito, B.L., "Verification of Communications Protocols and Abstract Process Models," Institute for Computing Science TR-25, University of Texas at Austin, 1982. [Fischer83] Fischer, M.J., Griffeth, N.D., Guibas, L.J., Lynch, N.A., "Probabilistic Analysis of a Network Resource Allocation Algorithm," submitted for publication. -132- [Floyd67] Floyd, R.W., "Assigning Meanings to Programs," in Mathematical Aspects Qf Computer Science. American Math. Soc., 1967. [Francez79] Francez, N., et. al., "Semantics of Nondeterminism, Concurrency, and Communication JCS$ 19(1979), pp. 290-308. [Goguen78] Goguen, J.A., Thatcher, J.W., Wagner, E.G., "Initial Algebra Approach to the Specification, Correctness, and Implementation of Ab~:dract Data Types," in Current Trends in Programming Methodology, YQh ~ Qata Structuring. R.T. Yeh, ed., Prentice-Hall, 1978. [Good79] Good, 0.1., Cohen, R.M., Keeton-Williams, J., "Principles of Proving Concurrent Programs in GYPSY," 6th POPL, 1979. [Good82] Good, 0.1., "The Proof of a Distributed System in GYPSY," Technical Report 30, University of Texas at Austin, September 1982. [Gordon79] Gordon, M.J.C., "The Denotational Description of Programming Languages," Springer-Verlag, 1979. fGoree81l Goree, J.A., "Internal Consistency of a Distributed Transaction System with Orphan Detection," MIT /LCS/TR-286, Mass. Institute of Technology, 1981. [Greif75] Greif, I. "Semantics of Communicating Parallel Processes," MIT /LCS/TR-154, September, 1975. 1 [Guttag78] Guttag, J.V., Horowitz, E., Musser, D.R., "Abstract Data Types and Software Validation," CACM 21, 12(Dec. 1978), pp. 1048-1064. [Guttag80] Guttag, J., Homing, J., "Formal Specification as a Design Tool," 7th POPL, 1980, pp. 251-261. [Hailpern80] Hailpern, B.T., Owicki, S.S., "Verifying Network Protocols Using Temporal Logic," Technical Report No. 192, Computer Systems Laboratory, Stanford University, June, 1980. [Hailpem81] Hailpern, B.T., Owicki, S.S., "Modular Verification of Computer Communication Protocols," IBM Research Report RC8726, March, 1981. [Harel78] Harel, D., "Logics of Programs: Axiomatics and Descriptive Power," MIT LCS TR-200, May, 1978. [Hewitt??] Hewitt, C., Baker, H., "Laws for Communicating Parallel Processes," IFIP n, Toronto, August, 1977. -133 · [Hoare69] Hoare, C.A.R., "An Axiomatic Basis for Computer Programming," CACM, Vol. 21, October, 1969. [Hoare72] Hoare, C;A.R., Proof of Correctness of Data Representations, Acta Informatica 1, 4(1972) pp. 271-281. [Hoare78] Hoare, C.A.R., "Communicating Sequential Processes," CACM, Vol. 21, August, 1978. [Hoare81a] Hoare, C.A.R., Brookes, S.D., Roscoe, A.W., "A Theory of Communicating Sequential Processes," Technical Monograph PRG-22, Oxford University Computing Laboratory, May, 1981. [Hoare81b] Hoare, C.A.R., "A Model for Communicating Sequential Processes," Technical Monograph PRG-22, Oxford University Computing Laboratory, June, 1981. [Jones81] Jones, C.B., "Development Methods for Computer Programs Including a Notion of Interference," Wolfson College, June, 1981. [Jones83] Jones, C.B., "Specification and Design of (Parallel) Programs, .. IFIP 83. [Kahn74] Kahn, G., "The Semantics of a Simple Language for Parallel Processing," IFIP 74, pp. 471-475. [Kahn77] Kahn, G., MacQueen, D.8., "Coroutines and Networks of Parallel Processes," IFIP 77, pp. 993-998. [Kapur80] Kapur, D., "Towards a Theory for Abstract Data Types," MIT /LCS/TR-237, May, 1980. [Keller76] Keller, R.M., "Formal Verification of Parallel Programs," CACM 19,?{July 1976), pp. 371-384. [Lamport80] Lamport, L., "'Sometime' is Sometimes 'Not Never': On the Temporal Logic of Programs, .. ACM POPL 1980. [Lamport83] Lamport, L., "Specifying Concurrent Program Modules," TOPLAS, 1983. [Lansky83] Lansky, A.L., Owicki, S., .. GEM: A Tool for the Description of Concurrency Primitives and Verification of Concurrent Programs," PODC 83. [Liskov79] Liskov, B.H., "Modular Program Construction Using Abstrac~ions," MIT Computation Structures Group Memo 184, September, 1979. [Lynch81] Lynch, N.A., Fischer, M.J., "On Describing the Behavior and Implementation · 134 · of Distributed Systems," Theoretical Computer Science 13(1981 ), pp. 17-43. [Lynch83] Lynch, N.A., "Concurrency Control for Resilient Nested Transactions," ACM SIGACT-SIGMOD Symposium on Principles of Database Systems, Atlanta, March, 1983. [Milner80] Milner, R., A Calclulus o1 Communicating Systems. Springer Lecture Notes in Computer Science 92, 1980. [Misra81] Misra, J., Chandy, K.M., "Proofs of Networks of Processes," IEEE TOSE, Vol. SE-7, No. 4, July 1981. [Misra82] Misra, J., Chandy, K.M., Smith, T., "Proving Safety and Liveness of Communicating Processes with Examples," ACM PODC 1982. [Owicki76] Owicki, S., Gries, D., "Verifying Properties of Parallel Programs: An Axiomatic Approach," CACM 15, 5( 1976). [Parnas72] Parnas, D.L., "A Technique for Software Module Specification with Examples," CACM 15, 5(May, 1972), pp. 330-336. [Pneuli77] Pnueli, A., "The Temporal Logic of Programs," FOCS 1977. [Pratt82] Pratt, V.R., "On the Composition of Processes," ACM POPL 1982. [Rounds81] Rounds, W.C., Brookes, S.D., "Possible Futures, Acceptances, Refusals, and Communicating Processes," FOCS 1981. [Schwabe81 a] Schwabe, D., "Formal Techniques for the Specification and Verification of Protocols," Report No. CSD-810401, UCLA Computer Science Department, April, 1981. [Schwabe81 b] Schwabe, D., "Formal Specification and Verification of a Connection-Establishment Protocol," USC-ISi Tech. Apt. ISI/RR-81-91, April, 1981. [Schwartz81] Schwartz, R.L., Melliar-Smith P.M., "Temporal Logic Specification of Distributed Systems," Second International Conference on Distributed Systems, INRIA, April, 1981. [Sunshine78] Sunshine, C.A., "Survey of Protocol Definition and Verification Techniques," Computer Networks 2(1978), pp. 346-350. [Weih184] Weihl, W.E., "Specification and Implementation of Atomic Data Types," PhD Thesis, MIT, March, 1984. [Wing83] Wing, J.M., "A Two-Tiered Approach to Specifying Programs," -135 • MIT /LCS/TA-299, 1983. [Wirth71] Wirth, N., "Program Development by Stepwise Refinement," CACM 14, 4(April 1971), pp. 221-227. [Wolper82] Wolper, P., "Specification and Synthesis of Communicating Processes Using an Extended Temporal Logic," ACM POPL 1982. [Yonezawa77] Yonezawa, A., "Specification and Verification Techniques for Parallel Programs Based on Message Passing Semantics," MIT /LCS/TA-191, December, 1977. - 136- Appendix I - Formal Specification and Proof The purpose· of this appendix is to outline the way in which the informal state-transition specification and proof techniques used in this thesis can be formalized, perhaps to permit mechanically-assisted specification and verification. The major new concepts introduced to permit this formalization are those of an "event/state algebra" and an "implementation algebra." An event/state algebra is a heterogeneous algebra that embeds the machine part of a state-transition specification. An "implementation algebra" is a special kind of event/state algebra, which embeds the composite machine for an implementation, and which contains among its operations the abstraction and decomposition map for the implementation. The utility of event/state algebras and implementation algebras derives from the fact that associated with each event/state algebra A (and hence each implementation algebra as well) is a temporal logic language ~A), within which can be expressed properties of the computations of the emb.e dded machine. Each of the proof techniques presented in this thesis has the property that its hypotheses can be formalized in terms of the validity of verification conditions, which are sentences expressed in the temporal language of an appropriate event/state algebra. The problem of formalizing proofs that use the techniques of this thesis is thereby reduced to the following two problems: (1) Find a convenient method for describing event/state algebras. (2) Find a general method whereby the description of an event/state algebra A can be used to obtain a formal deductive system for deriving a large number of true statements about A, where these statements are expressed in the temporai language ~A). In this appendix, the following tasks are accomplished: (1) The notions of event/state algebra and implementation algebra are defined. (2) Precise semantics are given for the temporal language ~A) associated with an event/state algebra A. (3) An approach, based on set theory, for describing event/state algebras is sketched. It is indicated how, from the description of an event/state algebra A, an A-sound deductive system for the language ~A) might be obtained. (4) It is show how the various proof techniques presented in this thesis can be -137 • formalized in the language ':J{A) for an appropriate A. 1.3 Event/State Algebras Definition - An event/state algebra A is a heterogeneous algebra whose signature is of the form: .A' ... >, where >.A is a distinguished constant of sort Events A so that .A> is an interface, and is a machine, which we call the embedded machine and which we denote by MachA . I When there is only one event/state algebra under consideration, we will omit the identifying subscripts. The ellipsis in the signature of A indicates that A is permitted to contain additional sorts, relations, and functions besides those explicitly listed. The reason for permitting A to contain these additional sorts, relations, and functions, is to provide a mechanism by which the temporal language ,CA) can be made as expressive as desired. We now define precisely the syntax and semantics of the temporal language ':J{A) uf an eve11t/state algebra A. Let l:A be the signature of A. The signature l:A is required to contain distinguished sorts Events and States. In addition, we assume that corresponding to each sort of IA is a countably infinite collection of variables which we use to range over values of that sort. The language ':J{A) contains syntactic categories of "terms," "atomic formulas," and "formulas," which are defined by induction as follows: Terms: (1) The distinguished symbols Now and After are terms of sort States. (2) The distinguished symbol Occurs is a term of sort Events. (3) If v is a variable of sort S, then v is a term of sort S. (4) If t , .... tn are terms of sorts S , .... Sn, respectively, and f is an n-ary 1 1 function symbol of type S X ... X Sn - S, then f(t ••• tJ is a term of sort S. 1 1 Atomic Formulas: If t , ... , tn are terms of sorts S , ... , Sn, respectively, and R is an n-ary 1 1 relation symbol of type S 1 X •.. X Sn' then R(t ... tJ is an atomic formula. 1 Formulas: (1) An atomic formula is a formula. · 138 · (2) If q, and 1" are formulas, and vis a variable of sort S, then -,fl', q, v i/1, and (3vES)q, are formulas. (3) If q, is a formula, then Oq, is a formula. The sets of terms, atomic formulas, and formulas of ~{A) are the least sets with the properties listed. The first-order language L(A) is the sublanguage _o f 'a{A) obtained by omitting formation rules (1) and (2) under "Terms," and formation rule (3) under "Formulas." We treat the additional logical connectives A, -, -, 't/ as abbreviations in the usual way. In addition, the temporal operator◊ is regarded as an abbreviation for -,o-,. We use the notation t(v ••. vn > to denote a term t whose variables are a subset of 1 the set {v , ••• , vn}, and the notation ip(v •.• vn> to denote a formula whose free variables 1 1 are a subset of {v1, ••. , vn}. The notations t(t,tv ••• t/vn) and q,(t,tv ... tn/vn) denote 1 1 the result of substituting the terms t , ... , tn for free occurrences of the variables v , ... , 1 1 vn int and q,, respectively. Next, we dAfine the semantics of ~A). If Sis a symbol (sort nAmA, f11nr.tinn symhol: or relation symbol) in the signature of A, then we use SA to represent the denotation (set, function, or relation) assigned by A to the symbol S. Define an interpretation for a sequence v , ... , vn of variables of sorts S , ••• , Sn' respettively, to be a sequence a , ••• , 1 1 1 an of elements of A, where each a is of sort S • The semantics of 'a{A) are defined in two 11 11 parts. First, given an intepretation a , .•• , an for the free variables v , ... , vn, a term 1 1 t(v .••• vn) of sort S denotes a function t[a,tv ... anlvJ from Steps(EventsA, StatesA) to 1 1 SA, whose value on the steps = is defined as follows: {1) If tis Now, then t[a,tv ••• anlvn](s) = q. 1 If tis After, then t[a,tv ••• anlvJ(s) = r. 1 (2) If t is Occurs, then t[a ,tv ••• anlvn ](s) = e. 1 (3) If tis the variable v , then t[a,tv ... anlvn](s) = a • 11 1 11 (4) If tis f(t •.• tn), then t[a,tv ••• anlvn](s) = fA(b ... bn), 1 1 1 whereb11 = t11 [a,tv1 ••• an/vJ(s)foreachk. The second part of the definition of the semantics of 'a{A) is concerned with when a formula cp(v ... vn) is satisfied by a history X E Hist(EventsA, StatesA), and an 1 intepretation a , ••• , an for v , ... , vn. We abbreviate this as X ~ A cp[a,tv ... anlvn], or, 1 1 1 when the algebra A is clear from the context, as simply X ~ cp(a,tv ... anlvJ. 1 -139- Atomic Formulas: If cp is the atomic formula R(t ••. tm), where the free variables of each 1 ti< are in the set {v , ••• , vn}, then X t= cp[a/v ••• anlvn] iff ERA, where b 1 1 1 11 = tl<[a,Jv ••• an/vn](Stepx(0)) for each k. 1 Formulas: If cp is a formula, but not an atomic formula, then (1) If cp is-,"'• "' v x, or (3vES)'f, then satisfaction for cp is defined by induction in the usual way. (2) If cp is □"'• then X t= cp[a/v1 ••• anlvn) iff suffix {X) t= cp[a/v1 ••• anlvn] for 1 all t E (0, oo). Suppose that cp(v ••. vn > is a formula of S{A) and that 'I' is a set of formulas of S{A), 1 the free variables of which are a subset of {v , ••• vn}. We say that cp is a consequence 1 of 'I' in A, written + t= A cp, if whenever a , ... , an is an interpretation for the variables v , 1 1 •.. , vn, and X E Hist(EventsA, StatesA) is such that X ..,_ 1',[a,Jv ••• anlvn] for all "1 in -+, 1 then X t= cp[a,Jv ••• a/vn] as well. The formula cp is said to be valid in A, abbreviated 1 I=A cp, if cp is a consequence in A of the null set of formulas. A sentence of S{A) is a formula of S{A) that has no free variables. If cp is a sentence and "1 is a formula, then it is easily verified that cp t= A 'f iff I=A cp -+ ti,. The following result makes explicit the relationship between the preceding definitions and the usual semantics of first/order logic. Lemma 1.1 - Suppose that cp(v ••• vn> is a formula of S{A), containing no occurrences of 0 □. Suppose that A is an event/state algebra, that X € Hist(EventsA, StatesA), and that a1, .•• , an is an interpretation for the variables v , ••. , vn. Suppose X(0) = . Then 1 X t=A cp(a,Jv1 ••• anlvn] iff t=A cp[a/v1 ... anlvn' q/Now, e/Occurs, ,/After], where the latter is defined in the usual sense of first-order logic. Proof - Straightforward. I We recall here the definitions, given in Chapter 4, of the sentence Comp of S{A}. Comp= lnit(Now) A □Trans{Now, Occurs, After) Intuitively, X t= Comp iff Xis a computation for the embedded machine MachA . -140- We conclude this section with the following definition: Suppose that A is an event/state algebra, and Valid is a sentence of ~A). Then the state-transition specification defined by the pair is the state transition specification S = , where M = Mach A, and V = {XE Hist(EventsA, StatesA): X t= Comp A Valid}. 1.4 Description of Event/State Algebras In this section, we consider the problem of describing event/state algebras in such a way that a sound deductive system for ~A) can be obtained from a description of the event/state algebra A. It should be noted that this problem has already received a good deal of attention in the research literature under the heading of "Specification of Abstract Data Types." In spite of the effort that has been expended on this problem, there still does not seem to be an available description method that is convenient for the purposes of this thesis. Hopefully this situation will be rectified in the near future. The description technique we use here can be summarized as follows: We assume fixed in advance a standard "primitive" or "core" algebra with a sufficiently expressive first-order theory. Let C be the core algebra, and let T be its complete first-order theory, expressed in the language t{C). An event/state algebra is described by writing a collection of first-order axioms U in an extension L of L{C), that define an extension by definition of T. Such a collection of axioms defines a unique extension of the core algebra C to a model A of T U U. We wish to obtain an A-sound deductive system for the language .t.(A) { = .t.). Since we wish our description method to be powerful enough to describe algebras such as <.K, 0, 1, +, •>, which cannot be completely axiomatized, it seems unreasonable to expect the core theory T to be axiomatizable. If we fix in advance a deductive system that axiomatizes a usefully large fragment of T, though, then by augmenting this deductive system with the defining axioms u, we can hopefully obtain an axiomatization of a usefully large fragment of the complete first-order theory of A. In this thesis, we assume as the core theory some suitable variant of the theory of sets. Set theory is highly expressive, and this makes it easy to describe desired event/state algebras. However, if machine-assisted verification is a goal, then set theory might not be the most appropriate: it seems quite possible that some less expressive core theory would be more amenable to mechanization. • 141 · We next consider the problem of deduction in ~A). Given an event/state algebra description, which, as discussed above, we regard as denoting an extension by definition of an underlying set theory, we wish to be able to deduce a large class of A-valid formulas of ~{A). Suppose we could somehow transform an arbitrary sentence cp of ~A) into a sentence cp' of t(A) such that I== A " ... cp '. In other words, suppose that we could axiomatize the temporal operator □ and special symbols Now, Occurs, and After, in terms of the set theoretic notions oft. Then the problem of showing t== A 9', where cp E ~A), would be reduced to the problem of showing that TI== cp ' 1 where cp' E t(A) is the transformed version of cp. , It seems likely that the reduction described the preceding paragraph can actually be carried out, since the idea seems essentially the same as that used in the proofs [Harel78] of the "arithmetic completeness" of deductive systems for dynamic logics. Assuming that this idea works for the temporal logics ~A), this would give us a way of deducing all valid formulas of ~A), assuming we have available the complete theory of some model of set theory. Although we can never obtain a complete axiomatization of Mt theory, it seems likely that any of thA usual collections of axiom~ fnr ~t thAnry wn1_1lrl provide us with a deductive system for ~A) that is powerful enough to be useful in practice. In practice, to write down explicitly the collection of defining axioms that describe an event/state algebra A is cumbersome. It is convenient to introduce some notation for common constructions. We do this with the understanding that descriptions expressed in this notation stand for collections of first-order defining axioms. In general, the description of an event/state algebra can be divided into two parts: one, the definition of new sorts, and two, the definition of new function and relation symbols. We define new sorts by a set of defining equations that define the new sorts in terms of more primitive components. These equations take the form: S = g(S1, ••• , Sn)' where Sis the new sort being defined, S1' ... , Sn are the names of previously-defined sorts, and s is an expression within which various set-theoretic constructions can appear. These defining equations are analogous to the domain equations used in the denotational definition of the semantics of a programming language [Gordon79]; however, to ensure that a set of equations can be regarded as denoting a collation of defining axioms, we do not permit here the use of recursive equations. The · 142 - set-theoretic constructions (cartesian product, disjoint union, etc.) that appear on the right-hand sides of the defining equations introduce implicitly various "built-in" functions and relati9ns (projection, injection, etc.). The constructions we use, and their associated built-in functions and relations are listed below. Once the equations that define the new sorts have been given, we can use these sorts and their built-in function and relations to define additional functions and relations, in particular the initial state and state-transition relations for the embedded machine. These additional functions and relations are defined by writing defining axioms in the usual way. 1.4.1 Set-Theoretic Constructions Used In Defining Equations 1. (Enumeration) • The expression {a , ••. , an} denotes the n-element set 1 whose elements are the constants a , ... , an. 1 2. (Disjoint Union) - If A and B are sets, then the expression [tA: A + t : BJ 8 denotes the disjoint unlon D of the sets A and B. The tags tA and t are used to denote 8 the injection operations associated with the disjoint union. That Is, if a€ A and b € B, then tA:a denotes the image of a, and t8:b the image of b, In D. 3. (Cartesian Product) - The expression [tA: A x t8: B] denotes the cartesian product C of the sets A and B. Associated with an element c of C are its projections c(tA ) and c(t ) onto the sets A and B, respectively. Given a€ A and b EB, then the expression 8 denotes the ordered pair with components a and b. If c EC and a EA, then the notation c[a/tA] denotes the element c' of C which is identical to c except that its tA component has the value a. To reduce clutter in expressions, tags will be omitted from both the disjoint union and cartesian product constructions when this is unlikely to cause confusion as to the intended meaning. 4. (Function Space) - If A and B are sets, then the notation [A - BJ denotes the set of all functions with domain A and range B. We use the usual notation t(a) for the application of f to the argument a, and the notation f[b/a] for the function that is identical to f except that it has value b for argument a. 5. (Finite Powerset) - The notation Set[A] denotes the set of all finite subsets of the set A. Ifs E Set[A] and a EA, then the expression aEs is true iff a is an element of the sets. The expression Isl denotes the cardinality of the set s. We also use the usual operations u, n, and - on Set[A]. The notation MSet[A] denotes the set of all finite -143 • multisets of elements of A. We use the same notation for operations on multisets as for sets, however, the meaning appropriate for multisets is assumed in this case. 6. (Finite Sequences) - The notation Seq[A] denotes the set of all finite sequences (i.e. strings) of elements of A. If u, v € Seq[A], then lul denotes the length of u, uv and u•v denote the concatenation of u and v, and if n € Nat, then u(n) denotes the n + 1st element of u. 1.4.2 Definition of the State-Transition Relation Manipulation of the state-transition relation is sometimes more convenient if its defining axioms are factored into a collection of pairs, each of which consists of a precondition, and a next-state predicate. The precondition defines the class of events to which the pair applies, and defines conditions on the current state that must be satisfied before an event in that class can occur. The next-state predicate determines the relation that must hold between the current state and the new state that results from an occurrence of such an event. Although the basic idea of precondition/next-state predicate pairs is fairly simple, some subtleties arise in actual use, especially associated with the interpretation of free variables common to the two predicates. This problem is similar to that which arises in the interpretation of free variables in the pre- and post-conditions used to specify sequential programs. We must therefore be somewhat more careful about the precise form and meaning of the pairs. A pair takes the form: , where e is a variable of sort Events, q and rare variables of sort States, and i. is a vector of free variables of sorts S, where S can be chosen arbitrarily for each pair. A finite collection , ... , , where the l. and Nex1c,(q, r, l J E , = q. What the above definition says is that a step satisfies the state transition relation Trans iff there exists a pair be a finite-length vector of event/state 1 algebras (corresponding to the component modules). Definition - An implementation algebra for Aabs and A is an event state algebra A with the following properties: (1) A e,nb~s A abs anu 1::acii A , with i ~ ; ::; n. For each ~• i u, oper aiion S of 1 A bs (resp. A;}, we write sabs (resp. S') for the corresponding sort or operation of A. 8 (2) A contains distinguished functions a: Events - Events8t>s 81: Events - Events', for 1 < ; < n .,, abs: States - States•bs ",: States - States', for 1 :S; < n, such that: , A = is an interconnection, called the embedded interconnection, MachA is the composite machine for 'A' MachAabs and ~. . 1; and "abs and the", are the canonical projections from the cartesian product States to the factors States8bs and States1, respectively. I 1. A function, mapping each sort, function symbol, and relation symbol of the signature of 8 to a corresponding sort, function, symbol, or relation symbol of the signature of A, that preserves relevant structure such as the -arity of the symbols. - 146- Since an implementation algebra is a particular kind of event/state algebra, it has an associated temporal language. Furthermore, the temporal language ~A) associated with an implementation algebra A contains the temporal languages c:J{A bs) and each 8 ~A;) as sublanguages. This property is what makes an implementation algebra useful for expressing correctness conditions. The description of an implementation algebra is performed in the same way as for ordinary ·event/state algebras. The meanings of many the symbols are fixed by the definition of an implementation algebra, and in practice It is convenient to omit their defining axioms. For example, the definition of the sort States is fixed by the requirement that it be the cartesian product of the sorts Statesk: States = ["abs: States•bs X 1 ,r : States X ... X wn: States"]. 1 Other examples of symbols whose meanings are fixed .b y the definition of an implementation algebra are the initial state relation lnit, and state-transition relation Trans for the composite machine. Definitions must always be explicitly given for the sort Events, the abstraction map a and the components 8; of the decomposition map. 1.6 Proof Techniques 1.6.1 Formal Correctness Theorem In this section we reduce the problem of proving the correctness of an implementation to the problem of showing the validity of a set of verification conditions, which are expressed in the temporal language associated with the implementation algebra. There are three verification conditions in the technique introduced here. The "invariance" verification condition expresses that the predicate Inv is an implementation invariant. The "maximality" verification is a straightforward formalization of the the maximality condition required by the Correctness Theorem, except that the phrase "q is reachable for the composite machine" is replaced by "/nv{q) holds." The "validity" verification condition is the formalization of the validity condition required by the Correctness Theorem. Recall that the validity condition required by the Correctness Theorem states that, if X is a computation for the composite machine that projects, under the canonical projections associated with the composite machine, to a valid computation for each component machine, then X projects to a valid computation for the abstract machine as - 147 - well. This condition cannot be formalized directly as a sentence in the temporal language of the implementation algebra, since that language has no constructs for dealing directly with histories and functions on histories. However, the language does contain the function symbols a, ;Er "abs' and ;f.t' which denote the abstraction map, components of the decomposition map, and canonical projections on the state set, respectively. To formalize the validity verification condition, we need some way of taking the sentences that express the conditions required for a computation of the abstract machine or a component machine to be valid, and "lifting" these sentences to sentences that express the corresponding properties on computations of the composite machine. In Chapter 4 we defined a syntactic translation that accomplished this lifting in the case of the synchronizer implementation. We now define this translation in general, and state a lemma that summarizes its useful properties. Suppose that A is an implementation algebra for Aabs and ,Er Given a formula cp of ~Aabs>• define fcp]abs to be the formula of ~A) obtained by replacing each occurrence of the symbol Now by the term ,, abs(Now), each occurrence of After by the term.,, abs(After), and each occurrence of Occurs by the term a(Occurs). Similarly, for each i E /, given a formula cp of ~A,), define (cp); be the formula of ~A) obtained by replacing each occurrence of Now by ";(Now), each occurrence of After by ,,,(After), and each occurrence of Occurs by 8(Occurs). 1 The precise relationship between a formula and its translation is captured by Lemma 1.2 below. An analogous result is stated in [Wolper82], where process of "lifting" specifications of processes to obtain specifications of a system of processes is catled "relativization." Lemma 1.2 (Translation Lemma) - Suppose that A is an implementation algebra for A bs 8 and ,E.r Suppose that cp(v ••• vm > is a formula of '!T(Aabs) (resp. ~A;), for some i E /), 0 that a , ••• , am is an interpretation of the variables v , ... , vm • and that X is a history over 0 0 Events A and StatesA. Then X l==A lff>Babs[aofv0 ••• amlvm] iff x,Er Suppose that Valid bs is a sentence of S{A b ), and for each 8 8 9 i, Valid, is a sentence of S{A ). Let Sabs be the state-transition specification defined by 1 the pair , and for each i, let S be the state-transition specification defined 1 by the pair is correct. (Invariance): (Basis) I= (VqEStates)(lnit(q) - lnv(q)) (Induction) I= (Vq,rEStates, eEEvents)(Trans(q, e, r) - (lnv(q) - lnv(r))) (Maximality): I= (VqEStates, eEEvents)((lnv(q) " A,€, Enabled (q, e)) - Enabled bs(q, e)). 1 8 (Validity): -149- where Enabledabs(q, e) = (3rEStates)Trans8 5b (q bs' a(e), ra > 8 Enabled;(q, e) = (3rEStates)Trans1(q , a,(e), r). 1 Proof - The basis part of the invariance verification condition states that Inv is true for all initial states, and the induction part of the invariance verification condition states that Inv is preserved under state transitions, and hence the truth of these two conditions implies that Inv is inductive. From the definition of the predicates Enabledabs and Enabled,, we know that Enabled bs(q, e) is true of a state q and event e iff a(e) is enabled for Mach A in state 8 abs qabs' and similarly, Enabled (q, e) is true iff a,(e) is enabled for MachA, in state q,. The 1 maximality verification condition therefore says that whenever q is a state such that lnv(q) holds, and a,(e) is enabled for Mach A. in state q for each; with 1 <; < n, then a(e) . I 1 is enabled for MachA in state qabs. This implies the maximality condition required by abs the Correctnes.q Theorem. By the Translation Lemma, we know that ffValid tJabs is satisfied by a computation 8 X of MachA iff Validabs is satisfied by the computation x. I 1.7 Rely-/Guarantee-Condition Proof Techniques In this section we give the formalized versions of the rely-/guarantee-condition proof techniques stated in Chapter 3. The first result formalizes Lemma 3. 11. Corollary 1.4 (Formal Rely/Guarantee Technique I) - Suppose that A is an implementation algebra for Aabs and ;E.r Suppose that Validabs = Relyabs - Guarabs · 150 - is a sentence of 9{A abs ) and that Valid., = Rely - Guar., for each i E I is a sentence of 1 9{A;). Suppose that (1) Comp I= (,";E, ffGuar1Il;) - ffGuarabsBabs' and (2) There exists a well-founded partial order < on I such that for all ; € I, Comp I= llRelyabsDabs /\ (/\;<; ffGuariDi) - (Rely,);. Then Comp I= (/\E, nvalid,D,) - nvalidabsDabs' Proof - Straightforward from Lemma 3.11. I The next result formalizes Lemma 3.12. Corollary 1.5 (Formal Rely/Guarantee Technique II) - Suppose that A is an implementation algebra for Aab& and K,. Suppose that Validabs = Relyabs - Guarabs is a sentence of ':J(Aabs) and that Valid, = Rely - Guar, for each ; EI is a sentence of 1 ~A,). Suppose that for each I,;€ I U {abs}, we have determined a sentence RG j of 1 ~A), such that properties (1 )-(3) below hold. (1 )(a) Comp I= (Rely absDabs - A;f.1 RGabsJ (b) Comp I= /\1f.1 RG,,abs - ff Guara i.lam (2){a) Comp I= RGabsj /\ A,E, + {abs} RG1J - fRely;Jr for all j € / (b) Comp I== IG uar;J1 - RG1,abs /\ A1f., + {abs} ~G,1, for all i € / (3) (Acyclicity) - Whenever {, , ... , , where the sets of inputs and outputs of E are defined by the unary relations In and Out of type Events in A. Suppose that the event/state algebra A includes among its operations the finite collection of relations ,f.,• where Prod, is of type States X Events x States. If the following sentences of «:J(A) are valid, then S is . · 151 · 9 -consistent. 2 (1) t= A,E, (Vq,rEStates, eEEvents)(Prod,(q, e, r) - Trans(q, e, r)) (2) t= (VqEStates, eEEvents)(ln(e) - (3rEStates)Trans(q, e, r)) (3) t= {Vq,rEStates, eEEvents)(Trans(q, e, r) A {OUt(e) v e = A) - v,E, Prod,(q, e, ,)) (4). Comp t= (A,E, Fair,) - Valid, where Fair, = O◊Enabled1{Now) - D◊Prod1(Now, Occurs, After). Enabled (q) = (3rEStates, eEEvents) Prod (q, e, r) 1 1 Proof - Straightforward from Corollary 5.8. Hypothesis (1) says that the Prod, are subsets of Trans. Hypotheses (2) states that Mach A is input-cooperative. Hypothesis (3) states that the Prod cover the set of nonnulf output or A-steps in Trans. Hypothesis (4) 11 formalizes the requirement that every fair computation of Mach A is valid. I -152 • Appendix II - Additional Examples In this appendix the specification and- verification techniques introduced in the thesis will be further illustrated through two additional examples. The first example concerns the specification and implementation of a resource manager module whose function is to allocate resources in response to requests from user processes. The resource· manager is implemented in a highly distributed fashion by a tree-structured system of local resource manager modules that communicate with each other to determine where resources should be sent. In the second example, a reliable message transmission service is specified, and an implementation by an unreliable message transmission substrate is given. Reliability is achieved through the use of a fault-tolerant protocol: the alternating bit protocol [Bartlett69]. The alternating bit protocol example has been examined by several other researchers [Chen82, Hailpern80, Lamport83, Schwartz81 ], and has become somewhat of a standard for evaluating specification and verification techniques for concurrent systems. The major purpose of the additional examples given here is to lend support to the following assertion: Essentially the same techniques as were used to obtain specifications and a correctness proof for the synchronizer implementation, can be applied in a reasonably systematic way to achieve similar results on other nontrivial examples. Thus, the ideas of state-transition specification, rely- and guarantee-conditions, and the proof technique embodied In the Correctness Theorem, are not ad hoc concepts useful for a single example, but serve as generally applicable guiding principles. A second point illustrated by the examples of this chapter is that more elegant specifications can result if one first imagines the structure of a proof of correctness in which the specifications will be used, and then derives the module specifications in an attempt to satisfy the requirements imposed by the proof structure. The difference between specifications obtained via this approach and those resulting from the "specify first, prove later" approach can be seen by comparing the validity conditions given here for the send and receive protocol modules with the liveness properties given by Lamport [Lamport83J for these modules. The specifications and proof given below are to a large extent independent of the precise assumptions on the behavior of the unreliable transmission medium. Lamport's presentation does not make this independence quite - 153 · so explicit. The observation that a proof of correctness can be used to derive component module specifications suggests the following general method for designing a correct implementation of a given abstract module: (1) Decide on the communication structure of the system of component modules (e.g. tree or ring structure). (2) For each pair of component modules that can possibly communicate, express informally the properties that each relies on/guarantees to the other to provide. These rely- and guarantee-conditions will serve to "cut" the interdependence of the component modules in a fashion similar to the way in which a loop invariant cuts the dependence of one iteration on preceding and succeeding iterations. (3) Select event and state sets for the component modules in such a way that the temporal language of the resulting implementation algebra is powerful enough to formally express the informally stated rely- and guarantee-conditions. (4) "Localize" the rely- and guarantee-conditions so that they are expressed in the temporAI language of each component module event/state AIQAhrA. Th~ rf2.ly- and guarantee-conditions of a resulting component module specification will be the conjunction of the localized rely- and guarantee-conditions, respectively. The examples in this appendix will be presented using the notation of Appendix I. 11.9 A Distributed Resource Management Algorithm In this section, we consider the specification and implementation of a resource manager module RM, whose function Is to allocate resources to a set of clients In response to requests from those clients. We will see how the resource manager can be implemented by a tree-structured network of local resource manager (LRM) modules, each of which communicates with a single client. Initially each local resource manager starts out with some subset of the resources. As client requests arrive and are filled at a particular site, though, the locally available set of resources might be exhausted. An LRM that is deficient in resources must then attempt to obtain additional resources from other sites. The Interesting part of the implementation is concerned with how the local resource managers communicate with each other to determine where the resources should be sent. The strategy by which this is accomplished is essentially the '"DYNAMIC-MATCH" strategy of [Fischer83], although this stategy is explained here in a -154 · slightly different and hopefully simpler way than in that paper. The resource manager example is presented here as a nontrivial exercise in the use of rely-/guaran"tee-conditions and an associated correctness argument as a basis for the derivation of specifications for the local resource manager modules. The use of rely-/guarantee-conditions as a guiding principle permits us to derive, in a reasonably systematic fashion, essentially the same specification for the local resource manager module as the node algorithm presented in [Fischer83]. The primary difference between the specification derived here and the algorithm of [Fischer83] is that we are not concerned here with the way in which an LRM resolves choices as to the pattern in which excess requests are forwarded to its neighbors. In (Fischer83], it is assumed that choices are resolved according to a specific probability distribution, and a large portion of the paper is concerned with probabilistic analysis of the consequences of this assumption. Here we concern ourselves only with showing that every request from a client is eventually satisfied, if possible. The argument provided in [Fischer83] of this basic correctness property is more of a proof sketch than a proof, and is somewhat unsatisfar.tory for this reason. 11.9.1 Specification of the Resource Manager Module The function of the resource manager module RM can be described as follows: Let Clients be a set that_contains the names of the clients with which the resource manager communicates, and let Resources be a set that contains the names of the resources to be managed. A client c requests a resource from the resource manager by issuing a request event request:c. The resource manager allocates a resource r to client c by issuing a reply event reply:. In this example, a resource that has been allocated to a client is never returned to the resource manager. The state of the resource manager can be thought of as consisting of a pair , where pending is a multiset of clients that represents the collection of unfilled requests and free is the set of available resources. The pending component is a multiset since we permit more than one request from a single client to be outstanding at one time. Receipt of a request from client c by the resource manager causes an instance of c to be added to the pending multiset. The event reply: can occur only if the client c is in the pending multiset and the resource r is in the free set. Occurrence of this event causes an instance of c to be removed from the pending set and the -155 · resource r to be removed from the free set. It is clear from this description that no resource is allocated more than once and no more than one resource is allocated in response to each request. In addition, we would like the resource manager to respond eventually to every request, as long as the set of free resources has not been exhausted. To derive a more precise specification from the preceding informal description, we begin by defining the resource manager event/state algebra. Our description has the following·as parameters: Clients: a finite set of clients Resources: a finite set of resources The interface of the resource manager is defined as follows: EventsRM = {X} + [request: Clients + reply: (Clients x Resources)]. lnRM = {X} + (request: Clients] OutRM = {X} + [reply: (Clients x Resources)] The state set for the resource manager is defined by: StatesRM = [free: Set[Resources] x pending: MSet[Clients]]. In an initial state, the multiset of pending requests is empty, and all resources are free. lnitRM(q) = q(free) = Resources /\ q(pending) = 0. The state-transition relation TransRM is defined by precondition/next-state predicate pairs as follows: A request event for client c can occur at any time, and causes c to be added to the pending set. (request) Prerequest(q, e, c) a e = request:c Nextrequest(q, r, c) = r = q[(q(pending)U{c})/pending] A reply event with resource res for client c can occur only if res is in the free set and c is in the pending multiset. It causes res to be removed from the free set and an instance of c to be removed from the pending multiset. (reply} Prerep1y(q, e, c, res} a e = reply:" c € q(pending) /\ res E q(free} Nextrep1yCq, ,, c, res) = r == q[(q(pending)-{c})/pending, (q(free}-{res))/free] - 156- The validity conditions for the resource manager module can be stated in rely-/guarantee-condition form as follows: ValidRM = RetyRM - GuarRM, where RelyRM = □(INow(free)I ~ fNow(pending)I) GuarRM = D(VcEClients)(c E Now(pending) - ◊(3rEResources}(Occurs = reply:)). Thus, if the number of outstanding requests never exceeds the number of available resources, then the resource manager module guarantees that every request will eventually receive a reply. 11.9.2 Implementation of the Resource Manager Our plan is to implement the resource manager module by a tree-structured network of local resource manager modules as depicted in Figure 3. Each local resource manager is responsible for filling requests originating from a single client. If the set of resources locally available is exhausted, the the LRM must try to obtain additional resources from elsewhere in the system. If an LRM has a surplus of r~sourlje!9-, then it must be willing to give ·out resources to other LRM's ·1,Nhose resources have already been allocated. To guide us in our derivation of the components that will be needed as part of an I LRM state, let us first obtain a rough statement of the validity conditions that an LRM is to satisfy. We organize these conditions into properties the LRM relies on its environment to provide, and properties that an LAM guarantees to its environment in return. An LAM relies on: (1) No special properties on the part of the client. (2) The eventual elimination of resource debts owed to the LRM by its parent. (3) The eventual elimination of resource debts owed to the LRM by each of its children. In return for these properties, an LAM guarantees that: (1) Every client request eventually receives a reply. (2) Resource debts owed by the LAM to its parent will eventually be eliminated. (3) Resource debts owed by the LRM to each of its children will eventually be eliminated. -157 • Fig. 3. Resource Manager Implementation Qlent C2 Qlent c, Qlent ~ Ck L. C11 Qlent C12 Resource Manager Module To obtain formal statements Of the preceding conditions, we must first obtain a precise definition of the notion of an LRM having a "resource debt" to one of its neighbors, and we must describe the mechanics of how such debts are incurred and eliminated. The introduction of the various components of the LAM state below can be viewed as providing us with enough expressive power In the language '{A..,.., of the LRM event/state algebra, to permit the formalization of the undefined quantities in the -158- above statement of the LAM validity conditions. A significant feature of the validity conditions stated above is the complementary form of the rely- and guarantee-conditions. The conditions above have been selected in such a way that ultimately, in the resource manager implementation, the conditions relied upon by an LAM ; from its neighboring LAM j will be precisely the conditions that LAM j guarantees to provide to LAM i. This symmetric statement of the validity conditions will be seen below to result in a rather simple and pleasant proof of correctness. With the above validity conditions in mind, we now attempt to identify the various events of the LAM interface and the components of the LAM state. We can identify immediately several kinds of events that must be in the interface of the LAM. Communication with the client requires the existence of a request event request, which represents the receipt of a request from the client, and a reply event of the form reply:,, in which resource r is allocated to the client in response to a prior request. Furthermore, the interface of an LAM must contain events corresponding to the transfer of resources between an LAM and its neighbors in the system. Let Resources be the set of names of all the resources that the LAM might be called upon to handle. For each,€ Resources, the LAM interface includes the event parent_jn:r, which represents the receipt of resource r from an LRM's parent in the tree, and parent_out:r, which represents the delivery of resource r by an LAM to Its parent. Let Children be a set of names used to index the children of the LAM. For each c € Children and r E Resources the interface of the LAM includes the event child_out:, which represents the transfer of resource r from the LAM to child c, and the event child_jn:, which represents the receipt of resource r by the LAM from child c. To describe the conditions under which transmission of resources between LRM's and between a client and an LAM is permitted, we include in the state of each LAM a set free, which represents the resources locally available at the LAM, and a nonnegative Integer pending, which counts the number of unfilled requests that originated at the client associated with the LAM. A request event causes pending to be incremented. A reply:r event can occur only if pending is nonzero and r € tree, and causes pending to be decremented and r to be removed from free. The resource transmission events parent_jn:r and chi/d_jn:r cause r to be added to the set free. The events parent_out:r and child_out: can occur only if r E free, and cause r to be removed from free. - 159- We have thus settled the issue of how and when requests and replies are transmitted betweeen an LRM and its client, and how resources are shuttled between LRM's. However, we have not yet determined how and when an LAM should request resources from one of its neighbors, or when an LAM should issue resources to a neighboring LRM. To describe the conditions governing the transmission of resources between LRM's, we introduce a few more components into the state of an LRM. The state of each LAM contains a component p_t,alance, and a component c_t,alance:c for each child c. The component p_t,alance represents the instantaneous "balance of payments" between the LRM and its parent, and c..balance:c represents a similar balance of payments between the LRM and child c. A positive balance represents a number of resources owed to the LAM by its neighbor, and a negative balance represents a number of resources owed by the LAM to its neighbor. These balances will be maintained so that the following relation is invariant: If pis an LAM with child c, then the c..balance:c component of the state of LAM p is always the negative of the p_balance component of the state of LAM c. This reflects the idea that resources owed by p to c can be viewed as a debit from the point of view of p, or as a credit from the point of view of c. These balances will be updated appropriately as requests are forwarded, and as resources travel between LRM's in payment of debts. An LRM will transmit resources to its neighbor in an attempt to reduce its indebtedness. To represent the forwarding of requests between LAM's we introduce the following additional kinds of events into the LAM interface: A forward_in event represents the receipt by the LRM of a forwarded request from its parent. Similarly, a forward_out:c event corresponds to the forwarding of a request by the LRM to child c. The event reject_out represents the forwarding of a request by the LRM to its parent, and the event rejectJn:c represents the receipt of a forwarded request by the LAM from child c. We use the terminology reject for the forwarding of requests upward in the tree to emphasize the asymmetry inherent in the parent/child relationship. In determining the conditions under which forwarding and rejection events should be permitted to occur, we must attempt to avoid the following two bad situations: (1) We must avoid the deadlock situation in which two LAM's are stubbornly requesting resources from each other, while each of their resource requirements could be fulfilled by resources from elsewhere in the system. (2) We must avoid the "livelock" situation in which a request is continually shuttled back and forth in the system without ever -160- reaching an LRM with available resources. Our proposal for resolving these difficulties is to have each LRM keep estimates of the number of surplus resources available in the subtree headed by_ each of its children. These estimates are to be optimistic in the sense that the estimate held by an LRM for child c is at all times an upper bound on the number of surplus resources actually available in the subtree headed by c. Situation (1) is avoided by having an LRM request resources from its parent only in the case that it has no resources locally available and there are no surplus resources left in any of the subtrees headed by its children. Situation (2) is avoided by requiring that an LRM only send a request to a child c if it estimates that there is a surplus of resources in the subtree headed by c. The effect of these two requirements is to ensure that the following invariant holds: If an LRM p owes resources to its child LRM c, then the number of resources owed by p to c is a lower bound on the instantaneous amount by which pending requests exceed available resources in the subtree headed by c. Thus p never owes more resources to c than are actually required by e's subtree. The balances of payments between an LAM and each of its neighbors can be ~ombinP.d with the numMr of pP.nding re<111e~t~ And locally avAilAhle resnurces to produce a quantity PBalance, which represents the projected net number of resources (positive = surplus, negative = deficit) that would be left at the LRM after all debts are paid. The quantity PBa/ance, defined formally below, is informally the number of free resources, plus the net number of resources owed to the LRM by its neighbors, minus the number of pending requests. The forwarding and rejection of requests by an LRM to its neighbors is done with the goal of "getting in the black;" that is, reducing the projected deficit. The remaining components we need as part of the LAM state are the following: For each child c, the state of an LRM contains a component c_estim{c) which is an integer that represents the optimistic estimate made by the LRM, of the projected number of resources that would be available in the subtree headed by child c, once all debts have been paid. If c is an LRM whose parent is p, then the state of c also contains a component p_estim, which is a local copy of the c_estim(c) component of the state of LRM p. Thus, not only does an LRM keep estimates of the projected number of resources remaining in the subtrees headed by each of its children, but it also keeps track of what its parent must currently estimate as the projected number of resources remaining in the subtree headed by the LRM. We permit p_estim and c_estim{c) to take · 161 · on arbitrary integer values, although it can be shown that if an LAM is used only in a system of other LAM's in the way we envision, then p_estim and c_estim(c) are invariantly nonnegative. The important points of the preceding discussion of the LAM events and states can be summarized as follows: (1) The LAM interface contains events corresponding to requests from and replies to the client, transferring of resources from/to its neighbors, and forwarding and rejection of requests. (2) An LAM state contains a set free of locally available resources and a count pending of outstanding requests from the client, to ensure that every request receives a response and that no resource is allocated more than once. (3) An LAM state contains a record of its "balance of payments" with each of its neighbors. Transfer of resources and requests between LAM's is performed to reduce indebtedness. If p and c are neighboring LAM's, then the balance kept by p tor c is the negative of the balance kept by c for p. (4) An LAM statP. cnntainR an 1:tStimate of th'3 projected net number of r~sources that would remain, once all_debts have been paid, in the subtrees headed by each of its children. This information is used to control the forwarding and rejection of requests. If p is the parent of c, then c maintains a local copy of p's estimate of the projected number of resources remaining in the subtree headed by c. 11.9.3 Local Resource Manager Specification From the informal discussion of the preceding section, we can derive a precise local resource manager specification. In the informal discussion above, we made no distinction between the root LAM and the other LAM's in the system. Although similar in many respects, the precise specifications of these two kinds of LAM's will be slightly different since a root LAM has no parent. To avoid redundancy, the specifications of the two kinds of LAM will be presented simultaneously, with differences pointed out along the way. The parameters of the LAM are the following: Children: a finite set of children Resources: a finite set of resources IR esources: the subset of Resources held initially by the LAM • 162 • {estimc: c E Children}: initial estimates of the number of resources in the subtrees headed by each of the children. The set Children is a set of names used to identify the children of the LAM. The set Resources is a set of names for all of the resources that the LAM might have to deal with. This set includes the names of all resources initial!~ held by the LAM, as well as all resources that might be transmitted to the LAM at some later instant by its neighbors. The set !Resources is a subset of Resources that represents the set of resources initially available at the LAM. For each c E Children, the parameter estimc is a nonnegative number which the LAM uses as its initial estimate of the projected number of resources remaining in the subtree headed by child c. Since there will be no debts in an initial state, correct use of an LAM requires that each estimc equal the actual number of resources initially available in the subtree headed by child c. The interface of a node LAM is defined as follows: EventsNLRM = {A} + [CEvent + SEvent] lnNLkM = {A} + [CIEvent + SIEvent] OutNRLM = {A} + [COEvent + SOEvent], where CEvent = CIEvent + COEvent SEvent = SIEvent + SOEvent and CIEvent = (request} COEvent = [reply: Resource] SIEvent = [reject_in: Children + forward_in + parenLln: Resource+ child_in: [Children x Resource]] SOEvent = [reject_out: + forward_out: Children + parenLout: Resource + child_out: [Children x Resource]] -163- The events listed above have the following intuitive meanings: Client events are those in which the LAM communicates with the client, whereas system events are those in which the LAM ~ommunicates with other LAM's. The client events are classified into request events, in which a request is received from the client, and reply events, in which a resource is sent to the client in response to a prior request. The system events are classified into: forwarding events (forward_out, forward_in), in which a request is forwarded from' an LAM to one of its children; rejection events (rejecLout, rejecLin), in which a request is rejected from an LAM to its parent; and resource transfer events (parenLout, parenLin, child_out, child_in), in which a resource is transferred from an LAM to one of its neighbors. The "_in" and "_out" suffixes denote the direction in which resources or requests flow; thus, forward_out:c is the event in which a request is forwarded from an LAM to child c, whereas forward_in is the event in which a forwarded request is received by an LAM from its parent. The interface EventsRLRM of a root LAM is obtained by omitting the forward_in, parenLout, rejecLout, and parenLin events. The state set for both a node and a root LAM is defined as follows: StatesLRM = [tree: Set[Aesource], pending: Nat, p_balance: Int, c_balance: [Children - Int], p_estim: Int, c_estim: [Children - Int]]. The set free is the set of resources currently available at the LAM. The number pending is a counter that records the number of outstanding requests. The quantity p..balance records the net number of resources that the LAM either is promised by its parent, or promises to send to its parent. If p..balance > o. then the LAM is promised resources by its parent; if p..balance < 0, then the LAM promises to send resources to its parent. The mapping c..balance records similar information for each of the children. The mapping c_estim records the estimate of the projected number of remaining resources in the subtree headed by each child. The quantity p_estim is the LAM's local copy of its parent's estimate for the subtree headed by the LAM, as discussed above. -164 · The initial state relation for the LRM is defined below. Recall that we view a finite multiset over a given universe as a function that assigns a finite multiplicity to each element of the universe. Lambda-notation has been used below as a shorthand for denoting particular multisets. lnifRM(q) = q = .cEChildren)(0) p_estim: IIResourcesl + l:c€Children estimc, c_estim: (>.cEChildren)(estimc)> Thus, in the initial state, all resources in !Resources are free, no requests are pending, no resources are promised by/promised to any of the neighbors, and the estimated surplus of resources in the subtree headed by the LRM is the sum of the number of free resources initially at the LRM, plus the sum of all the initial estimates for the subtrees headed by each of the children of the LRM. We can now give the formal dP.finition of the qu~ntity PBalance discussed above. PBalance(q} = lq(free}I - q(pending} + q(p_balance) + l:c€Chlldren q(c_balance)(c). As discussed above, given a state q, PBalance(q) represents the net number of resources (positive = surplus, negative = deficit) that would be left at the LRM after all debts are paid. The state-transition relation TransNLRM for a node LRM is defined as follows: An incoming request from a client gets recorded as pending. (request} Prerequest(q, e) a e = request Nextrequest(q, r) = r = q[(q(pending) + 1)/pending] A resource res can be sent to the client if there is at least one pending request, and res Is in the set of free resources. The resource res is removed from the set of free resources, and the number of pending requests is decremented. (reply) Prerep1y(q, e, res) E e = reply:res " res € q(free) A q(per.ding) > 0 · 165 · Next,ep yCq, ,, res) = r = q[(q(free)-{res))/free, 1 (q(pending)-1 )/pending] Receipt of a forwarded request from the parent means that the LAM promises to send one more resource to the parent, and consequently, that the LAM estimates a surplus of one fewer in its own subtree. (forward_in) Pre,orward_in(q, e) = e = forward_in Next,o,ward in(q, r) = r = q((q(p_balance)-1 )/p_balance, (q(p_estim)-1 )/p_estim] A request can be forwarded to child c only if the LAM currently is "in the red" and estimates a surplus of resources in the subtree headed by child c. As a result of forwarding the request, the number of resources promised by child c is incremented, and the estimated number of resources in the subtree headed by c must be decremented. (forward_out) Pre,orward out(q, e, c) = e = forward_out:c A PBalance(q) < O A q(c_estim)(c) > 0 Next,orward_out(q, r, c) = r = q[(q(c_balance)(c) + 1)/c_balance(c), (q(c_estim)(c)-1)/c_estim(c)] Receipt of a rejected request from child c means that child c promises to send one fewer resource (or requires one more resource) than it did before, and thus the quantity c...ba/ance(c) must be decremented. In addition, the fact that a request has been rejected by c means that the resources in the subtree headed by c have been exhausted, and thus c_estim(c) should be set to zero. (rejecLin) Pre,eiecLin(q, e, c) = e = rejecLin:c Next,eiecUn(q, r, c) = r = q[(q(c_balance)(c)-1)/c_balance(c), 0/c_estim(c)] A request can be rejected to the parent only if the LAM is "in the red" and there is no projected surplus in any of the subtrees headed by children of the LRM. By rejecting a request, the LAM promises one fewer resource to its parent, and hence reduces its projected deficit. In addition, p_estim must be zeroed to maintain the invariant equality - 166- between p_estim and the corresponding c_estlm component of the parent LAM. (reject_out) Prereiect_out(q, e) = e = reject_out A PBalance(q) < 0 A (VcEChildren)(q(c_estim)(c) < 0) Nextreject_out(q, r) = r = q[(q(p_balance) + 1)/p_balance, 0/p_estim] The various resource transfer events occur when an LAM owes a debt and has an available ·resource. Their effect is to cancel out some of the debt. (parent_in) PreparenUn(q, e, res) = e = parenLln:res NextparenUn(q, r, res)= r = q[(q(free)U{res})/free, (q(p_balance)-1 )/p_balance] (parent_out) Preparent_out(q, e, res) a e = parent_out:res) A res E q(free) A q(p_balance) < 0 Nextparent_out(q, r, res) = r = q[(q(free)-{res})/free, (q(p_balance) + 1) /p_balance] (child_in) Prechild_in(q, e, c, res) = e = child_in: Nextchild_in(q, r, c, res) = r = q((q(free)U{res})/free, (q(c_balance)(c)-1)/c_balance(c)] (child_out) Prechild_ou (q, e, c, res) = e = child_out: A 1 res E q(free) A q(c_balance)(c) < O Nextchitd_ou,(q, r, c, res) a r = q[(q(free)-{res})/free, (q(c_balance)(c) + 1)/c_balance(c)] The definition of the state-transition relation TransRLRM for a root LAM is obtained by deleting the pairs above for the forwardJn, parent_out, and parentJn events, and replacing the pair for reject_out events by the following pair for A-events: (A) PreA(q, e) = e = A A PBalance(q) < 0 A (VcEChildren)(q(c_estim)(c) :S 0) Next A( q. r) = r = q[( q(p_balance) + 1) /p_balance, 0/p_estim] The A-transitions permitted by this pair are necessary for the consistency of the root LAM specification: if the reject_out pair were simply deleted as were the torwardJn, - 167 - parent_out, and parent_in pairs, then there would be no way for a root LRM to change the value of p_balance and the rely-condition Rely_externa1RLRM defined below would be vacuous. To complete the specification of the local resource manager, it remains to define the validity conditions. As outlined in the informal discussion above, the validity conditions for the node and root LRM's can be expressed in rely-/guarantee-condition form as follows: ValidNLRM = RelyNLRM _ GuarNLRM ValidRLRM = RelyRLRM - GuarRLRM. As was done in the informal discussion, it is convenient to factor the rely- and guarantee-conditions into what the LRM relies on each of its neighbors and the external environment to provide, and what the LRM guarantees in turn to each of its neighbors and the external environment. The rely- and guarantee-conditions for the node LRM are defined by RAlyNLRM = RP.ly_pArantNLRM "(VcEChifdren)Rely_('hil,jLRM{~) GuarNLRM = Guar_ clienfRM " Guar_ paren~RM " (Ve EChildren)Guar_ childLRM(c). The rely- and guarantee-conditions for the root LAM are defined by AelyRLRM = Aely_externa1RLRM" (VcEChildren)Aely_childLRM(c) GuarRLRM = Guar_clienfRM" (VcEChildren)Guar_childLRM(c). A node LAM relies on the eventual payment of debts owed to the LAM by its parent. Aely_parentNLRM = D(D(Now(p_balance) > 0) - ◊(3rEAesources)(Occura = parenLin:r)) Although a root LAM has no parent, the intuitive significance of a positive value for p..balance in the case of a root LAM is that the total number of requests in the entire tree exceeds the total number of available resources. Since we cannot expect a system of LAM's to eventually satisfy all requests under such circumstances, a root LAM relies on the external environment to ensure that p..balance is invariantly nonpositive. Aely_externa1RLRM = □(Now(p_balance) < 0) Both kinds of LAM rely on each of their children to eventually eliminate debts owed to the LAM, either by the transmission of resources, or by the rejection of requests. - 168- Aely_childLRM(c) = □(D(Now(c_balance)(c) > 0) - ◊((3rEAesources)(Occurs = child_in:) v (Occurs = reject_in:c))) A node or root LAM guarantees to its client that pending requests will eventually receive a reply. Guar_clientLRM = D(Now(pending) > o- ◊(3rEAesource)(Occurs = reply:,)) A node LRM guarantees eventually to eliminate debts owed to its parent, either by actual transmission of resources, or by rejecting requests. Guar_ parentNLRM = □(D(Now(p_balance) < 0) - ◊((3rEAesources)(Occurs = parent_out:r) v (Occurs = reject_out))) Both kinds of LAM guarantee eventually to pay debts owed to their children. Guar_childLRM(c) = □(D(Now(c_balance)(c) < 0) - ◊(3r€Aesources)(Occurs = child_out:)) In devising the validity conditions for the local resource manager module, it was necessary to choose between two possible forms in which to state the rely- and guarantee-conditions. Since we are often faced with such choices in practice, it is 1 useful to examine the motivation for the particular choice made here. As an example, consider the definition of Guar_parentNLRM, which was stated above in the form (1) Guar_ parentNLRM = □(D(Now(p_balance) < 0) - ◊((3r€Resources)(Occurs = parent_out:r) v (Occurs = reject_out))) This guarantee-condition states that either .a parent_out or a reject_out will occur if there is the condition p_balance < 0 holds persistently (i.e. forever after some point). We might also have chosen the apparently stronger alternative form (2) Guar_ parentNLRM = □(Now(p_balance) < 0 - ◊((3r€Aesources)(Occurs = parent_out:r) v (Occurs = reject_out))). which requires the occurence of a parent_out or reject_out event in the case that the condition p...balance < O occurs at a single instant. In fact, we claim these two sentences are equivalent in the context of the LAM specification. More precisely, we claim CompLRM P: (1) ++ (2). Clearly (2) implies (1) by temporal reasoning alone. To see -169- that CompLRM I= -,(2) implies -,(1 ), suppose CompLRM and -,(2). Then (*) ◊(Now(p_balance) < 0 A D((VrEResources)(Occurs * parenLout:r) A (Occurs* rejecLout))). That is, eventually there is a point at which p..balance < 0 holds, but after which no parent_out or reject_out events ever occur. Inspection of the state-transition relation for the LRM shows that the only events that can cause p..balance to be increased are parent_out and reject_out events. This means that, if no parent_out or reject_out events occur, then p..balance < 0, once established, holds forever. Applying this result to (*) shows that (* *) ◊(D(Now(p_balance) < 0) /\ □((VrEResources)(Occurs * parent_out:r)" (Occurs~ reject_out))). But{**) is precisely the negation of (1) above, and thus {1) and (2) are equivalent. In this example, where form (1) and form (2) are equivalent, we chose form {1) over form (2) because form (1) is more convenient for the proof of correctness. Once we h•• au•i0- n"°"o"""c lr,.l.r,,.._o.r,~l "- 'n form (1 t/\ for thn ,::.t.'.,_ .,. .._... .r... ......... +..n.. n """"d1'+icn Gunr !'\~rcn•NLRMI .1,..,• •c mu"r.:t u~n th::i 11 • 11.., V VVII 1,t I _,_,,,... IL - -- - same form for the complementary rely-condition Rely_childLRM(c). Similar arguments apply to Guar_childLRM(c) and Rely_parentNLRM. 11.9.4 The Resource Manager Implementation Algebra In this section we define the resource manager implementation algebra ARMI. Let the following be given as parameters: Clients: a finite set of clients. root: a disti.nguished element of Clients Children: Clients - Set[Clients] maps each client to a set of children Resources: a finite set of resources {Resourcesc: c € Clients}: the initial partitioning of Resources. We require that be a rooted tree. Let parent: (Clients - {root}) - Clients be the function that maps each c € Clients to its parent. Define the function PDesc: Clients - Set[Clients], which takes an element c of Clients to the set of all proper descendants of c, in terms of the function Children in the obvious way. Define Desc(c) = {c} U PDesc(c) for all c E Clients. -170- The set Clients will be the index set for the interconnection; that is, there will be one LAM corresponding to each element of Clients. Define the embedded algebras A bs 8 and {AP: p E Clients) as follows: A bs: is the resource manager event/state algebra ARM, with parameters 8 Clients, Resources instantiated as Clients, Resources, respectively. Aroot: is the local resource manager event/state algebra ALAM, with parameters Resources, !Resources, Children, {estimc: c E Children(root)} instantiated as Resources, Resourcesroot' Children(root), {IdEOesc(c:) IResourcesi c E Children(root)}, respectively. AP: where p € Clients - {root}, is the local resource manager event/state algebra ALAM, with parameters Resources, !Resources, Children, {estimc: c € Children(p)} instantiated as Resources, _ResourcesP, Children(p), {IdEDesc(c:) IResourcesdl: c E Children(p)}, respectively. Let the composite interface for the resource manager interconnection be defined as follows: Event~AMI = {~} + [rAf111P.St: Cli~nts + reply: (Clients x Resources) + forward: (Clients - {root}) + reject: Clients+ down: ((Clients - {root}) x Resources) + up: ((Clients - {root}) x Resources)] lnAMI = {X} + [request: Clients] QutAMI = {X} + (EventsAMI - lnRM1). Intuitively, the event request:p corresponds to the receipt of a request by LAM p from its client, and reply:(p, r> corresponds to the allocation of resource r by LAM p to its client. The event forward:p represents the simultaneous occurrence of a forward_jn event for LAM p, and a forward_out:p event for LAM parent(p). The event reject:p represents the simultaneous occurrence of a reject_out event for LAM p and a reject_jn:p event for LAM parent(p). The event down: represents the simultaneous occurrence of a parent_in:r event for LAM p and a child_out:(p, r> event for LAM parent(p). Finally, the event up:(p, r> represents the simultaneous occurrence of a parent_out:r event for LAM p and a chi/d_jn:(p, r> event for LAM parent(p). Formally, these relationships are captured by the following definitions of the abstraction map aRMI, - 171 - and the decomposition map i 1RMI = . • p p€Cbents' aRM1(e) = request: c if e = request: c = reply: if e = reply: =A otherwise. BRM1(e) = request if e = request:p p = reply:, if e = reply: = forward_in if e = forward:p = forward_out:c if e = forward:c and p = parent(c) = rejecLout if e = reject:p = rejecLin:c if e = reject:c and p = parent(c) = child_in: if e = up: and p = parent(c) = child_out: if e = down: and p = parent(c) = parenLout:, If e = up: = parenLin: r if e = down:(p, r> =A otherwise. 11.9.5 Proof of Correctness In this section we prove the correctness of the implementation <,ARMI, Sabs' cEClients>, where Sabs Is defined by , Sroot is defined by , and Sc for c E Clients- {root} is defined by . Implementation Invariant: As usual, we factor the implementation invariant lnvRM1(q) for the resource manager implementation into an abstraction relation AbsRM1(q) and a representation invariant RepRM1(q). The abstraction relation simply states that the set of free resources for the abstract resource manager module is just the union of the sets of free resources for each of the component LRM's, and that the multiset of pending requests for the abstract RM assigns to each client a multiplicity equal to the value of the state variable pending for the corresponding LAM. AbsRMl(q) = qabs(free) = uc€Clients qc(free) A Q bs(pending) = (AcEClients)(qc(pending)) 8 - 172 - It is convenient to factor the representation invariant into several conjuncts: AepRM1{q) = Disjoint(q) " Neighbor{q) " Owed(q) A Optim(q). The conjunct Disjoint{q) states that the sets of free resources possessed by two distinct LAM's are disjoint. Disjoint(q} = Ac,c 'EClients, up:, and down:. In case e = reply:,.we have that rc(free) = qc(free) - {res} and 'c·(free) = qc,(free) for all c' € Clients with c' * c. In case e = up;, and c E Chilu,1::m(µ), we have that rP(irtrej = qP(ireej U {res}, rc(free) = qc(free)- {res), and 'c ,{free) = qc ,{free) for all c' C Clients with c' E Clients- {c, p}. In case e = down:, and c € Children(.o), we have that rP(free) = qP(free) - {res}, rc(free) = qc(free) + {res}, and 'c ,(free) = qc ,(free) for all c' € Clients with c '€ Clients - { c, p J. In each of these cases it is easily checked that Oisjoint(r) holds. Neighbor(,): Note that the predicate Neighbor(,) depends upon the values of rc(p_balance), rc(c_balance), rc(p_estim), and rc(c_estim), for each c € Clients. Enumeration of cases shows that the only events that affect the values of these components of the state are the events reject:c, forward:c, down:, and up:. However, examination of the definition of the LRM state-transition relation and the definition of the decomposition map i RMI shows that each change in the state of a participant in one of these events is accompanied by a compensating change in the state of the other participant. For example, if c E Children(p), then occurrence of an event of the form reject:c makes rc(p_balance) = qc(p_balance) + 1, but also makes rP(c_balance)(c) = qP(c_balance){c)-1. Thus the predicate Neighbor is preserved. Owed(r): Assuming that Owed(q) holds, the only way for Owed(r) to be false is for an event e to occur that increments qP(p_balance) when it is zero, or increments - 174- PBalance(qP) + l:cEChildren(p) qP(c_estim)(c) when it is zero. The only events that might have this property are e = reject:p, and up:. In case e = reject:p, PBalance(qP) + IcEChildrenCP> qP(c_estim)(c) is incremented, but the precondition for this event requires that this quantity be less than zero, so Owed(,) holds. In case e = up:, the quantity qP(p_balance) is incremented, but the precondition fore requires that this quantity be strictly negative, and hence Owed(r) holds. Optim(r): Assuming that Optim(q) holds, the only way for Optim(r) to be false is for the quantity qc(p_estim) to be decreased below the quantity PBalance(qc) + l:dEChildren(c) qc(c_estim)(d), or for the latter quantity to be increased above the former. The only events that could possibly have this effect are lorward:c and reject:c. If e = lorward:c, then qc(p_estim} is decremented, but so is PBalance(qc) + IdEChildren(c) qc(c_estim)(d). If e = reject:c, then PBalance(qc) + Id€Children(c) qc(c_estim}(d) is incremented and qc(p_estim) is set to zero. However, the precondition for e requires that the former quantity be negative. This fact implies that PBalance(rc) + l:dEChildren(c) rc(c_estim)(d) < 0 and rc(p_estim) = 0, and Optim{r) holds. From the invariance of Owed, Neighbor, and Optim, we can derive the fundamental property of estimates upon which the ,correctness of the resource management system crucially depends. This property Is expressed by Lemma 11.1 below, which states that if an LRM; is owed resources by its parent, then the amount it is owed by its parent is a lower bound on the total instantaneous deficit in the subtree of which; is the root. To express this result formally, we introduce the quantity /Balance(q) where q is an LRM state, defined as follows: IBalance(q) = lc,(free)l -q(pending). Whereas the quantity PBalance(q) introduced previously represents the total projected balance of resources at an LRM, after all debts have been paid, the quantity /Ba/ance{q) represents the total instantaneous balance of resources at an LRM, where the amount of indebtedness is not taken into account. Lemma 11.1 - The following is invariant for the resource manager implementation: ApEClients(qP(p_balance) > o- qP(p_balance) < -IcEDescCP) IBalance(qc)) Proof - From their definitions, it is easily seen that the quantities PBalance(q) and /Balance(q) are related by the following identity, expressed in the language L(ALRM) of · 175 • the LRM event/state algebra: PBalance(q) = IBalance{q) + q(p_balance) + :tcEChildren q(c_balance)(c). From this identity, a simple induction on the height of a node i € Clients in the tree , shows the truth of the following identity for all i € Clients: (1) :t;EDesc(i) PBalance(q;) = q;(p_balance) + :t,Eeesc(p) IBalance(q). That is, the total projected balance in the subtree of which; is the root is equal to the total instantaneous balance in that subtree, plus the net number of resources promised to be exchanged with the parent of i. The invariance of Owed(q) means that the following is invariant: (2) AiEClients{q;(P_balance) > 0 - PBalance(q1) + :t;EChik:tren(,J q;(c_estim){/) S 0). That is, if an LRM i is owed resources by its parent, then it must estimate no surplus of resources in the subtree of which ; is the head, based on the estimates it has for each of iit; t;hildren. The invariance of Neighbor(q) implies that the following is invariant: (3) A,Ectients(ViEChildren(i))(q (c_estim)(J) = q (p_estim)). 1 I Substitution of (3) into (2) shows the invariance of (4) A,Eclients(q;(p_balance) > 0 -+ PBalance(q;) + :tiEChildren(i) qI(p_estim) < 0). Using the invariant Optim(q) to substitute for q (p_estim) in (4) shows that 1 (5) A,EClients(q,(p_balance) > 0 - PBalance(q1) + :t;EChik:tren(i) (PBalance(q;) + :tllEChildren(i) q,(c_estim)(k)) S 0). is invariant. Repeating this argument to eliminate all occurrences of c_estim yields the invariance of (6) A,Ectients(q1(p_balance) > 0 - :t;EOesc(,J PBalance(q) S 0), -176- which states, intuitively, that if LAM; is owed resources by its parent, then there can be no projected surplus of resources in the subtree of which ; is the root. By using (1) to eliminate PBalance in favor of !Balance in (6), we obtain the invariance of (7) A;ECtients(q;(p_balance) > 0 - q;(p_balance) + l:iEDesc{i) IBalance(qi) S 0), which is equivalent to the desired result. I Proof of Maximality The maximality verification condition is: t== (VqEStates, eEEvents){lnvRMl(q) /\ AcEClientsEnabledc(q, e) - Enabled bs(q, e)). 8 The proof of this assertion is most easily performed by a case analysis on the event e; making use of the fact that the module specifications define the state-transition relation by precondition/next-state predicate pairs. If e = forward:c, reject:c, down:, or up:. If e = request:c, then aRM1(e) = request:c, and hence Enabled bs(q, e) = true. 8 We are left with the case e = reply:. In this case, we obtain the following from the module specifications: Enabled bs(q, e) = r E q b (free) Ac E qab (pending) 8 8 5 5 Enabledc(q, e) = r E qc(free) A qc(pending) > 0 Enablediq, e) = true, if p E Clients-{c}. Assume lnvRM1(q), and hence AbsRM1(q), holds. Assume further that ApEClients EnabledP(q, e) holds. From Enabledc(q, e) we know that r E qc(free) A qc(pending) > 0 holds. From this and AbsRMl(q) we infer that r E q bs(free) Ac E q b (pending) holds, as 8 8 8 desired. Proof of Validity To prove that the validity verification condition holds for the resource manager implementation, we use Corollary 1.5. To apply Corollary 1.5, we must find, for each ;, j, € Clients + {abs), a sentence RG .. of ~ARMt) such that the following hold: IJ -177 - (RMI1)(a) CompRMI t= (RelyRMDabs - ";€Clients RGabsj (RMl1)(b) CompRMI t= ",e:clients RGI.abs - (GuarRMJabs (RMl2)(a) (root) CompRMI t= (A,EClients+{abs} RG,,,oot - ffRelyRLRMBroot> (node) CompRMI t= ";EClients-{root} (A;EClients+{abs} AG,, - ffRelyNL~,) (RMl2)(b) (root) CompRMI t= (llGuarRLRMJ,oot - /\/€Clients+ {abs} RGroot,;) (node) CompRMI t= ",EClients-{root} (llGua,-NLRMB, - "/€Clients+ {abs} RG,} (RMl3) Whenever {, } is a cycle of Clients, then 0 1 1 1 CompRMI t= yn-1 RG • le"' 0 '11J1c + 1 The sentences RG j bear a particular relationship, formalized in Lemma 11.2 below, 1 to the various conjuncts appearing in the local resource manager validity conditions. Lemma 11.2 states in essence, that the local resource manager validity conditions are "localized" versions of the sentences AG;J· Part (a) of Lemma 11.2 states that the sentence RG,.parent(,J captures exactly what LAM ; guarantees to Its parent and exactly um~♦ I Pt.A n~r0 nt1,;\ r01ies on ; .,.,. nrov"1d"' o ... r+ (b) ,.ta•es th--t th" __ ..., .. .... e nr.- ••• ·--· -· .... ,..... ..... \ lj ...,.. • ••. .., .., '-'• I .... ,. -,ii ' u I C --",,I ,,iw, IW I ''-'parent(/)J captures exactly what LAM j relies on its parent to provide, and exactly what LAM parent(/) guarantees to provide to j. Part (c) states that the sentence RGabs,root captures exactly what the root LAM relies on the external environment of the system of LAM's to provide. Part (d) states that the sentence RG,,abs captures exactly what LAM i guarantees to provide to the external environment of the system of LRM's. The sentences RG,, are defined as follows: AGroot,abs = □(Now root(pending) > 0 - ◊(3r€Resources)(B~!(Occurs) = reply:,)) RGabs,root = □(Now root(p_balance) < 0) For all ;, i E Clients - {root}: AGi,abs = □(Now,(pending) > o- ◊(3r€Resources)(8:,™1(0ccurs) = reply:,)) AGabsJ = true. For all ; E Clients - {root}: AG,,parent(i) = D(D(Now,{p_balance) < 0)-+ ◊((3rEAesources)(8fM1(0ccurs) = parenLout:r) v (8fM1(0ccurs) = rejecLout)) - 178 - RGparent(,),i = □(□(Now parent(i)(c_balance)(i) < 0) _. ◊(3rEResources)(B:!:!mtciOccurs) = child_out:)) For all i,; E Clients such that neither ; = parent(/) or; = parent(,): RG;J = true. Lemma 11.2 - The following are valid for the resource manager implementation: (a) For all i E Clients- {root}, CompRMI I= RGi,parent(I) - (Guar_ parentNLRMJ, ++ (Rely_childLRM(,)lparent(I} (b) For all i E Clients - {root}, CompRMI I= RGparent(i)J ++ (Rely_parentNLRMJ - 1 ftGuar _childLRM(/)Jparent(I) (c) CompRMI I= RGabs,root - ftRely_externaIRLRMJroot (d) For all i E Clients, Compql~t I= RG,,abs - ff Guar _clientLR>.4),. Proof - Straightforward, using the invariance of Neighbor and the definition of the decomposition map 4 RMt. I Lemma 11.3 - Under the definitions given above for the sentences RG;J• conditions (RMI 1 )-(RMl3) hold for the resource manager implementation. Proof- Assume CompRM1• To prove that (RMI 1 )(a) holds, we must show IRelyRMBabs _. A,EClient RGabs,/ Suppose that ftRelyRMJabs holds. It suffices to prove that RGabs.root holds, since RGabsJ = true for all; E Clients- {root}. fRelyRMDabs is defined by: (RelyRMjabs = □(!Now abs(free)I > INow abs(pending)I). Using this and the invariance of the predicate AbsRMt, we infer the truth of □(I;eCUents (INow,(free)I - Now;(pending)) > 0), which is equivalent to (A) □(l::iEClients IBalance(Now) > 0). -179- From Lemma 11.1 and the fact that Desc(root) = Clients, we infer that □(Now root(p_balance) > 0 - Now root(p_balance) < -IiEClients IBalance(Now;)). From this and (A), we conclude that □(Now root(p_balance) S 0), which is precisely the statement that RGabs,root holds. To prove (RMl1)(b), we must show A;EClients RGi,abs - ff GuarRMJabs Suppose that A,Eclients RG,,abs holds. From the definition of RGi,abs we know that A;EClients □(Now,(pending) > 0 - ◊{3r€Resources)(B::™'(Occurs) = reply:,)) holds. From the invariance of AbsRMI and the definition of the abstraction map aRMI we infer that A,EClients □(i E Now abs(pending) - ◊(3rEResources)(aRMl(Occurs) = reply:)) holds: This is precisely the statement that HGuarRMJlabs holds. We next prove (RM12){a). In case {root), we must show (root) A,EClients +{ abs} RGi,root - f RetyRLRMBroot" From the root LAM specifications we know that ffRely_externa1RLRMlroot" AiEChildren(root) ffRely_childLRM(,)Droot ++ ffRelyRLRMDroot' Using Lemma 11.2 (a) and (c) we infer that RGabs,root /\ A;EChildren(root)RG,,root ++ ff RelyRLRMDroot• which implies formula (root). In case (node) we must show that for all i € Clients, (node) A,Eclients+{abs} RG;J - (RelyNLAMJr Fix i to be an arbitrary element of Clients. From the node LAM specifications we know that HRely_parentNLRMD, " A,€Children(i)HRely_childLRM(,)D, - (RelyNLRMDr Using Lemma 11.2 (a) and (b) we infer that RGparent(i)J " AiEChildren(/)RG;J ++ IRelyNLRMJ,, which implies formula (node). We next show (RM12)(b). In case (root) we must show (root) IGuarRLRMBroot - A/€Clients+{abs} RGrootJ' From the root LAM specifications we have DGuarRLRMDroot ++ IGuar_clientLRMJroot A A;EChildren(root) &Guar_chUdLRMC,)Droot Using Lemma 11.2 (b) and (d), we infer -180 · (Gua,-RLRMJ!root ++ RGroot,abs A A/€Children(root}RGrootJ' This implies formula (root), since RGrootJ = true unlessj = abs or j € Children(root). In case (node) we must show that, for all;€ I: (node) ffGuarNLRMJ, - A/Ectients+{abs} RG,J. Let i be an arbitrary element of Clients, and assume (Gua~LRMJ,. From the node LAM " specifications we have (GuarNLRMD, ++ fG uar_ clienfRMJ, " (Guar_ parentNLRMJ, A A/EChildren(I) (Guar_ childLRM(/))1 Using Lemma 11.2 (a), (b), and (d), we infer ffGua~LRMB, ++ RGi,abs" AGi,parent(,)" (Vj€Children(i))RG,r This implies formula (node), since RG J = true unless j = abs, j = parent(,). or j € 1 Children(,). To prove (RMl3) it suffices to show that RG,,parent(,) v RGparent(i),; holds for all i € Clients - {root}. This is because every cycle {, ... , } of Clients either 0 1 contains a link for which AG, J = true by definition, or else contains both ,,, ,,, + 1 link:; and for some; C Clients- {root}. To show RG,,parent(I} v RGparent(,)J holds for all i € Cl!ents - {root}, let i be arbitrarily fixed, and suppose that -,RG,,parent(I} holds, to show th,t RGparent('1, holds. By definition of RG,.parent(I) we know that ◊D(Now,(p_balance) < 0) holds. By the invariance of Neighbor, we infer that ◊□(Now parent(, (c_balance){,) > 0) holds. This implies that 1 RGparent(l)J holds. I 11.10 A Message Transmission System In this section we consider the specification and implementation of a message transmission module TM, whose function is to reliably deliver messages input by one user, called the sender, to another user, called the receiver. Messages should be delivered in the order in which they are sent, and should not be subject to loss or duplication. The message transmission module therefore behaves as a FIFO buffer between the sender and the receiver. The interesting part of this example is how the reliable FIFO buffer behavior of the message transmission module is Implemented by a transmission module implementation TMI, which consists, in part, of unreliable transmission line components. This is accomplished through the use of a send protocol module and a receive protocol module, which together implement the alternating bit • 181 • protocol [Bartlett69]. The alternating bit protocol is a standard example for which correctness proofs in varying styles have been given by other researchers. Most analyses treat only safety properties, however the proofs given by Hailpern and Owicki [Hailpern80] and Lamport [Lamport83] treat liveness properties of the protocol in addition to safety properties. The major deficiency of Hailpern and Owicki's treatment is the unstructured and apparently ad hoc nature of the specifications and the correctness proof. It is difficult to discern from their work very much in the way of a general method (with the exception of their use of history variables, which can be seen as a special case of the state-transition approach espoused here) likely to be applicable to other examples. In contrast, the specifications and correctness proof given below are an instance of a general strategy, which is embodied in the state-transition approach to specification, the use of rely- and guarantee-conditions, and the Correctness Theorem. Of the extant proofs of the correctness of the alternating bit protocol, that of Lamport [Lamport83] is perhaps the most similar to the one given here. The modules are specified in a state-transition style quite like that proposed here. It is possible to identify portions of Lamport's proof that correspond to the proof of invariance of tha abstraction relation and implementation invariant given below. The major difference between Lamport's proof and the one given here is in the statement and proof of the liveness (i.e. validity) properties. Lamport's liveness specifications for the send protocol module take the form: "If the send protocol module has an unprocessed message, then it will eventually give a packet containing that message to the unreliable transmission medium for transmission to the receive protocol module;" "If a correct acknowledgement is received by the send protocol module, then eventually the protocol will progress to the next unprocessed message;" etc. These are ''low-level" specifications that can be thought of as essentially a set of assertions that might appear in an assertional proof that a particular program satisfies the specification. In contrast, the specifications given here are of the form: "If the send protocol module can rely on the fact that sufficiently many transmissions of packet p will eventually result in the receipt of an acknowledgement for packet p, then it guarantees eventually to process every message given to it as input." This is a "higher-level" specification that states what the send protocol module accomplishes without detailing a chain of intermediate steps by which it is. accomplished. -182 - A feature that distinguishes the proof presented here from previous proofs, is that the proof here is to a great extent independent of the precise assumptions on the reliability of the transmission line components. The specifications of the transmission line module are expressed in the form: "If a message m is transmitted according to certain conditions Xmit(m), then eventually m will be delivered according to conditions Dlvr(m)." For concreteness, we use "m is transmitted repeatedly, without intervening transmission of any different message m '" for Xmit(m}, and a symmetric condition for Dlvr(m}. However, Xmit(m) and Dlvr(m) can easily be replaced with alternative conditions without change to the proof structure. 11.10.1 Specification of the Message Transmission Module The interface of the message transmission module TM consists of two kinds of events: those of the form TMJn:m, in which message mis presented to the transmission module by the sender, and events of the form TM_out:m, in which message m is delivered by the transmission .module to the receiver. We wish to state that the tr;:in43rnis~ion modL•le ctelivers m'?ssages in FIFO order without loss "'r duplication. We can think of the state of the transmission module as a sequence of the messages input by the sender but not yet delivered to the receiver. Equivalently, and for our purposes more conveniently, the state of the transmission module can be thought of as a pair of sequences of messages, where inq represents the entire history of messages that have ever been input to the transmission module by the sender, and outq represents the entire history of messages that have ever been output to the receiver by the transmission module. The sequence of messages sent but not yet delivered by the transmission module is represented, in this alternative state set, by the sequence inq-outq. Based on this selection of state set, let us now derive a precise specification of the transmission module. Let Values be a finite set of message values, given as a parameter. The interface of the transmission module is defined as follows: Events™ = {>.} + [TM_in: Values + TM_out: Values]. In TM = {>.} + [TM_in: Values] Out™ = {>.} + [TM_out: Values]. The state set for the transmission module is defined by: -183 · States™ = [inq: Seq[Values] x outq: Seq[Values]]. If q E States™, then we write q{inq-outq) as an abbreviation for q(inq) -q(outq). In an initial state for the transmission module, the queue is empty. lnitTM(q) = q(inq) = q(outq). The state transition relation TransTM is defined by precondition/next-state predicate pairs as follows: An input event with message m can occur at any time, and causes message m to be appended to the end of inq. (TM_in) PreTM• . in(q, e, m) = e = TM_in:m = r = q[(q(inq))•m/inq] An output event with message m can occur only if there is a message that has been sent but not yet delivered, and mis the first such message. The effect is to append m to the end of outq. rr• A "'' ,+, o..... (q c m) ' • • • ,_....,.,,.. "1 • • '-'TM_out ' ' = a ::: TM_out:m "q(outq) < q(lnq) A m = (q(inq-outq)(O) = r = q[(q(outq))•m/outq]. We wish the validity condition for the transmission module to capture the requirement that every message sent is eventually delivered. This is captured by the definition below, which states that, given any prefix s of inq, there is eventually a later time at which sis also a prefix of outq. Valid™ = Rely TM - Guar TM where Rely™ = true Guar™ e □{VsESeq[Values])(s < Now(inq) - ◊{s S Now{outq))). 11.10.2 Implementation of the Transmission Module Figure 4 shows the implementation of the transmission module by a send protocol module SP, a receive protocol module RP, a sender-to-receiver transmission line module SRTL, and a receiver-to-sender transmission line module RSTL. Messages received from the sender by the send protocol module (an "in" event) are placed in a queue for transmission to the receive protocol module. When this queue is nonempty, a -184 · packet consisting of the first message in the queue and a current boolean sequence number is transmitted (via a "pkt_out" event) by the send protocol module SP over the transmission line SRTL. In contrast to the reliable transmission module specified in the preceding section, the transmission line module is inherently unreliable, and might lose or duplicate messages. We require, however, that the transmission line not reorder messages. Since messages might be lost, in general it will be necessary for the send protocol .module to transmit the same packet a number of times before it is delivered to the receive protocol module. Thus the send protocol module continues to send the packet until an acknowledgement for the sequence number it contains is received (an "ack._in" event) over the transmission line module RSTL. Receipt of a correct acknowledgement by the send protocol module causes the first message to be removed from its queue. In addition the send protocol module complements its sequence number. When a packet arrives at the receive protocol module (via a "pkt_in" event), it is checked to see if its sequence number is current. If the sequence number is current, then th.e message is extrar:tP.rl Rnrt pl-R~e.rf in A q1_1e1.1e of messeges to be delivered to the receiver. Also, the sequence number expected by the receive protocol module is complemented. The receive protocol module ignores packets that do not contain the current sequence number. The receive protocol module transmits acknowledgem'3nts for the most recently received packet over the transmission line module RSTL (via "ack_out" events). Whenever the queue of messages to be delivered to the receiver is nonempty, then a message can be removed and sent to the receiver (via "out" events). 11.10.3 Specification of the Transmission Line Module The interface of the transmisssion line module contains events of the form TL_Jn:m, which correspond to the presentation of message m for transmission, and of the form TL_out:m, which correspond to the delivery of message m to its destination. Thus, the interface of the transmission line module TL is isomorphic to that of the message transmission module. The difference between the two modules lies in the fact that, whereas the transmission module guarantees to deliver each message exactly once, the transmission line module is permitted to lose or duplicate messages any number of times. We require, however, that the transmission line module not reorder messages. Also, we require that repeated input of messages to the transmission line -185 · Fig. 4. Transmission Module Implementation Sender Transmission Module module will eventually cause messages to be delivered. We will use the same state set for the transmission line specification as ~ used for the transmission module specification. However, the int\llti~ meani~gs of the components lnq and outq of the state are significantly changed, as Is the state-transition relation and validity condition. For the transmission line module, the sequence lnq represents a sequence of messages, each of which is destined to be delivered at least once. However, each message in lnq might be delivered more than once. The sequence outq represents the messages in inq, each of which has already had all its copies delivered, and will therefore never be delivered again. The state transition relation is modified to permit message loss and duplication as follows: The possibility of message loss Is captured by the fact that input events are permitted either to produce no state change (corresponding to the loss of the associated message) or to append the message exactly once to the end of inq (indicating that the message is destined to be delivered eventually at least once). The possibility of message duplication is captured by the fact that output events are permitted either to ~uce no state change (corresponding to the duplication of the message just delivered) or to add the message -186 · to outq (corresponding to the delivery of the final copy of the message). Note that the preceding description is only one of many possible ways of presenting the same transmission line specification. For example, we could have captured the possibility of message loss or duplication by stating that the occurrence of a TL.Jn:m event causes the message m to be appended k times to inq, where k is a nondeterministically chosen natural number. Occurrence of a TL_out:m event would then be possible only if m is the first element of inq not also in outq. and would cause m to be appended precisely once to outq. The transmission line specification is an example of an indeterminate specification (see Section 6.2), which means that a single observation can be produced in more than one computation. Although we could give a determinate transmission line specification equivalent to the indeterminate version used here, the use of an indeterminate specification seems more natural . . We now make the above informal specification more precise. As in the case of the transmission module specification, let Values be a finite set of message values, given as a parameter. Define the interface of the transmission line module as follows: Events'L = p.J + [TL_in: Values+ TL_out: Values]. In TL = {A} + [TL_in: Values] OutTL = {A} + [TL_out: Values] Define the state set of the transmission line module by: StatesT L = [inq: Seq[Values] X outq: Seq[Values]]. In an initial state, the transmission line queue is empty. lnitTL(q) = q(inq) = q(outq). The state-transition relation TransT L is defined as follows: An input event with message m can occur at any time, and either causes no change in state (the message is lost) or appends the message to the end of the queue (the message is destined to be delivered). (TL_in) PreTL..in(q, e, m) = e = TLin:m NextTL..in(q, ,, m) = r = q v, = q[(q(inq))•m/inq] An output event with message m can occur only if there is a message that has been sent but for which the last copy has not yet been delivered, and m is the first such message. The message mis either appended to outq (corresponding to the last copy of m being -187 - delivered), or there is no state change (corresponding to the duplication of m). (TL_out) PreTLout(q, e, m) = e = TL_out:m "q(outq) < q(inq) A m = (q(inq-outq))(O) NextTL_ (q, ,, m) = r = q v, = q[(q(outq))•m/outq]. 001 The validity condition for the transmission line module should express the requirement that, for each message m, if the transmission of m satisfies certain minimal conditions (e.g. that m is transmitted repeatedly, without intervening transmission of other messages), then the transmission line module will ensure that m will eventually be delivered according to certain conditions (e.g. m will eventually be delivered repeatedly, without intervening transmission of other messages). Formally, ValidTL = RelyTL - GuarTL Rely TL a true GuarTL = □(VmEValues)(XmitTL(m) - ◊OtvrTL(m)), where XmitTL(m) describes the conditions required on the transmission of message m and OlvrTL(m) describes the corresponding conditions according to which m will be deliver~. Aside from the reri11irAmP.nt th;:tt the re.suiting sp13Cification be consistent, there is a reasonable amount of flexibility in the choice of the conditions XmitTL(m) and DlvrTL(m). We will see tater that the particular choice of conditions does not significantly affect the proof of correctness of the transmission module implementation, as tong as the conditions XmitTL(m) and DlvrTL(m) interact properly with corresponding conditions appearing in the specifications for the send and receive protocol modules. For concreteness though, we make the following definitions: XmitTL(m) = □◊(Occurs = TL_in:m)" □(Vm '€Values)(Occurs =. TLin:m ' - m' = m). DlvrTL(m) a □◊(Occurs = TL_out:m) A D(Vm '€Values)(Occurs = TL_out:m' - m' = m). Intuitively, the condition XmitTL(m) states that the message mis transmitted repeatedly, without any transmission of other messages m '. The condition DtvrTL(m) states that the message m is delivered repeatedly, without any delivery of other messages m '. -188- 11.10.4 Specification of the Send Protocol Module The send protocol module SP interfaces between the sender and the SRTL and RSTL transmission line modules. Its function is to implement one half of the alternating bit transmission protocol. The interface of the send protocol module consists of three kinds of events: SP Jn:m, which represents the receipt of message m from the sender, SP _pkt_out:p, which represents the transmission of packet p over the unreliable transmission line SRTL, and SP_ ack_in:b, which represents the receipt of an acknowledgement for sequence number b from the unreliable transmission line RSTL. The state of the send protocol consists of three components: a sequence inq of all messages that have ever been received from the sender, a sequence outq of all messages that have been acknowledged by the receive protocol module, and a boolean component sn, which records the current sequence number. Choosing outq to be the sequence of acknowledged messages, rather than the sequence of all messages transmitted to the SRTL transmission line, allows us to obtain a simpler correctness proof than that presented by Hailpern and Owicki [Hailpern80]. In that paper, the use of the actual history of messages transmitted requires the correctness proof to define and reason about certain functions whose purpose is essentially to extract the history of acknowledged messages from the history of all transmitted messages. Informally, the send protocol module behaves as follows: Occurrence of a SPJ n:m event causes the message m to be appended to inq. When there is a message to be sent, and processing of all previous messages has been completed, the message is paired with the current sequence number to form a packet p, which is then given to the unreliable transmission line SRTL to be transmitted to the receive protocol module. The send protocol module continues to transmit the packet p until an acknowledgement for its current sequence number arrives over the unreliable transmission line RSTL. When a correct acknowledgement arrives, the message acknowledged is appended to outq, signifying that it has been successfully delivered, and the current sequence number is complemented. More precisely, let Values be a finite set of message values, given as a parameter. The interface of the send protocol module is defined as follows: EventsSP = {>,} + [SP _in: Values + SP _pkLout: Pkts + SP_ ack_in: Bool] lnSP = {>.} + [SP _in: Values + SP _ack_in: Bool] - 189- OutSP = {>.} + [SP _pkt_out: Pkts] Pkts = [msg: Values x sn: Bool], where Pkts is the set of packets. The state set for the send protocol module is defined by: States5P = [inq: Seq[Values] x outq: Seq[Values] x sn: Bool]. In an initial state, the queue is empty, and the sequence number is false. lnitSP(q) = q(inq) = q(outq) /\ q(sn) = false. The state-transition relation TransSP is defined as follows: An SPJ n:m event can occur at any time, and causes the message m to be appended to lnq. (SP_ln) PreSP in(q, e, m) = e = SP_in:m Nextgp ln(q, r, m) = r = q[(q(inq))•m/inq] An SP_ pkt_out:p event can occur only If there is a message that has been received from the sender but not yet successfully transmitted to the receiver, p(msg) is the first such message, and p{sn) is the current sequence number. There is no effect on the state. {SP _pkt_out) PreSP__pkLout(q, e, p) = e = SP_ pkt_out:p /\ q{outq) < q{inq) /\ p{msg) = {q{inq-outq)){O) /\ p{sn) = q(sn). NextSP__pkLout{q, '• P) =' = q. An SP_ ack_in event for acknowledgment b can occur at any time. If b does not match the current sequence number, or if there is no message currently being transmitted, • then there is no change in state. If b does match the current sequence number and there is a message currently being transmitted, then this indicates that the message has been successfully transmitted. In this case, the current message is appended to outq, and the sequence number is complemented. (SP_ ack_in) PreSP_ack..in{q, e, b) = e = SP_ ack_in:b Nextgp_ack..in{q, ,, b) = ({q(inq) = q{outq) vb ~ q(sn)) -+ r = q) /\ ((q{outq) < q{inq) /\ b = q(sn)) -+ r = q[ ,(q(sn))/sn, (q{o utq)) •q{inq-outq)(O) I outq]). · 190 · With the validity condition for the send protocol module, we would like to capture the following: If the send protocol can rely on the fact that repeated transmissions of a packet eventually result in the repeated receipt of acknowledgements for that packet, then it guarantees that every message appearing in inq will eventually also appear also in outq. This requirement is stated in rely-/guarantee-condition form as follows: Valid8P = RelySP - Guar8P RelySP = □(VpEPkts)(XmitSP(p) - ◊Dlvr8P(p(sn))) Guar8P = D(VsESeq(Values])(s < Now(inq) -+ ◊(s < Now(outq))), where the formula XmitSP(p) is the formalization of the statement: "packet p is transmitted repeatedly, without any transmissions of other packets," and the formula Dlvr8P(b) is the formalization of "acknowledgements for sequence number b are received repeatedly, without receipt of any other acknowledgements." These formulas must be defined to be compatible (in a way that is made precise by Lemma 11.6 below) with the formulas XmitTL(m) and DlvrTL(m) in the transmission line specification. Thus, XmitSP(p) = □◊(Occurs = SP _pkt_out:p) A D(Vp 'EPkts)(Occurs = SP _pkLout:p' -+ p' = p) Dlvr8P(b) = □◊(Occurs = SP _ack..ln:b) A □(Vb 'EBool)(Occurs = SP_ack_in:b ' - b' = b). 11.10.5 Specification of the Receive Protocol Module The receive protocol module interfaces between the SRTL and RSTL transmission lines, and implements the complementary half of the transmission protocol. It operates as follows: The state of the receive protocol module consists of two sequences, inq and outq, of messages, and a boolean sequence number sn. The sequence inq records the history of valid messages (with duplications removed) that have been received from the unreliable.transmission line SRTL. The sequence outq records the history of messages that have been delivered to the receiver. Initially the sequence number sn in the receive protocol module's state matches the sequence number in the state of the send protocol module. The receiver waits for packets to be delivered by the SRTL transmission line. If a received packet has a sequence number that does not match the current sequence number, then it is ignored. If a received packet has a sequence number that matches the current sequence number, then the message Is extracted from the packet and placed at the end of inq. In addition, the current sequence number is complemented. At any time, the receive protocol module can transmit acknowledgements for the • 191 - complement of its current sequence number (i.e. for the sequence number of the last valid packet received). As in the previous specifications, let the finite set Values be given as a parameter. Define the interface of the receive protocol module as follows: EventsRP = {A} + [RP_ pkt_in: Pkts + RP _out: Values + RP_ ack..out: Bool] lnRP = {A} + [RP_ pkLin: Pkts] OutRP = {A} + [RP_ out: Values + RP_ ack_out: Bool] Plds = [msg: Values X sn: Bool]. Define the state set by: StatesRP = [inq: Seq[Values] x outq: Seq[Values] x sn: Bool]. In an initial state, both queues are empty, and the sequence number is false. lnitRP(q) = q(inq) = q(outq) A q(sn) = false The pairs that define the state transition relation TransRP are given below. A RP _pkLin event with packet p can occur at any time. If the sequence number in p does not match the current sequence number, then there is no effect on the state. If the sequence number in p does match the current sequence number, then the message contained in p is appended to inq, and the current sequence number is complemented. (RP _pkt_in) PreRP..PkUn(q, e, p) = e = RP _pkLin:p NextRP. .PkLin(q, ,, p) = (p(sn) ~ q(sn) - r = q) " (p(sn) = q(sn) - r = q[-,q(sn)lsn, (q(inq))• p(msg)/inq]) A RP _ack_out event can occur only for the complement of the current sequence number. There is no effect on the state. (RP_ ack..out) PreRP_ ack..out(q, e) = e = RP_ ack..out:(-,q(sn)) Nex~P..ack..out(q, r) a r = q An RP _out event with message m can occur only if there is a message in inq that has not yet appeared in outq, and m is the first such message. The effect is to append m to outq. (RP_out) = e = RP _out:m /\ q(outq) < q(inq) A m = (q(inq-outq))(0) Nextou (q, r, m) = r = q[(q(outq))•m/outq] 1 -192- The validity condition for the receive protocol module should capture the following two requirements: (1) If packet p is received repeately, then eventually acknowledgements. for the sequence number contained in that packet will be transmitted repeatedly; and (2) Every message that appears in inq will eventually appear in outq. Formally, ValidRP = RelyRP - GuarRP RelyRP = true GuarRP = □(VpEPkts)(DlvrRP(p) - ◊XmitRP(p(sn))) A □(VsESeq[Vafues])(s < Now(inq) - ◊(s < Now(outq))), where, as in the previous specifications, DlvrRP(p) formalizes the statement, "Packet p is received repeately, without any receipt of other packets" and XmitRP(b) formalizes the statement, "Acknowledgement b is transmitted repeately, without any transmission of other packets." These formulas are defined as follows: DlvrRP(p) = □◊(Occurs = RP _pkLin:p) A □(Vp '€Pkts)(Occurs = RP _pkUn:p' - p' = p) XmitRP(b) = □◊(Occurs = RP _ack_out:b) A □(Vb '€Bool)(Occurs = RP _ack_out:b' - b' = b) 11.10.6 The Transmission Module Implementation Algebra I In this section we define the transmission module implementation algebra A™1• Let the finite set Msgs of message values be given as a parameter. Define Pkts = [msg: Msgs x sn: Bool]. The index set for the interconnection is the set {SP, RP, SRTL, RSTL}, corresponding to the send protocol, receive protocol, send-protocol-to-receive-protocol transmission line, and receive-protocol-to-send-protocol transmission line component modules. Define the embedded algebras A abs, ASP, ARP' ASRTL' and ARSTL as follows: A abs: is the message transmission module event/state algebra A™, with the parameter set Values instantiated as the set Msgs. ASP: is the send protocol module event/state algebra ASP, with parameter Values instantiated as the set Msgs. ARP: is the receive protocol module event/state algebra ARP, with parameter Values instantiated as the set Msgs. ASRTL: is the transmission fine module event/state algebra A1L, with parameter Values instantiated as the set Pkts. -193 - is the transmission line module event/state algebra ATL, with parameter Values instantiated as the set Bool. Let the composite interface for the transmission module interconnection be defined as follows: Events™ 1 = {A) + [in: Msgs + out: Msgs + pkt_out: Pkts + pkt_in: Pkts + ack_out: Bool + ack_in: Bool] fn™1 = {A) + [in: Msgs] Out™' = {A) + (Events™1-ln™1) Intuitively, events in:m and out:m represent, respectively, the receipt of message m from the sender and the delivery of message m to the receiver. Events pkt_out:p and pkt.Jn:p represent, respectively, the presentation of packet p by the send protocol module to the SRTL transmission line and the receipt of packet p by the receive protocol module from the SRTL transmission line. Events ack_out:b and ack.Jn".b represent, respectively, the presentation of acknowledgement b by the receive protocol module to the RSTL transmission line and the receipt of acknowledgement b by the send protocol module tram the RSTL tranernief:ion !!ne. Define the abstraction map 1a ™ , and the decomposition map~™' as follows: a™1(e) = TM_in:m if e = in:m = TM_out:m if e = out:m = A otherwise. 1 8~ (e) = SP_in:m If e = in:m = SP _pkt_out:p if e = pkt_out:p = SP_ ack_in:b If e = ·a ck_in :b =A otherwise. 1 8~ (e) = RP_out:m if e = out:m = RP _pkt_in:p if e = pkt_in:p = RP_ ack_out:b if e = ack_out:b =A otherwise. 8~~L(e) = TL_in:p if e = pkt_out:p = TL_out:p if e = pkt_in:p =A otherwise. -194- 8~~L(e) = TL_in:b if e = ack_out:p = TL_out:b if e = ack_in:p =A otherwise. 11.10.7 Proof of Correctness In this section we prove the correctness of the implementation (jATMI, sabs' ;E{SP,RP,RSTL,SRTL}>' where sabs is defined by , and SSP, SRP' SRSTL' SsRTL are defined by , , , and 1 , respectively. Invariance The correctness of the transmission module implementation depends only on the invariance of the following: (1) q b (inq) = qSP(inq) /\ q bs(outq) = qRP(outq) 8 5 1 (2) q p(outq) < qRP(inq) S q p(inq). 8 8 (.;()ndition (1) is the abstraction relation AbsTMi(q), and states that the abstract transmission module's inq is identical to the inq for the send protocol module, and that the abstract transmission module's outq is identical to the outq for the receive protocol module. Condition (2) is Lemma 11.4 below, and says that the receive protocol module's inq is always an extension of the send protocol module's outq and a prefix of the send protocol module's inq. Condition (2) is not inductive as stated, and must be strengthened to permit an inductive proof of invariance. We therefore define the implementation invariant lnv™1(q) by lnv™1(q) = Rep™1(q) /\ Abs™1(q), where Rep™1(q) is the representation invariant and Abs™1(q) is the abstraction relation. The abstraction relation is: Abs™1(q) = q bs(inq) = q p(inq) /\ qabs(outq) = qRP(outq). 8 8 The representation invariant Rep™1(q) is defined as follows: Rep™1(q) = Queue(q) "(Start(q) V Send(q) V Flip(q) V Ack(q)), where -195- and the formal definitions of Start, Send, Flip, and Ack will be given below. This invariant says that, at any instant of time, the histories inq and outq tor the send and receive protocol mc;>dules satisfy certain prefix relationships captured by the predicate Queue. In addition, the transmission system is always in one of four kinds of states, corresponding to the four predicates Start(q), Send(q), Flip(q), and Ack(q). The situations covered by these four predicates, and how they evolve during execution, will now be described. In a state that satisfies Start, the send and receive protocol modules have the same sequence number, the send protocol module's outq and the receive protocol module's inq are identical, and no new packets or acknowledgements are currently in transit over the transmission lines. The predicate Start is satisfied by all initial states. States satisfying Start give rise to states satisfying Send when there is an unprocessed message at the send protocol module that has been output to (but possibly lost by) the transmission line RSTL. In a state that satisfies Send, the send and receive protocol modules have the same current sequence number, the outq of the send protocol module and the inq of the receive protocol module are identical, there is an unprocessed message at the send protocol module, there may be packets containing this message in transit over the transmission line SRTL, and there are no new acknowledgements in transit over RSTL. States satisfying Send give rise to states satisfying Flip when the first packet containing an unprocessed message arrives at the receive protocol module. In a state that satisfies Flip, the send and receive protocol modules have complementary current sequence numbers, the inq of the receive protocol module is equal to the outq of the send protocol module with the newly arrived message appended, and all packets in transit over SRTL or acknowledgements in transit over RSTL are old in the sense that they are for a sequence number that is not the one currently expected by the send protocol module. States satisfying Flip give rise to states satisfying Ack when the first acknowledgement for the newly arrived packet is transmitted over RSTL. In a state satisfying Ack, the send and receive protocol modules have complementary current sequence numbers, the inq of the receive protocol module is equal to the outq of the send protocol module with the still-unacknowledged message appended, all packets in · 196 · transit over SRTL are old, but there may be new acknowledgements in transit over RSTL. To complete the cycle, states satisfying Ack give rise to states satisfying Start when the first new acknowledgement is received by the send protocol module. For the formal statement of these predicates, it is convenient to define some auxiliary predicates, which describe possible states of the transmission lines SRTL and RSTL. The predicate SRTL_old is true of a state iff all packets in the SRTL transmission line are old, in the sense that they are for the opposite sequence number than the one currently expected by the receive protocol module. SRTL_old(q) = {Vn< lqSRTL{inq-outq)l}(qSRTL{inq-outq){n){sn) * qRP{sn)) Similarly, the predicate RSTL_old is true of a state iff all acknowledgements in the RSTL transmission line are old, in the sense that they are for the opposite sequence number thAn th~ one expect~ by the ~nc:f p.r,:,tocol module. RSTL_old{q) = {Vn < lqRSTL {inq-outq)l)(qRSTL {inq-outq)(n){sn) * qSP{sn)) The predicate SRTL_new is true of a state iff the SRTL transmission line queue consists of a (possibly empty) sequence of old packets, followed by a (possibly empty) sequence of new packets, each of which contains the first unprocessed message held by the send protocol module. SRTL_new{q) = (3m < lqSRTL(inq-outq)l)(Vn < l m - qSRTL(inq-outq)(n) = )) Similarly, the predicate RSTL_new is true of·a state iff the RSTL transmission line queue currently consists of a {possibly empty) sequence of old acknowledgments, followed by a (possibly empty) sequence of new acknowledgements. RSTL_new(q) = (3m < lqRSTL{inq-outq)l)(Vn < lqRSTL(inq-outq)I) ((n < m - qRSTL (inq-outq)(n) * qSP(sn)) A (n > m - qRSTL {inq-outq){n) = qSP(sn))). -197 · The formal definitions of the predicates Start, Send, Flip, and Ack are as follows: Start(q) = q5p(sn) = qRP(sn) " qsp(outq) = qRP(inq) A SRTL_old(q) " RSTL_old(q) Send(q) = qSP(sn) = qRP(sn) "q8p(outq) < q8p(inq) "qSP(outq) = qRP(inq) A SRTL_new(q) A RSTL_old(q) Flip(q) = qSP(sn) * qRP(sn) A Qgp(outq) < qSP(inq) A (qgp(outq))qSP(inq-outq)(O) = qRP(inq) A SRTL_old(q) A RSTLold(q) Ack(q) = q8p(sn) -:1: qRP(sn) " qsp(outq) < qSP(inq) " (qSP(outq))q p(inq-outq)(O) = qRP(inq) A 8 SRTL_old(q) A RSTL_new(q). We now consider the proof that lnv™1(q) is invariant. (Basis): F (\'qEStates™1)(1nit™1(q) - lnv™1(q)). If q is an initial state then all queues are empty and the sn components of the state of both the send protocol module and the receive protocol module have value false. It Is easily verified from this that Abs1 1M (q) "Queue(q) "Start(q) holds. (Induction): F (\'q,rEStates™', eEEvents™1)(Trans™1(q, e, r) - (lnv™1(q) - lnv™1(r))). Suppose that lnv™1(q) holds and that Trans™1(q, e, r) holds. We first examine the problem of showing that Abs™ 1(,) holds. Abs™'(,) is easily seen to be true, since the only events that affect components of the state upon which Abs™' depends are the events in:m and out:m. Comparison of the definitions of Trans™, Trans8P, and TransRP shows that the events in:m and out:m maintain the desired correspondence between the abstract module state and the states of the send and receive protocol modules. To see that Queue(,) holds, note that the definitions of TransSP and TransRP imply that the inq and outq components of the states of the send and receive protocol modules can only change in one of the following two ways: - A new message is appended to the end of inq. - The first element of inq-outq is appended to the end of outq. Neither of these two kinds of changes can cause outq not to be a prefix of inq, and thus Queue(,) must hold. - 198- To show that Start(,) v Send(,) v Flip(,) v Ack(r) holds, we claim that all events preserve the truth of the predicates Start, Send, Flip, and Ack, except in the following cases: - If Start(q) is true and e = pkt_out:p, then Send(r) is true. - If Send(q) is true and e = pkt_in:p, with p(sn) = qRP(sn), then Flip(,) is true. - If Flip(q) is true and e = ack_out:b, then Ack(r) is true. - If Ack(q) is true and e = ack_in:b, with b = qSP(sn), then Start(,) is true. It is a straightforward, but tedious process to verify the truth of this claim by exhaustive case analysis. The following consequence of the invariance of InvT M1(q) is the crucial fact used in the maximality and validity proofs below. Lemma 11.4 - The following are invariant for the transmission module implementation: (a) qsp(outq) S qRP(inq) (b) qRP(inq) s qSP(inq) Proof - The invariance of Start(q) v Send(q) v Flip{q) v Ack(q) implies the invariance of (1) qRP(inq) = qSP(outq) v (qgp(inq) > q p(outq) A qRP(inq) = (.qSP(outq))qSP(inq-outq)(0)). 6 Suppose the first disjunct of (1) holds, that is qSP(outq) = qRP(inq). Then (a) is immediate. The invariance of Oueue(q) implies that q p(outq) < qSP(inq), thus yielding 8 (b). Now suppose that the second disjunct of (1) holds. It is a fact about finite sequences that if s, s' are finite sequences, and s > s ', then s > s 'm, where m = (s - s ')(0). This fact permits us to conclude, from the second disjunct of (1 ), that qSP(inq) > (qSP(outq))qSP(inq-outq)(0) = qRP(inq) > qSP(outq), yielding (a) and (b). I Maximality The maximality verification condition is: I= (VqEStates™1, eEEventsTM1)(1nv™1(q) A EnabledSP(q, e) A EnabledRP(q, e) A EnabledSRTL(q, e) A EnabledRSTL(q, e) - Enabled bs(q, e)). 8 Examination of the definition of Trans™ shows that Enabledabs(q, e) is identically true unless e = out:m. Thus it suffices to show that, for all q E States™' and all m € Msgs, if lnv™1(q) and EnabledRP(q, out:m) hold, then Enabledabs(q, out:m) holds as well. -199- Suppose now that lnvTM1(q) and EnabledRP(q, out:m) hold. It suffices to show that (1) qSP(inq) > qRP(outq) holds, for then the assumption that lnv™1(q) (and hence Abs™1(q)) holds implies that q b (inq) > q b (outq) holds, which in turn implies that Enabledabs(q, out:m) holds. 8 5 8 5 By definition of EnabledRP(q, out:m), we know that (2) qRP(outq) < qRP(inq) A qRP(inq-outq)(0) = m holds. Informally, if e = out:m, then m must be the first message in the receive protocol module's inq, that has not yet been transferred to its outq. The truth of (1) follows from (2) and Lemma 11.4 (b). I Validity To prove that the validity verification condition holds for the transmission module implementation, we use Corollary 1.4. We use the well-founded partial ordering < on the set {SP, RP, RSTL, SRTL} that includes exactly the pairs SRTL < SP, RP< SP, and RSTL < SP. Under this ordering, hypotheses (1) and (2) of Corollary 1.4 are as follows: {fMil) · CompTMI t= ffGuai-SrJSP A (GuarR::-MRP A ffGuarTLJSRTL A IGuarTLJRsn - (Guar™Jabs (TMl2) Comp™' I= (GuarTLlsRTL A ffGuarRPjRP A (GuarTLIRsTL - [RetySP)SP. These two conditions capture abstractly the important relationships between the validity conditions of the various modules. We now· prove that (TMl1) and (TMl2) are consequences of the module specifications. Lemma 11.5 - Condition (TMl1) holds for the transmission module implementation. Proof - Assume Comp™' and (Guar8'iSP and (GuarA'iRP' Using the definition of IGuar8PjSP, we have D(VsESeq[Msgs])(s < NowSP(inq) - ◊(s < NowSP(outq))). From this and Lemma 11.4 (a), we obtain D(VsESeq[Msgs])(s ~ NowSP(inq) - O(s < NowRP(inq))). Using the assumption (GuarRP]RP gives D{VsESeq[Msgs]){s < NowSP(inq) - O(s < NowRP(outq))). From an application of the invariance of AbsTMr, we conclude • 200- □(VsESeq[Msgs])(s < Now abs(inq} - ◊(s :s; Nowa bs(outq))). I The proof of condition (TMl2} makes use of the following lemma, which expresses the principle that guided our choices for the definitions of the various Xmit and Dlvr formulas in the specifications above. Lemma 11.6 - The following hold for the transmission module implementation: t= ffXmitSP(p}ffSP ... ffXmit1L(p}DSRrL t= 1ff0Ivr5P(b)B8P - ffDlvr L(b)DRsTL t= 1ffXmitRP(b}DRP - ffXmit L(b}BRsTL t= ffDlvrRP(p}DRP - (DlvrTL(p})SRTL' Proof - Straightforward from the module specifications and the definition of the decomposition map .4 ™1• I Lemma II. 7 - Condition (TMl2} holds for the transmission module implementation. Pro"'f ~upp~o thot CompTMI "ValidTLu» SRTL' tr\1/1a r...iRPll ..,"d tr\/-,li,.fTL11.II RSTL held • - • "'"' --.- • .,. ' u. " ll ,l..1 iRP• .... U • ...... ,.. ' Suppose, to obtain a contradiction, that -iffRelys~SP holds. From the definition of (Rely8PDsP' we know that (1) ◊(3p€Pkts)((XmitSP(p)D8P A O-iffDlvr5P(p(sn)))SP) holds. That is, eventually a point is reached after which the packet p is transmitted infinitely often, without intervening transmission of other packets, but infinitely many acknowledgements for the sequence number contained in p are not received by the send protocol module. From (1) and Lemma 11.6 we infer (2) ◊(3p€Pkts)(RXmit1L(p))SRTL A o-i10Ivr8P(p(sn)))SP). From (2) and the assumption that ffValid1LBSRrL holds, we deduce (3) ◊(3p€Pkts)(llDlvr TL(p)BSRTL A O-i(Dlvr5P(p(sn)))s,J. That is, packet p is delivered infinitely often to the receive protocol module, without intervening delivery of other packets, but infinitely many acknowledgements for the sequence number contained in p are not received by the send protocol module. From (3), another application of Lemma 11.6 shows (4) ◊(3pEPkts)(IDlv.-RP(p)DRP A □-i(Olvr5P(p(sn))JSP). From this, an application of BValidR~RP shows (5) ◊(3p€Pkts)(IXmitRP(p(sn})DRP A □-i(otvr5P(p(sn))J~- That is, an acknowledgement for the sequence number contained in packet p is - 201 - repeatedly transmitted by the receive protocol module, but infinitely many • acknowledgements for p are not received by the send protocol module. Applying Lemma 11.6, [ValidTL]RSTL' and Lemma 11.6 again, shows that (6) ◊(3pE:Pkts)([Dlvr8P(p(sn))] 8p /\ O,[DlvrSP(p(sn))] p). 8 This is a contradiction, and we conclude that (TMl2) must hold. I · 202- Appendix Ill - Index of Definitions abstraction map 41 ~-abstraction map 113 1/0-abstraction map 87 abstraction operator 42 asynchronous 97 asynchronous product 84 behavior 40 9-behavior 113 1/0-behavior 88 primitive behavior 87 canonical projection 55,84 coherent 57 Completeness Theorem 120 compatible coupling property 108 composite machine 55 composition operator 43 computation 50 valid computotlon 52 consistent 9-consistent 114 1/0-consistent 88 locally 9-consistent 116 correct 44 !-correct 114 Correctness Theorem 58 cycle 64 decomposition map 41 !I-decomposition map 113 canonical decomposition map 85 1/0-decomposition map 107 determinate 118 quasi-determinate 119 embedding 85 enabled 86 event 38 input event . 84 null event 38 output event 84 event/state algebra 137 evolutionary 116 explicit 86 -203- fair 87 future 115 history 49 history skeleton 51 1/0-system 86 implementation 44 ~-implementation 114 implementation algebra 145 implementation invariant 56 Induction Principle 49 inductive 49 input-cooperative 86 interconnection 41 9-interconnection 114 embedded interconnection 145 interface 38 abstract interface 41 component interface 41 composite interface 41 ~-Interface 113 1/0-interface 84 system interface 85 initial state set 48 invariant 49 isomorphic 107 maximality condition 56 machine 48 embedded machine 137 1/0-machine 86 PS-machine 88 system machine 86 null step 48 observation 40 orthogonal 120 possibilities mapping 59 preserves 87 strictly preserves 87 productive step set 88 reachable 49 regular 117 rely-/ guarantee-conditions 60 repeatedly 87 runs 86 satisfies 43 skeletal sequence 50 - 204 - spans 50,51 specification 43 state-transition specification 52 subset specification 47 specification domain 113 specification language 43 state function 49 state-transition relation 48 translation 41 Translation Lemma 147 truncation 113 truncation-closed 113 validity condition 56