Notes on event-driven analysis

This booklet is published under the terms of the licence summarized in footnote 1.

This booklet serves as a general introduction to two more substantial volumes on entity modeling and event modeling, which contain more specific analysis patterns and analysis questions. If you feel the lack of diagrams in this book; you will be more than compensated by the number of diagrams in later booklets.

System principles. 2

On system granularity and the analyst 6

The required process hierarchy. 7

On use cases. 10

The transaction principle. 13

Discrete event modeling principles. 16

The refactoring principle. 21

Reuse in SOA and CBD (reprise). 23

Componentisation and distribution.. 24

Do enterprise packages mean the end of analysis?.. 27

System principles

Software engineering is underpinned by many theories, principles and paradigms. The Agile paradigm is one. There are also; Kleene’s theorem (about data flow structures); Bohm and Jacopini’s principle (about process structures); relational theory (about data store structures and queries); type theory (about operations on types); the state machine paradigm (about events and states); the component paradigm (about encapsulating a component behind an interface); the OO paradigm (the component paradigm plus inheritance and polymorphism). I believe that systems theory is the place to start. So this first chapter presents some basic system theory and adds some principles on top.

What is a system? A dictionary says: “A group of interacting, interrelated, or interdependent elements forming a complex whole.” Systems theory has more to say than that. It is interpreted for the purposes of this booklet as follows.

The boundary principle

A system is a transformation process. It transforms inputs into outputs. The only way the system can make an impact on its environment is via inputs and outputs. The only way a system’s function or performance can be tested is by inputs and outputs.

A system is a bounded transformation process. It can be represented in a ‘context’ or ‘black box’ diagram. You can take a black-box view of any man-made system. Outside the box boundary are actors, who supply inputs to the system and consume outputs from it

The reason we make a system is to deliver some kind of result or output. Outputs are the yield or value of the system. They should meet goals or provide benefits. The objectives of and requirements for a system are best defined in terms of outputs of value to those in the world outside system. Input acquisition is a cost paid to those ends.

The need for resources

A resource is something a system needs to process inputs. A resource remains available, after an input has been consumed and processed, for processing the next input. A resource changes over time, however slowly and has to be maintained.

Inputs and resources are overlapping ideas. Generally speaking: inputs are transformed, whereas resources are used up, wear out. Inputs are used up in a short cycle time, whereas resources are maintained over a longer cycle.

So, a system is a set of processes that transform inputs into outputs, using and maintaining resources to do this

The start up problem

On system creation, resources must be input. Forgetting to establish the resources is at the root of many problems when it comes to implementation/deployment/transition/roll out of a new system.

(Note ambiguous terms: implementation can mean coding; deployment can mean installation of program code on a computing machine; transition can mean transfer of business operations from a customer to an outsourced supplier)

The connectivity principle

Every part of a system is directly or indirectly connected to every other part. If this were not so, there would be two or more distinct systems. So, inside the boundary are interconnected subsystems.

The boundary question

Where to draw a system boundary? This is a major source of customer-supplier contention. There is no universal or absolute definition. The boundary line is a choice made by an observer, or agreed by a group of observers. It is often convenient to enclose a set of persistent resources: e.g. the wall of shipyard, or the fences of a farm.

Systems can be nested. A systems analyst must ensure that people agree on the boundary of a system they are discussing or working on. (Systems can also be distributed and overlapping, but we’ll talk about that another time.)

The system/project boundary clash

One project can involve changes to several resource-centric systems. To mark the project scope on a diagram, highlight which inputs/outputs are new, which are changed, and therefore which subsystems must be modified as a result of new or changed inputs/outputs.

The information system principle

An information system produces outputs the human mind can process. Input and output forms include text, speech, music, pictures, moving images and radio transmissions. An information system can be manual, mechanical, electro-mechanical, or electronic. (E.g. a CD player transforms information recorded on discs to information in the form of air vibrations.)

The software system principle

All this talk of systems, but no mention of software so far. A software system is an information system, a data processing system, in which the process are automated and run on a computer platform.

A software system principle

A software system can do nothing for a business but recognise and produce data flows; it validates input events, stores persistent entity states, and derives output from those inputs and states.

A software system is a data processing system. It is a constrained system, capable of recognising and producing only a very limited kind of input and output - a data flow

A data flow is a data structure composed of data items. Kleene’s theorem says every I/O data flow structure can be defined as a ‘regular expression’ (a hierarchical structure of sequence, selection and iteration components). But many data flows are simply flat lists

Sometimes a software system appears to produce things other than data flows. The software inside an ATM machine seems to produce dollar bills. In fact, it only sends a message, via an actuator, that instructs an electro-mechanical device to feed dollar bills out through the hole in the wall.

Software systems use data resources. Most substantial software systems maintain a state (aka data resource, memory, working storage, data store) that includes all variables maintained by the component. A variable is the atomic element of a state, input or output data structure

Stateful systems: even the simplest process control system needs & maintains a few persistent variables (typically representing a fact about the state of an electro-mechanical device outside the software system). “Persistent” means the state is retained between processes.

Stateless systems: not all software systems maintain a persistent data resource, but designers usually call that a ‘program’, ‘module’ or ‘object’, not a ‘system’.

The data integrity principle

The data integrity concern

Analysts strive to maintain data integrity by defining rules that constrain all the users and clients that can update entity state data (which means, by the way, that they posit defensive design rather than design by contract).

The boundary of a software system encapsulates a coherent state. Persistent data should have integrity. This means

· A fact should be the same value in all locations (e.g. customer name).

· Data should be consistent with all invariant business rules (e.g. order must be for a known customer).

Arguably, data integrity is the challenge for enterprise architects.

“If users don’t trust your data, your [application] is garbage” IT director of a major telecommunications company.

“Problems with data integrity at one company turned a 2 month exercise into an 18 month exercise.” Information Week May 19th 1997.

In a non-distributed system, the state should always appear consistent with all invariant rules. No external observer, no client, can ever find the data in an inconsistent state. The data can however be inconsistent part way through a process inside the system.

Where data is duplicated or distributed, everything gets more complicated. Our manifesto makes three points on this.

The cache concern	Analysts regard caching business data outside the enterprise’s persistent data store as a design optimisation technique and a design headache, not a primary design principle.
The data duplication concern	Analysts have to define workflows to synchronise copies or entity state that are duplicated beyond the bounds of automated transaction management.

The modeling principle

A software system models the world around it. To have a purposeful impact on the world around it, software must contain representations of that world.

Non-enterprise applications

We are talking mainly about enterprise applications. Students sometimes ask: What about embedded systems? Here, a software system is embedded in a wider electro-mechanical system. The actors outside the boundary of the software system are the sensors and actuators of the electro-mechanical system. The same principles apply.

What about process control systems? “Process” here usually means the operation of an electro-mechanical system. The same principles apply.

In fact, an enterprise application can be viewed as embedded inside a business system, as a process control system where the processes are human activities and real world business processes. E.g. A billing system is designed to monitor and control the paying of bills by customers. The control is weak and indirect of course.

What about closed systems? A closed system cannot be observed from outside the system. Observation requires and implies output from the system. So, a closed system is not interesting or relevant to us.

Summary

This introduction has defined fundamental concepts to do with systems in general and software systems in particular. It has introduced several principles taken for granted later, especially in the context of distributed systems thinking and component-based development.

System theory principle	description
The boundary principle	A system is a bounded transformation process.
The need for resources	A system needs & maintains persistent resources.
The start up challenge	On system creation, resources must be input.
The connectivity principle	Every part of a system is directly or indirectly connected to every other part.
The boundary question	Systems can be nested.
The system/project boundary clash	A project usually affects several systems.
The information system principle	An information system produces data output in a form we humans can process.
The software system principle	A software system can do nothing for us but recognise and produce data flows; it validates input, stores persistent state, and derives output from inputs and state.
The data integrity principle	The boundary of a software system encapsulates a coherent state.
The modeling principle	A software system models the world around it.

That last idea is very important in systems analysis. Models feature a little in this book, and rather more in other booklets in the series.

On system granularity and the analyst

Composition of smaller systems into a bigger system is the principal tool we have for creating a large system, and for hiding lower level details. The idea is appealing and useful, but also deceiving.

It is tempting to assume that the same concerns apply at each level of composition; that we should use the same devices to model a system whatever its level of granularity. Some have hoped that we can build a model that is comprehensive and internally consistent at the highest levels of composition in the same way we can at the lowest levels. This kind of thinkinghas encouraged people working at higher levels of composition to use ill-fitting tools and build fanciful models of limited practical application.

`Level, diagram, interface`	`Analyst role`	Model
`1) In a programmer's diagram, boxes may be` `modules and lines may be local procedure calls (1st kind of interface).`	`Analysts are not involved.`	`A comprehensive and coherent, even executable, model might be maintained.`
`2) In an application architect’s diagram, boxes may be software components on different` `processors and lines might be remote procedure calls (2nd kind of interface).`	`Analysts may define the back end business services required by an application’s user interface, using an informal interface definition language.`	`A comprehensive and coherent model might be maintained, but it is less likely.`
`3) A solution architect is concerned to keep software systems loosely-coupled and draws diagrams where boxes are discrete systems and lines` `are asynchronous data flows (3rd kind of interface).`	`Analysts define data structures, data models, inter-system messages and data flows (perhaps using XML?)`	A comprehensive and coherent model is beyond us. Between discrete legacy systems, full agreement about business terms is impossible.
`4) A systems analysts looks outside of` `the software domain into the users’ domain, and considers user interface (4th kind of interface).`	`Analysts define use cases and user interfaces.`	Use cases and user interfaces are modeled only the loosest way.
`5)` A business analyst is concerned with how business goods and services pass b`etween` the highly varied human and mechanical processes of an `enterprise` (5th kind of interface).	`Analysts define business processes, and the conditions that control the flow of those processes.`	At best a narrow view model (a business process view, a deployed technology view, whatever) and a few loose mappings between the views.

What does this mean for analysts reading this book? It helps me to say what the booklet is not about. It is not about the lowest level of composition; for discussion of software component design and the OO paradigm see “The Agile Application Architect”. It says little about the highest level of composition; for more on enterprise modeling see “The Agile Enterprise”.

This booklet is mostly about specifying an enterprise application system at the middling levels of granularity. It says little about data flows and user interfaces. The focus is on business processes, use cases, business services and data models, and on using them to capture business rules.

The required process hierarchy

The required process hierarchy

Analysts look to distinguish three levels of process specification: long-running business processes, shorter running use cases and atomic automated services.

Process definitions are decomposable to any number of levels you like. We look to distinguish three levels of process specification: long-running business processes, shorter running use cases and atomic business services. Enterprise application users are usually able to distinguish these levels in the course of requirements specification - provided we help them to do this.

Process level	Process that
1 Business process	A long-running end-to-end process that operates in the human activity system (though its flow may be directed by a computerised workflow system.)
2 Use case (session)	A shorter-running process that takes place at the human-computer interface of a data processing system, supporting a step in a business process.
3 Business service	An atomic process that is executed by a back-end component within a data processing system, serving a use case or business process. Usually a transaction.

This three-level hierarchy fits pretty well how most enterprise applications support a business. The processes at each level have different names because they are different. I have heard people speak of business use case, system use case and service use case. But I normally reserve the term use case for the middle level, because the other levels don’t fit so well what most people describe as a use case or document in the style of a use case.

It may be called a process hierarchy, but it is really a process network. A lower-level process can be part of, reused in, several higher-level ones. That potential for reuse is one of the reasons for separating the three levels. I discuss reuse later in <The refactoring principle>.

The three-level hierarchy gives us a reasonably natural analysis and design sequence: first business process definition, second use case definition and third business service definition. But life is never that simple and much iteration is expected during analysis and design.

Business processes in a human activity system

Consultants and business analysts often draw process flow diagrams to represent business processes. I assume you know how to draw a process flowchart. The only point I want to make here is that while the control flow of a business process is governed by business rules, these rules usually have to be coded in the heart of the enterprise application, in the automated business services that maintain the business data.

Use cases at the human-computer interface

Most systems analysts nowadays define something akin to use cases during requirements analysis. A use case:

· is executed by an actor to support a step in a business process

· provides information to the actor and/or captures information for future business process steps

· is typically an OPOPOT (one-person-one-place-one-time) user-system dialogue or function – or a process to consume or produce a major data flow.

· is documented using a use case template, by listing process steps in a "main path" and "alternative paths"

· may be documented within a requirements catalogue (perhaps high-level only) or within a functional specification (perhaps more detailed).

· defines the boundary of a software system; is outward facing, user-task or HCI oriented; define the flows of control that govern what users do at the user-system interface.

For use case metrics capture and use case estimation, you have to be very clear about the granularity of a use case. It is normally shorter than an end-to-end business process. It is normally longer than an atomic business service. However, the effort to code a use case will involve the effort to code all of the business services it invokes, and this should be remembered when estimating effort.

In theory, Agilists avoid the temptation to detail use cases by writing each as a user story on a postcard. The Agile Analyst-Designer then goes on to act as a broker between domain experts, end users and developers, conveying all the remaining details verbally. However, practical experience suggests that analysts end up transcribing their user stories into Word documents and extending them with details and test scripts until they cover several pages.

Brian Nolan has observed that one of the problems people run into using OO methodologies is that people continue refining their use cases until they have decomposed the back-end into components and defined all the back-end operations. To some extent, this is because people confuse use cases with business services, and OO methodologies do encourage a distinction.

Use cases are better reserved for defining the dialogue at the human-computer interface. The temptation to decompose a uses case to low levels of processing detail may be avoided by recognising and defining business services as distinct things.

Business services offered by back-end components

Use cases involve, or better, invoke business services. This isn’t functional decomposition so much as client-server design. You can think of use cases and business services as being arranged out-to-in rather than top-down. Usually, a business service

· is a service of a back-end business component

· is invoked from a user interface or data flow consuming process

· supports and progresses a use case

· applies a message to stored business data.

Viewed from the top down, a business service is the bottom level of user-required process. Think of it as an atomic process for now.

Viewed from the bottom up, a business service is a service offered by a back-end component. The back-end component might be a database under our control or a database under somebody else’s control, or a 3rd party component of any kind. It might be a web service perhaps.

Some of the complexity and interest in specification and design arises from a mismatch between the top down expectation of what the business services should be (the required business services), and the reality of what one or more back end components provide (the provided business services).

For now, assume business service = unit of work = transaction, so it can be rolled back if any exception is discovered. That is important to how you specify business rules. I’ll deal with deviations from this scheme later.

Variations on the three-level process hierarchy

The three-level process hierarchy is only a model, and it can be varied.

Sometimes the top and bottom levels are enough. A step in business process or workflow invokes a business service directly, without the kind of human intervention you would normally document in the form of a use case.

Occasionally, the bottom level of processing is enough. We have built a back-end infrastructure system that has (in the terms of this discussion) no use cases, only business services. Yes, you could view the business services as use cases - but they are better defined using a business service template than a use case template.

Quite often, a use case invokes one business service, and that business service is not invoked by any other use case. It is then very tempting to define them together. I generally recommend you separate them, and thus separate the concerns of the user interface from the concerns of back-end processing. The obvious benefit is that it you can more readily reuse the business service in another context. A less obvious benefit is that you can specify business rules where (I say) they belong, with the business service rather than with the user interface. More on that later!

Other variations are mostly to do with whether what the user perceives as an atomic process is (behind the scenes) really several processes.

Mapping required process to software layers

This table maps the three-level process hierarchy maps to a standard 3-layer software architecture.

Required process hierarchy	Software processing layer
Business process	*
Use case	User interface	processing the input and output data structures, the user interface views of data
Business service	Business services	processing the business rules
	Data services	processing the database and other data sources

* Some put workflow here. But workflow means different things to different people. I am not confident it is right to place it above the user interface layer, since workflow control can be inverted in some kind of session state behind the user interface components. Wherever workflow fits, a workflow is basically a procedure; I assume you know how to draw a flowchart and your developers know how to code a procedure, and to invert it into a subroutine if need be. Workflow technologies may assist here, but they are not necessary.

Conclusion

We all rely on abstraction as a tool to specify software systems in fewer words than we need to code them. The three-level process hierarchy is a simple and basic abstraction tool. You might say a higher level process is composition of lower-level details. Certainly, a higher level process definition should suppress detail better described in lower level ones.

“Where is transaction management (roll back and error handling processes) in this model?” Effat Mohammed

A good question. To be discussed later.

On use cases

I have sketched out a three-level process hierarchy, not as a rule, but as a reasonable starting point for analysis and enterprise application specification. This chapter addresses questions about the middle level of the hierarchy; where use cases fit.

Can I specify using prototype UI designs rather than use cases?

Some specify user interactions with a system, and business rules applied, in a narrative supporting a prototype user interface. This specification may take the form of a Word document, perhaps containing screen images, certainly containing references to elements of the screens.

It would be foolish to dismiss any common approach. But, just as it is not generally considered a good idea to code business rules in user interface objects, it is not generally considered a good idea to attach specifications of business rules to user interface designs.

The user interface may hold copies of persistent business data. So, the UI-centric analyst is naturally tempted to specify business rules where the data appears in a user interface view. But remember that the same data may appear in other views. And whatever happens at the user interface, you’ll still need to make sure the data in the underlying database is consistent

The cache concern

Analysts regard caching business data outside the enterprise’s persistent data store as a design optimisation technique and a design headache, not a primary design principle.

I can’t deal here with all possible approaches and complexities. Let me press on talking about a specification centred on use cases.

Is a use case an idea or an artifact?

Hi Graham, I agree with the process hierarchy, and the descriptions you have given. But I would prefer the term "User Session" for the 2^nd level and keep "Use Case" for its use in UML as a generic tool for modelling processes at any level.

We tend, in using the three-level process hierarchy, to map methodology idea to artifact type.

UML gives us artifact types not methodology: UML provides four artifact types to document a process: activity diagram, use case, state chart and interaction diagram. Any process definition artifact type can be used at any level of process granularity.

Use case as an artifact type: But it seems silly to use the term use case at every level of granularity. What can it mean then other than "process"? We are overloaded with terms for process. Process will do.

Use case as methodology concept: A methodology has to reduce confusions by making granularity distinctions. For most, a use case is a case of software system usage by an external actor. In OO analysis and design methods, the use case is primarily employed to capture requirements at the level of the system boundary or human-computer interface. Using the term user session for a 2nd level process might be OK if it didn't leave batch input/output processes out in the cold, as modern methods tend to.

Mapping methodology concept to artifact type. It seems natural in our method to relate the level of process granularity to the most popular artifact type for that level thus

· business process --> flow chart or activity diagram

· use cases --> use case with main path, alternative paths etc.

· business services --> operation interface definitions.

That is a simplification, but one that seems useful in practice.

.Can I specify use cases without regard to business services?

You can define the goal of a use case, and briefly outline it, without considering when and where business services are invoked.

VERSION 1: “The use case goal is to help a salesperson place an Order for a Customer, with the option of registering a new Customer as well.”

The more you spell out the normal and alternative paths of the use case, the more you imply when and where business services are invoked. What is the normal path of this use case? Will a salesperson first check/register the Customer, then place the Order? Or first set out to place an Order and then register the Customer only if it proves necessary?

VERSION 2: “First, the salesperson enters the Order details and presses the send button. Normally, the Order is accepted. Alternatively, if system cannot find Customer (the account number is wrong or missing), then system prompts the salesperson to check/register the Customer, then return to re-enter the Order.”

Then, the more you discover and say about the business rules, the more you have to consider when the business services are invoked and how they may fail. Consider the specification below.

VERSION 3: “The use case goal is to place an Order. A precondition of this use case is that the Customer exists in the system. A post condition is that a new Order exists in the system. First, the user enters the Order details and presses the send button. Normally, the Order is accepted. Alternatively, if system cannot find Customer (the account number is wrong or missing), then system prompts the user to check/register the Customer, then return to re-enter the Order. The user may leave the dialogue at any point, returning to the main menu.”

Paraphrased from a popular book on use case definition.

Tell me, what is wrong with this specification? Don’t worry about whether the design is good or bad; consider only whether the specification is accurately expressed.

This specification is badly expressed. A precondition of the use case might be that the user identity and password are valid. But it is not a precondition of the use case that the Customer is registered. This rule is in fact a precondition of the business service invoked by the use case to create an Order.

Second, a post condition of the use case might be that the main menu is displayed once more. It is not a post condition of the use case that a new Order exists in the system. The use case may finish with a new Order, with a new Customer, or with a new Customer and a new Order, or without any update to the system at all.

Normally, error messages force the end user to be aware of the business rules imposed by back end business services. The user cannot ignore these rules. And the domain experts with whom you specify use cases will be concerned that you specify these rules correctly

Moreover, the paths through a use case are determined by the rules imposed by back end business services. In our example, you have to specify an alternative path to cope with the possibility that the Order is rejected for lack of a Customer. In other examples, there may be several alternative use case paths, depending on the several reasons why a business service may be rolled back, each starting with a different error message.

Where it is true (as is common it is said) that 70% of a system processing is exception handling, then it is not very helpful if you specify a use case without at least mentioning the reasons why back end services may fail, and the alternative use case paths triggered by such cases.

Domain experts and end users are not the only people who have to be aware when a business service is invoked and the different paths the follow from its success or failure. The software designers must know this as well.

The transaction principle

The transaction principle

In logical specification, analysts posit the back-end automated business service triggered by a business event can be rolled back; this is a necessary simplifying principle.

I have sketched out a three-level process hierarchy, not as a rule, but as a reasonable starting point for analysis and enterprise application specification. Where is transaction management (meaning automated roll back on exception conditions) in this hierarchy?

You do have to posit transaction management at some level of process specification. If you don't, then you are forced to specify all manner of additional complexities such as undo processes. You naturally want to posit transaction management at the level that can be managed automatically, or close to it. Certainly, you don’t want to specify roll back processing that will in the end be automated for you.

Generally speaking:

· A long running business process cannot be automatically rolled back - since you cannot roll back the real world.

· A shorter running use case cannot be automatically rolled back - since you cannot roll back an end user's mind, or outputs already consumed by external actors.

· An atomic business service can be rolled back, since you can roll back data stores and outputs not yet processed by external actors.

It would be nice to assume a discrete event triggers one business service, which is one unit of work or transaction. However, nothing is ever as simple as you want. And every principle of data processing systems analysis and design is there to be broken.

The two likely causes of a discrepancy can be described as “componentisation of processing” and “aggregation of input”, and I should say something about both.

Componentisation of processing at the back end

Sometimes the back end system is modularised so that one user-perceived discrete event has to be processed by entirely discrete business components. This may be down to physical design constraints that users (even analysts) don’t want to concern themselves with. It may be because stored data is denormalised or replicated (whether by accident or design) in two or more databases.

A colleague has supplied an example.

"Suppose a billing process invokes the Register Customer use case which invokes two business services, first Create Customer in CRM system, then Create Customer in billing system. What if the 1st business service succeeds and the 2nd fails?" Effat Mohammed

Suppose there is no user intervention between the two back-end processes. If both succeed, then you have no problem. If both fail, then you have no problem, save a little complexity in the error reporting. But if one succeeds and one fails, you have a headache.

The notion of the discrete event is that it succeeds or fails as a whole. So if one succeeds and one fails, then you have some design to do. What if the 1st subsystem succeeds in creating a customer record and the 2nd subsystem fails? Design options include.

Transaction option: Is data integrity king? If the users worry a lot about stored data discrepancies - if they perceive the two provided business services as inseparable - then you should wrap the two provided business services into one transaction to manufacture the required business service.

· If automated transaction management is impossible, you have to design and code an undo process (Uncreate Customer) on the 1st subsystem - and invoke it before returning to the user.

· If the business services are already transactions, you may be able to add a higher level of federated/distributed transaction control above the two existing transactions. I hear this imposes more dramatic performance overheads. And using federal transaction management may complicate the reporting and handling of exceptions. Ask an architect if you want to know more about federated/distributed transactions - my concern here is the analysis and design implications.

In many ways, the transaction option is ideal, because it saves the need to design, code and invoke undo processing. It leaves the simple use case intact.

Workflow option: Are users relaxed about stored data discrepancies? Are they happy to perceive the two provided business services as discrete? Then the system can report that one business service has succeeded and the other has failed. The user is left to decide what if anything to do about the stored data discrepancy. You will however have to extend the use case or define alternative paths that enable the user to do what is necessary.

The fix it on the fly option: You might design and invoke a business service (or several business services) on the 2nd subsystem that will alter the database (say create some missing reference data) so it can now accept the update. The user may need a relatively complex report of what has been done to fix things up. This design option may require additional user decision making if not extra data entry, and so, again, involve extending the use case.

These three design options influence the way the use case is specified. You really should ask your domain experts which option they prefer. It is their job to tell you whether they see the processing as one event, or two distinct events that do not have to both succeed.

Whichever option you choose, the system's user interface must report to the user what has happened. You may have to give the user two error messages at once, reporting both the CRM and Billing systems have rejected the transaction for different reasons.

Aggregation of input at the front-end

Sometimes the data for several truly discrete events is submitted at once. In some designs, the user presses the send button after a large screenful of data entry, at which point several discrete business services are invoked. This may be a consequence of designing to meet non-functional requirements for performance or usability.

In on-line processing, different types of business services are often grouped in one transaction (say Create Customer and Create Order). In batch processing, different instances of the same business service are grouped in one transaction (say 100 Orders). Aggregation will reduce transaction management overheads and speed up processing, providing the success rate is high.

E.g. To reduce processing time, you may process 1,000 ticket sales (downloaded overnight from a railway platform ticket machine) in one transaction, and roll back all of them in the (hopefully rare) exception that any one fails.

Many off-line or batch process do not well fit the conventional modern "use case" approach to system requirements. In truth, the typical use case template is not primarily designed to help people define a use case for batch processing of an input file, or the generation/printing of a report - you are left to your own devices there.

Batch processes do not always fit the three-level process model very well either. If a whole batch process use case is implemented as one single transaction, then it departs from the principle introduced above.

Analysis and logical design is not separable from physical design constraints

For use case specification purposes, you may idealise by positing a required business service is a transaction. But ultimately, if the process is not automatically roll backable, then you have to specify the undo processing needed to correct a half-completed update.

Thus the logical and physical are inevitably intertwined. You have to specify the components and processes of a system with some minimal knowledge of the target platform's transaction management capability. You should know and declare two kinds of platform-related information in a specification:

· the units of system composition - the discrete systems across which the chosen platform can automate roll back of a process - a discrete system has a discrete structural model and often maintains a discrete data store

· the business services provided by each discrete system - the roll-backable units of work.

Your specification must

· recognise discrete system boundaries.

· model discrete events as well as discrete entities.

· work on the assumption that discrete events can be automatically rolled back.

At least, you have to do these things if you want you model to be readily transformable into a software system. You cannot hope to do forward engineering from model to code if you cannot envisage the former as a reverse engineered abstraction of the latter.

In short, if your specification is to be useful to software specialists, then before you define a business service you will have to answer two questions

· Q) What persistent data does the business service need?

· Q) Can our platform roll back the required business service across all the data stores that hold this data?

Discrete event modeling principles

This is a relatively academic chapter. This chapter outlines the principles of discrete event modeling and promotes the importance of defining business services, since these are required processes just as much as business processes and use cases.

A discrete event is sometimes called a logical transaction. For now, assume one discrete event triggers one business service, which is one unit of work or transaction. Normally, a business actor expects a discrete event to succeed if its preconditions are met or else fail completely. I’ll deal with deviations from this scheme later.

You have to define the persistent data structures maintained by discrete systems, the units of work on those data structures and the pre and post conditions of those unit of work. Why? Because:

· You have to capture what domain experts understand of how persistent data constrains processing and is changed by processes.

· An enterprise's data is distributed, and it is vital to define which data stores the business requires to be consistent and which data stores need not be consistent.

· Users should understand the effects of any business service that they invoke from a system's user interface (or if they don’t understand, they trust the system has been designed with the help of domain experts who do understand).

· If you (analyst) don’t define the business rules required and imposed by business services, then you are simply ducking your responsibility.

This applies to any kind of software system specification, whether it is a model or a narrative.

Discrete event modeling

We cannot build a software system that monitors the world on a truly continual basis. A software system can only perceive the passage of time as a series of discrete events - processed one after another. A software system recognises the passage of time by detecting an event.

An <event> triggers a service. I use these two terms almost interchangeably. An event causes a state change in the whole system or component being modeled. An event has an effect on at least one entity in the system. An event changes the state of one or more entities. An event triggers a discrete unit of work on a coherent data store.

Event start

An event is signified by either the input of a data message, or the arrival of a date/time. It does not matter to the specifier how an event is detected and enters the system. An event may be input by an external actor. An event may arrive in a message queue. Alternatively, the system may reach out into its environment to grab an event, by polling for input.

E.g. a calendar or clock may notify the system of time intervals by iteratively sending an ever-advancing date/time message to the system. Alternatively, the system may poll a calendar to see if the date has changed yet. In the second case, the inputs arrive in the form of replies. The event and the process it triggers are the same as far as business rule specification is concerned

Polling is perhaps more common in process control systems, where one might for example poll a sensor to see if a machine has overheated, than in enterprise applications.

Event conclusion

An event concludes when the system either rejects the event or the state of the system is definitively advanced. Either way, the system is free to process the next event. The event conclusion is normally marked by a reply or output message.

Every event input to a system should produce at least a success/fail response. This outcome must be recognised (now or later) by the clients outside the system. If no client cares whether an event has succeeded for failed, then the event was pointless, and need not, should not, have been input.

You might process a batch of account credit events overnight. The fact that the customer's interest in the event outcome is masked, or removed to some distance, behind a batch I/O process, is neither here nor there as far as discrete event modeling is concerned. Eventually, the customer wants to know whether their account has been credited or not. Every event failure must be notified to a client. By implication, every event success is notified to a client. Directly or indirectly, the outcome of every event is notified to a client.

The consequences of the event - the inspection by clients of state changes by enquiries and on reports (e.g. bank statements) are not considered part of the event itself.

Events are discrete

In the real world, things change continually, as far as we can tell. The body of knowledge about discrete event models applies to the models we build of things, rather than the things themselves. So if you talk about events, states and conditions in terms of your intuitive understanding of a real world process, your conversation may depart from the conventions of discrete event models.

You cannot apply discrete event model terms and concepts to a continuous process, until you have divided that process into discrete steps, each with a state before and a state after.

e.g. You cannot consider the momentary condition of planetary alignment to be a state in a system model, unless the system rests in that state between one event and the next event.

You cannot inspect the state of the system while an event is taking place, only between events. Or rather, if you do inspect the state while an event is taking place, then you will likely find that the system is in a state that is inconsistent with the rules of the system.

The date/time that an event happens can be recorded in the system only if a persistent record is maintained of the event, which is not normally the case for all events that update a business information database (though an event log may be maintained outside the system).

Event-driven design v OO responsibility-driven design

Use cases and business services are not object-oriented. They are procedures; they are event-driven processes. They are products of event-oriented analysis and design. Event-orientation has a big role to play in analysing and building implementable models.

· Agilist: This is more analysis than design.

Yes, I am focusing on the analysis of business rules. And I am thinking of what analysts are responsible for doing rather than what developers do.

· Agilist: Business services speak to you, but not me, and not everyone. Each person is different.

That’s partly a matter of education, but I suggest that business services speak to users because they are the process that users must invoke and respond to the success or failure of.

· Agilist: Analysts may also define responsibilities, and these can be shared across use cases and business services.

Analysts may do this, but I’d recommend letting the object-oriented designers and developers sort it out. Designers should always strive to factor out common processes, both between levels and within a level of the three-level process hierarchy. How required processes map to classes for object-oriented programming is a matter for software designers to decide.

I would like to draw a dividing line between what analysts must document in a specification, and what is done to transform that specification into economical OO code. I recommend analysts:

· Focus on entities and events (business services)

· Don’t build code-level component models

· Use event-oriented analysis to discover responsibilities and rules

The OO paradigm contains a technique related to analysis of events and business services, that is, responsibility-driven design. A responsibility represents a package of behavior and data, which might be a single operation, but also may involve multiple operations and data items. The best-known book on responsibilities is Wirfs-Brock and McKean <http://www.amazon.com/exec/obidos/tg/detail/-/0201379430>

· Agilist: Business services have a place in object-oriented design, but you would do better to focus on “responsibilities”, a fuzzy but very valuable notion.

My wish is to raise discrete events to a higher status in analysis and design, partly because they are firm rather than fuzzy. When I looked (only briefly) at responsibility-driven design, it felt to me like a way to anthropomorphise encapsulated things and the work they do for their clients. I am sure it does help to scope what classes do. But ultimately, in an enterprise application, what is a class responsible for? It is responsible for playing its part in the processing of the required business services – no more, no less.

· Agilist: Your understanding of responsibilities is not the same as most people in the OO community.

You are surely right. I should say my aim is to teach analysis, and to some extent design, in a way that does not assume either procedural or object-oriented programming. Business services are requirements that have the same weight whatever the programming paradigm. We can rescope the entity-oriented classes to our hearts’ content. We cannot rescope the business services so readily, for they are the requirements; they are what clients need to be done.

· Agilist: Getting your analysts to focus on responsibilities would make their specifications more palatable to OO developers.

Perhaps, if OO developer training remains as it is. But should analysts be influenced by programming language paradigm? Would it not be better to lift our developers above a single paradigm (in this case the OO paradigm, in other cases the relational paradigm or state machine modeling paradigm, or whatever)?

And can designers and developers really do a good job without understanding the business services? Some gurus promote component-based development in which “business components” sit on large coherent data structures, offering services. Other speak of a service-oriented architecture. Even within the OO paradigm, rather more attention is given to “control objects” are nowadays. So I think the force is with those who specify in an event-oriented way.

· Agilist: Responsibilities play an important role in switching from a functional view of a system to an OO one.

They do, though I hold no candle for functional decomposition. There is a view (expressed by Bertrand Meyer and others) that object-orientation is the opposite of functional decomposition. For me, the opposite of object-orientation is event-orientation, in which one analyses business services and factors out common processes. The two are complementary. I believe both are needed for a good understanding of systems.

· Agilist: Arguing for an alternative technique won’t help if your audience are not familiar with the technique you propose. Technique X is not always better than technique Y for everybody in every situation. We should discuss tradeoffs, let people pick the technique(s) that seem best and then tailor it to their situation.

I can’t argue against you there. My concern is that OO-trained people don’t have the choice, because they are not taught event-oriented analysis and design.

· Agilist: We prioritize and negotiate requirements. We also break requirements into smaller chunks, sometimes as a result of prioritization and scheduling efforts. So, the business services can change.

Yes. Let me not exaggerate the immutability of business services, but I do want to emphasise their importance in analysis and design. You cannot usually rescope the business services without consulting the users (e.g. the users really do care that the order item value is returned). Whereas you can rescope the classes and their responsibilities behind the scenes (e.g. the users do not care if the order item value is calculated by an operation in the order item entity or the product entity).

· Agilist: A focus on responsibilities would get you out of the data mind set.

Discrete events and business service may be traditionally associated with database processing, but they are primary analysis and design artefacts in real-time process control systems also. In fact, they may prove more immutable in process control, since you cannot readily alter messages that hardware sensors understand and actuators respond to, and the behavior of the overall system is constrained by safety considerations. Looking at the case studies in OOP books, I usually find that the discrete events are more solid and indisputable than the classes. There are times when it seems perverse of OO authors to treat the entity-oriented classes as the primary design artefacts.

Conclusion

Event-oriented design has always been with us, and will remain with us. Events and services are survivor concepts. At the first level of analysis, the clients of a system require services rather than objects. Analysts have to define the services required and the events that trigger them. Objects are a structuring device for designers. Analysts do need to understand the state of the system, but a data model is perfectly adequate to define that.

Term	Definition
Effect	(event effect) a reference and/or state change to one entity caused by a event.
Effect instance	a reference and/or state change to one entity instance caused by one event instance.
Effect type	a reference or state change (or set of mutually exclusive state changes) to one entity type caused by one event type.
Enquiry	a message that triggers a service that does not change a component’s data store; sometimes used as synonym for the service itself.
Entity	An <entity> has attributes and relationships that are used and updated by events. An entity usually records the state of a persistent thing or process in the environment that the component is required to monitor if not control.
Entity model	An entity model is a data model. For rule specification, an entity model may be unnormalised to a degree (not fully in 1NF). It should include expressions that define how attribute values are constrained and derived; it should include as many derived attributes as prove helpful.
Event	a message that triggers a service that updates a component’s data store; very often used in this booklet as a synonym for the service itself.
Service	a process that consumes input parameters and produces a response/output (at least a success/fail message if not a more substantial output data structure). A service usually reflects a transient event in the environment that the component is required to monitor if not control. A business service usually inspects, tests or changes the state of at least one entity.

The refactoring principle

The refactoring principle

Analysts continually look for opportunities to factor out common processes, both between levels and within a level.

The simplest program refactoring technique is creating a subroutine. Agilists apply the refactoring principle to code, but it applies equally to designs and models. In fact, most design methods feature some kind of refactoring technique.

Refactor using the required process hierarchy

While identifying the required business processes, use cases and business services, an important task is to optimize the reuse of a process at one level by process at the level above.

A higher-level process invokes one or more lower level processes; a business process step invokes one or more use cases; a use case step invokes one or more business services. Conversely a lower level process may be reused by two or more higher level processes. There are two kinds of reuse.

· Between levels: a use case may be used in more than one business process step. Similarly, a business service may be used in more than one use case. However, project examples suggest that the majority of business services are unique to the use case that invokes them.

· Within a level: two use cases may share a common process. Similarly, two business services may share a common process. The lowest level common process in a business service is an operation on the lowest level encapsulated entity. Sometimes, a common process is wanted on its own in another context, so it can be defined as discrete use case or business service.

Do not confuse a shared use case that involves user interaction (typically some kind of look up enquiry) with a back-end business service.

By the way, you may be able to generalise two similar business services into one, but you have to define the two distinct requirements before you can know merging the requirements is possible.

Refactor where database process share a common process

SSADM includes a formal event-oriented technique for defining reuse between business services. In this technique, the business service is called a discrete "event" which has an “effect” on each of one or more “entities”. Two discrete events can share a common process, known as a "super event". The OO concept of a responsibility is akin to an effect, or more interestingly, to a super event.

In short, you:

· identify events.

· identify where two or more events have same pre and post conditions wrt an entity (that is, the several events appear at the same point in the entity's state machine and have the same effect).

· name the shared effect as a super event.

· analyse to see if the super event goes on from that entity (where the events’ access paths come together) to have a shared effect on one or more other entities, and if so, you adopt the super event name in specifying those other entities’ state machines.

I don't mean to persuade you to use this exact “super event” analysis and design technique. I only want to indicate that reuse via event-oriented analysis and design has a respectable and successful history, since many object-oriented designers are unaware of this history.

Refactor data structures for performance

You may refactor a database to make it more efficient. You might add or remove an index (one automatically maintained by the database management system) without affecting any program or any test data.

Beyond this, if you change an input or output data structure, you are obliged to change the test data for the programs that read/write the data structure. This is not refactoring in the proper sense of the term.

But there is an analogy to refactoring in the database design techniques of normalization and denormalisation. If you change a data structure in either of these ways, then you must migrate the test data, but you would hope that your software layering means you do not have to change client-level programs that access the database, do not have to change the end-users test data.

Watch out for non-functionals

Note that refactoring can changes the non-functional characteristics of a system, program or database. Where this is likely, you should retest to ensure the non-functional requirements are still met.

Reuse in SOA and CBD (reprise)

To be repeated here from <The application architect>.

The enterprise challenge	Our ideal, that all applications will run on top of an enterprise-wide service-oriented architecture (or an enterprise database), is a target analysts expect to fall short of.
The here and now principle	Analysts define services and components to serve known clients, not imaginary ones, since generalization ahead of requirements hinders now and proves unwise later.

Componentisation and distribution

Dividing a system into business components or building a system from pre-defined distributed components, can have a big impact on analysis and specification.

Architects define data architecture

The architect makes high-level decisions about the run-time environment, defines how many data stores there will be and indicates which data belongs where, perhaps by listing the kernel entities for each data store. An architect may have good reasons to design several smaller data stores rather than one large one

· data subsets must be maintained in different locations for performance reasons

· a data subset is to be reused in another system

· the legacy systems are too difficult to reengineer.

But without such good reasons and evidence for them, distribution should be resisted.

"Although web services are touted as reducing the need for experienced (and expensive) software engineers, this would be a false economy. Distributed systems remain complex and counter-intuitive no matter what software technology is used to build them." Computerwire Jan 2003

Distribution weakens the relationships between entities. e.g. Orders and Customers might be stored in discrete data stores.

Distribution divides a required business service into several provided services. e.g. Order Placement becomes a workflow involving services acting at different times on the Order data store and the Customer data store.

Distribution turns what systems analysts might want to specify as invariant conditions into transient preconditions and post conditions of distinct services. e.g. the rule that an Order is related to a Customer cannot be maintained as an invariant if Orders and Customers are in discrete data stores.

Distribution forces business actors to engage with messages passing between subsystems. e.g. The business users who enter an Order Placement event into the Orders subsystem have to react later when the Customer subsystem returns a message saying that the Customer named in the Order Placement parameters does not exist, or owes too much money for the Order to be accepted.

Terms and concepts

A business component is a subsystem, based on a data stores, just like the whole system. It is expected that a business component has one coherent data store – transactions will guarantee data integrity within it. But you cannot assume that transactions will guarantee data integrity across two or more business components – that depends on architectural decisions and design constraints.

This means you cannot model business components and business rules until data architecture decisions, ones that determine the distribution of data between data stores, have been made. These have a big impact on where and how systems analysts specify business rules.

The architect plays a leading role in decisions about division and distribution, and the analyst has to deal with the consequences.

Term	Definition
Component	a software subsystem definable by an interface and a data store.
Interface	a list of service types (describable in a service catalogue).
Data store	a coherent structure of entity types (describable in a type model or data model) that is retained between services.
Business component	a component that business actors use and relate to. It is often very large; may be implemented as hundreds or even thousands of OOP classes.
Business actor	a client who acts in the business environment outside a business component and uses its business services. e.g. Marriage Registrar, Salesman
Business service	a service that a business actor expects to succeed if its preconditions are met or else fail completely. e.g. Wedding, Divorce, Order Valuation, Order Closure

Analysts follow architects

Systems analysts have to work with the results of data architecture decisions, and specify each distributed business component. To put it another way, anything that the systems analyst specifies before the data architecture decisions are made may well have to be revised afterwards, and that includes workflows and user interfaces.

Let me review some of the manifesto concerns.

The data integrity concern	Analysts strive to maintain data integrity by defining rules that constrain all the users and clients that can update entity state data (which means by the way that they posit defensive design rather than design by contract).
The cache concern	Analysts regard caching business data outside the enterprise’s persistent data store as a design optimisation technique and a design headache, not a primary design principle.
The data duplication concern	Analysts have to define workflows to synchronise copies or entity state that are duplicated beyond the bounds of automated transaction management.

The more the architect divides and distributes data stores, the smaller the business components get, the lower the level of processing that business actors have to engage with and that systems analysts have to address. That's life. That's workflow systems for you. That's a price you pay for wanting or needing a distributed system rather than a monolithic one.

Analysts do not have to subdivide what they are given

Implicit in some component-based development methods is the notion that componentisation is a good thing. In fact, it is not necessarily good that a large component is subdivided into smaller ones, unless the component parts must be distributed or will be used on their own.

Given a large data store, systems analysts do not have subdivide it into smaller business components. The systems analysts need worry about only boundary of the data store, the interface that business people are conscious of. This principle may not hold up to the most destructive of testing, but it is a good working hypothesis for the systems analyst to start with.

Programmers may componentise for software design reasons

Programmers may choose to decompose the code that operates on one data store into smaller business components, but this is a matter of program design rather than systems analysis. In practice, the real motivation is often that programmers need to divide their program code into manageable chunks.

Choose the level of granularity for specification

One person's system is another's subsystem. Components, services, and rules can be nested. Components can be composed or decomposed. Specifiers must choose the level of granularity that suits the circumstances.

At the bottom level of a composition hierarchy, every damn statement in a software system can be regarded as a rule, or as implementing a rule. I don’t want to get lost in the detail of how a system is coded. I am not interested in what low-level modules/classes say to each other. I don’t speak of business rules at the level of operations acting on a small object in an OO program. I have to abstract to a level of system composition that business users can and should understand.

Systems analysts are interested talking to business actors and other business people, and defining their universe of discourse in the form of business rules. So, systems must abstract from the detail. The abstraction tool I use is composition. I employ the concepts of component-based development (CBD).

Systems analysts should abstract by composition to the point where they can usefully and meaningfully discuss business components and business services with business actors and other business people.

Define business rules at the level of business components

Systems analysts should specify business rules at the level of composition that is a business component, at the level of granularity where the scope of the component is large enough that:

· actors are business people, or proxies for them such as web pages, workflow engines, and batch i/o programs.

· interface operations are what actors perceive as discrete business services

· the persistent data can be managed as one coherent data store

Think not so much of the user interface as the server side of a system. Depending on your software architecture, as business component might be a business services layer or a data services layer. The data store of a business component is usually large and persistent enough to require a database management system to keep it safe and in order.

Do enterprise packages mean the end of analysis?

This is a slightly edited version of a contribution to an international discussion group. You can find more at dm-discuss-subscribe@yahoogroups.com.

“The first time that packaged, enterprise software solutions were thrust up the IT market, they faded away. There were two outcomes, pockets of success and enterprise-scale messes.

The pocket-sized success:

· customer only implements one or two modules and feels successful,

· vendor declares victorious substantiation for their enterprise solution.

The enterprise-scale mess:

· great proclamations; great expectations,

· pockets of success (i.e. one or two modules successfully implemented) creates market,

· market grows from silver bullet syndrome,

· customers slowly realize that these are complex systems, as more and more modules attempt to be added

· the complexity of the system overwhelms the customers (i.e., more and more resources are being consumed in efforts to sustain previous success levels)

· customer demands more flexibility; vendor provides more flexibility

· customer realizes that the flexibility is just as complex, if not more than the system itself,

· customer realizes that the break even point was a delusion and senses that they have lost control

· to regain control, customer reorganizes and new management initiates project to create own systems to phase out vendor modules,

· management proclaims, we'll never do that again!

Anecdote: In the late 90s object proponent and psychologist, David A. Taylor, PhD., founded Enterprise Engines, Inc., a business engineering firm that was going to use object technology to design more effective organizations. He put together a consortium of 5 companies (most were manufacturers) that funded the effort. Have you heard of them?

A business enterprise is an extremely complex organism. Take a course in Organizational Design if you don’t believe it. A new policy here has some beneficial effect here but has a detrimental effect over there which counter-balances with an effort that then has implications in another area, and so on, and so on, and so on. Organizational Design is full of complexity.

Managers seek ERP systems because trying to understand their entire environment is a very difficult task, one not easily done and one that they’re not usually trained for. And so, if you can tell them that you have a package that will do that, great. In most business schools, management candidates spend little time in Organizational Design studies. Here is a very simple look at what most business schools focus their students on:

The Functions of a Business:

· Accounting and Management Decision Making

· Financial Management

· Marketing Management

· Human Resource Management

· The Strategic use of Information Technology

· Operations Management

· Strategic Management

In one fashion or another, this summarizes the focal points of every business school. Organizational Design is seen as a component of each, rather than a discipline encompassing each. That’s not to say it isn’t taught, it is. But, you can get an MBA in Management. You can get an MBA in Finance. You can get an MBA in Marketing. You can get an MBA in IT. You can get an MBA in Operations Management. I don’t know of any school offering an MBA in Organizational Design

That’s a long-winded way to my point. You can build an accounting system. You can build capital management systems. You can build marketing systems. You can build HR systems. You can build manufacturing/shop floor systems. But interconnecting them has been, and still is, a Herculean task.

Shrewd marketers would have you believe that just because one system can feed trivial data to another that can use that data, you have a system that represents the behavior of the enterprise. Rational individuals are grouped together. This produces irrational behaviors that, ultimately, beget a rationalized result. Go figure.

What has happened to manufacturing planning systems? Are they being implemented any more successfully than they used to be?

Is the new culture of "data quality" helping companies come to terms with he issues?

Are the software packages any better than they used to be? Do they properly deal with the difference between the operational systems and the data warehouse? Are they flexible enough to provide the responsiveness to clients that you correctly require?”

References

Ref. 1: “Software is not Hardware” in the Library at http://avancier.co.uk

Footnote 1: Creative Commons Attribution-No Derivative Works Licence 2.0

Attribution: You may copy, distribute and display this copyrighted work only if you clearly credit “Avancier Limited: http://avancier.co.uk” before the start and include this footnote at the end.

No Derivative Works: You may copy, distribute, display only complete and verbatim copies of this page, not derivative works based upon it. For more information about the licence, see http://creativecommons.org