Configuration management challenges
This page is published under the terms of the licence
summarized in the footnote.
How to manage large and complex systems? Some managers
assume that the answer to extra size is to add more resources, and the answer
to extra complexity is to add more formal change management processes.
Sometimes those answers help, but the mathematics of size and complexity,
combined with the nature of human psychology and social interaction, means they
don’t always help. This paper sets the scene for later proposals by describing
what makes configuration management difficult.
Increasing structural complexity
Distributed management responsibility
Scaling up a social organisation
Choosing item size and identifying dependencies
Managing description as well as reality
Maintaining parallel descriptions
Maintaining traceability records
Managing many states of one system
A configuration is a set of items that are related in
one structure. Configuration management is about monitoring and controlling
changes to the items or the structure. Change management embraces both change
control (which approves change requests) and configuration management (which
makes changes). That’s probably enough terminology definition for now, but the
appendix offers definitions of these and related terms.
One of the reasons why configuration management can be
more or less challenging is that some structures are more complex than others.
This table summarises two scales on which complexity might be measured.
Variegated items |
|
Heterogeneous |
Uniform items |
Homogenous |
|
|
Generic relationships |
Specific relationships |
A homogenous set contains uniform items. I have a bag
of marbles. I can add or remove a marble with no concern about its effects on
other marbles in the bag. It doesn’t matter how many people want to add or remove
marbles, since the actions of one person do not affect others (unless the bag
is full or empty). If the collection grows too large, its management can be
shared.
(My farm has too many fields for me to plough? I
father two sons and divide the field management between us. Our lap top asset
register lists too many devices? We hire more support people.)
A heterogenous set contains
variegated items, where different item types need different attention. To
manage a collection of variegated items, we usually sort them or group them,
often under a hierarchical structure. Assembling related items together makes
it easier to give those items the special attention they need.
(I need to select a marble of the right colour for my game;
I put the different coloured marbles in different coloured bags. My theatre
audience have bought different value seats, so I must code and read their
tickets and direct them to the right seating area.)
In a homogenous network, the relationships are generic.
Items somehow depend on each, but the relationships are generic, not specific
to the items in the relationship.
(A house is built from many connected bricks; the
overall structure matters, but it doesn’t matter which particular bricks are
cemented together. A telco may add or subtract
subscribers without caring which talk to each other. If the rule is that a bag
of marbles must contain a marble of each colour, them before subtracting a
marble we must first check other marbles, but no particular marble.)
In a heterogeneous network, relationships are
specific. There is a structure in which items depend on each, and the
relationships are specific to the items in the relationship. Now the manager of
the structure needs to know (remember or record) which items depend on which
other items.
(Highly heterogeneous networks are found in software,
where every component is different, and every component offers different
services to other components. Software configuration is difficult because a
change to any one item can affect several other items.)
Some pseudo mathematics
To generalise, the more variegated the items, the more
specialised the relationships, the more complex the configuration and the more
difficult configuration management becomes.
Variegated items |
|
Complex |
Uniform items |
Simple |
|
|
Generic relationships |
Specific relationships |
I imagine a formula for measuring M, the management
challenge presented by a configuration, where:
·
I is the number of item types
in a configuration.
·
R is the number
of relationship types between items.
·
C is complexity,
calculated as I * R
·
M = C * the sum total
of items and relationships.
It might be no exaggeration to say that a one-person
business is 100 times more efficient than a 100 person business. And that is
partly down to the ability of one human brain to manage a configuration faster
and more efficiently than a social organization.
The human brain is amazing. All businesses rely on the
staggering ability of individual human beings to remember large quantities of
information about items in the real world and use that information to
manipulate those items. The billions of components in a human brain can
communicate instantly and efficiently in ways far beyond our comprehension.
Social organisations are far less sophisticated than
brains. Compared with cells within one human brain, people in a society
communicate slowly and inefficiently. Since there is no collective memory,
information about items in the real world has to be painstakingly recorded by
one person for subsequent inspection by other people.
When you alone are responsible for assembling
inter-related items into a structure, and changing it, you can manage a complex
configuration in your head. I don’t mean you can remember it exactly; I mean
you can become so familiar with a structure of related items that you can
maintain it with little or no recourse to documentation of relationships
between items.
You can hold a structure in mind well enough to
remember where to find an item, and which items depend on which other items. If
you have to make a change, you know which items of the structure need
attention. If you have to change one item, you remember which other items need
attention. And if you aren’t sure, you know where to look to find out, quickly.
My rule of thumb is that you (being intelligent enough
to read this kind of abstract discussion) can hope to manage a configuration of
between 100 and 1,000 items, depending on how complex the structure is.
There is a limit to what one person can do. If M is
the challenge presented by a configuration and MY is
the maximum challenge you can manage, then when M exceeds MY, you become
unreliable. There are two remedies.
Where the challenge lies in complexity rather than
size, you will document inter-item relationships, so you can analyse the
impacts of changing any one item. You will probably do this reluctantly, after
being forced into it by making mistakes. It slows you down, and there is a
limit to what one person can manage.
Where the challenge lies in size rather than
complexity, you will share the configuration management with others. How easily
can you do this?
Variegated items |
Difficult |
Very
difficult |
Uniform items |
Easy |
Difficult |
|
Generic relationships |
Specific relationships |
Perhaps the most heterogeneous systems, where every
component is different and differently related to other components, are
software systems.
There is a huge step change from one person working on
a heterogeneous configuration to a group of two or three people working in
parallel on it. How to ensure the integrity of the structure – to ensure one
person’s work doesn’t interfere with another person’s work? The team has to
formalize change management processes. Both oral and written communication is
needed. And delays are introduced while one person waits for another to
complete a task.
Managers may assume that the answer to size is to add
more resources.
In “The mythical man month” Fred Brooks (ref. 1) set
out to explain why adding people to a late project tends to make things worse.
Using his group intercommunication formula, the number of communication
channels between individuals in a group is n(n-1)/2,
where n is the number of people. For example: in a 50 person group, there are
1,225 channels of communication. In a 100 person group, there are 4,450
channels of communication.
If group success depends on everybody communicating
with each other, then adding people to a late project adds overheads and
increases group intercommunication at the expense of the problem solving or
product creation activity.
Managers may assume that the answer to complexity is
to add more formal change management processes.
The same formula can explain why adding items to a
configuration can make it unmanageable. The formula suggests the potential
complexity of a configuration rises disproportionately with the size of the
configuration. For example: in a 100 item configuration, there are 4,450
possible interdependencies.
So when you increase the size of a system
configuration, unless you are careful to minimize interdependencies, there will
be a disproportionate increase in the resources and budget needed for change
management. And this tends to exaggerate the effect of Brook’s law. Now, an
ever increasing proportion of the effort is devoted to change management
processes, at the expense of actually building and maintaining the system
itself.
Brooks quipped that his book is called "The Bible
of Software Engineering" because "everybody reads it but nobody does
anything about it!"
The operational system – the one that actually runs –
is way, way too large and complex for us to document in every detail. All
system descriptions are abstractions from operational systems. The items in a
configuration management database are composites of a size we choose to record.
Configuration management standards may imply that
every inter-item dependency can and will be recorded before a change is
requested.
“Every dependency between a new item and those already
included in the configuration must be recorded at the point of registration.
“You need not describe the nature of a dependency: it
is likely to change during the life of the item and is better discovered by
impact analysis under change control. Note however that changes to dependencies
are changes to items, and so changes to the whole configuration.
“Record both the name and version number of all
dependent items. Dependences are necessarily version-specific and it is
essential to cross-relate the various versions of interdependent items as they
are changed.
“A complete record of item interdependences is a
pre-requisite for effective impact analysis. In its absence, it is extremely
difficult to determine accurately whether and how a given change will affect
costs and timescales.”
But the dependencies we choose record are abstractions
from the multitude of relationships that exist in a run-time system. We hope
not to overlook one, but we cannot guarantee it unless we have some automated
way to find them all.
In building a house, there is a system description
(documented by the architect) and an operational system (constructed by
builders). Both are configurations and should be under some degree of
configuration management. However, changing an operational system and changing
a system description are very different, with different risks, costs and
implications.
The system
description – the architects’ drawing |
The
operational system – the house |
The architect produces an abstract design and bill
of materials. |
The builder buys and makes concrete parts, completes
a physical construction. |
The drawings are inter-related. The fireplace in one
drawing should fit the space on another drawing. |
The parts are interrelated. The real fireplace must fit
in the space for it. |
Inconsistency in the description does not reveal
itself. |
Inconsistency in the operational system reveals
itself, and may cause it to fail |
The customer inspects the drawings and confirms them
in conversation with the architect. |
The customer and architect inspect the growing
building and discuss modifications with the builder. |
The customer makes changes - introduces requirements
not mentioned in the vision. |
The customer and builder make changes, sometimes without
telling the architect. |
The architect captures changes in the drawings. |
As the house is built, the customer experiences it,
the system description may be left behind |
Changes are to drawings are cheap and low risk. |
Changes to operational systems are costly and risky |
Analogies, like building a house, are both appealing
and dangerous. They can create false expectations.
Building a human or computer activity system is a
little like building a house. You do expect that:
·
The system
description should be internally consistent, though it might not be.
·
The system
description omits details included in operational system. (A Gartner report
said more than 80% of software has no documented requirements.)
·
The operational
system must be internally consistent; else the system will fall over.
·
The operational
system may depart from the system description.
But the world of IT is peculiar in that the
operational system maintained by people is so intangible and malleable. In many
ways, building activity systems is not like building houses.
·
Activity system
descriptions are obscure to customers. Documented procedures are intangible;
they cannot be readily validated by inspection. The more detailed the system
description, the less the customer understands it. Getting customers to “sign
off” a system description is unreliable as a verification step.
·
An operational
software system is even more obscure. Executable code is readable only by a
computer.
·
The operational
system is malleable. Components, processes and test cases are readily changed.
A whole system may be rebuilt every day or so.
·
System
description is so unreliable that half the system development cost is system
testing and change management
The world of IT is also peculiar in that the
operational system maintained by people is a bottom-level, computer readable,
description of the system that actually runs.
It is normal to find a large and complex system is
described from several angles and at several levels. If there are describer,
tester and builder versions of the system that actually runs, then there are
four configurations.
·
Requirements
configuration: may encompass a vision statement, a requirements catalogue, use case definitions, data models and more.
·
Testing configuration:
should include test harnesses, test cases and test data, sometimes called the
“executable requirements” configuration.
·
Source code
configuration: should include whatever developers maintain by way of source
code, database schemas, input and output data formats, and unit tests;
“everything you need to build anything, but nothing that you actually build.”
Martin Fowler
·
Executable
solution configuration: should be an operational system that is capable of
running. It has to must include all the infrastructure software to be deployed
on hardware alongside the coded solution (class libraries, workflow package,
app server, object request broker, messaging system, operating system, etc) and all the hardware devices the software is deployed
to.
Obviously these configurations are related. In theory,
you can see them as a single configuration, in which every item is related
directly or indirectly to every other item.
But these parallel descriptions are at different
levels of abstraction – made using different methods and tools - by different
people - at different times. If these configurations are not separable, and
cannot evolve independently, then a project would grind to a halt.
Managers and customers sometimes expect us to maintain
full traceability between distinct configurations – between system description
items (or requirements), test cases and operational system items (or
components). Sometimes traceability records are a contractual obligation. Yet
it turns out they are not actually needed, used or
trusted.
Configuration management standards imply that full
traceability can and will be maintained.
“The traceability process is part of the configuration
management processes.
It should be defined in the project/service initiation
documents and linked to the solution development/maintenance process.
All items in traceability records should be under
configuration management.
Traceability records should be extended as items are
approved and placed under configuration management.
Traceability records should be maintained throughout
specification, design, code, and testing.
Traceability records can be maintained in
spreadsheets, but it is better to use a specialist tool, especially where there
are several hundred requirements, and several thousand test cases and solution
components.
Traceability records should be visible to all project
participants.”
Yet in practice, it proves very difficult to maintain
comprehensive traceability records implied above. Traceability records are
fertile ground for the quality auditor looking for gaps and discrepancies. The
documentation is a headache to produce; it is tedious and unproductive work.
People are under pressure to find and solve design problems fast, which they
can do in their heads without recourse to the records. So records get out of
step with reality.
People argue that higher level system descriptions should be maintain in perfect alignment
with bottom level ones (cf. enthusiasts for the Zachman Framework). Yet articles in the BCS Requirements
Engineering newsletter have suggested we must always question why traceability
is needed.
It just won’t happen unless there is
·
Time and budget
for it
·
A disciplined and
effective process for doing it
·
Proof that it
helps people.
It is commonly necessary not only to maintain parallel
configurations of the different kinds above, but also configurations at
different states of development.
Configuration State |
Requirements |
Test cases |
Source code |
Executable |
Initiation |
Version 6 |
|
|
|
Elaboration |
Version 5 |
Version 5 |
|
|
Construction |
Version 4 |
Version 4 |
Version 4 |
|
Testing |
Version 3 |
Version 3 |
Version 3 |
Version 3 |
Operation |
Version 2 |
Version 2 |
Version 2 |
Version 2 |
Archive |
Version 1 |
Version 1 |
Version 1 |
Version 1 |
You need a process that defines how a system version
moves from one state to the next, and back again if need be.
IT operations
teams manage operational systems that are composed of concrete hardware and
network items. They see applications as files/documents they deploy onto
machines. They don’t need to know what an application does or how it works.
They use what they call Configuration Management Database.
IS development teams manage system descriptions at a much more detailed
level. Their components (modules or classes) may be a hundred or a thousand
times more fine-grained than the deployment units recognised by IT operations.
They use what they call a Software Configuration Management tool.
Either team,
in changing what they see as their configuration, can disable a configuration managed
by the other. “Disable” here might mean stop altogether. But it could also mean
reduce the performance or availability of existing software below an acceptable
level. E.g.
·
IT operations
might upgrade an operating system wherever it appears in the IT estate, unaware
that this will disable an existing application.
·
IS development
use a different database management system version than the one approved by IT operations.
·
IS development
requests new software to be deployed onto existing hardware unaware that it
will disable some other software already deployed.
So, each team
has to take care they understand the impact of a change on the configuration(s)
managed by other teams - despite the fact that they use different configuration
management repositories.
How to ensure
both integrity and agility? You can’t maximise both. To ensure integrity you
need careful change management, but that adds costs and slows you down, and
makes some changes infeasible. There is a tension between integrity (requiring
thorough system description and strict configuration management) and agility
(requiring speed of system adaptation/evolution).
The more you
have to document configuration items and their relationships, the more you have
to share the management of a configuration, the more descriptions you have to
maintain of the system you are working on, the more overheads you add to what
might be called the real work.
Change
management consumes so much time and cost that you must explicitly and
repeatedly decide where you want heavy change management processes, and where
you don’t. You must balance the costs of applying heavyweight processes against
the risks of applying lightweight processes. And to do this, you must
understand the various times and places that change management can be relaxed.
It seems
isolation of silos is death to integrity but essential to agility.
Change
management embraces change control and configuration management. Configuration
items are usually items are related in a structure, often a complex network
structure. A change to one item can have an effect on many other items. This
table defines key terms in change management.
Agile |
Willing and
able to speedily respond to change. |
Baseline
configuration |
A
specification or product structure that has been formally reviewed and agreed
upon. The basis for further development. Can be changed only through formal
change management. E.g. a contract, a requirements catalogue, architecture
documentation, or a hardware configuration. |
Change
Control |
The
organisation and processes needed within change management to: Monitor the
potential sources of change Record
change requests Perform
impact analysis Decide
which changes should be made. |
Change
management |
The
organisation and processes needed to both exercise change control to a
baseline, and perform configuration management. |
Configuration
Item |
An item in
a baseline configuration. Could be a requirement, a source code component or
a hardware device. Can be at any level of granularity. “Component of an Infrastructure under the
control of configuration management. A configuration item can range from an
entire system (hardware, software, documentation) to a single hardware
component.” ITIL |
Configuration
management |
The
organisation and processes needed within change management to establish a
baseline configuration and apply changes to that baseline configuration.
Involves work to: Identify
and document the characteristics of each item. Define dependencies
between items. Control the
introduction of new versions of items. Report the
status of configuration items and changes to them. |
Impact
analysis |
Analysis of
the effects of a change (perhaps a new requirement or deliverable) to find
the effects of that change. How does it impact what has been done so far? How
does it constrain what is planned for the future? Leads to an impact analysis
report. |
Request for
Change |
“Form used
to record details of a request for a change to any Configuration Item within
an Infrastructure or to procedures and items associated with the
Infrastructure.” ITIL |
References
Ref. 1: “The
Mythical Man-Month: Essays on Software Engineering” by Fred Brooks."
Footnote:
Creative Commons Attribution-No Derivative Works Licence 2.0
Attribution:
You may copy, distribute and display this copyrighted work only if you clearly
credit “Avancier Limited: http://avancier.co.uk” before the start and include this footnote at the
end.
No Derivative
Works: You may copy, distribute, display only complete and verbatim copies of
this page, not derivative works based upon it.
For more
information about the licence, see http://creativecommons.org.