Nexus: Managing for Change
A Discussion Paper
By Erick Engelke
Nexus is viewed by many as primarily
an undergraduate computing environment, but we are now seeing dramatic growth
in administrative and research areas.
This surge is also increasing the need for management changes as we must
treat office areas with a different security model than we do the undergraduate
labs.
History: Windows 2000 and Active Directory 1.0
Our current security model was
defined in 2000, and was based on the needs of administrators within the
confines of what could be accomplished with the Windows 2000 product of that
time. It should be remembered that
Windows 2000 shipped with the first ever (version 1.0) Active Directory
implementation, which was untested commercially and had serious
limitations. Also, 3rd party
tools were not yet directory enabled, so they required administrator privileges
to perform even basic everyday operations.
The faculties had experience in Watstar, NT4 and Unix, where they
were masters of their own domain. In the
case of Watstar/Polaris, their local autonomy did not
conflict with their ability to integrate into a larger community.
During discussions in 2000 with IST
and others, there grew to be two differing views of how the campus active
directory should be managed. The
documents of that time stress the differences, possibly understating the
commonalities.
Essentially it came down to the
point that IST members wanted everyone to live within the confines of a well
designed Active Directory, which assumed Active Directory was as mature and
integrated as Microsoft documentation claimed.
The faculties believed they should not be limited in their abilities,
especially since no one at the time could have the experience to predict the
limitations imposed by this model, particularly before it was deployed in a
real world production environment.
Both groups wanted security, but
could only accept models where their own needs would not be hindered by imposed
privilege limitations, and both groups had valid points.
With the benefit of several years of
experience, we can now explain the fundamental differences from that time.
Active Directory was immature, many
basic operations required extraordinary privileges.
Some example challenges (just a few
of the many):
-
one could
not view (or even review) a GPO created by someone else in another department
without advanced privileges. Since
sharing of installation details is important, this was a very common problem
for us
-
supposedly
common operations, such as adding Norton Antivirus, required either advanced
privileges, or the use of a local privileged account on each workstation - the
latter considered an unnecessary security risk, and was one of the problems of
the NT 4 management model that Active Directory was supposed to fix. Adding thousands of machines should not
require constant use of highly privileged accounts.
-
Windows
2000 DHCP server required special privileges for daily administration
It was only in 2003 and 2004 that
many of these problems were resolved.
-
Windows
2003 server allows OU privileged accounts the ability to view and copy GPOs created by others elsewhere in Active Directory.
-
In
2004, Symmantec Antivirus 9 (replaces Norton) now has
a MMC snap-in which requires only OU privileges
-
Server
2003 DHCP server daily administration can now be performed with OU privileges
There was a fundamental shift in
Microsoft and 3rd party tools from the NT4 styled tradition of total
power needed to perform basic actions, toward a real directory-enabled system
that allows for fine control of administrative powers.
The promise of active directory
roles has finally become a reality. We
now have the experience to determine what tasks we must perform in this
environment, so it is appropriate to review our management model.
Nexus Growth
Since our initial production
deployment in April 2001, Nexus has grown at a near linear rate of
approximately 750 computers per year across all six faculties, and we will likely
reach 3,000 computers in the domain before the four year anniversary.
When one considers that the six
faculties have a total of 9,000 DNS names in active use (September 2004,
including computers, switches, printers, etc.), Nexus represents approximately
one third of all computers in the faculties.
Growth in offices and research areas
is significant. Several faculty and
departmental computing groups are strongly encouraging their faculty and staff
to use Nexus.
This requires a different model and
different operating principles than we use for student computing.
As examples:
-
sensitivity
to data collection. They are not working
in open labs, they have a right to privacy. We collect data on student usage to watch
trends and plan for lab scheduling, software licensing, hardware
upgrades, etc.
-
security
for data. The current management model
permits access to a relatively small list of individuals spread across campus,
but that list could be smaller.
World Trends
There are some noteworthy trends in computing, particularly in response to
security issues.
Windows, Unix,
and many of their subcomponents increasingly rely on security defined by roles
and permissions. It is good design to jail a process so its impact cannot
exceed certain predefined security boundaries.
One of the most popular examples of
this strategy is Java. Java’s impressive
security comes from executing its code in a padded environment which ensures
security, while still allowing the programmer to perform all necessary functions.
It is not academic altruism that has
driven this strategy, but customer demand.
Symantec (makers of an antivirus product) reports 4,496 new malware threats against Windows alone in the first six
months of 2004, representing a 350% increase from the year before.
Our present management environment
was based on solving real-world needs, deploying an immature Windows 2000
product. We adopted a model of
professional trust between all the system administrators when Windows security
features were insufficient to allow a model of well-defined system roles with
that experience. We now have a better
idea of what security allowances we now need to perform our daily tasks.
Our Plan
There is growing concern from
administrators that we should revisit the management strategy in light of all
these developments.
Many of us have observed that we
rarely need to use our bang bang accounts (full
Active Directory permission) anymore, and when we do need to use them, we
usually only need a subset of their total powers.
There are presently 34 bang bang equivalent accounts.
One is used by an automated system process, the rest are assigned to 26
individuals (multiple accounts for one person are necessary to have specialized
attributes). Some of the accounts are
inactive.
Distributed management must balance
inter-administrator trust with exposure to risks by defining clear boundaries
for administrators. To
date these boundaries have been entirely by convention, and enforced by
discussions and some crawling software.
However, we believe they could now be delineated by unambiguous Active
Directory permissions.
Bang Bangs
Our proposal is to leave the
existing bang accounts in place (OU level permissions) - they still make sense
- but usher in a new strategy for bang bang accounts.
Each existing bang bang account holder would keep that account, but be issued
a new set of privileges. It would
essentially be a customized box which grants every necessary privilege, but
nothing more.
-
We
would share access to the students. The
most common use of bang bang today is to move
students to a different faculty. This is
necessary when students change programs, or go on to study at a graduate level
in a different faculty or department, or a variety of special cases. The other common use is to help deal with
forgotten passwords when the student (with a valid photo student card) is at a
lab session in another faculty’s building.
Treating the students as a shared resource has been very successful.
-
We
would not normally share access to the various office environments. The local bang account would be the only way
that these machines are touched, or that GPOs are
installed on them. This would treat
office environments almost as though they were on a separate management system for
security aspects, while allowing us to use the scalability and sharing of
Nexus.
-
Student
labs would remain in control of the department who physically manages them, but
administrators are encouraged to continue to share information and GPOs for lab management and application deployment.
To give an example, an AHS
administrator would have total control over the AHS portion of the Nexus
environment. He would also have total
access to the student portion of Active Directory – allowing him to move users or
change passwords, but not necessarily to access their file space. He would not have any access to office
areas in any other part of the domain, and might have read access to labs of
departments who wish to share that capability.
These privileges will be applied to
groups, such as !!AHS_ADMIN, and all appropriate
employees will be made a member of that group.
If an employee leaves the unit, his removal from that group will
immediately eliminate all the privileges bestowed by membership.
There are some people who perform
specialized tasks as part of their job.
For example, the employee who performs tape backups of the domain
controllers would have extra permissions to be able to perform that task, but
would not automatically have privileges to perform other actions in another
faculty or department beyond normal responsibilities.
Even though the new bang bang accounts will significantly improve total security, it
would be reasonable for WNAG to specify the appropriate number of bang bang accounts issued.
Domain Emergency Accounts
We would need to create a very small
number of domain
‘emergency’
accounts. For discussion purposes, we
have considered a possible total of four such accounts on the domain. The actual number and distribution will be
decided by WNAG.
The emergency accounts would be
capable of accessing all areas of Nexus, but they would be used as a last
resort, and, by definition, would certainly not be used on a daily basis.
These accounts would be subjected to
more stringent conditions, such as being restricted to a small number of
trusted office or server room computers, and be subjected to intensive logging
beyond that expected for normal users in this university environment.
Any time that these accounts are used,
some notification or summary will be eventually sent to WNAG. The very nature of these
accounts mean that some extraordinary event happened, either planned or
emergency, and that information must be shared.
Hard Coding Change
There are two technical ways of
achieving the end result described so far.
It can be done by starting with a
predefined privilege level (eg. domain Administrator)
and then subtracting the areas one does not need. The other method is by building up a list of
desired privileges from scratch. The
options are subtractive versus additive.
AHS and Chemical Engineering both
have a large number of office machines, and they have found it necessary to
restrict outside access. In the current
management structure, this is only possible by removing access from everyone
but those who should have it. This is as
secure as a locked screen door; it signals to other administrators that they
have no business accessing these areas, but the security is easily defeated
because they can just re-enable themselves.
While we trust the administrators, the real danger is compromised
software that can re-enable these privileges just as easily as a human could.
We propose to use the other
strategy, and build new security keys from scratch which have all the necessary
privileges enabled, but no more. These
accounts cannot be compromised (barring any design flaws in the Windows
software).
Guiding Principles
This is a significant change in the
management model. It will enhance
security, with little or no discernable difference to the administrators who
will be affected by the change.
Our goals can be outlined as:
-
preserving
local administrator’s ability to do the job unencumbered. These would include ability to:
o add users
o install and manage workstations, servers and printers
o install software on unit’s workstations
o add scripts as necessary
o select, review and edit GPOs
o select an appropriate SUS and NAV strategy for the
clients
-
enhancing
the effectiveness of local computing unit
o offer greater assurances of security to the office
user community
o other initiatives not mentioned in this document, eg. edit the login browser page
-
reducing
exposure to unnecessary privileges from ‘outsiders’ of the local department.
-
providing
the ability to select a peer group who could cover during vacations
-
enabling
of emergency
accounts possessing extraordinary privileges
o to deal with crisis situations
o to provide backup in the rare event that no
departmentally selected peer member can be reached
o to better document changes by requiring WNAG
notification
-
maintaining
a system consistent with the distributed management philosophies embodied in Watstar/Polaris/Nexus of the last twenty years
Final Comments
The success of Nexus is entirely
dependent on individual successes of computing units across campus. The principles of distributed management are
central to our existence. If we cannot
support local flexibility, Nexus would cease to be a good solution for many of
the people it now serves.
Many of the current Nexus
administrators have worked together for many years. They are trustworthy professionals who must
be treated as peers on shared concerns, and as chiefs in their local
departments. The proposed Nexus model
gives them near-autonomous control over their own areas, as well as the ability
to work unencumbered in a large shared environment.
Appendix
Nexus growth has become more
constant in the last year. Prior to
that, there were spikes when faculties and or departments moved student
computing labs en masse.
The current growth rate cannot be
maintained indefinitely. In less than
four years, Nexus has grown to approximately one third of all faculty-owned DNS
names, which include routers, switches, some printers, etc.
In Active directory there are
multiple plateaus of permissions.
Certain permissions, such as the ability to restore an Active Directory,
are quite obscure. In Nexus, we commonly
talk about two special privilege levels.
The ! (bang) privilege was
initially defined as the elevated privilege (over a normal unprivileged user)
which can implement changes for the local organizational unit. Computing units are free to add as many bang userids as required for their operation. Other units are not affected at all by their
bang decisions.
The !! (bang bang)
privilege was defined as the elevated privilege equivalent to Administrator@nexus.
!! accounts could initially access any Nexus
computer, and any OU structure.
The !! accounts were intended to be
used only occasionally, and by a small number of people. The Waterloo Nexus Advisory Group (WNAG) and
Engineering Systems Advisory Group (ESAG) agreed to certain limitations on who and how many accounts would be created.
The number of !!
accounts has risen over the years, but the ratio of
computers per !! account indicates that the trend
should not be considered a failure, we have achieved more than 100 computers or
more per !! account.
References
This document is my attempt to
express a very complex problem in relatively simple terms. I had input from numerous administrators, and
some of the ideas and assumptions are based on Bruce Campbell’s original decisions
when he created Watstar and the distributed
management model that we have enjoyed for two decades.
Many the Nexus Active Directory
references can be found on the web. I
recommend starting with http://www.eng.uwaterloo.ca/~erick/nexus/useridconventions.html
which includes historical links and best practices.
If any links are dead, or the data
more recent than described, I have backups of most of the original web pages
that can be referenced.
Erick Engelke
October, 2004