Nexus: Managing for Change
A Discussion Paper
By Erick Engelke
Nexus is viewed by many as primarily an undergraduate computing environment, but we are now seeing dramatic growth in administrative and research areas. This surge is also increasing the need for management changes as we must treat office areas with a different security model than we do the undergraduate labs.
History: Windows 2000 and Active Directory 1.0
Our current security model was defined in 2000, and was based on the needs of administrators within the confines of what could be accomplished with the Windows 2000 product of that time. It should be remembered that Windows 2000 shipped with the first ever (version 1.0) Active Directory implementation, which was untested commercially and had serious limitations. Also, 3rd party tools were not yet directory enabled, so they required administrator privileges to perform even basic everyday operations.
The faculties had experience in Watstar, NT4 and Unix, where they were masters of their own domain. In the case of Watstar/Polaris, their local autonomy did not conflict with their ability to integrate into a larger community.
During discussions in 2000 with IST and others, there grew to be two differing views of how the campus active directory should be managed. The documents of that time stress the differences, possibly understating the commonalities.
Essentially it came down to the point that IST members wanted everyone to live within the confines of a well designed Active Directory, which assumed Active Directory was as mature and integrated as Microsoft documentation claimed. The faculties believed they should not be limited in their abilities, especially since no one at the time could have the experience to predict the limitations imposed by this model, particularly before it was deployed in a real world production environment.
Both groups wanted security, but could only accept models where their own needs would not be hindered by imposed privilege limitations, and both groups had valid points.
With the benefit of several years of experience, we can now explain the fundamental differences from that time.
Active Directory was immature, many basic operations required extraordinary privileges.
Some example challenges (just a few of the many):
- one could not view (or even review) a GPO created by someone else in another department without advanced privileges. Since sharing of installation details is important, this was a very common problem for us
- supposedly common operations, such as adding Norton Antivirus, required either advanced privileges, or the use of a local privileged account on each workstation - the latter considered an unnecessary security risk, and was one of the problems of the NT 4 management model that Active Directory was supposed to fix. Adding thousands of machines should not require constant use of highly privileged accounts.
- Windows 2000 DHCP server required special privileges for daily administration
It was only in 2003 and 2004 that many of these problems were resolved.
- Windows 2003 server allows OU privileged accounts the ability to view and copy GPOs created by others elsewhere in Active Directory.
- In 2004, Symmantec Antivirus 9 (replaces Norton) now has a MMC snap-in which requires only OU privileges
- Server 2003 DHCP server daily administration can now be performed with OU privileges
There was a fundamental shift in Microsoft and 3rd party tools from the NT4 styled tradition of total power needed to perform basic actions, toward a real directory-enabled system that allows for fine control of administrative powers.
The promise of active directory roles has finally become a reality. We now have the experience to determine what tasks we must perform in this environment, so it is appropriate to review our management model.
Since our initial production deployment in April 2001, Nexus has grown at a near linear rate of approximately 750 computers per year across all six faculties, and we will likely reach 3,000 computers in the domain before the four year anniversary.
When one considers that the six faculties have a total of 9,000 DNS names in active use (September 2004, including computers, switches, printers, etc.), Nexus represents approximately one third of all computers in the faculties.
Growth in offices and research areas is significant. Several faculty and departmental computing groups are strongly encouraging their faculty and staff to use Nexus.
This requires a different model and different operating principles than we use for student computing.
- sensitivity to data collection. They are not working in open labs, they have a right to privacy. We collect data on student usage to watch trends and plan for lab scheduling, software licensing, hardware upgrades, etc.
- security for data. The current management model permits access to a relatively small list of individuals spread across campus, but that list could be smaller.
There are some noteworthy trends in computing, particularly in response to security issues.
Windows, Unix, and many of their subcomponents increasingly rely on security defined by roles and permissions. It is good design to jail a process so its impact cannot exceed certain predefined security boundaries.
One of the most popular examples of this strategy is Java. Java’s impressive security comes from executing its code in a padded environment which ensures security, while still allowing the programmer to perform all necessary functions.
It is not academic altruism that has driven this strategy, but customer demand. Symantec (makers of an antivirus product) reports 4,496 new malware threats against Windows alone in the first six months of 2004, representing a 350% increase from the year before.
Our present management environment was based on solving real-world needs, deploying an immature Windows 2000 product. We adopted a model of professional trust between all the system administrators when Windows security features were insufficient to allow a model of well-defined system roles with that experience. We now have a better idea of what security allowances we now need to perform our daily tasks.
There is growing concern from administrators that we should revisit the management strategy in light of all these developments.
Many of us have observed that we rarely need to use our bang bang accounts (full Active Directory permission) anymore, and when we do need to use them, we usually only need a subset of their total powers.
There are presently 34 bang bang equivalent accounts. One is used by an automated system process, the rest are assigned to 26 individuals (multiple accounts for one person are necessary to have specialized attributes). Some of the accounts are inactive.
Distributed management must balance inter-administrator trust with exposure to risks by defining clear boundaries for administrators. To date these boundaries have been entirely by convention, and enforced by discussions and some crawling software. However, we believe they could now be delineated by unambiguous Active Directory permissions.
Our proposal is to leave the existing bang accounts in place (OU level permissions) - they still make sense - but usher in a new strategy for bang bang accounts.
Each existing bang bang account holder would keep that account, but be issued a new set of privileges. It would essentially be a customized box which grants every necessary privilege, but nothing more.
- We would share access to the students. The most common use of bang bang today is to move students to a different faculty. This is necessary when students change programs, or go on to study at a graduate level in a different faculty or department, or a variety of special cases. The other common use is to help deal with forgotten passwords when the student (with a valid photo student card) is at a lab session in another faculty’s building. Treating the students as a shared resource has been very successful.
- We would not normally share access to the various office environments. The local bang account would be the only way that these machines are touched, or that GPOs are installed on them. This would treat office environments almost as though they were on a separate management system for security aspects, while allowing us to use the scalability and sharing of Nexus.
- Student labs would remain in control of the department who physically manages them, but administrators are encouraged to continue to share information and GPOs for lab management and application deployment.
To give an example, an AHS administrator would have total control over the AHS portion of the Nexus environment. He would also have total access to the student portion of Active Directory – allowing him to move users or change passwords, but not necessarily to access their file space. He would not have any access to office areas in any other part of the domain, and might have read access to labs of departments who wish to share that capability.
These privileges will be applied to groups, such as !!AHS_ADMIN, and all appropriate employees will be made a member of that group. If an employee leaves the unit, his removal from that group will immediately eliminate all the privileges bestowed by membership.
There are some people who perform specialized tasks as part of their job. For example, the employee who performs tape backups of the domain controllers would have extra permissions to be able to perform that task, but would not automatically have privileges to perform other actions in another faculty or department beyond normal responsibilities.
Even though the new bang bang accounts will significantly improve total security, it would be reasonable for WNAG to specify the appropriate number of bang bang accounts issued.
Domain Emergency Accounts
We would need to create a very small number of domain ‘emergency’ accounts. For discussion purposes, we have considered a possible total of four such accounts on the domain. The actual number and distribution will be decided by WNAG.
The emergency accounts would be capable of accessing all areas of Nexus, but they would be used as a last resort, and, by definition, would certainly not be used on a daily basis.
These accounts would be subjected to more stringent conditions, such as being restricted to a small number of trusted office or server room computers, and be subjected to intensive logging beyond that expected for normal users in this university environment.
Any time that these accounts are used, some notification or summary will be eventually sent to WNAG. The very nature of these accounts mean that some extraordinary event happened, either planned or emergency, and that information must be shared.
Hard Coding Change
There are two technical ways of achieving the end result described so far.
It can be done by starting with a predefined privilege level (eg. domain Administrator) and then subtracting the areas one does not need. The other method is by building up a list of desired privileges from scratch. The options are subtractive versus additive.
AHS and Chemical Engineering both have a large number of office machines, and they have found it necessary to restrict outside access. In the current management structure, this is only possible by removing access from everyone but those who should have it. This is as secure as a locked screen door; it signals to other administrators that they have no business accessing these areas, but the security is easily defeated because they can just re-enable themselves. While we trust the administrators, the real danger is compromised software that can re-enable these privileges just as easily as a human could.
We propose to use the other strategy, and build new security keys from scratch which have all the necessary privileges enabled, but no more. These accounts cannot be compromised (barring any design flaws in the Windows software).
This is a significant change in the management model. It will enhance security, with little or no discernable difference to the administrators who will be affected by the change.
Our goals can be outlined as:
- preserving local administrator’s ability to do the job unencumbered. These would include ability to:
o add users
o install and manage workstations, servers and printers
o install software on unit’s workstations
o add scripts as necessary
o select, review and edit GPOs
o select an appropriate SUS and NAV strategy for the clients
- enhancing the effectiveness of local computing unit
o offer greater assurances of security to the office user community
o other initiatives not mentioned in this document, eg. edit the login browser page
- reducing exposure to unnecessary privileges from ‘outsiders’ of the local department.
- providing the ability to select a peer group who could cover during vacations
- enabling of emergency accounts possessing extraordinary privileges
o to deal with crisis situations
o to provide backup in the rare event that no departmentally selected peer member can be reached
o to better document changes by requiring WNAG notification
- maintaining a system consistent with the distributed management philosophies embodied in Watstar/Polaris/Nexus of the last twenty years
The success of Nexus is entirely dependent on individual successes of computing units across campus. The principles of distributed management are central to our existence. If we cannot support local flexibility, Nexus would cease to be a good solution for many of the people it now serves.
Many of the current Nexus administrators have worked together for many years. They are trustworthy professionals who must be treated as peers on shared concerns, and as chiefs in their local departments. The proposed Nexus model gives them near-autonomous control over their own areas, as well as the ability to work unencumbered in a large shared environment.
Nexus growth has become more constant in the last year. Prior to that, there were spikes when faculties and or departments moved student computing labs en masse.
The current growth rate cannot be maintained indefinitely. In less than four years, Nexus has grown to approximately one third of all faculty-owned DNS names, which include routers, switches, some printers, etc.
In Active directory there are multiple plateaus of permissions. Certain permissions, such as the ability to restore an Active Directory, are quite obscure. In Nexus, we commonly talk about two special privilege levels.
The ! (bang) privilege was initially defined as the elevated privilege (over a normal unprivileged user) which can implement changes for the local organizational unit. Computing units are free to add as many bang userids as required for their operation. Other units are not affected at all by their bang decisions.
The !! (bang bang) privilege was defined as the elevated privilege equivalent to Administrator@nexus. !! accounts could initially access any Nexus computer, and any OU structure.
The !! accounts were intended to be used only occasionally, and by a small number of people. The Waterloo Nexus Advisory Group (WNAG) and Engineering Systems Advisory Group (ESAG) agreed to certain limitations on who and how many accounts would be created.
The number of !! accounts has risen over the years, but the ratio of computers per !! account indicates that the trend should not be considered a failure, we have achieved more than 100 computers or more per !! account.
This document is my attempt to express a very complex problem in relatively simple terms. I had input from numerous administrators, and some of the ideas and assumptions are based on Bruce Campbell’s original decisions when he created Watstar and the distributed management model that we have enjoyed for two decades.
Many the Nexus Active Directory references can be found on the web. I recommend starting with http://www.eng.uwaterloo.ca/~erick/nexus/useridconventions.html which includes historical links and best practices.
If any links are dead, or the data more recent than described, I have backups of most of the original web pages that can be referenced.