Lecture Notes
WatITis
2006
Erick
Engelke and Daniel Delattre
Software deployment is one of the promises of a managed
environment.  Our presentation reflects
the experiences we gained from managing the Nexus computer network, a system
with over 3,500 computers.
 
slide:
Nexus Software Deployment
Approximately 1,750 of our computers are in student labs, where we want
them installed to behave identically to other computers in the room.  That way, computer courses can rely on one
set of instructions producing predictable results on all the workstations.
Another 1,800 computers are in private offices and research labs.  Again, we want the software to be delivered
reliably, but here we may need more flexibility for a variety of software to be
installed to meet the needs of the user. 
Office computers are not “one size fits all”.
Automated software delivery is very important for providing good
service to our users, and it is also necessary to make efficient use of our IT staff
time.
Slide Challenges
We have identified several challenges to reliable software deployment –
and they highlight issues that are not adequately addressed through the obvious
deployment strategies like GPO based installs or even manual installs. 
We’ll start by outlining our goals.
Slide
Requirements 
Reliable 
–       
We want software to install reliably whenever as
possible.  But when it doesn’t, or can’t,
we need to know about the failure so we can fix it before the user is
inconvenienced.  After all, reliable
software delivery is a big part of the argument for a managed environment.
Effective 
–       
there are many aspects
to effective delivery.  The distribution
should be fast, it shouldn’t require a user to have to take manual steps to
assist the installation, and it shouldn’t frustrate the user.  
–       
Some installation techniques require the user to step
through a series of windows or even make decisions.  This is not acceptable.
–       
Users also get frustrated if slow installs happen at boot
time – they begin to dread turning their computers off or rebooting.
–       
If users are afraid of rebooting, they are equally suspect
of even logging off.  This makes software
installation more difficult.
Automated
-        
We want workstations to be remotely managed and the delivery
to be automatic.  At some time the
administrator will assign software to the workstations (such as through Active
Directory), and we want the workstation to get the software automatically,
hopefully at an appropriate time so it can do the installation correctly.  
-        
As you will see, we also want repairs to happen somewhat
automatically – nothing should require an actual visit – that simply isn’t
scalable.
-        
Another aspect of automation is that workstation rebuilds
should automatically bring the rebuilt system to the same software
configuration as before, and identical to its peers.
-        
Many technologies handle only certain types of
automation.  For example, GPOs handle only MSI automation, RoboCopy
handles only file duplication.  We’ll see
examples where both strategies have limitations.
Targeted
-        
We need to be able to target specific software to specific
computers or groups of computers.  For
example, we may want to add some optical scanning software to all computers
with scanners physically attached, but not to computers without the
hardware.  Targeting means software goes
where it’s needed, but not where it isn’t (or isn’t
wanted).
-        
Some automation tools don’t handle targeted delivery.  They are fine for some groups of computers,
but they don’t handle the diverse needs of our big system.
Upgradeable and Removable
-        
Software installation is only one part of the software
lifecycle.  
-        
Upgrades and removal are also important but they are often more
difficult to accomplish.
Secure
-        
Any process for adding software must expose no new security
risks.
slide –
Possible Solutions
There are several solutions which campus administrators have used or
investigated.
As some people remind us, there is always the possibility of manual
installation – sitting down with a bunch of CDs.
Robocopy is a free Microsoft provided
tool which is very convenient for copying files to a bunch of Windows
computers.
Active Directory Group Policy Objects are the workhorse Nexus uses for
most software distribution.  
There are a variety of Microsoft and third party commercial
solutions.  These include Microsoft SMS
or Systems Management Server, Novel Zenworks, LanDesk
Open Source solutions always deserve a look.
And there may be others.
slide
Manual Installations
-        
Manual installations have one aspect of reliability – they
are vendor supplied so we would almost expect them to be ideal in terms of
reliability.
-        
Experience shows us that this is not true.  The results are not always identical and the
resulting software is not as predictable as automated installs
-        
Obviously manual installs are not as effective.  Someone has to visit every workstation, and
there is the inconvenience to all involved.
-        
Some strengths of manual installs are:
o       
software can be targeted to the intended machines
o       
upgrades and removals are usually possible, though manual
processes
-        
an often forgotten weakness of manual installs is that the
administrator password is used and has a greater chance to be exposed
slide RoboCopy
RoboCopy is a free Microsoft
tool.  It can be very convenient for copying
files to a number of destinations.
The concept of using RoboCopy for software
distribution is very similar to the old SysCtl we had
on Watstar, where we had a nightly refresh to install
or reinstall all needed files.
Compared to some of the other solutions, RoboCopy
is not very effective.  
For one thing, it cannot be used for targeted installs; it only
addresses workstations with identical images.
It is automated, but only suited for file copy installations, not for
MSI or executable installs.
The upgrade or removal process involves overwriting or deleting files.
RoboCopy has no reporting tools.  When its done, you
don’t know where it was unsuccessful.
slide
Active Directory GPOs
GPOs are Microsoft’s recommended way to manage software…
well, until you push its limits and need to buy the SMS product.
Our experience is that GPOs are reliable for
smaller software packages.  And later you
will see numbers to support this claim.
GPOs only handle MSI software packages.  If you have an EXE install, for example, you
have to convert it into an MSI – which is an inexact process, which we will
also discuss.
GPOs are automated. 
They are built into Active Directory. 
This does not mean the automation always works.  If the install is unsuccessful, the GPO is
not attempted again, and there is no way that you will know about the failure.  You are left with a mess and no way to fix
it.
The GPO is effective for an initial install of application software.  The upgrade process is not very reliable, nor
is the removal process.  So GPOs do not handle the full software life cycle.
There are no reporting tools provided by Microsoft to gauge the results
or fix the failures.
slide – 3rd
party commercial solutions
There are several commercial options available.
Nexus has nearly 3,600 clients and is still growing.  Most products become very expensive with
these numbers.
Also, most commercial systems are locked into certain technologies and
less extensible than open source.  That
translates into less flexibility for our needs. 
We already have great control over our environment,
we’ll show examples of how we tweak the experience – which we can do with open
source tools.
One strong benefit of most commercial tools is the excellent
reliability and reporting capabilities.  
slide open
source solutions 
There aren’t actually a large choice of open source
tools for our needs.
Of the ones available, some like ANI and Unattended,
are best for identical systems.  They
could be used for some teaching labs, but really don’t meet our needs for
flexibility in hardware and software configurations.
Not all of them are in active development either.
There are many other site-specific technologies, and we see them at the
educational conferences.
We were intrigued by WPKG.  It is
popular and flexible.  It also has good
support and is actively kept up-to-date.
slide
Current Environment
Before we could pick among the tools, we needed to determine the
weaknesses of our current environment and find a way to measure the success of
the new tool.
Nexus relies heavily on Active Directory GPOs.  And GPOs rely on
MSI packages.
Not all software comes as an MSI package.
Some software is very
unreliable to install.  And upgrades and
uninstalls are extremely unreliable. 
A particularly vexing problem is that once software is installed
incompletely, it becomes very hard to detect. 
Microsoft offers no reporting tools for this problem.
For example, we have a 120 station computer lab which is used for
teaching all manner of Engineering and other courses.  All the computers are supposed to be
identical – they all had the same GPOs, but we always
heard reports of things not working on some machine or another.  Unfortunately, we rarely heard which exact
machine was missing the software, and even if we did, how would we fix the
problem other than a complete re-install?
slide GPO
Issues
There are other problems. 
Repackaging software means capturing the differences between the
preinstalled state and the same computer after the software is installed.
This strategy captures unrelated, unnecessary or even some damaging
elements.
??????
Software installs also depend on certain logic, such as decisions for
compatibility.  For example, the vendor’s
installation program will look for certain components and install them only if
necessary, or make other decisions based on what is installed such as web
browser enhancements or additions to Word. 
There is logic necessary to complete the installs, and some of that can
be lost in the capture process.  
Large software packages seemed to install less reliably.  As we’ll show, we developed the technology to
actually detect this and even work towards correcting the situation.
slide GPO
tool
GPOtool is the name of a locally
developed tool.  It’s also the name of an
unrelated Microsoft tool – small world.
Our GPOtool looks for broken GPO-based software
and helps fix it – all done remotely from a management station.
It was initially created to deal with the problem in that 120 station
lab I mentioned.  They were supposed to
be identical computers, identical software, what went wrong and can we fix it?
GPOtool is now used in a lot of
places on campus.  Not only can it fix
problems, but it can give assurance that the installation is complete and
successful.  If an exam is to be held in
a computer lab, the last thing we want to have to worry about is whether the
software is installed.
GPOtool works by scanning the
computers to see what they have installed and whether the software is
completely installed.  
It creates a report detailing what is where, and also what is only
semi-there – the result of a failed install.
GPOtool can reset the GPO flags, so
that the next time GPOs are applied, the missing
software is reinstalled.
It also includes the ability to uninstall most software using the
vendor supported uninstall programs.
GPOtool’s reports can be exported to a
text file for further analysis or for inclusion in management documents.
slide GPOtool Observations
In our 120 station lab, GPOtool advised us of
what had worked reliably and what hadn’t. 
We were able to fix the failed installs without having to do total
re-installs.
In the case of MasterCAM, the 81% success
rate shows that 23 out of 120 machines failed. 
This underscores the unreliability of GPOs,
and gives us a difficult application we can try to install with a replacement
tool to gauge its usefulness.
slide
Open Source: WPKG
WPKG is a popular open source tool for software life-cycle
management.  It can do deployment,
removal and updates.
There are both push and pull features.  Normally workstations are configured to pull
down new software at a convenient time, but one can also push software by
remotely triggering a pull.
WPKG supports MSIs, InstallShield
and other vendor supported formats including EXE files.
Like GPOs, it supports layered images – where
workstations can subscribe to multiple packages to build up the final image.
Files can be distributed for better total scalability.  For example, for our 120 station lab, we want
to distribute the images over four servers to keep performance peppy.  With GPOs this is
accomplished by having four GPOs – one for each
server.  WPKG would allow us to
inherently split the load without making redundant GPOs.
WPKG includes its own reporting tools. 
Though they are simply workstation based, they are easily made powerful
because Nexus already has mechanisms to collect the reports so we can summarize
the results.
WPKG has a very small footprint. 
slide
WPKG – overview
The way WPKG works is that we define software packages, eg. MatLab, SPSS, or Autocad that we wish to
deploy.  These packages can be
dependant on other packages, so a subscription to one package will drag along
the other packages on which it depends.
Station profiles are defined with a list of packages.  Eg. all computers in a lab are put
into one lab profile, and it can be given subscriptions to software packages.
Profiles can also extend other profiles.  For example, a machine in a lab with an
optical scanner needs to have all the other software of the lab PLUS the
scanner software.  So inheritance is
used.
An example is in order.
slide
WPKG Example
Our Engineering GAFF lab needs a variety of software.  All the stations in the lab need the same
software, so they are all described by the GAFF profile.
There is a computer with an optical scanner, it needs all the GAFF
software, PLUS the scanner software.  So
we make the GAFF SCANNER profile grab just the scanner software and inherit the
rest from the GAFF profile.
slide
GAFF lab Example
As you can see in this graphical view, we organize the computers into group profiles.  We can assign them some generic software
profiles – office and scientific software, and those profiles point to
individual software install packages which will be applied.
Our GAFF scanner inherits all the software from GAFF.  So if we add a new package to GAFF, we don’t
have to remember to add it to GAFF SCANNER – the association by inheritance
applies the software automatically.
slide
WPKG web tool
To organize its data, WPKG uses XML files to list packages, profiles
and computer names.  
XML is an industry standard – which is a good thing.  And computers can easily read it and act on
it.
Unfortunately XML is hard for users to enter.
Also, administrators of a big system like Nexus need shared access to
the XML files so they can update the data as needed.  That complicates things because a human entry
error in the XML would render the whole WPKG-based system useless.
We needed a way to make the XML files better.  So we could enjoy shared access, and to
prevent people from making syntactical errors which corrupt the file.  So we added a web interface and database
backend for WPKG.  In this system, the
XML is generated as needed, and is always correct.
slide WPKG
BUSY Warning
The other notable shortcoming of WPKG was that users could log into the
system while software was being installed. 
Ideally the software is installed when the user is not active or present.
While GPOs seem to have this ability, some of
our GPO problems occur because the machine slows down during the install and
eventually times out doing GPO processing and allows users in.
What we did was enhance our Nexus login so we could flag the machine to
prevent logins during WPKG installs.  The
warning message is customizable.  
One way this solution is superior to the Microsoft GPO lockout strategy
is that an administrator can cancel the install (in a crisis) and allow the
user to log in.  In contrast, GPOs cannot be stopped once they start processing.  
Arts found this out when all of Microsoft Office started reinstalling
everywhere one unpleasant morning.
slide
Possibilities
WPKG offers us some new possibilities.
With our web tool, it is possible to configure PC client software from
any web browser – even from a blackberry if needed.
We can generate reports so we know what software is being
deployed.  This is great for gauging the
interest in renewing software liceneses.
It is possible to reduce duplication of effort in software packaging.  
And we can truly handle the software life cycle including upgrades and
uninstalls.
slide Our
Plan 2007
We currently use GPOs extensively – this will
not change any time soon.  In fact, this
investigation prompted the creation of GPOtool which
gives new life to GPOs by addressing the reporting
and repairing of GPO based software.
But we will also be using WPKG.  
Initially we will use it to target only difficult or unreliable
installs.  We will monitor the success of
those deployments and compare them and make more decisions then whether wpkg is the right solution for us.
slide
Questions?