untitled

Information Technology Services The Australian National University

IIS Business Office HOME • IIS HOME • IIS SEARCH • Division of Information

 Office of the Director



The Australian National University

INFORMATION TECHNOLOGY SERVICES & PLANNING UNIT

Defensive Computing Strategies
for Desktop Computers and Local
Area Networks

6 February 1995
 (amended 15 February 2000)

                      ACKNOWLEDGMENT

     The Australian National University would like to  express
     its appreciation to the University of Melbourne which has
     given its  consent to modify its document entitled
    "Technical  Note 107: Defensive Computing Strategies for
     Desktop Computers and Local Area Networks" for use in The
     Australian National University.


        Copyright Information Technology Services
             The University of Melbourne, 1992

       Copyright  Information Technology Services
         The Australian National University, 1995


Contents:

1.	Introduction and Summary
			The Role of Information Technology
			Sharing of Responsibility
			Defensive Measures
2.	Summary of Policy and Procedures
			Administrative Controls
			Security Measures - Servers
			Security Measures - Workstations
			Backup, Archiving and Disaster Recovery Planning
3.	Computing Without Defences	
4.	Minimising Costs with Defensive Computing
5. 	Organisational and Administrative Control
6. 	Security
			Servers
			Workstations
7.	Failsafe - Backup, Archiving and Disaster Recovery
			Backup
			Backup Techniques
			Archiving
8.	Failsoft Strategies
			Failsoft - A Definition
			Failsoft Techniques
			Conditions for Effective Failsoft
				Redundancy of Equipment and Data Storage
				People and Data Mobility
				Equipment Mobility
				Attitudes, Knowledge and Skills
			Developing a Failsoft Strategy
9.	Implementation, Testing and Review
10.	Sources of Information
Appendix 1:  Password Selection Guidelines

1.  INTRODUCTION AND SUMMARY

The University has a very large investment in Information Technology.
The   IT   Directions   Statement  ("Information  Directions   Study:
Directions  Statement,"  Australian National University,  June  1993)
estimated that the University has a current investment in IT hardware
and  software  of  the order of $50m. The Statement  also  identified
recurrent  expenditure in excess of $24m per annum for equipment  and
software  purchased  by the University and for  IT  classified  staff
throughout the University.  An increasing proportion of academic  and
administrative  work is done using desktop computers.   Much  of  the
University's  administrative and academic information is  stored  and
transmitted electronically.  This information is a valuable  asset  -
the  University could not function effectively without  it,  and  the
replacement  cost of the information (if replacement  were  possible)
would  easily exceed the replacement cost of the computing equipment.
This paper is concerned with the risks associated with dependence  on
IT  and ways in which the University can minimise these risks and the
associated  costs.   The  focus  is on  effectively  and  efficiently
meeting  the  University's responsibility for data stewardship  in  a
distributed computing environment.

It  is important, particularly in a climate of prominent and credible
demands for accountability and efficiency in the use of public funds,
that the University make good use of its IT investment.  While IT has
many benefits, in both efficiency and quality, the technology has  to
be  managed  properly.  The benefits depend on  reliable  and  timely
access to data and systems.  Inherent in all computer systems  are  a
number  of  risks,  which must be understood and  prepared  for  with
defensive computing strategies.

The IT risks we face are -

  -  Loss of control of access to information.
  -  Loss of information.
  -  Loss of staff productivity in the event of system failure.
  -  Loss or damage to computing equipment.
  -  Loss of use of information on departure of key personnel.
  -  Legal  and  ethical risks related to Intellectual  Property  and
     Privacy.
  -  Theft of information and equipment.
  -  Attack by computer viruses.
  -  Unauthorised tampering with University data.
  -  Ineffective information sharing.
  -  Under-utilisation of computing resources.

The  risks  are  not  new.   They apply  to  traditional  information
handling  systems.  In the days of filing cabinets and  photocopiers,
there  was always a risk of information being copied or removed,  and
of  files  being  lost or stolen.  The introduction  of  photocopiers
immediately   increased  the  risk  of  breach  of  copyright.    The
proliferation of IT changes these risks, but does not create them.


THE ROLE OF INFORMATION TECHNOLOGY

The   University's   Information  Technology   Directions   Statement
identified:

     To   maintain   its   leading  position   among   national   and
     international academic institutions and to fulfil its  Statement
     of  Objectives, the University will need to plan for, manage and
     use effectively the whole spectrum of IT.

The  objectives  of the University in publishing  this  note  are  to
ensure that computer systems and facilities throughout the University

  -  are  secure  from  unauthorised access  to  or  modification  of
     University data;
  -  are  secure  from unrecoverable loss of data as  the  result  of
     error or equipment failure;
  -  can be restored in a timely manner in the event of failure;
  -  meet University needs for timely information;
  -  minimise opportunities for theft;
  -  survive the departure of key personnel;
  -  do not infringe Intellectual Property or Privacy rights;
  -  facilitate legitimate access to and sharing of information;  and
  -  use University resources efficiently.

SHARING OF RESPONSIBILITY

Fifteen  years  ago, most computerised information in the  University
was physically stored in the Computer Services Centre (CSC).  The CSC
managed  the  storage  facilities in  such  a  way  that  if  a  user
inadvertently  lost information, it could usually be  recovered  from
backup copies.  Physical security was the clear responsibility of the
CSC,  as  was  the control of access to information  on  the  central
computer   systems.   Licensing  and  installation  of   system   and
application  software was also managed by the CSC.  The CSC  provided
in  essence  a  bureau  service, taking full responsibility  for  the
proper  management  of  the  majority of the  University's  computing
resources.

During  the last fifteen years the number of computers on campus  has
increased  a hundredfold. The bulk of this increase has been  due  to
the  deployment of desktop computers and small servers throughout the
University.  Most computerised information  is  now stored  on
workstations  or file servers physically present in  and  managed  by
departments,  faculties and schools.  Administrative data  entry  and
processing are performed by all departments, using their own  systems
as  well  as the University's central administrative systems operated
by the Management Information Services (MIS).

In  consequence,  the  sharing  of responsibility  for  IT  has  also
changed.   While data stewardship remains crucial to the University's
successful  use  of  IT, this stewardship must  now  be  provided  by
various  University services (such as MIS) and all other departments
working  in  partnership.   In a distributed  computing  environment,
responsibility  for  day to day management of computer  systems  must
reside  at  the  local  level.  Major parts  of  effective  defensive
computing  strategies  are  unavoidably  the  responsibility  of  the
department  controlling the computer.  IT User Support and  MIS  are
resourced  to provide advisory and training services, and,  in  cases
where  economies  of  scale  are significant,  or  devolution  cannot
reasonably  be  achieved, IT Services provide  centrally  funded  and
managed resources which complement those in departments.

In  support  of departmental systems, IT Services, principally  MIS,
are chartered to provide:-

  -  Guidelines  on  security,  backup and  archiving,  failsafe  and
     failsoft  mechanisms, virus management and compliance  with  the
     law.

  -  Public  Access volumes providing public domain or site  licensed
     software  for,  amongst other purposes, virus  protection,  file
     transfer,   file   compression,  backup  and   archiving,   word
     processing and other desktop applications.

  -  Training  in  the  use  and  management  of  Macintosh  and  IBM
     compatible   PC   workstations,  and  in  local   area   network
     administration.

  -  The  IT  Support  Forum  and mailing  list  (Local  IT  support-
     contacts)   for  discussion  amongst  staff  involved   in   the
     management or operation of IT Systems.

  -  Recommendations on standard software and hardware, and advice on
     configuration.

  -  Licensing and maintenance agreements.

  -  Access to administrative systems and their data.

DEFENSIVE MEASURES
SECURITY measures guard against theft of equipment and data,
unauthorised  use of equipment and systems, unauthorised  copying  or
modification  of University data, and infection by computer  viruses.
Security  measures  are  a response to risks  which  arise  from  the
possibility of illegal behaviour against the University's  property.
Physical  security  measures are imperative -  while  the  University
carries  insurance  that covers the theft or  computer  equipment,  a
significant  loss  of productivity and information  may  result  from
theft. (The University's insurance policy provides cover for the loss
or  damage  of all IT equipment owned by the University or for  which
the  University has assumed responsibility. Inquiries regarding  this
insurance  policy should be directed to the Chancelry Services. 
Financial and Business Services ext.4257)

ADMINISTRATIVE measures control and monitor the use and management of
computer systems, with the aim of minimising security risks, ensuring
that the University complies with software copyright and privacy law,
and encouraging defensive practices.

The  University  tries  to  minimise the incidence  and  duration  of
computer downtime.  Even if security and administrative measures were
completely  successful in eliminating loss through  theft,  equipment
failure  would still occur because of accidental damage, old  age  or
design  or manufacturing fault.  PREVENTATIVE MAINTENANCE repairs  or
replaces faulty equipment before actual failure, thus minimising  the
consequential costs of failure.

When  computing equipment fails,  work stops, or worse  still  it  is
lost.  FAILSAFE measures minimise long term loss of data by providing
reliable recovery of lost data.

FAILSOFT  is  the  graceful degradation of service  as  one  or  more
components  of  a system fails.  The degradation may  include  slower
service  or less efficient and convenient ways of getting work  done.
A failsoft strategy succeeds if the work of a department can continue
despite failure of parts of the departmental computing system.

DISASTER RECOVERY PLANS encompass the actions to take place if  there
is  substantial loss of a computing facility, through fire  or  flood
for   example.   Such  plans  may  envisage  substantial   delay   in
replacement of equipment and provide for temporary measures  such  as
manual systems or borrowed equipment to be used in the interim.

EMERGENCY PROCEDURES are often expensive and provide no guarantee  of
success,  but offer the hope of recovery of data despite the  absence
of  failsafe.  These procedures rely on a high level of expertise and
are  usually not planned in advance.  They should be used only  as  a
last resort.

None  of these measures, in isolation, provides adequate control over
IT risks.  Departments, and individual computer users, should adopt a
combination of complementary defensive strategies.  The blend  chosen
will depend on the availability of equipment, software and expertise,
risk  assessment, the costs associated with potential losses and  the
cost of the defensive strategies.

Complementary defensive strategies are compared in Table 1.
Table 1:  Complementary Defensive Computing Strategies
Security Measures Physical and software measures to prevent unauthorised removal or use of computers. These measures provide protection against malicious damage but not accidental loss.
Administrative Controls Assigning responsibility for security and other measures, institutionalisation of these measures, approval processes for decisions with widespread effects.
Preventative Maintenance Replacement, refurbishment and inspection of equipment before failure, to minimise unexpected downtime.
Failsafe Precautions taken against equipment failure, maintaining an ability to recover and continue work once the faulty equipment has been repaired or replaced.
Failsoft Precautions taken against equipment failure with the goal of maintaining an ability to get work done despite failure of components in the system.
Disaster Recovery Planning A planned sequence of assigned actions to be carried out if there is disastrous loss of a computer facility (for example, through fire or flood causing extensive and irreparable damage to equipment).
Emergency Procedures Risky or expensive procedures, usually not planned in advance, which may provide recovery despite the absence of adequate failsafe measures. These procedures are adopted as a last resort.

Table  2  classifies defensive practices from several  fields  in  an
attempt to illustrate the concepts
.
Table 2:  Classification of Defensive Practices in Using Systems
System Defensive Practice Benefit Type of Practice
Motor Car Steering Lock Reduced likelihood of theft Security
Snow Skls Safety Bindings Ski is released from the boot before the strain is sufficient to break ankle or leg Failsafe
Aircraft Multi-Engine Design If not all enginges fail, the aircraft flies with reduced performance, and lands safely Failsoft
Motor Car Brake overhaul Brakes unlikely to fail while in use Preventative Maintenance
Motor Car Compulsory motor vehicle inspections Strong encouragement to those who control registered vehicles to maintain them in a safe condition. Administrative Control
City Airlift of emergency food, shelter, hospitals, bulldozers and building materials following natural disaster Minimise loss of life, avoid evaculation, to minimize delay in reconstruction Disaster Recovery
Experimental Aircraft Parachute Safe descent of pilot after failure of aircraft in flight Emergency Procedure

2.  SUMMARY OF POLICY AND PROCEDURES

ADMINISTRATIVE CONTROLS

1.   LAN  administrators  should be trained  in  the  principles  and
     operation of networking systems.

2.   The   LAN  administrator  should  record  information  such   as
     department  name,  physical  location  of  servers   and   brief
     descriptions of the hardware and software used on the network.

3.   LAN  administrators are required to maintain documented  records
     of  who  has  access  to the LAN and what  level  of  access  is
     provided.

4.   A  LAN  map should be maintained and an up to date copy provided
     to the Head, IT Services Network Group.

5.   LAN  names  and  numbering systems must be registered  with  the
     Head, IT Services Network Group.

6.   Only  network  versions of software and software for  which  the
     University has a site licence may be loaded onto a file server.

7.   No  software  may be run on any workstation unless permitted  by
     the owner of the copyright on the software.

8.   The  University  may  conduct audits  of  departmental  computer
     systems   to  measure  the  degree  of  compliance  with   these
     guidelines.

SECURITY MEASURES - SERVERS

1.   Servers  should be located in a lockable room, not in a corridor
     or  office.   Ideally,  the room should  be  separate  from  the
     workstations which access the server.  Access to the room should
     be  controlled by limited issue of keys to system administrators
     and their assistants who need physical access to the server.

2.   The room housing the server should be kept locked outside normal
     working hours.

3.   Ventilation  should be adequate (if the room is comfortable  for
     people then ventilation is probably adequate for the server) and
     the  temperature  should  be  kept  well  within  the  operating
     specifications of the server.  Temperatures below 10ûC or  above
     28ûC lead to unreliability and expensive damage.

4.   Where there is provision to do so, cables should be fastened  in
     their sockets with screws.

5.   Servers  should  be physically attached to an  immovable  anchor
     point.

6.   Servers  should  be  physically  marked  as  property   of   the
     University.

7.   A  detailed  and  accurate record should be  maintained  in  the
     departmental  asset register of the server and all  the  options
     with  which is equipped, and all licensed software installed  on
     it.

8.   Network and operating system floppy disks should be kept  locked
     in a separate room from the server.

9.   Passwords should be required for all user access.  If, for  some
     reason,  it  is deemed necessary to allow "guest" or "anonymous"
     access,  disk  access should be limited to read-only  access  to
     those files that are needed by such users.

10.  Administrators must determine the level of access  available  to
     individual users on the basis of need.

     In  this  connection,  it should be noted that  under  Macintosh
     System  7, all Macintoshes can act as limited file servers.   If
     file sharing is enabled, the default settings give "Guests" full
     read,  write and delete permission on the computer's hard  disk.
     Similarly,  users of Windows for Workgroups should be  aware  of
     the  capability and risks associated with the sharing facilities
     provided by this product.

11.  Passwords assigned by the system administrator should be changed
     as soon as possible by the user.

12.  Users  should  choose  passwords that are  difficult  to  guess.
     Passwords  should  be  at  least  eight  characters  long.    In
     particular,  passwords such as "secret", or the user's  name  or
     initials  should  be avoided.  Passwords that are  normal  words
     compromise security in that brute force methods such  as  trying
     every  word from a dictionary are effective in discovering them.
     These  brute force methods are easily carried out by a  computer
     program.   Appendix  1 lists password selection  guidelines  for
     users.

13.  User  passwords  should be changed from time to  time.   If  the
     operating  system allows it, users should be obliged  to  change
     their  passwords at regular intervals.  The frequency with which
     password changes are required should not be so high as to  cause
     users to make insecure paper records of their passwords.

14.  User   passwords   should  not  be  written  down,   or   stored
     electronically in documents.  If it is absolutely  necessary  to
     keep  a  written record of the password, it should be  on  paper
     rather  than computer, and the paper should be kept in a  secure
     place.  A Post-It note attached to the user's workstation is not
     acceptable!

15.  Users  are held responsible for illegal access gained by use  of
     the  password.  Users must be advised of this responsibility  at
     the time the username is issued.

16.  Super-user  privileges such as the ability to create  authorised
     users  should be allocated strictly on the basis of need.  There
     should,  however,  be at least two people with these  privileges
     for each server.

17.  User accounts should be immediately disabled or removed from the
     system  in the event that the user leaves the University or  for
     some other reason no longer has a need for access.

18.  If  the  operating system allows recording of  user  logins  and
     login attempts, this feature should be used, and the logs should
     be  checked daily by the system administrator.  Users should  be
     advised that this is the case.

SECURITY MEASURES - WORKSTATIONS

1.   Where there is provision to do so, cables should be fastened  in
     their sockets with screws.

2.   Workstations  should  be  physically attached  to  an  immovable
     anchor point.

3.   Workstations  should be physically marked  as  property  of  the
     University.

4.   Workstations  that  have a key lock facility  should  be  locked
     outside  normal  working  hours.  Unless  for  some  reason  the
     workstation  needs  to be kept on out of  hours,  it  should  be
     powered off overnight.

5.   A  detailed  and  accurate record should be  maintained  in  the
     departmental  asset  register of the  workstation  and  all  the
     options  and  software with which is equipped.  The location  of
     the equipment should be recorded also.

6.   Staff  should be regularly reminded that equipment which is  the
     property  of  the  University may not  be  removed  without  the
     written permission of the Head of Department.

7.   At  least  once a year, an audit should be conducted  to  ensure
     that all computing equipment in a department's asset register is
     configured  and  located  as stated in the  register.   Portable
     equipment  such  as  laptop  and notebook  computers  should  be
     checked much more frequently.

8.   All  workstations  should be protected  from  computer  viruses.
     Suitable  virus  protection  software  for  Macintosh  and   IBM
     compatible  PC computers is available at no charge  from  public
     access  file  volumes  maintained  by  IT  User  Support.    The
     licensing  of  this software allows staff and students  who  use
     privately  owned computers in their University work to  run  the
     software on private machines, and this practice is encouraged.

BACKUP, ARCHIVING AND DISASTER RECOVERY PLANNING

1.   Backup  arrangements for each workstation and server  should  be
     determined  by  departments with due regard  to  the  costs  and
     benefits.   The procedures should be documented, as  should  the
     file recovery procedures.

2.   At  least  two backups of each machine should exist at  any  one
     time,  with one set being held in the department and the  second
     set being stored offsite.

3.   Backup  procedures  should be tested - it is  not  uncommon  for
     procedures  to be found inadequate only when it is necessary  to
     restore a lost file.

4.   For  servers,  a Disaster Recovery Plan should be initiated  and
     documented by the local administrator.  The plan should  include
     matters  such  as who is to be contacted if there  is  disaster,
     arrangements   for  reinstating  services  and  provisions   for
     reverting to manual procedures during system downtime.

5.   Disaster Recovery Plans should be tested annually.

6.   Disaster  Recovery Plans should be lodged with the Director,  IT
     Services.

3.  COMPUTING WITHOUT DEFENCES

Information is often the most valuable computing asset in an  office.
The  effectiveness  of staff who use computers  depends  on  constant
access  to  the information.  Most cases of hard failure -  temporary
complete loss of service - cause loss of access to information.  Some
workstations in the University are not secure or failsafe, and theft,
damage  or malfunction would lead to irrevocable loss of information.
Most departments recognise the value of information and have failsafe
procedures  to  preserve  it, but they are not  failsoft.   They  run
systems  in which the failure of one or more components is likely  to
lead  to  complete loss of access to information until the  fault  is
rectified.

The  risks  are not insignificant - the University loses information,
and staff time, when computer systems fail.  Two incidents will serve
as  illustration. These incidents occurred in departments with  well-
run computer systems.  Names are fictitious, the incidents were real.

JIM SMITH
Jim  Smith  is  an  academic  who uses  his  IBM  compatible  PC  for
preparation of exams, preparing research papers and articles, general
correspondence,   electronic  mail,  and  research   on   Information
Technology.   His  department  has a cautious  approach  to  computer
viruses;   it  avoids software from dubious sources  and  runs  virus
checks  as  a  matter  of  course on  all  incoming  disks.   Student
workstations  and  disk  storage are quarantined.   Their  anti-virus
policy  is  intended  to prevent infection in the  first  place,  and
detect   infection   early.   Disinfection  had  not   been   closely
considered.

Despite the precautions, a virus carried on a virus-detection  floppy
disk  infected Jim's PC.  Eradication required reformatting the  hard
disk,  with  the  loss of all data.  He had archival  copies  of  his
software,  and  most  of his data was backed up to  floppy  disks  in
various  places.  There was no comprehensive backup on either  floppy
disk or the network fileserver, or to off-site archive, and no record
of the content and location of the various archives and backups.

Recovery  was  slow and painful, but most of the data  was  recovered
after  reformatting using emergency procedures based on disk  utility
software.   The  painstaking process required resources  not  usually
available in the department, and Jim had no access to his data or use
of  his computer for several days, during which he was unable to work
effectively.

FACULTY OF ACOUSTIC ENGINEERING
The  faculty  office  makes extensive use of word  processing,  mail-
merging and databases.  The workstations are networked IBM compatible
PC   machines  with  one floppy disk drive and  no  hard  disk.   All
software and data is stored on the faculty fileserver.

In the middle of the year, an old fileserver was replaced with a new,
more powerful machine.  The machine failed some time later, with disk
errors.   The  failures  continued despite  several  replacements  of
components including the disk drive, power supply and motherboard.  A
'hot  spare' file server  failed also.  The problem is thought to  be
with the local power supply;  the newer and more powerful fileservers
are  more  demanding and sensitive to current fluctuations,  and  the
server runs reliably with an Uninterruptable Power Supply.

The  process  of  successive failure and repair took several  months,
during which the office endured a number of unscheduled incidents  of
complete  loss of access to data and software for several days  at  a
time.   This was particularly serious towards the end of the Academic
Year,  when the office was especially busy with exam results,  Summer
School programs and student mailouts.

THE COSTS
These cases have two striking similarities.  In each case, users were
dependent on their computers, and knew it.  In each case, users  were
cautious.   There was no permanent loss of data.  But in  each  case,
one  or more people experienced considerable anxiety and frustration,
and  many working hours were effectively lost.  The cost of  the  two
incidents, in terms of lost working hours alone, is estimated at over
$6000.  The full cost is probably much higher.

4.  MINIMISING COSTS WITH DEFENSIVE COMPUTING

It  would  be  best  if  IT  systems never  failed.   The  University
encourages purchase of reliable systems, but even so, there  will  be
failures.  Hard disks with Mean Time Between Failure (MTBF) of 50,000
hours  are considered reliable by current standards.  In the case  of
machines such as fileservers, which are run 24 hours per day,  50,000
operational hours elapse in less than six years.  We have  more  than
100  file servers, and so we could expect, on average, more than  100
server  disk failures every six years:  more than one per month.   On
average,  a faculty running several file servers will experience  one
or more server failures each year.

A workstation runs about 1600 hours per year.  If disk MTBF is 50,000
hours  and the University has 6000 workstations, then an estimate  of
the incidence of workstation disk failure is

   1600 hours/machine/year x 6000 machines
   ---------------------------------------  =  192 failures/year
            50,000 hours/failure

Although it has no theoretical foundation, there is an empirical  law
of  computing which states that most equipment failures occur  during
times  of  peak workload, just before critical deadlines.  About  one
machine  in  thirty is likely to fail each year as a result  of  hard
disk  failure alone.  It is most unlikely that any faculty  will  not
experience such failures each year.

The  University seeks reliable equipment, and not much  more  can  be
done  to  prevent these failures.  The cost of repair and replacement
is unavoidable, but is largely absorbed through long-term warranties.
The annual cost of staff time lost due to temporary unavailability of
equipment  and data is significant, as is the cost of permanent  loss
of data.  It is in minimising these costs that defensive computing is
important.   On  the  estimates above, hard  disk  failure  alone  in
workstations  and fileservers are likely to cost the University  400-
500  staff  days  per  year.  Much of this cost  can  be  avoided  if
defensive strategies are in place.

Jim  Smith needed a single, comprehensive backup at a known location.
This didn't exist, because he did not have a quick process for backup
and  restore  as required.  As hard disks have grown from  a  typical
10Mb  in 1985 to 100-200Mb in 1994, backup to floppy disk has  become
impractical:   it is so time consuming that it is rarely  done  often
enough to provide real protection against loss of current data.

A  policy of regular backup of local hard disk data to the fileserver
would have provided the required safety net.  Such a backup takes  5-
60  minutes, depending on factors such as volume of data and software
used.   Restoration of data takes 5-10 minutes.  Such a  backup  runs
unattended, and so it can run at lunch time or if necessary overnight
with  minimal  impact on other work.  Had such  a  practice  been  in
place, Jim could have reformatted his disk, downloaded all his  files
from the server, deleted the (now known) infected software, and run a
new,  uninfected  virus-checker to check that the  problem  had  been
solved.

At  Acoustic  Engineering,  the fileserver  was  a  single  point  of
failure.  Fileserver data was regularly backed up to tape, and it was
possible  to  restore  the data to another server,  but  the  Faculty
workstations  could  not function effectively  unless  shielded  from
backbone  network traffic by the local server.  The  server  provided
the  only  available printing facility.  A suitable failsoft strategy
might include

  -  equipping  workstations with hard disks  and  regularly  copying
     shared  fileserver  data onto workstation disks,  together  with
     backup copies of the required software, or

  -  developing the skills and procedures required to get work  done,
     albeit  slowly, on floppy-only workstations without the network,
     and

  -  placing  one or more printers on suitable trolleys, and ensuring
     that  all  staff  with printing requirements  were  sufficiently
     skilled to move these printers between workstations as needed.

5.  ORGANISATIONAL AND ADMINISTRATIVE CONTROLS

In  most  cases,  individual workstations are networked.   While  all
workstations  have  access  to  network  resources  provided  by  the
University,  most  departments also provide some local  area  network
resources,  such  as fileservers and printers.   In  that  case,  the
department  should  appoint  a  Local  Network  Administrator,   with
responsibility for the proper management of the departmental  network
and liaison with ITS Network Services, and with authority appropriate
to those responsibilities.  Proper management of a Local Area Network
will  require setting of standards and procedures for use of  network
facilities  by  individual users, and close  liaison  with  users  to
ensure  that  individual workstations do not pose  a  threat  to  the
integrity of the network.

The  University has a responsibility under the law not to breach  the
Intellectual Property rights of software authors, and all  staff  are
under written instructions not to do so in the course of their work.

The following guidelines apply:

1.   LAN  administrators  should be trained  in  the  principles  and
     operation of networking systems.

2.   The   LAN  administrator  should  record  information  such   as
     department  name,  physical  location  of  servers   and   brief
     descriptions of the hardware and software used on the network.

3.   LAN  administrators are required to maintain documented  records
     of  who  has  access  to the LAN and what  level  of  access  is
     provided.

4.   A  LAN  map should be maintained and an up to date copy provided
     to the Head, IT Services Network Group.

5.   LAN  names  and numbering systems  must be registered  with  the
     Head, IT Services Network Group.

6.   Only  network  versions of software and software for  which  the
     University has a site licence may be loaded onto a file server.

7.   No  software  may be run on any workstation unless permitted  by
     the owner of the copyright on the software.

8.   The  University  may  conduct audits  of  departmental  computer
     systems   to  measure  the  degree  of  compliance  with   these
     guidelines.

6.  SECURITY

The  aim  of  security measures is to ensure, so far as is  possible,
that  equipment  and  the  information stored  on  it  is  safe  from
accidental  or  malicious damage, from unauthorised interference  and
from  theft.   A number of physical and administrative  measures  are
necessary to provide suitable security.

SERVERS
Servers  provide  important services to many users.   These  services
include  fileserving, printserving and electronic  mail  as  well  as
remote  login and file transfer.  Servers not only provide  important
services  to  local  users:   they are a part  of  the  communication
infrastructure which provides important but hidden services  such  as
packet   routing  and  protocol  conversion.   Appropriate   security
measures are:

1.   Servers  should be located in a lockable room, not in a corridor
     or  office.   Ideally,  the room should  be  separate  from  the
     workstations which access the server.  Access to the room should
     be  controlled by limited issue of keys to system administrators
     and their assistants who need physical access to the server.

2.   The room housing the server should be kept locked outside normal
     working hours.

3.   Ventilation  should be adequate (if the room is comfortable  for
     people then ventilation is probably adequate for the server) and
     the  temperature  should  be  kept  well  within  the  operating
     specifications of the server.  Temperatures below 10ûC or  above
     28ûC lead to unreliability and expensive damage.

4.   Where there is provision to do so, cables should be fastened  in
     their sockets with screws.

5.   Servers  should  be physically attached to an  immovable  anchor
     point.

6.   Servers  should  be  physically  marked  as  property   of   the
     University.

7.   A  detailed  and  accurate record should be  maintained  in  the
     departmental  asset register of the server and all  the  options
     with  which is equipped, and all licensed software installed  on
     it.

8.   Network and operating system floppy disks should be kept  locked
     in a separate room from the server.

9.   Passwords should be required for all user access.  If, for  some
     reason,  it  is deemed necessary to allow "guest" or "anonymous"
     access,  disk  access should be limited to read-only  access  to
     those files which are needed by such users.

10.  Administrators must determine the level of access  available  to
     individual users on the basis of need.

     In  this  connection,  it should be noted that  under  Macintosh
     System  7, all Macintoshes can act as limited file servers.   If
     file sharing is enabled, the default settings give "Guests" full
     read,  write and delete permission on the computer's hard  disk.
     Similarly,  users of Windows for Workgroups should be  aware  of
     the  capability and risks associated with the sharing facilities
     provided by this product.

11.  Passwords assigned by the system administrator should be changed
     as soon as possible by the user.

12.  Users  should  choose  passwords which are difficult  to  guess.
     Passwords  should  be  at  least  eight  characters  long.    In
     particular,  passwords such as "secret", or the user's  name  or
     initials  should be avoided.  Passwords which are  normal  words
     compromise security in that brute force methods such  as  trying
     every  word from a dictionary are effective in discovering them.
     These  brute force methods are easily carried out by a  computer
     program.   Appendix  1 lists password selection  guidelines  for
     users.

13.  User  passwords  should be changed from time to  time.   If  the
     operating  system allows it, users should be obliged  to  change
     their  passwords at regular intervals.  The frequency with which
     password changes are required should not be so high as to  cause
     users to make insecure paper records of their passwords.

14.  User   passwords   should  not  be  written  down,   or   stored
     electronically in documents.  If it is absolutely  necessary  to
     keep  a  written record of the password, it should be  on  paper
     rather  than computer, and the paper should be kept in a  secure
     place.  A Post-It note attached to the user's workstation is not
     acceptable!

15.  Users  are held responsible for illegal access gained by use  of
     the  password.  Users must be advised of this responsibility  at
     the time the username is issued.

16.  Super-user  privileges such as the ability to create  authorised
     users  should be allocated strictly on the basis of need.  There
     should,  however,  be at least two people with these  privileges
     for each server.  Super-user passwords should also be stored  in
     written form in a very secure place such as in a sealed envelope
     in the departmental safe or similar.

17.  User accounts should be immediately disabled or removed from the
     system  if there is the user leaves the University or  for  some
     other  reason  no longer has a need for access.  Academic  users
     when leaving the University may, subject to the approval of  the
     head  of their area, retain access to their account for strictly
     limited  time  in order that they may transfer their  files  and
     data.

18.  If  the  operating system allows recording of  user  logins  and
     login attempts, this feature should be used, and the logs should
     be  checked daily by the system administrator.  Users should  be
     advised that this is the case.

WORKSTATIONS
Workstations  provide  one  user  at  a  time  with  access  to   the
workstation's  own  storage  and  processing,  as  well  as   network
facilities.  In the case of staff workstations, each machine is often
allocated  to a particular person, whereas other machines, especially
those  in  computer  laboratories, are usually  shared  between  many
people.  Failure or loss of a workstation, while serious, has a lower
cost  than loss of a server.  Similarly, although it is important  to
maintain  the security and confidentiality of University data  stored
on workstations, more compromise is possible.  The following security
measures are recommended.

1.   Where there is provision to do so, cables should be fastened  in
     their sockets with screws.

2.   Workstations  should  be  physically attached  to  an  immovable
     anchor point.

3.   Workstations  should be physically marked  as  property  of  the
     University.

4.   Workstations  which have a key lock facility  should  be  locked
     outside  normal  working  hours.  Unless  for  some  reason  the
     workstation  needs  to be kept on out of  hours,  it  should  be
     powered off overnight.

5.   A  detailed  and  accurate record should be  maintained  in  the
     departmental  asset  register of the  workstation  and  all  the
     options  and  software with which is equipped.  The location  of
     the equipment should be recorded also.

6.   Staff  should be regularly reminded that equipment which is  the
     property  of  the  University may not  be  removed  without  the
     written permission of the Head of Department.

7.   At  least  once a year, an audit should be conducted  to  ensure
     that all computing equipment in a department's asset register is
     configured  and  located  as stated in the  register.   Portable
     equipment  such  as  laptop  and notebook  computers  should  be
     checked much more frequently.

8.   All  workstations  should be protected  from  computer  viruses.
     Suitable software for Macintosh and IBM compatible PC  computers
     is  available  at  no  charge from public  access  file  volumes
     maintained  by IT User Support.  The licensing of this  software
     allows  staff and students who use privately owned computers  in
     their  University work to run the software on private  machines,
     and this practice is encouraged.

7.  FAILSAFE - BACKUP, ARCHIVING AND DISASTER RECOVERY

Physical  and access security precautions alone do not guarantee  the
integrity of data in computing systems.  User and programmer  errors,
and  hardware  failure, can all lead to corruption or loss  of  data.
With  emergency procedures, it is sometimes possible to recover  data
which  has apparently been lost, but this approach is both unreliable
and  expensive.  The only reliable defence is a backup and  archiving
program.

BACKUP
Data  backup is a common failsafe practice.  It is the regular making
of  a  copy of data and software, from which the files on a  computer
can  be  recovered.   The backup may be a full  backup,  which  is  a
complete  copy  of all data stored on a computer, a  partial  backup,
which  is  a copy of only some disks, directories or folders,  or  an
incremental backup, which is a copy of those files or documents which
have  been  changed since a previous backup.  Partial and incremental
backups are used to provide frequent backup of crucial data, at lower
cost than full backup.  One practice is to make a monthly full backup
and keep each one for two months.  Partial or incremental backups are
made more often and these are kept until the next full backup.

Backup  copies  of  data  are the only way to  provide  for  reliable
recovery  of lost data.  While backup is the crucial, and  often  the
only,  component of failsafe schemes, it is also crucial to providing
mobility of people and data for failsoft strategies, and to efficient
execution of any Disaster Recovery Plan.

BACKUP TECHNIQUES
The  traditional backup technique is backup from local or server hard
disk  to floppy disk.  Backup software enables storage of many  files
from  the  hard  disk on a set of floppy disks, with files  split  as
necessary between floppies and with the directory or folder structure
preserved.   This  is  a standard capability in MS-DOS,  and  Windows
using the standard commands.  Standard Macintosh system software does
not  provide  such a facility, but backup software  is  available  at
extra cost.

As  disk  capacity  has  increased, floppy  disk  backup  has  become
inadequate.  Using high density disks, the typical user will  require
40-60  floppy  disks  for a full backup.  Each  disk  takes  about  a
minute,  and  by its nature the process requires constant  attention.
Most  users will either not make backups, or will find an alternative
technique  which runs unattended.  If sufficient fileserver  capacity
is  available, a workable technique is to backup from local hard disk
to  fileserver.  Alternatively, if the working data is stored on  the
fileserver,  backup from fileserver to local hard disk  is  feasible.
Removable hard disks and tape drives are other alternatives, but they
will  not  provide user mobility unless all workstations are equipped
with  these drives or the drives can be quickly and easily moved from
one computer to another.

Archiving  and  data compression software, such as PKZIP  (shareware)
and  STUFFIT (shareware) can run unattended and significantly  reduce
the amount of storage needed for backup.

A  common  failure  of backup disciplines is that while  backups  are
regularly  made,  no-one involved develops enough  knowledge  of  the
processes  by  which backup data can be restored.  In establishing  a
backup  scheme,  data recovery processes should be  tested,  and  all
relevant staff should be given practical experience of the process.

Floppy  disks are particularly prone to failure.  Although the drives
are  just  as  reliable  as other drives, the  disks  themselves  are
subject  to much greater physical wear and tear.  Floppies which  are
used for operational software or data should be backed up as a matter
of  course.   One approach is to keep data and software  on  separate
disks.   For each software disk, a duplicate should be made once  and
stored in a safe place.  On failure of the working software disk, the
backup  is put into service and a new backup is made.  Data  floppies
should  be  duplicated at frequent intervals.  On  Macintosh,  floppy
duplication is accomplished by dragging the source disk icon  on  top
of  the  destination  disk icon.  On IBM PC compatible  systems,  the
DISKCOPY command is used.  Both Mac and IBM PC compatible systems can
duplicate floppy disks using only one disk drive if necessary.

There is no central backup of individual server and workstation  disk
storage,  and  data  stored  on  servers  and  workstations  must  be
protected by local backup arrangements.

ARCHIVING
Some  software  and  data  files are used infrequently,  or,  at  the
completion of a project, are not expected to be required  again.   To
conserve  expensive primary disk storage, such applications and  data
should  be  removed.   Nevertheless, it is  not  desirable  that  the
information  be  irrevocably lost for all time.  For example,  it  is
University  policy  ("Guidelines  for  the  Responsible  Practice  of
Research", ANU Paper 578/1993, June 1993) that research data be  held
for  a  period of at least five years.  The appropriate action is  to
copy  the  information to cheaper, less accessible storage media  for
archival purposes.

Archiving  is  often confused with backup, as the same  software  and
techniques are used.  The difference between Backup and Archiving  is
that  whereas  the information in a backup may change frequently  and
exists  also  in primary storage, archived material does  not  change
often  and  may  be stored only on the secondary medium.   Backup  is
performed frequently as part of operating routines, whereas archiving
(and recovery from archival storage) is performed only when required.

Archiving   is  a  useful  technique  for  reducing  the  volume   of
information which is copied in day to day backup operations.  On many
workstations,  much of the primary storage space  is  used  to  store
application software, which does not often change.  User data  files,
which do change often, often account for as little as 25-50% of  disk
space.  If system and application software is archived, there  is  no
need for it to be included in regular backups.  In a department where
a  standard  software environment is established,  a  single  archive
stored  on a fileserver will meet the software recovery needs of  all
users, as well as facilitating rapid setup of new machines.

A  second  significant application of archiving is in  management  of
computer laboratories in which a large number of users have access to
a  number of similar workstations.  In such laboratories, there is  a
tendency for an initial standard software configuration to be rapidly
changed, so that each workstation becomes different.  Maintenance  of
such  workstations is difficult and time-consuming, unless a standard
archive is established at the outset.  Workstation management is then
achieved  by  periodically  reformatting the  workstation  disks  and
reloading  the  standard  software configuration  from  the  standard
archive.   This  process  can take as little  as  15  minutes  for  a
laboratory of 20-30 workstations, using an archive stored on a nearby
fileserver.

Archives should be documented - their long-term nature means that the
person  who  eventually  has to restore the  information  to  primary
storage may well not be the person who created the archive.

The  strengths  and weaknesses of a number of backup  techniques  are
summarised  in Table 3.  Successful and efficient backup arrangements
rely on a combination of several.
Table 3 - Backup and Archiving Technique
Technique Software Used Strengths Weaknesses
Simple file copy to floppy disk DOS: COPY, XCOPY Easy and quick for small numbers of small files Directory/folder structure not always preserved; file sets larger than floppy disk cacity can't be backed up
Windows file manager (Drag and Drop) Relies only on standard desktop equipment and software.
Mac: The finder (Drag and Drop) Inexpensive storage medium
Copy to multi-disk floppy disk backup set

DOS: BACKUP/RESTORE (DOS 3,4 &5) MSBACKUP (DOS6) XCOPY.Windows mwbackup

Mac: Third party software (eg. Retrospect)

Relies only on standard desktop equipment. Inexpensive storage medium. Backup & restore software is part of standard DOS. Requires human intervention every minute or so; takes about 1 minute per megabyte. Requires additional software on Mac.
Copy to spare space on same hard disk drive

DOS: BACKUP/RESTORE (DOS 3,4 & 5), MSBACKUP (DOS 6) XCOPY.

windows: mwbackup

Mac: The finder (File Duplicate)

Fast, uses only standard equipment and software Requires a lot of spare disk space.
Copy to magnetic tape DOS/MAC: software supplied with tape drive. Fast. No user intervention required. Comparatively expensive storage medium. Requires special equipment and software. No all tape units have proved reliable.
Copy to hard disk on another machine (eg server to workstation or workstation to server)

DOS: BACKUP/RESTORE (DOS 3,4 & 5), MSBACKUP (DOS 6) XCOPY. Windows: mwbackup

Mac: Finder, 3rd party software (eg. Retrospect/remote)

No user intervention required. Extra protection if the copied-to-disk is itself backed up. Requires spare hard disk space. Speed depends on network connection. File restore capability depends on the network.
File compression, in conjunction with above methods.

DOS: PKZIP.

Mac: StuffIT, CompactPro

Efficient use of storage space, faster data transmission and backup duplicaiton. Software provides flexible partial and incremental backups. Compression runs unattended. Overall compress and copy takes more time than straight copy. Liccence fees are payable for compression software.

Selection  of  backup  procedures  involves  consideration   of   the
availability and cost of equipment, software and staff, the risk  and
likely  cost of data loss, and the frequency with which backup should
be performed.

The following practices are recommended:

1.   Backup  arrangements for each workstation and server  should  be
     determined  by  departments with due regard  to  the  costs  and
     benefits.   The procedures should be documented, as  should  the
     file recovery procedures.

2.   At  least  two backups of each machine should exist at  any  one
     time,  with one set being held in the department and the  second
     set being stored offsite.

3.   Backup  procedures  should be tested - it is  not  uncommon  for
     procedures  to be found inadequate only when it is necessary  to
     restore a lost file.

4.   For  servers,  a Disaster Recovery Plan should be initiated  and
     documented by the local administrator.  The plan should  include
     matters  such  as who is to be contacted if there  is  disaster,
     arrangements   for  reinstating  services  and  provisions   for
     reverting to manual procedures during system downtime.

5.   Disaster Recovery Plans should be tested annually.

6.   Disaster  Recovery Plans should be lodged with the Director,  IT
     Services.

8.  FAILSOFT STRATEGIES

A  failsoft  strategy  has  the goal of enabling  the  department  to
continue  work comfortably, despite equipment failure.  The  goal  is
achieved through policies of

  - mobility of people and their work between workstations;
  - mobility of equipment between people and between locations;
  - consistent computing equipment and practices;  and
  - maintenance  of  equipment and skills above the minimum  required
     for effective and efficient work under ideal conditions.

Experience  suggests that although appropriate security and  failsafe
measures  are  in  place  in many departments,  comparatively  little
attention  has been paid to failsoft strategies in either  purchasing
decisions  or  operating procedures.  While the  other  measures  are
basic  and  essential, efficient management of departmental computing
requires  that  failsoft  at least be considered.   Failsoft  is  not
always  cost-justified a priori, but its absence should  be  decision
made considering relevant costs and risk factors.

8.1  FAILSOFT - A DEFINITION

Graceful  degradation  of service as one or more  components  of  the
system fails.

SERVICE
System  capabilities used to get useful work done.  Examples  include
document  editing, printing, statistical data analysis, data storage,
database  maintenance and interrogation, access to  remote  computers
and databases, and sending, receiving and storing electronic mail.

DEGRADATION
Degraded  service may be slower, less convenient, less  efficient  or
less  attractive  than normal service levels.  For example,  while  a
networked  laser printer is unavailable, one might use a  cheap  dot-
matrix  printer  moved around the department to  provide  a  degraded
printing service.

GRACEFUL
Graceful  degradation implies that the degraded  service  is  quickly
available,  with  little  inconvenience or expense  other  than  that
associated with the degradation itself.

SYSTEM
The  computer system consists of workstations (PC, Mac, Unix  box  or
terminal),  usually located at the workplace of its  principal  user,
together  with equipment elsewhere in the department, the  University
and  throughout  the  world  through the  network,  and  the  network
infrastructure itself.

COMPONENTS
Cables,   disk   drives,  disks,  CPUs,  RAM,   printers,   plotters,
fileservers,  mailservers, network gateways, modems, nameservers  and
any  other  items  which  the  system  requires  to  maintain  normal
services.

FAILURE
A  state  where  a  component does not perform its  normal  function,
causing normal services to become at least temporarily unavailable.

Failsoft  is  not  the  same  as  disaster  recovery,  failsafe,   or
preventative  maintenance, but it complements all of those  programs.
In  many cases, some of the practices required by a failsoft strategy
will  already  be  in place as part of failsafe or disaster  recovery
procedures.   In a department which has such programs in  place,  the
marginal cost of setting up a failsoft program can be very small.


8.2.  FAILSOFT TECHNIQUES

In  principle,  failsoft techniques do not  depend  on  the  type  of
workstation used.  Similar techniques are available for Macintosh, PC
and  Unix  workstations.  Details vary across systems, in consequence
of their relative strengths and weaknesses.

NETWORK-BASED COMPUTING
In  some workgroups, it is convenient to keep the operational  copies
of  all  software and data on a fileserver volume which is accessible
to all workstations in the work group.  There can be cost advantages,
and   network-based  computing  facilitates  standardisation  of  the
workstation environment.  In the event of failure of any component of
a  workstation, work can continue immediately on another workstation,
provided one is available.  LANManager, PC-NFS and AppleTalk networks
(except  those  holding sensitive administrative  information)  allow
login  across the campus, so in the last resort the use of  a  public
workstation  in  the Leonard Huxley Building level  1  training  area
allows work to continue.

There  are disadvantages.  In the event of failure of the network  or
the  fileserver, all workstations are effectively disabled.  In  most
settings,   network   based  computing  has  noticeable   performance
deficiencies compared to workstation-based computing.

WORKSTATION COMPUTING
A  stand-alone  workstation is one which  is  not  dependent  on  the
network.  While electronic mail and access to remote systems will  be
network-dependent,  much  useful work can be  done  on  a  standalone
workstation.  All data and software is stored on local hard disk, and
printing  is  provided  by  a  printer  connected  directly  to   the
workstation.

Standalone  systems are rarely as cost-effective as those  with  some
degree  of reliance on the network.  Where people share data but  not
workstations, it will be better to keep working data on a  fileserver
to  which  all  members  of  the group have  access.   Network  based
printing  is  favoured in most cases because of economies  of  scale.
Nevertheless,  workstations  which can function  stand-alone  are  an
important  part of a failsoft strategy, as the standalone  capability
reduces dependence on the fileserver.

A well-balanced approach is to use standalone capable workstations in
a  networked  environment.  The network provides printer sharing  and
data  mobility,  whether through use of the  network  for  backup  or
through  locating  working  data  on  the  network,  with  backup  to
workstation disks.

HOT SPARES AND REDUNDANT EQUIPMENT
Service  provision despite component failure depends on either  using
equipment in ways other than normal, or on using other equipment.  In
some  cases, failure of a component is best dealt with by on-the-spot
replacement   with  spare  equipment.   Where  personal   workstation
utilisation  approaches  100%, it may  be  necessary  to  maintain  a
complete  spare workstation to ensure timely completion  of  mission-
critical work.  In most cases, the trend within the University of one
workstation  per desk provides sufficient redundancy,  provided  that
equipment  can  be  shared when necessary  and  that  user  and  data
mobility is arranged.

"JURY RIG" PRINTER SHARING
While  networking  is  the ideal method for printer  sharing,  it  is
important  to retain a printing capability even when the  network  is
unavailable.  There are two ways of achieving this:  both  depend  on
an ability to connect a printer directly to a workstation.

A  networked printer can be made portable by placing it on a trolley.
In the event of network failure, the printer can be wheeled around to
the various workstations and used as a local printer.  Alternatively,
a  lightweight, low quality dot matrix printer can be moved around as
necessary.

If  frequent movement of the printer is infeasible, another  approach
to  network failure is to connect the printer to a single workstation
for  the duration of the network failure.  As staff require printing,
they  can  transfer their work to floppy disks and take  it  to  this
printer  workstation.  If even this reconnection of  the  printer  is
impossible, a "worst case" option is to take work to public  computer
laboratories  such as those in the central facility  in  the  Leonard
Huxley Building, and print on public printers.  In these cases, it is
important   that  the  application  software  used  on   the   normal
workstations can be made available on the printer workstation  or  is
available on public workstations.  It is crucial that workstations be
equipped with compatible floppy disk drives.

ALTERNATIVE COMMUNICATION TECHNIQUES
Those who require remote login and file transfer services usually use
TCP/IP  software (FTP and Telnet), which provide those  services  via
the  local  Ethernet  or  LocalTalk cabling.  There  are  alternative
techniques in some cases.  Kermit is a file transfer and remote login
protocol  for  serial  communication lines, and Kermit  programs  are
available  free for MS-DOS, Windows, Macintosh and Unix  workstations
as  well as many other systems.  Serial communication is slower  than
Ethernet  or  LocalTalk, but better than nothing.  Serial connections
can be arranged in several ways.

In  cases  where  connection  is  required  between  machines  in   a
department, an inexpensive RS-232 cable between serial ports  can  be
used  to cover distances up to about 100 feet.  For longer distances,
or  when no cabling is possible, a dial-up modem can provide a  login
and  file  transfer capability via the public telephone  network.   A
modem  is  required at both ends of the link:  ITS  Network  Services
operate  a  dial-up modem service that allows connection (subject  to
conditions of use restrictions on the dial-up modems and the computer
to  be  accessed) to any computer on campus connected to  the  Campus
Network.

In  cases where the volume of data is large or the frequency  of  the
need for data transfer is low, it is often most effective to transfer
data  from one machine to another via floppy disk.  A 1 megabyte file
can  be  transferred from one machine to another via floppy  disk  in
about  two  minutes  plus travelling time.  At 9600bps,  serial  file
transfer  of the same file will take about 20 minutes.  In  order  to
use   floppy  disk  transfer,  workstations  must  be  equipped  with
compatible  disk  drives.  Current Macintosh  computers  have  floppy
drives  which can read and write 1.44Mb 3.5" IBM compatible PC floppy
disks.

SUMMARY OF TECHNIQUES
Table  4  lists  some components which are subject  to  failure,  the
consequence  of  failure, ways of obtaining degraded service  pending
repair,  the  conditions  which apply to  degraded  service  and  the
failhard alternative.
Table 4:  Failsoft Responses to Component Failure
Failure Consequence Degraded Service Conditions Alternative
Hard disk Error message from workstation, no disk access. If it is the system disk, startup fails. Restore data and software to another workstation. Boot from disk and work floppy-only, using data and software from network.

Up-to-date backup must exist on a medium accessible from the replacement workstation.

Boot disk available.

Loss of access until repaired; possible permanent loss of data.
Floppy Drive Error message from workstation when drive is used Read and write floppy disks on another machine, transporting data via network. Restrict local operations to floppy formats whichremain available Workstation not dependent on floppy disk drive for operation. Access to networked workstation with similar disk drives. No access to floppy disk data until drive is repaired.
Monitor Cabling correct, but image not present or unstable Use another monitor or workstation Access to another monitor or workstation, user mobility No computing until monitor is repaired or replaced.
Power Supply No action when switched on Restore data and software to another workstation Up to date backup, user mobility No computing until power supply is repaired or replaced.
Cabling Components securely and corrected cabled, but not working together Borrow or share the required cable. Spare cables, diagnostic skills and confidence such as required to install a new system. No computing until faulty cable is replaced.
Printer Printer does not print or does not print properly Use another printer, through the network or by local connection Mobility of local printers or arrangements to use alternative network printers and the necessary configuration skills No printing until printer is repaired.
Network server One or more file, print, mail and other services becomes unavailable. Work standalone - make alternative arrangements for file transfer, printing, data and software storage, communication with other users. workstations which can function standalone; equipment and software for alternative file transfer; printer/user mobility. Sever disruption of work for all users of the server until it is reinstated.
Network infrastructure Local network up but isolated. Most work undisrupted, use public workstations for remote file transfer. Access to public workstations with compatible floppy disk drives. Work which relies on campus-wide (or beyond) networking is delayed.
Keyboard/mouse Keystrokes not received by computer. Replace with "hot spare" or use another computer. Hot spare, or access to shared machines and user mobility. No computing until keyboard is replaced or repaired.
   
8.3  CONDITIONS FOR EFFECTIVE FAILSOFT

The conditions for failsoft may be summarised as needs for

  - Redundant storage of data, and deliberate equipment redundancy  -
     maintenance of equipment in excess of that required  to  provide
     services under ideal conditions;

  - flexible   allocation  of  equipment  to  individuals,   achieved
     through  maintenance  of  user  and  data   mobility,  equipment
     mobility and standardisation of hardware and software;

  - arrangements  with  nearby  departments  to  share  resources  to
     overcome temporary unavailability of normal services; and

  - maintenance  of the attitudes, knowledge and skills  required  to
     implement failsoft techniques.

These  conditions  may  best  be met within  a  policy  framework  of
equipment  "ownership" at department or workgroup level  rather  than
that  of  individuals,  and  continuous  management  and  review   of
contingency planning.

REDUNDANCY OF EQUIPMENT AND DATA STORAGE
Some  failsoft techniques depend on maintaining services with reduced
equipment  levels, but most depend on pressing other  equipment  into
abnormal  service to cover failure of equipment normally used.   This
implies  a  need to equip at a level greater than that  required  for
normal  operation.   A  universal  need,  for  failsafe  as  well  as
failsoft, is maintenance of more than one copy of working data.

In  some  cases, failsoft will require the acquisition  of  redundant
equipment,  such as hot spare units.  However, the trend  within  the
University  of one workstation per desk provides for some redundancy.
Although  most University staff will use computers in much  of  their
work, very few will make continuous use of the equipment allocated to
their  use.   For occupational health reasons if nothing  else,  many
staff will only make part-time use of their workstations.

A  consequence  is  that  the  University  is  not  aiming  for  100%
utilisation  of  workstation equipment.  A high rate  of  utilisation
would  be  75%.  This means that in a department with 12 workstation-
equipped  staff, there will be 12 staff workstations, but on  average
only  9  will  be  in use at any one time.  This redundant  equipment
level  is  well justified by the advantages of providing  staff  with
workstations which are available on demand.  At times when  equipment
failure threatens the effectiveness of one or more members of  staff,
it  also  provides  redundancy which can be  exploited  to  meet  the
conditions for failsoft computing.

PEOPLE AND DATA MOBILITY
Rapid  deployment of temporary replacement equipment to an individual
requires that the user's computing environment can be quickly set  up
on  another machine, and that the machine can be returned quickly  to
its  normal use.  It must be possible to physically move the user  to
the  machine or to bring the machine to the user's normal  workplace.
Where  it  is  possible to acquire floating or hot  spare  computers,
serious   consideration  should  be  given  to  laptop  or   notebook
computers.

In  most  cases,  data mobility will not be achieved through  regular
floppy-disk  backup:  the process is too slow and  requires  constant
user  intervention.   If  floppy disks are the  only  storage  medium
available  other  than workstation hard disks, then  great  attention
must  be  paid to efficiency.  The simple backup of the  entire  hard
disk  content  to  floppy is not efficient.  With  the  use  of  file
compression  and partial and incremental backup, it may sometimes  be
possible   to  bring  the  size  of  the  task  down  to   manageable
proportions.   In most cases, the use of network file server  storage
will  be  far  more  effective.  Data mobility  techniques  based  on
expensive  special-purpose devices such as removable hard  disks  and
tape cartridges should be avoided unless it is possible to equip  all
workstations with these devices.

EQUIPMENT MOBILITY
On occasion, particularly in cases of server failure, the only way to
maintain  some  services  which  are normally  provided  through  the
network   is  to  physically  move  equipment  from  workstation   to
workstation  as  the  need arises.  The obvious example  is  printing
services  in the LAN Manager environment.  Network printing  services
in this environment become unavailable if the local server fails, yet
all  printers installed as network printers can quickly be redeployed
as   workstation   printers,  providing  that  appropriate   physical
arrangements are made.

Network  printers are usually large and heavy, and  it  will  not  be
possible  to  move  them around unless they are  placed  on  suitable
trolleys.   Given  the  ability to bring the  printer  close  to  the
workstation,  electronic connection is easily  established.   For  an
effective  printing service, it is necessary that staff are  able  to
reconfigure  application and system software in  such  a  way  as  to
generate output suitable for the printer, directed to the appropriate
output port.

ATTITUDES, KNOWLEDGE AND SKILLS
Suitable  equipment  and  backup procedures  are  necessary  but  not
sufficient  conditions.  While the University's policy is to  provide
each  member  of staff with an appropriate workstation,  it  must  be
recognised that few staff use their machines continuously,  and  that
this  partial  utilisation provides much of the equipment  redundancy
needed  to  allow work to continue despite failure of some equipment.
Successful  exploitation of this redundancy depends on  the  attitude
that  equipment belongs to the University, not to individual  members
of staff.

Good  management of computing requires detailed contingency planning,
but  no-one can  foresee every emergency which takes place.   Success
in  failsoft  depends on users themselves having  the  knowledge  and
skills  to  diagnose a fault and devise a work-around  to  use  while
waiting for the fault to be rectified.


8.4.  DEVELOPING A FAILSOFT STRATEGY

Use   of   computers  differs  greatly  between  workgroups,  between
departments and between faculties.  Although some defensive  measures
are recommended throughout the University, there is no single off-the-
shelf  failsoft  strategy.  Aspects of organisational  behaviour  are
important:     unless    defensive   practices    are    successfully
institutionalised  they  are  likely to  fail.   Success  depends  on
adopting policies consistent with an appropriate and acceptable level
of  autonomy for the staff involved.  Detailed defensive strategy  is
best  developed  at  the  level  at which  it  will  be  implemented.
Although  the individual user can and should act, greater  efficiency
and   reliability  can  be  achieved  through  strategy  devised  and
implemented  at  work group, department or faculty level.    Assuming
that   administrative  controls,  security  measures   and   failsafe
practices are in place, one approach to developing failsoft practices
is set out here.

IDENTIFY THE SERVICES UPON WHICH USERS DEPEND.
Not  all computer use depends on all of the services provided by  the
system.   An  efficient  strategy  will  not  allocate  resources  to
maintaining  or  recovering services which  are  not  required.   The
crucial  services  will  be those which are relied  on  for  mission-
critical work.

IDENTIFY THE SYSTEM WHICH PROVIDES THOSE SERVICES.
Which workstations, fileservers, printers, modems and other equipment
are used?

IDENTIFY POTENTIAL COMPONENT FAILURE
Determine  which  components are subject to  failure,  and  for  each
possible failure, consider the likely time to repair or replace,  the
services which would be unavailable while waiting for repair, and the
services (usually databases) which would be permanently lost.

COST THE FAILURES
Ignore  those  failures with no long term effects and for  which  the
impact  of  temporary  loss of service is insignificant.   For  those
potential  failures  which remain, estimate the  incidence  of  these
failures per year, and the likely cost of the loss of service.  As  a
rule of thumb, it is wise to double the initial estimate.

ESTIMATE THE BENEFIT OF FAILSOFT
The goal of the strategy is to avoid loss of certain services despite
component failure.  Multiply the annual incidence of each failure  by
the  estimated  cost  of lost service to obtain an  estimate  of  the
annual benefit associated with failsoft.

DEVISE AND COST FAILSOFT POLICIES
For  each significant potential failure, devise one or more practices
which  will  enable   continuing useful  work  despite  the  failure.
Estimate the annual cost of each practice.  If capital expenditure is
involved, a reasonable annualised figure for estimation purposes  may
be obtained by dividing capital cost by three.

CONSIDER OVERLAP
Failsoft  practices tend to overlap with each other and with existing
failsafe  policies.   It  will  not  be  necessary  or  efficient  to
implement  all possible failsoft practices, but neither would  it  be
wise  to  implement a bare minimum.  Choosing a set of practices  for
implementation  requires judgement.  There will be many  alternatives
involving different standards of emergency service, different  levels
of  risk,  different  protection against multiple component  failure,
different  costs, and different likelihood of user acceptance,  which
is crucial to success.

CONSIDER FEASIBILITY
Check  that  the  proposed policies are feasible  -  that  sufficient
resources  can be allocated.  Check also that the estimated  benefits
are  not greatly outweighed by the cost.  A small excess of cost over
benefit may be justified as a premium paid for risk reduction.


9.  IMPLEMENTATION, TESTING AND REVIEW

Defensive measures guard against failure of systems, but they tend to
fail  themselves  unless successfully institutionalised.   Consistent
practices  and  equipment are required among a number of  people  and
over time, and so ad-hoc implementation is unlikely to succeed.

PURCHASING
The  strategy  may call for some immediate purchases of equipment  or
software.   It  should  also  be considered  when  new  equipment  is
acquired.  If possible, new equipment should confirm to the standards
established  by the strategy to provide compatibility and flexibility
in  allocation.   If that is not possible, then the  strategy  itself
will require adjustment to accommodate the new equipment.

COMMITMENT
The  organisational  culture  of  the  University  is  one  in  which
commitment  to  policy  is  rarely  achieved  by  decree.   Defensive
measures depend on procedures such as backup being performed  without
fail, despite pressure of other work or absence of key people.  Staff
commitment  is  necessary, and is most likely to be obtained  if  all
staff affected are consulted during development of the strategy.

DOCUMENTATION
Unless  defensive measures and planned responses to component failure
are  written down, Murphy's Law dictates that in the absence  of  key
people,  some crucial defensive measure will be omitted,  or   no-one
available  will  know  how  to  carry out  the  planned  response  to
equipment failure.  This will occur at a time of peak workload.

TESTING
An  emergency  is  not the right time to find out whether  a  certain
technique works properly, and nor is it the time for staff to develop
new  skills.   As part of the implementation process, the  techniques
adopted   should  be  practiced.   If  the  strategy  guards  against
fileserver failure by providing for standalone workstation computing,
then  a  test  should  be scheduled to ensure that  work  can  indeed
continue  despite  the  failure.  All staff  who  are  likely  to  be
involved in moving themselves from one workstation to another  should
have  the  opportunity to make such a move under test conditions,  to
ensure that in an emergency they have the necessary experience.

REVIEW
Development  of  defensive strategy is influenced  by  factors  which
change:  technology, the nature of the work, available funds, network
capability  and existing equipment.  The strategy should be  reviewed
from  time  to  time,  and  adjusted to suit changing  circumstances.
After  an equipment failure is also a good time to review what worked
and what didn't work.


10.  SOURCES OF INFORMATION

HARDWARE AND SOFTWARE REFERENCE MANUALS

Software  reference manuals usually have a chapter entitled  "Getting
Started"  or  "Installation", in which the hardware requirements  for
the package are stated, and instructions are given for installing the
software on floppy disk or hard disk.  In many cases, installation on
a network disk is the same as installation on local hard disk, but in
other  cases  there will be special instructions for installation  on
network  disk.  This information will assist in setting  up  software
for use from all three storage media as the need arises.

In the case of application programs and system software which have to
be  configured for particular printers, the installation  section  of
the  reference manual usually gives instructions for  doing  so.   At
installation  time,  it is best to install printer  drivers  for  all
printers  that  are likely to be used at any time, rather  than  just
those printers which will be used under normal conditions.

The   MS-DOS   and   Windows  User's  reference  contain   invaluable
information  on the data backup commands.  Other MS-DOS  and  Windows
capabilities useful in failsoft include DISKCOPY, DISKCOMP and XCOPY,
and MODE and SET, which may be necessary to set up a local printer.

Good  printer  manuals provide extensive information on the  hardware
and  software configuration required to use the printer, and  on  the
precise  detail of the cabling required.  Many printers and computers
offer  more  than  one  means of connection:  A  typical  Mac  offers
LocalTalk  and  serial communications, and some Macs offer  Ethernet.
Most  IBM  compatible  PC  machines have  both  serial  and  parallel
communication ports.  Study of the computer and printer manuals  will
indicate  the  range  of possibilities, which  can  be  exploited  to
provide multiple paths from computer to printer.

INTERNET NEWS
The  Internet  News system is a world-wide broadcast electronic  mail
system  carried  in Australia by AARNet.   Useful
information  on  Macintosh, MS-DOS and Windows shareware  and  public
domain software, some of which is useful in failsoft strategy, can be
found in these newsgroups:

  - aus.archives
  - aus.computers.ibm-pc
  - comp.binaries.ibm.pc
  - comp.binaries.ibm.pc.archives
  - comp.binaries.ibm.pc.d
  - comp.binaries.mac
  - comp.sys.mac.digest

There are many other groups which may be useful too.  Through AARNet,
it  is  possible  to obtain most of the software discussed  in  these
newsgroups, using FTP.

UNIVERSITY ONLINE SERVICES
The University maintains local public access collections of technical
information, equipment and software price-lists and public domain and
shareware  software for Macintosh, MS-DOS and Windows  systems.   The
majority of this information is available from the LITSS Home page.


THE LOCAL IT SUPPORT CONTACTS MAILING LIST
The  IT  Support  Forum uses this electronic mail  mailing  list  for
discussion  of  IT-related matters.  Electronic mail  sent  to  site-
contacts is forwarded to all subscribers.  It is IT Services practice
to publish announcements of new facilities and services, meetings and
seminars through the mailing list as well as through other channels.

APPENDIX 1.  PASSWORD SELECTION GUIDELINES

The  following  guidelines should be considered when  selecting  your
password:
-  Don't   use   your  login  name  in  any  form  (as-is,  reversed,
   capitalised, doubled, etc.).
-  Don't use your first or last name in any form.
-  Don't use your spouse's or child's name.
-  Don't  use  other  information easily obtained  about  you.   This
   includes  license plate numbers, telephone numbers,  the brand  of
   your car, the name of the street you live on, etc.
-  Don't use a password of all digits, or all the same letter.   This
   significantly decreases the search time for a cracker.
-  Don't  use  a  word  contained  in  English  or  foreign  language
   dictionaries, spelling lists, or other lists of words.
-  Don't use a password shorter than eight characters.
-  Do use a password with mixed-case letters.
-  Do  use a password with nonalphabetic characters, e.g., digits  or
   punctuation.
-  Do  use a password that is easy to remember, so you don't have  to
   write it down.
-  Do  use  a  password that you can type quickly, without having  to
   look  at the keyboard.  This makes it harder for someone to  steal
   your password by watching over your shoulder.

Although  this  list may seem to restrict passwords  to  an  extreme,
there  are  several  methods  for choosing  secure,  easy-to-remember
passwords  that  obey  the above rules.  Some of  these  include  the
following:

-  Choose  a  line  or  two from a song or poem, and  use  the  first
   letter  of  each word. For example, "In Xanadu did  Kubla  Kahn  a
   stately pleasure dome decree" becomes IXdKKaspdd.
-  Alternate  between  one consonant and one or  two  vowels,  up  to
   eight  characters. This provides nonsense words that  are  usually
   pronounceable,  and  thus  easily  remembered.  Examples   include
   routboot, quadpops, and so on.
-  Choose  two  short  words and concatenate  them  together  with  a
   punctation   character  between  them.  For   example:   dog;rain,
   book+egg, or kid?goat


Please direct enquiries regarding this page to the Webmaster.
The information on this page was updated on Tue, 15 Feb 2000. The page has been authorised by the Director, Information Infrastructure Services as relevant officer.
© 2000 The Australian National University