The Australian National University
INFORMATION TECHNOLOGY SERVICES & PLANNING UNIT
Defensive Computing Strategies
for Desktop Computers and Local
Area Networks
6 February 1995
(amended 15 February 2000)
ACKNOWLEDGMENT
The Australian National University would like to express
its appreciation to the University of Melbourne which has
given its consent to modify its document entitled
"Technical Note 107: Defensive Computing Strategies for
Desktop Computers and Local Area Networks" for use in The
Australian National University.
Copyright Information Technology Services
The University of Melbourne, 1992
Copyright Information Technology Services
The Australian National University, 1995
Contents:
1. Introduction and Summary
The Role of Information Technology
Sharing of Responsibility
Defensive Measures
2. Summary of Policy and Procedures
Administrative Controls
Security Measures - Servers
Security Measures - Workstations
Backup, Archiving and Disaster Recovery Planning
3. Computing Without Defences
4. Minimising Costs with Defensive Computing
5. Organisational and Administrative Control
6. Security
Servers
Workstations
7. Failsafe - Backup, Archiving and Disaster Recovery
Backup
Backup Techniques
Archiving
8. Failsoft Strategies
Failsoft - A Definition
Failsoft Techniques
Conditions for Effective Failsoft
Redundancy of Equipment and Data Storage
People and Data Mobility
Equipment Mobility
Attitudes, Knowledge and Skills
Developing a Failsoft Strategy
9. Implementation, Testing and Review
10. Sources of Information
Appendix 1: Password Selection Guidelines
1. INTRODUCTION AND SUMMARY
The University has a very large investment in Information Technology.
The IT Directions Statement ("Information Directions Study:
Directions Statement," Australian National University, June 1993)
estimated that the University has a current investment in IT hardware
and software of the order of $50m. The Statement also identified
recurrent expenditure in excess of $24m per annum for equipment and
software purchased by the University and for IT classified staff
throughout the University. An increasing proportion of academic and
administrative work is done using desktop computers. Much of the
University's administrative and academic information is stored and
transmitted electronically. This information is a valuable asset -
the University could not function effectively without it, and the
replacement cost of the information (if replacement were possible)
would easily exceed the replacement cost of the computing equipment.
This paper is concerned with the risks associated with dependence on
IT and ways in which the University can minimise these risks and the
associated costs. The focus is on effectively and efficiently
meeting the University's responsibility for data stewardship in a
distributed computing environment.
It is important, particularly in a climate of prominent and credible
demands for accountability and efficiency in the use of public funds,
that the University make good use of its IT investment. While IT has
many benefits, in both efficiency and quality, the technology has to
be managed properly. The benefits depend on reliable and timely
access to data and systems. Inherent in all computer systems are a
number of risks, which must be understood and prepared for with
defensive computing strategies.
The IT risks we face are -
- Loss of control of access to information.
- Loss of information.
- Loss of staff productivity in the event of system failure.
- Loss or damage to computing equipment.
- Loss of use of information on departure of key personnel.
- Legal and ethical risks related to Intellectual Property and
Privacy.
- Theft of information and equipment.
- Attack by computer viruses.
- Unauthorised tampering with University data.
- Ineffective information sharing.
- Under-utilisation of computing resources.
The risks are not new. They apply to traditional information
handling systems. In the days of filing cabinets and photocopiers,
there was always a risk of information being copied or removed, and
of files being lost or stolen. The introduction of photocopiers
immediately increased the risk of breach of copyright. The
proliferation of IT changes these risks, but does not create them.
THE ROLE OF INFORMATION TECHNOLOGY
The University's Information Technology Directions Statement
identified:
To maintain its leading position among national and
international academic institutions and to fulfil its Statement
of Objectives, the University will need to plan for, manage and
use effectively the whole spectrum of IT.
The objectives of the University in publishing this note are to
ensure that computer systems and facilities throughout the University
- are secure from unauthorised access to or modification of
University data;
- are secure from unrecoverable loss of data as the result of
error or equipment failure;
- can be restored in a timely manner in the event of failure;
- meet University needs for timely information;
- minimise opportunities for theft;
- survive the departure of key personnel;
- do not infringe Intellectual Property or Privacy rights;
- facilitate legitimate access to and sharing of information; and
- use University resources efficiently.
SHARING OF RESPONSIBILITY
Fifteen years ago, most computerised information in the University
was physically stored in the Computer Services Centre (CSC). The CSC
managed the storage facilities in such a way that if a user
inadvertently lost information, it could usually be recovered from
backup copies. Physical security was the clear responsibility of the
CSC, as was the control of access to information on the central
computer systems. Licensing and installation of system and
application software was also managed by the CSC. The CSC provided
in essence a bureau service, taking full responsibility for the
proper management of the majority of the University's computing
resources.
During the last fifteen years the number of computers on campus has
increased a hundredfold. The bulk of this increase has been due to
the deployment of desktop computers and small servers throughout the
University. Most computerised information is now stored on
workstations or file servers physically present in and managed by
departments, faculties and schools. Administrative data entry and
processing are performed by all departments, using their own systems
as well as the University's central administrative systems operated
by the Management Information Services (MIS).
In consequence, the sharing of responsibility for IT has also
changed. While data stewardship remains crucial to the University's
successful use of IT, this stewardship must now be provided by
various University services (such as MIS) and all other departments
working in partnership. In a distributed computing environment,
responsibility for day to day management of computer systems must
reside at the local level. Major parts of effective defensive
computing strategies are unavoidably the responsibility of the
department controlling the computer. IT User Support and MIS are
resourced to provide advisory and training services, and, in cases
where economies of scale are significant, or devolution cannot
reasonably be achieved, IT Services provide centrally funded and
managed resources which complement those in departments.
In support of departmental systems, IT Services, principally MIS,
are chartered to provide:-
- Guidelines on security, backup and archiving, failsafe and
failsoft mechanisms, virus management and compliance with the
law.
- Public Access volumes providing public domain or site licensed
software for, amongst other purposes, virus protection, file
transfer, file compression, backup and archiving, word
processing and other desktop applications.
- Training in the use and management of Macintosh and IBM
compatible PC workstations, and in local area network
administration.
- The IT Support Forum and mailing list (Local IT support-
contacts) for discussion amongst staff involved in the
management or operation of IT Systems.
- Recommendations on standard software and hardware, and advice on
configuration.
- Licensing and maintenance agreements.
- Access to administrative systems and their data.
DEFENSIVE MEASURES
SECURITY measures guard against theft of equipment and data,
unauthorised use of equipment and systems, unauthorised copying or
modification of University data, and infection by computer viruses.
Security measures are a response to risks which arise from the
possibility of illegal behaviour against the University's property.
Physical security measures are imperative - while the University
carries insurance that covers the theft or computer equipment, a
significant loss of productivity and information may result from
theft. (The University's insurance policy provides cover for the loss
or damage of all IT equipment owned by the University or for which
the University has assumed responsibility. Inquiries regarding this
insurance policy should be directed to the Chancelry Services.
Financial and Business Services ext.4257)
ADMINISTRATIVE measures control and monitor the use and management of
computer systems, with the aim of minimising security risks, ensuring
that the University complies with software copyright and privacy law,
and encouraging defensive practices.
The University tries to minimise the incidence and duration of
computer downtime. Even if security and administrative measures were
completely successful in eliminating loss through theft, equipment
failure would still occur because of accidental damage, old age or
design or manufacturing fault. PREVENTATIVE MAINTENANCE repairs or
replaces faulty equipment before actual failure, thus minimising the
consequential costs of failure.
When computing equipment fails, work stops, or worse still it is
lost. FAILSAFE measures minimise long term loss of data by providing
reliable recovery of lost data.
FAILSOFT is the graceful degradation of service as one or more
components of a system fails. The degradation may include slower
service or less efficient and convenient ways of getting work done.
A failsoft strategy succeeds if the work of a department can continue
despite failure of parts of the departmental computing system.
DISASTER RECOVERY PLANS encompass the actions to take place if there
is substantial loss of a computing facility, through fire or flood
for example. Such plans may envisage substantial delay in
replacement of equipment and provide for temporary measures such as
manual systems or borrowed equipment to be used in the interim.
EMERGENCY PROCEDURES are often expensive and provide no guarantee of
success, but offer the hope of recovery of data despite the absence
of failsafe. These procedures rely on a high level of expertise and
are usually not planned in advance. They should be used only as a
last resort.
None of these measures, in isolation, provides adequate control over
IT risks. Departments, and individual computer users, should adopt a
combination of complementary defensive strategies. The blend chosen
will depend on the availability of equipment, software and expertise,
risk assessment, the costs associated with potential losses and the
cost of the defensive strategies.
Complementary defensive strategies are compared in Table 1.
Table 1: Complementary Defensive Computing Strategies
| Security Measures | Physical and software measures to prevent unauthorised removal or use of computers. These measures provide protection against malicious damage but not accidental loss. |
| Administrative Controls | Assigning responsibility for security and other measures, institutionalisation of these measures, approval processes for decisions with widespread effects. |
| Preventative Maintenance | Replacement, refurbishment and inspection of equipment before failure, to minimise unexpected downtime. |
| Failsafe | Precautions taken against equipment failure, maintaining an ability to recover and continue work once the faulty equipment has been repaired or replaced. |
| Failsoft | Precautions taken against equipment failure with the goal of maintaining an ability to get work done despite failure of components in the system. |
| Disaster Recovery Planning | A planned sequence of assigned actions to be carried out if there is disastrous loss of a computer facility (for example, through fire or flood causing extensive and irreparable damage to equipment). |
| Emergency Procedures | Risky or expensive procedures, usually not planned in advance, which may provide recovery despite the absence of adequate failsafe measures. These procedures are adopted as a last resort. |
Table 2 classifies defensive practices from several fields in an attempt to illustrate the concepts . Table 2: Classification of Defensive Practices in Using Systems
| System | Defensive Practice | Benefit | Type of Practice |
| Motor Car | Steering Lock | Reduced likelihood of theft | Security |
| Snow Skls | Safety Bindings | Ski is released from the boot before the strain is sufficient to break ankle or leg | Failsafe |
| Aircraft | Multi-Engine Design | If not all enginges fail, the aircraft flies with reduced performance, and lands safely | Failsoft |
| Motor Car | Brake overhaul | Brakes unlikely to fail while in use | Preventative Maintenance |
| Motor Car | Compulsory motor vehicle inspections | Strong encouragement to those who control registered vehicles to maintain them in a safe condition. | Administrative Control |
| City | Airlift of emergency food, shelter, hospitals, bulldozers and building materials following natural disaster | Minimise loss of life, avoid evaculation, to minimize delay in reconstruction | Disaster Recovery |
| Experimental Aircraft | Parachute | Safe descent of pilot after failure of aircraft in flight | Emergency Procedure |
2. SUMMARY OF POLICY AND PROCEDURES ADMINISTRATIVE CONTROLS 1. LAN administrators should be trained in the principles and operation of networking systems. 2. The LAN administrator should record information such as department name, physical location of servers and brief descriptions of the hardware and software used on the network. 3. LAN administrators are required to maintain documented records of who has access to the LAN and what level of access is provided. 4. A LAN map should be maintained and an up to date copy provided to the Head, IT Services Network Group. 5. LAN names and numbering systems must be registered with the Head, IT Services Network Group. 6. Only network versions of software and software for which the University has a site licence may be loaded onto a file server. 7. No software may be run on any workstation unless permitted by the owner of the copyright on the software. 8. The University may conduct audits of departmental computer systems to measure the degree of compliance with these guidelines. SECURITY MEASURES - SERVERS 1. Servers should be located in a lockable room, not in a corridor or office. Ideally, the room should be separate from the workstations which access the server. Access to the room should be controlled by limited issue of keys to system administrators and their assistants who need physical access to the server. 2. The room housing the server should be kept locked outside normal working hours. 3. Ventilation should be adequate (if the room is comfortable for people then ventilation is probably adequate for the server) and the temperature should be kept well within the operating specifications of the server. Temperatures below 10ûC or above 28ûC lead to unreliability and expensive damage. 4. Where there is provision to do so, cables should be fastened in their sockets with screws. 5. Servers should be physically attached to an immovable anchor point. 6. Servers should be physically marked as property of the University. 7. A detailed and accurate record should be maintained in the departmental asset register of the server and all the options with which is equipped, and all licensed software installed on it. 8. Network and operating system floppy disks should be kept locked in a separate room from the server. 9. Passwords should be required for all user access. If, for some reason, it is deemed necessary to allow "guest" or "anonymous" access, disk access should be limited to read-only access to those files that are needed by such users. 10. Administrators must determine the level of access available to individual users on the basis of need. In this connection, it should be noted that under Macintosh System 7, all Macintoshes can act as limited file servers. If file sharing is enabled, the default settings give "Guests" full read, write and delete permission on the computer's hard disk. Similarly, users of Windows for Workgroups should be aware of the capability and risks associated with the sharing facilities provided by this product. 11. Passwords assigned by the system administrator should be changed as soon as possible by the user. 12. Users should choose passwords that are difficult to guess. Passwords should be at least eight characters long. In particular, passwords such as "secret", or the user's name or initials should be avoided. Passwords that are normal words compromise security in that brute force methods such as trying every word from a dictionary are effective in discovering them. These brute force methods are easily carried out by a computer program. Appendix 1 lists password selection guidelines for users. 13. User passwords should be changed from time to time. If the operating system allows it, users should be obliged to change their passwords at regular intervals. The frequency with which password changes are required should not be so high as to cause users to make insecure paper records of their passwords. 14. User passwords should not be written down, or stored electronically in documents. If it is absolutely necessary to keep a written record of the password, it should be on paper rather than computer, and the paper should be kept in a secure place. A Post-It note attached to the user's workstation is not acceptable! 15. Users are held responsible for illegal access gained by use of the password. Users must be advised of this responsibility at the time the username is issued. 16. Super-user privileges such as the ability to create authorised users should be allocated strictly on the basis of need. There should, however, be at least two people with these privileges for each server. 17. User accounts should be immediately disabled or removed from the system in the event that the user leaves the University or for some other reason no longer has a need for access. 18. If the operating system allows recording of user logins and login attempts, this feature should be used, and the logs should be checked daily by the system administrator. Users should be advised that this is the case. SECURITY MEASURES - WORKSTATIONS 1. Where there is provision to do so, cables should be fastened in their sockets with screws. 2. Workstations should be physically attached to an immovable anchor point. 3. Workstations should be physically marked as property of the University. 4. Workstations that have a key lock facility should be locked outside normal working hours. Unless for some reason the workstation needs to be kept on out of hours, it should be powered off overnight. 5. A detailed and accurate record should be maintained in the departmental asset register of the workstation and all the options and software with which is equipped. The location of the equipment should be recorded also. 6. Staff should be regularly reminded that equipment which is the property of the University may not be removed without the written permission of the Head of Department. 7. At least once a year, an audit should be conducted to ensure that all computing equipment in a department's asset register is configured and located as stated in the register. Portable equipment such as laptop and notebook computers should be checked much more frequently. 8. All workstations should be protected from computer viruses. Suitable virus protection software for Macintosh and IBM compatible PC computers is available at no charge from public access file volumes maintained by IT User Support. The licensing of this software allows staff and students who use privately owned computers in their University work to run the software on private machines, and this practice is encouraged. BACKUP, ARCHIVING AND DISASTER RECOVERY PLANNING 1. Backup arrangements for each workstation and server should be determined by departments with due regard to the costs and benefits. The procedures should be documented, as should the file recovery procedures. 2. At least two backups of each machine should exist at any one time, with one set being held in the department and the second set being stored offsite. 3. Backup procedures should be tested - it is not uncommon for procedures to be found inadequate only when it is necessary to restore a lost file. 4. For servers, a Disaster Recovery Plan should be initiated and documented by the local administrator. The plan should include matters such as who is to be contacted if there is disaster, arrangements for reinstating services and provisions for reverting to manual procedures during system downtime. 5. Disaster Recovery Plans should be tested annually. 6. Disaster Recovery Plans should be lodged with the Director, IT Services. 3. COMPUTING WITHOUT DEFENCES Information is often the most valuable computing asset in an office. The effectiveness of staff who use computers depends on constant access to the information. Most cases of hard failure - temporary complete loss of service - cause loss of access to information. Some workstations in the University are not secure or failsafe, and theft, damage or malfunction would lead to irrevocable loss of information. Most departments recognise the value of information and have failsafe procedures to preserve it, but they are not failsoft. They run systems in which the failure of one or more components is likely to lead to complete loss of access to information until the fault is rectified. The risks are not insignificant - the University loses information, and staff time, when computer systems fail. Two incidents will serve as illustration. These incidents occurred in departments with well- run computer systems. Names are fictitious, the incidents were real. JIM SMITH Jim Smith is an academic who uses his IBM compatible PC for preparation of exams, preparing research papers and articles, general correspondence, electronic mail, and research on Information Technology. His department has a cautious approach to computer viruses; it avoids software from dubious sources and runs virus checks as a matter of course on all incoming disks. Student workstations and disk storage are quarantined. Their anti-virus policy is intended to prevent infection in the first place, and detect infection early. Disinfection had not been closely considered. Despite the precautions, a virus carried on a virus-detection floppy disk infected Jim's PC. Eradication required reformatting the hard disk, with the loss of all data. He had archival copies of his software, and most of his data was backed up to floppy disks in various places. There was no comprehensive backup on either floppy disk or the network fileserver, or to off-site archive, and no record of the content and location of the various archives and backups. Recovery was slow and painful, but most of the data was recovered after reformatting using emergency procedures based on disk utility software. The painstaking process required resources not usually available in the department, and Jim had no access to his data or use of his computer for several days, during which he was unable to work effectively. FACULTY OF ACOUSTIC ENGINEERING The faculty office makes extensive use of word processing, mail- merging and databases. The workstations are networked IBM compatible PC machines with one floppy disk drive and no hard disk. All software and data is stored on the faculty fileserver. In the middle of the year, an old fileserver was replaced with a new, more powerful machine. The machine failed some time later, with disk errors. The failures continued despite several replacements of components including the disk drive, power supply and motherboard. A 'hot spare' file server failed also. The problem is thought to be with the local power supply; the newer and more powerful fileservers are more demanding and sensitive to current fluctuations, and the server runs reliably with an Uninterruptable Power Supply. The process of successive failure and repair took several months, during which the office endured a number of unscheduled incidents of complete loss of access to data and software for several days at a time. This was particularly serious towards the end of the Academic Year, when the office was especially busy with exam results, Summer School programs and student mailouts. THE COSTS These cases have two striking similarities. In each case, users were dependent on their computers, and knew it. In each case, users were cautious. There was no permanent loss of data. But in each case, one or more people experienced considerable anxiety and frustration, and many working hours were effectively lost. The cost of the two incidents, in terms of lost working hours alone, is estimated at over $6000. The full cost is probably much higher. 4. MINIMISING COSTS WITH DEFENSIVE COMPUTING It would be best if IT systems never failed. The University encourages purchase of reliable systems, but even so, there will be failures. Hard disks with Mean Time Between Failure (MTBF) of 50,000 hours are considered reliable by current standards. In the case of machines such as fileservers, which are run 24 hours per day, 50,000 operational hours elapse in less than six years. We have more than 100 file servers, and so we could expect, on average, more than 100 server disk failures every six years: more than one per month. On average, a faculty running several file servers will experience one or more server failures each year. A workstation runs about 1600 hours per year. If disk MTBF is 50,000 hours and the University has 6000 workstations, then an estimate of the incidence of workstation disk failure is 1600 hours/machine/year x 6000 machines --------------------------------------- = 192 failures/year 50,000 hours/failure Although it has no theoretical foundation, there is an empirical law of computing which states that most equipment failures occur during times of peak workload, just before critical deadlines. About one machine in thirty is likely to fail each year as a result of hard disk failure alone. It is most unlikely that any faculty will not experience such failures each year. The University seeks reliable equipment, and not much more can be done to prevent these failures. The cost of repair and replacement is unavoidable, but is largely absorbed through long-term warranties. The annual cost of staff time lost due to temporary unavailability of equipment and data is significant, as is the cost of permanent loss of data. It is in minimising these costs that defensive computing is important. On the estimates above, hard disk failure alone in workstations and fileservers are likely to cost the University 400- 500 staff days per year. Much of this cost can be avoided if defensive strategies are in place. Jim Smith needed a single, comprehensive backup at a known location. This didn't exist, because he did not have a quick process for backup and restore as required. As hard disks have grown from a typical 10Mb in 1985 to 100-200Mb in 1994, backup to floppy disk has become impractical: it is so time consuming that it is rarely done often enough to provide real protection against loss of current data. A policy of regular backup of local hard disk data to the fileserver would have provided the required safety net. Such a backup takes 5- 60 minutes, depending on factors such as volume of data and software used. Restoration of data takes 5-10 minutes. Such a backup runs unattended, and so it can run at lunch time or if necessary overnight with minimal impact on other work. Had such a practice been in place, Jim could have reformatted his disk, downloaded all his files from the server, deleted the (now known) infected software, and run a new, uninfected virus-checker to check that the problem had been solved. At Acoustic Engineering, the fileserver was a single point of failure. Fileserver data was regularly backed up to tape, and it was possible to restore the data to another server, but the Faculty workstations could not function effectively unless shielded from backbone network traffic by the local server. The server provided the only available printing facility. A suitable failsoft strategy might include - equipping workstations with hard disks and regularly copying shared fileserver data onto workstation disks, together with backup copies of the required software, or - developing the skills and procedures required to get work done, albeit slowly, on floppy-only workstations without the network, and - placing one or more printers on suitable trolleys, and ensuring that all staff with printing requirements were sufficiently skilled to move these printers between workstations as needed. 5. ORGANISATIONAL AND ADMINISTRATIVE CONTROLS In most cases, individual workstations are networked. While all workstations have access to network resources provided by the University, most departments also provide some local area network resources, such as fileservers and printers. In that case, the department should appoint a Local Network Administrator, with responsibility for the proper management of the departmental network and liaison with ITS Network Services, and with authority appropriate to those responsibilities. Proper management of a Local Area Network will require setting of standards and procedures for use of network facilities by individual users, and close liaison with users to ensure that individual workstations do not pose a threat to the integrity of the network. The University has a responsibility under the law not to breach the Intellectual Property rights of software authors, and all staff are under written instructions not to do so in the course of their work. The following guidelines apply: 1. LAN administrators should be trained in the principles and operation of networking systems. 2. The LAN administrator should record information such as department name, physical location of servers and brief descriptions of the hardware and software used on the network. 3. LAN administrators are required to maintain documented records of who has access to the LAN and what level of access is provided. 4. A LAN map should be maintained and an up to date copy provided to the Head, IT Services Network Group. 5. LAN names and numbering systems must be registered with the Head, IT Services Network Group. 6. Only network versions of software and software for which the University has a site licence may be loaded onto a file server. 7. No software may be run on any workstation unless permitted by the owner of the copyright on the software. 8. The University may conduct audits of departmental computer systems to measure the degree of compliance with these guidelines. 6. SECURITY The aim of security measures is to ensure, so far as is possible, that equipment and the information stored on it is safe from accidental or malicious damage, from unauthorised interference and from theft. A number of physical and administrative measures are necessary to provide suitable security. SERVERS Servers provide important services to many users. These services include fileserving, printserving and electronic mail as well as remote login and file transfer. Servers not only provide important services to local users: they are a part of the communication infrastructure which provides important but hidden services such as packet routing and protocol conversion. Appropriate security measures are: 1. Servers should be located in a lockable room, not in a corridor or office. Ideally, the room should be separate from the workstations which access the server. Access to the room should be controlled by limited issue of keys to system administrators and their assistants who need physical access to the server. 2. The room housing the server should be kept locked outside normal working hours. 3. Ventilation should be adequate (if the room is comfortable for people then ventilation is probably adequate for the server) and the temperature should be kept well within the operating specifications of the server. Temperatures below 10ûC or above 28ûC lead to unreliability and expensive damage. 4. Where there is provision to do so, cables should be fastened in their sockets with screws. 5. Servers should be physically attached to an immovable anchor point. 6. Servers should be physically marked as property of the University. 7. A detailed and accurate record should be maintained in the departmental asset register of the server and all the options with which is equipped, and all licensed software installed on it. 8. Network and operating system floppy disks should be kept locked in a separate room from the server. 9. Passwords should be required for all user access. If, for some reason, it is deemed necessary to allow "guest" or "anonymous" access, disk access should be limited to read-only access to those files which are needed by such users. 10. Administrators must determine the level of access available to individual users on the basis of need. In this connection, it should be noted that under Macintosh System 7, all Macintoshes can act as limited file servers. If file sharing is enabled, the default settings give "Guests" full read, write and delete permission on the computer's hard disk. Similarly, users of Windows for Workgroups should be aware of the capability and risks associated with the sharing facilities provided by this product. 11. Passwords assigned by the system administrator should be changed as soon as possible by the user. 12. Users should choose passwords which are difficult to guess. Passwords should be at least eight characters long. In particular, passwords such as "secret", or the user's name or initials should be avoided. Passwords which are normal words compromise security in that brute force methods such as trying every word from a dictionary are effective in discovering them. These brute force methods are easily carried out by a computer program. Appendix 1 lists password selection guidelines for users. 13. User passwords should be changed from time to time. If the operating system allows it, users should be obliged to change their passwords at regular intervals. The frequency with which password changes are required should not be so high as to cause users to make insecure paper records of their passwords. 14. User passwords should not be written down, or stored electronically in documents. If it is absolutely necessary to keep a written record of the password, it should be on paper rather than computer, and the paper should be kept in a secure place. A Post-It note attached to the user's workstation is not acceptable! 15. Users are held responsible for illegal access gained by use of the password. Users must be advised of this responsibility at the time the username is issued. 16. Super-user privileges such as the ability to create authorised users should be allocated strictly on the basis of need. There should, however, be at least two people with these privileges for each server. Super-user passwords should also be stored in written form in a very secure place such as in a sealed envelope in the departmental safe or similar. 17. User accounts should be immediately disabled or removed from the system if there is the user leaves the University or for some other reason no longer has a need for access. Academic users when leaving the University may, subject to the approval of the head of their area, retain access to their account for strictly limited time in order that they may transfer their files and data. 18. If the operating system allows recording of user logins and login attempts, this feature should be used, and the logs should be checked daily by the system administrator. Users should be advised that this is the case. WORKSTATIONS Workstations provide one user at a time with access to the workstation's own storage and processing, as well as network facilities. In the case of staff workstations, each machine is often allocated to a particular person, whereas other machines, especially those in computer laboratories, are usually shared between many people. Failure or loss of a workstation, while serious, has a lower cost than loss of a server. Similarly, although it is important to maintain the security and confidentiality of University data stored on workstations, more compromise is possible. The following security measures are recommended. 1. Where there is provision to do so, cables should be fastened in their sockets with screws. 2. Workstations should be physically attached to an immovable anchor point. 3. Workstations should be physically marked as property of the University. 4. Workstations which have a key lock facility should be locked outside normal working hours. Unless for some reason the workstation needs to be kept on out of hours, it should be powered off overnight. 5. A detailed and accurate record should be maintained in the departmental asset register of the workstation and all the options and software with which is equipped. The location of the equipment should be recorded also. 6. Staff should be regularly reminded that equipment which is the property of the University may not be removed without the written permission of the Head of Department. 7. At least once a year, an audit should be conducted to ensure that all computing equipment in a department's asset register is configured and located as stated in the register. Portable equipment such as laptop and notebook computers should be checked much more frequently. 8. All workstations should be protected from computer viruses. Suitable software for Macintosh and IBM compatible PC computers is available at no charge from public access file volumes maintained by IT User Support. The licensing of this software allows staff and students who use privately owned computers in their University work to run the software on private machines, and this practice is encouraged. 7. FAILSAFE - BACKUP, ARCHIVING AND DISASTER RECOVERY Physical and access security precautions alone do not guarantee the integrity of data in computing systems. User and programmer errors, and hardware failure, can all lead to corruption or loss of data. With emergency procedures, it is sometimes possible to recover data which has apparently been lost, but this approach is both unreliable and expensive. The only reliable defence is a backup and archiving program. BACKUP Data backup is a common failsafe practice. It is the regular making of a copy of data and software, from which the files on a computer can be recovered. The backup may be a full backup, which is a complete copy of all data stored on a computer, a partial backup, which is a copy of only some disks, directories or folders, or an incremental backup, which is a copy of those files or documents which have been changed since a previous backup. Partial and incremental backups are used to provide frequent backup of crucial data, at lower cost than full backup. One practice is to make a monthly full backup and keep each one for two months. Partial or incremental backups are made more often and these are kept until the next full backup. Backup copies of data are the only way to provide for reliable recovery of lost data. While backup is the crucial, and often the only, component of failsafe schemes, it is also crucial to providing mobility of people and data for failsoft strategies, and to efficient execution of any Disaster Recovery Plan. BACKUP TECHNIQUES The traditional backup technique is backup from local or server hard disk to floppy disk. Backup software enables storage of many files from the hard disk on a set of floppy disks, with files split as necessary between floppies and with the directory or folder structure preserved. This is a standard capability in MS-DOS, and Windows using the standard commands. Standard Macintosh system software does not provide such a facility, but backup software is available at extra cost. As disk capacity has increased, floppy disk backup has become inadequate. Using high density disks, the typical user will require 40-60 floppy disks for a full backup. Each disk takes about a minute, and by its nature the process requires constant attention. Most users will either not make backups, or will find an alternative technique which runs unattended. If sufficient fileserver capacity is available, a workable technique is to backup from local hard disk to fileserver. Alternatively, if the working data is stored on the fileserver, backup from fileserver to local hard disk is feasible. Removable hard disks and tape drives are other alternatives, but they will not provide user mobility unless all workstations are equipped with these drives or the drives can be quickly and easily moved from one computer to another. Archiving and data compression software, such as PKZIP (shareware) and STUFFIT (shareware) can run unattended and significantly reduce the amount of storage needed for backup. A common failure of backup disciplines is that while backups are regularly made, no-one involved develops enough knowledge of the processes by which backup data can be restored. In establishing a backup scheme, data recovery processes should be tested, and all relevant staff should be given practical experience of the process. Floppy disks are particularly prone to failure. Although the drives are just as reliable as other drives, the disks themselves are subject to much greater physical wear and tear. Floppies which are used for operational software or data should be backed up as a matter of course. One approach is to keep data and software on separate disks. For each software disk, a duplicate should be made once and stored in a safe place. On failure of the working software disk, the backup is put into service and a new backup is made. Data floppies should be duplicated at frequent intervals. On Macintosh, floppy duplication is accomplished by dragging the source disk icon on top of the destination disk icon. On IBM PC compatible systems, the DISKCOPY command is used. Both Mac and IBM PC compatible systems can duplicate floppy disks using only one disk drive if necessary. There is no central backup of individual server and workstation disk storage, and data stored on servers and workstations must be protected by local backup arrangements. ARCHIVING Some software and data files are used infrequently, or, at the completion of a project, are not expected to be required again. To conserve expensive primary disk storage, such applications and data should be removed. Nevertheless, it is not desirable that the information be irrevocably lost for all time. For example, it is University policy ("Guidelines for the Responsible Practice of Research", ANU Paper 578/1993, June 1993) that research data be held for a period of at least five years. The appropriate action is to copy the information to cheaper, less accessible storage media for archival purposes. Archiving is often confused with backup, as the same software and techniques are used. The difference between Backup and Archiving is that whereas the information in a backup may change frequently and exists also in primary storage, archived material does not change often and may be stored only on the secondary medium. Backup is performed frequently as part of operating routines, whereas archiving (and recovery from archival storage) is performed only when required. Archiving is a useful technique for reducing the volume of information which is copied in day to day backup operations. On many workstations, much of the primary storage space is used to store application software, which does not often change. User data files, which do change often, often account for as little as 25-50% of disk space. If system and application software is archived, there is no need for it to be included in regular backups. In a department where a standard software environment is established, a single archive stored on a fileserver will meet the software recovery needs of all users, as well as facilitating rapid setup of new machines. A second significant application of archiving is in management of computer laboratories in which a large number of users have access to a number of similar workstations. In such laboratories, there is a tendency for an initial standard software configuration to be rapidly changed, so that each workstation becomes different. Maintenance of such workstations is difficult and time-consuming, unless a standard archive is established at the outset. Workstation management is then achieved by periodically reformatting the workstation disks and reloading the standard software configuration from the standard archive. This process can take as little as 15 minutes for a laboratory of 20-30 workstations, using an archive stored on a nearby fileserver. Archives should be documented - their long-term nature means that the person who eventually has to restore the information to primary storage may well not be the person who created the archive. The strengths and weaknesses of a number of backup techniques are summarised in Table 3. Successful and efficient backup arrangements rely on a combination of several. Table 3 - Backup and Archiving Technique
| Technique | Software Used | Strengths | Weaknesses |
| Simple file copy to floppy disk | DOS: COPY, XCOPY | Easy and quick for small numbers of small files | Directory/folder structure not always preserved; file sets larger than floppy disk cacity can't be backed up |
| Windows file manager (Drag and Drop) | Relies only on standard desktop equipment and software. | ||
| Mac: The finder (Drag and Drop) | Inexpensive storage medium | ||
| Copy to multi-disk floppy disk backup set |
DOS: BACKUP/RESTORE (DOS 3,4 &5) MSBACKUP (DOS6) XCOPY.Windows mwbackup Mac: Third party software (eg. Retrospect) |
Relies only on standard desktop equipment. Inexpensive storage medium. Backup & restore software is part of standard DOS. | Requires human intervention every minute or so; takes about 1 minute per megabyte. Requires additional software on Mac. |
| Copy to spare space on same hard disk drive |
DOS: BACKUP/RESTORE (DOS 3,4 & 5), MSBACKUP (DOS 6) XCOPY. windows: mwbackup Mac: The finder (File Duplicate) |
Fast, uses only standard equipment and software | Requires a lot of spare disk space. |
| Copy to magnetic tape | DOS/MAC: software supplied with tape drive. | Fast. No user intervention required. | Comparatively expensive storage medium. Requires special equipment and software. No all tape units have proved reliable. |
| Copy to hard disk on another machine (eg server to workstation or workstation to server) |
DOS: BACKUP/RESTORE (DOS 3,4 & 5), MSBACKUP (DOS 6) XCOPY. Windows: mwbackup Mac: Finder, 3rd party software (eg. Retrospect/remote) |
No user intervention required. Extra protection if the copied-to-disk is itself backed up. | Requires spare hard disk space. Speed depends on network connection. File restore capability depends on the network. |
| File compression, in conjunction with above methods. |
DOS: PKZIP. Mac: StuffIT, CompactPro |
Efficient use of storage space, faster data transmission and backup duplicaiton. Software provides flexible partial and incremental backups. Compression runs unattended. | Overall compress and copy takes more time than straight copy. Liccence fees are payable for compression software. |
Selection of backup procedures involves consideration of the
availability and cost of equipment, software and staff, the risk and
likely cost of data loss, and the frequency with which backup should
be performed.
The following practices are recommended:
1. Backup arrangements for each workstation and server should be
determined by departments with due regard to the costs and
benefits. The procedures should be documented, as should the
file recovery procedures.
2. At least two backups of each machine should exist at any one
time, with one set being held in the department and the second
set being stored offsite.
3. Backup procedures should be tested - it is not uncommon for
procedures to be found inadequate only when it is necessary to
restore a lost file.
4. For servers, a Disaster Recovery Plan should be initiated and
documented by the local administrator. The plan should include
matters such as who is to be contacted if there is disaster,
arrangements for reinstating services and provisions for
reverting to manual procedures during system downtime.
5. Disaster Recovery Plans should be tested annually.
6. Disaster Recovery Plans should be lodged with the Director, IT
Services.
8. FAILSOFT STRATEGIES
A failsoft strategy has the goal of enabling the department to
continue work comfortably, despite equipment failure. The goal is
achieved through policies of
- mobility of people and their work between workstations;
- mobility of equipment between people and between locations;
- consistent computing equipment and practices; and
- maintenance of equipment and skills above the minimum required
for effective and efficient work under ideal conditions.
Experience suggests that although appropriate security and failsafe
measures are in place in many departments, comparatively little
attention has been paid to failsoft strategies in either purchasing
decisions or operating procedures. While the other measures are
basic and essential, efficient management of departmental computing
requires that failsoft at least be considered. Failsoft is not
always cost-justified a priori, but its absence should be decision
made considering relevant costs and risk factors.
8.1 FAILSOFT - A DEFINITION
Graceful degradation of service as one or more components of the
system fails.
SERVICE
System capabilities used to get useful work done. Examples include
document editing, printing, statistical data analysis, data storage,
database maintenance and interrogation, access to remote computers
and databases, and sending, receiving and storing electronic mail.
DEGRADATION
Degraded service may be slower, less convenient, less efficient or
less attractive than normal service levels. For example, while a
networked laser printer is unavailable, one might use a cheap dot-
matrix printer moved around the department to provide a degraded
printing service.
GRACEFUL
Graceful degradation implies that the degraded service is quickly
available, with little inconvenience or expense other than that
associated with the degradation itself.
SYSTEM
The computer system consists of workstations (PC, Mac, Unix box or
terminal), usually located at the workplace of its principal user,
together with equipment elsewhere in the department, the University
and throughout the world through the network, and the network
infrastructure itself.
COMPONENTS
Cables, disk drives, disks, CPUs, RAM, printers, plotters,
fileservers, mailservers, network gateways, modems, nameservers and
any other items which the system requires to maintain normal
services.
FAILURE
A state where a component does not perform its normal function,
causing normal services to become at least temporarily unavailable.
Failsoft is not the same as disaster recovery, failsafe, or
preventative maintenance, but it complements all of those programs.
In many cases, some of the practices required by a failsoft strategy
will already be in place as part of failsafe or disaster recovery
procedures. In a department which has such programs in place, the
marginal cost of setting up a failsoft program can be very small.
8.2. FAILSOFT TECHNIQUES
In principle, failsoft techniques do not depend on the type of
workstation used. Similar techniques are available for Macintosh, PC
and Unix workstations. Details vary across systems, in consequence
of their relative strengths and weaknesses.
NETWORK-BASED COMPUTING
In some workgroups, it is convenient to keep the operational copies
of all software and data on a fileserver volume which is accessible
to all workstations in the work group. There can be cost advantages,
and network-based computing facilitates standardisation of the
workstation environment. In the event of failure of any component of
a workstation, work can continue immediately on another workstation,
provided one is available. LANManager, PC-NFS and AppleTalk networks
(except those holding sensitive administrative information) allow
login across the campus, so in the last resort the use of a public
workstation in the Leonard Huxley Building level 1 training area
allows work to continue.
There are disadvantages. In the event of failure of the network or
the fileserver, all workstations are effectively disabled. In most
settings, network based computing has noticeable performance
deficiencies compared to workstation-based computing.
WORKSTATION COMPUTING
A stand-alone workstation is one which is not dependent on the
network. While electronic mail and access to remote systems will be
network-dependent, much useful work can be done on a standalone
workstation. All data and software is stored on local hard disk, and
printing is provided by a printer connected directly to the
workstation.
Standalone systems are rarely as cost-effective as those with some
degree of reliance on the network. Where people share data but not
workstations, it will be better to keep working data on a fileserver
to which all members of the group have access. Network based
printing is favoured in most cases because of economies of scale.
Nevertheless, workstations which can function stand-alone are an
important part of a failsoft strategy, as the standalone capability
reduces dependence on the fileserver.
A well-balanced approach is to use standalone capable workstations in
a networked environment. The network provides printer sharing and
data mobility, whether through use of the network for backup or
through locating working data on the network, with backup to
workstation disks.
HOT SPARES AND REDUNDANT EQUIPMENT
Service provision despite component failure depends on either using
equipment in ways other than normal, or on using other equipment. In
some cases, failure of a component is best dealt with by on-the-spot
replacement with spare equipment. Where personal workstation
utilisation approaches 100%, it may be necessary to maintain a
complete spare workstation to ensure timely completion of mission-
critical work. In most cases, the trend within the University of one
workstation per desk provides sufficient redundancy, provided that
equipment can be shared when necessary and that user and data
mobility is arranged.
"JURY RIG" PRINTER SHARING
While networking is the ideal method for printer sharing, it is
important to retain a printing capability even when the network is
unavailable. There are two ways of achieving this: both depend on
an ability to connect a printer directly to a workstation.
A networked printer can be made portable by placing it on a trolley.
In the event of network failure, the printer can be wheeled around to
the various workstations and used as a local printer. Alternatively,
a lightweight, low quality dot matrix printer can be moved around as
necessary.
If frequent movement of the printer is infeasible, another approach
to network failure is to connect the printer to a single workstation
for the duration of the network failure. As staff require printing,
they can transfer their work to floppy disks and take it to this
printer workstation. If even this reconnection of the printer is
impossible, a "worst case" option is to take work to public computer
laboratories such as those in the central facility in the Leonard
Huxley Building, and print on public printers. In these cases, it is
important that the application software used on the normal
workstations can be made available on the printer workstation or is
available on public workstations. It is crucial that workstations be
equipped with compatible floppy disk drives.
ALTERNATIVE COMMUNICATION TECHNIQUES
Those who require remote login and file transfer services usually use
TCP/IP software (FTP and Telnet), which provide those services via
the local Ethernet or LocalTalk cabling. There are alternative
techniques in some cases. Kermit is a file transfer and remote login
protocol for serial communication lines, and Kermit programs are
available free for MS-DOS, Windows, Macintosh and Unix workstations
as well as many other systems. Serial communication is slower than
Ethernet or LocalTalk, but better than nothing. Serial connections
can be arranged in several ways.
In cases where connection is required between machines in a
department, an inexpensive RS-232 cable between serial ports can be
used to cover distances up to about 100 feet. For longer distances,
or when no cabling is possible, a dial-up modem can provide a login
and file transfer capability via the public telephone network. A
modem is required at both ends of the link: ITS Network Services
operate a dial-up modem service that allows connection (subject to
conditions of use restrictions on the dial-up modems and the computer
to be accessed) to any computer on campus connected to the Campus
Network.
In cases where the volume of data is large or the frequency of the
need for data transfer is low, it is often most effective to transfer
data from one machine to another via floppy disk. A 1 megabyte file
can be transferred from one machine to another via floppy disk in
about two minutes plus travelling time. At 9600bps, serial file
transfer of the same file will take about 20 minutes. In order to
use floppy disk transfer, workstations must be equipped with
compatible disk drives. Current Macintosh computers have floppy
drives which can read and write 1.44Mb 3.5" IBM compatible PC floppy
disks.
SUMMARY OF TECHNIQUES
Table 4 lists some components which are subject to failure, the
consequence of failure, ways of obtaining degraded service pending
repair, the conditions which apply to degraded service and the
failhard alternative.
Table 4: Failsoft Responses to Component Failure
| Failure | Consequence | Degraded Service | Conditions | Alternative |
| Hard disk | Error message from workstation, no disk access. If it is the system disk, startup fails. | Restore data and software to another workstation. Boot from disk and work floppy-only, using data and software from network. |
Up-to-date backup must exist on a medium accessible from the replacement workstation. Boot disk available. |
Loss of access until repaired; possible permanent loss of data. |
| Floppy Drive | Error message from workstation when drive is used | Read and write floppy disks on another machine, transporting data via network. Restrict local operations to floppy formats whichremain available | Workstation not dependent on floppy disk drive for operation. Access to networked workstation with similar disk drives. | No access to floppy disk data until drive is repaired. |
| Monitor | Cabling correct, but image not present or unstable | Use another monitor or workstation | Access to another monitor or workstation, user mobility | No computing until monitor is repaired or replaced. |
| Power Supply | No action when switched on | Restore data and software to another workstation | Up to date backup, user mobility | No computing until power supply is repaired or replaced. |
| Cabling | Components securely and corrected cabled, but not working together | Borrow or share the required cable. | Spare cables, diagnostic skills and confidence such as required to install a new system. | No computing until faulty cable is replaced. |
| Printer | Printer does not print or does not print properly | Use another printer, through the network or by local connection | Mobility of local printers or arrangements to use alternative network printers and the necessary configuration skills | No printing until printer is repaired. |
| Network server | One or more file, print, mail and other services becomes unavailable. | Work standalone - make alternative arrangements for file transfer, printing, data and software storage, communication with other users. | workstations which can function standalone; equipment and software for alternative file transfer; printer/user mobility. | Sever disruption of work for all users of the server until it is reinstated. |
| Network infrastructure | Local network up but isolated. | Most work undisrupted, use public workstations for remote file transfer. | Access to public workstations with compatible floppy disk drives. | Work which relies on campus-wide (or beyond) networking is delayed. |
| Keyboard/mouse | Keystrokes not received by computer. | Replace with "hot spare" or use another computer. | Hot spare, or access to shared machines and user mobility. | No computing until keyboard is replaced or repaired. |
8.3 CONDITIONS FOR EFFECTIVE FAILSOFT The conditions for failsoft may be summarised as needs for - Redundant storage of data, and deliberate equipment redundancy - maintenance of equipment in excess of that required to provide services under ideal conditions; - flexible allocation of equipment to individuals, achieved through maintenance of user and data mobility, equipment mobility and standardisation of hardware and software; - arrangements with nearby departments to share resources to overcome temporary unavailability of normal services; and - maintenance of the attitudes, knowledge and skills required to implement failsoft techniques. These conditions may best be met within a policy framework of equipment "ownership" at department or workgroup level rather than that of individuals, and continuous management and review of contingency planning. REDUNDANCY OF EQUIPMENT AND DATA STORAGE Some failsoft techniques depend on maintaining services with reduced equipment levels, but most depend on pressing other equipment into abnormal service to cover failure of equipment normally used. This implies a need to equip at a level greater than that required for normal operation. A universal need, for failsafe as well as failsoft, is maintenance of more than one copy of working data. In some cases, failsoft will require the acquisition of redundant equipment, such as hot spare units. However, the trend within the University of one workstation per desk provides for some redundancy. Although most University staff will use computers in much of their work, very few will make continuous use of the equipment allocated to their use. For occupational health reasons if nothing else, many staff will only make part-time use of their workstations. A consequence is that the University is not aiming for 100% utilisation of workstation equipment. A high rate of utilisation would be 75%. This means that in a department with 12 workstation- equipped staff, there will be 12 staff workstations, but on average only 9 will be in use at any one time. This redundant equipment level is well justified by the advantages of providing staff with workstations which are available on demand. At times when equipment failure threatens the effectiveness of one or more members of staff, it also provides redundancy which can be exploited to meet the conditions for failsoft computing. PEOPLE AND DATA MOBILITY Rapid deployment of temporary replacement equipment to an individual requires that the user's computing environment can be quickly set up on another machine, and that the machine can be returned quickly to its normal use. It must be possible to physically move the user to the machine or to bring the machine to the user's normal workplace. Where it is possible to acquire floating or hot spare computers, serious consideration should be given to laptop or notebook computers. In most cases, data mobility will not be achieved through regular floppy-disk backup: the process is too slow and requires constant user intervention. If floppy disks are the only storage medium available other than workstation hard disks, then great attention must be paid to efficiency. The simple backup of the entire hard disk content to floppy is not efficient. With the use of file compression and partial and incremental backup, it may sometimes be possible to bring the size of the task down to manageable proportions. In most cases, the use of network file server storage will be far more effective. Data mobility techniques based on expensive special-purpose devices such as removable hard disks and tape cartridges should be avoided unless it is possible to equip all workstations with these devices. EQUIPMENT MOBILITY On occasion, particularly in cases of server failure, the only way to maintain some services which are normally provided through the network is to physically move equipment from workstation to workstation as the need arises. The obvious example is printing services in the LAN Manager environment. Network printing services in this environment become unavailable if the local server fails, yet all printers installed as network printers can quickly be redeployed as workstation printers, providing that appropriate physical arrangements are made. Network printers are usually large and heavy, and it will not be possible to move them around unless they are placed on suitable trolleys. Given the ability to bring the printer close to the workstation, electronic connection is easily established. For an effective printing service, it is necessary that staff are able to reconfigure application and system software in such a way as to generate output suitable for the printer, directed to the appropriate output port. ATTITUDES, KNOWLEDGE AND SKILLS Suitable equipment and backup procedures are necessary but not sufficient conditions. While the University's policy is to provide each member of staff with an appropriate workstation, it must be recognised that few staff use their machines continuously, and that this partial utilisation provides much of the equipment redundancy needed to allow work to continue despite failure of some equipment. Successful exploitation of this redundancy depends on the attitude that equipment belongs to the University, not to individual members of staff. Good management of computing requires detailed contingency planning, but no-one can foresee every emergency which takes place. Success in failsoft depends on users themselves having the knowledge and skills to diagnose a fault and devise a work-around to use while waiting for the fault to be rectified. 8.4. DEVELOPING A FAILSOFT STRATEGY Use of computers differs greatly between workgroups, between departments and between faculties. Although some defensive measures are recommended throughout the University, there is no single off-the- shelf failsoft strategy. Aspects of organisational behaviour are important: unless defensive practices are successfully institutionalised they are likely to fail. Success depends on adopting policies consistent with an appropriate and acceptable level of autonomy for the staff involved. Detailed defensive strategy is best developed at the level at which it will be implemented. Although the individual user can and should act, greater efficiency and reliability can be achieved through strategy devised and implemented at work group, department or faculty level. Assuming that administrative controls, security measures and failsafe practices are in place, one approach to developing failsoft practices is set out here. IDENTIFY THE SERVICES UPON WHICH USERS DEPEND. Not all computer use depends on all of the services provided by the system. An efficient strategy will not allocate resources to maintaining or recovering services which are not required. The crucial services will be those which are relied on for mission- critical work. IDENTIFY THE SYSTEM WHICH PROVIDES THOSE SERVICES. Which workstations, fileservers, printers, modems and other equipment are used? IDENTIFY POTENTIAL COMPONENT FAILURE Determine which components are subject to failure, and for each possible failure, consider the likely time to repair or replace, the services which would be unavailable while waiting for repair, and the services (usually databases) which would be permanently lost. COST THE FAILURES Ignore those failures with no long term effects and for which the impact of temporary loss of service is insignificant. For those potential failures which remain, estimate the incidence of these failures per year, and the likely cost of the loss of service. As a rule of thumb, it is wise to double the initial estimate. ESTIMATE THE BENEFIT OF FAILSOFT The goal of the strategy is to avoid loss of certain services despite component failure. Multiply the annual incidence of each failure by the estimated cost of lost service to obtain an estimate of the annual benefit associated with failsoft. DEVISE AND COST FAILSOFT POLICIES For each significant potential failure, devise one or more practices which will enable continuing useful work despite the failure. Estimate the annual cost of each practice. If capital expenditure is involved, a reasonable annualised figure for estimation purposes may be obtained by dividing capital cost by three. CONSIDER OVERLAP Failsoft practices tend to overlap with each other and with existing failsafe policies. It will not be necessary or efficient to implement all possible failsoft practices, but neither would it be wise to implement a bare minimum. Choosing a set of practices for implementation requires judgement. There will be many alternatives involving different standards of emergency service, different levels of risk, different protection against multiple component failure, different costs, and different likelihood of user acceptance, which is crucial to success. CONSIDER FEASIBILITY Check that the proposed policies are feasible - that sufficient resources can be allocated. Check also that the estimated benefits are not greatly outweighed by the cost. A small excess of cost over benefit may be justified as a premium paid for risk reduction. 9. IMPLEMENTATION, TESTING AND REVIEW Defensive measures guard against failure of systems, but they tend to fail themselves unless successfully institutionalised. Consistent practices and equipment are required among a number of people and over time, and so ad-hoc implementation is unlikely to succeed. PURCHASING The strategy may call for some immediate purchases of equipment or software. It should also be considered when new equipment is acquired. If possible, new equipment should confirm to the standards established by the strategy to provide compatibility and flexibility in allocation. If that is not possible, then the strategy itself will require adjustment to accommodate the new equipment. COMMITMENT The organisational culture of the University is one in which commitment to policy is rarely achieved by decree. Defensive measures depend on procedures such as backup being performed without fail, despite pressure of other work or absence of key people. Staff commitment is necessary, and is most likely to be obtained if all staff affected are consulted during development of the strategy. DOCUMENTATION Unless defensive measures and planned responses to component failure are written down, Murphy's Law dictates that in the absence of key people, some crucial defensive measure will be omitted, or no-one available will know how to carry out the planned response to equipment failure. This will occur at a time of peak workload. TESTING An emergency is not the right time to find out whether a certain technique works properly, and nor is it the time for staff to develop new skills. As part of the implementation process, the techniques adopted should be practiced. If the strategy guards against fileserver failure by providing for standalone workstation computing, then a test should be scheduled to ensure that work can indeed continue despite the failure. All staff who are likely to be involved in moving themselves from one workstation to another should have the opportunity to make such a move under test conditions, to ensure that in an emergency they have the necessary experience. REVIEW Development of defensive strategy is influenced by factors which change: technology, the nature of the work, available funds, network capability and existing equipment. The strategy should be reviewed from time to time, and adjusted to suit changing circumstances. After an equipment failure is also a good time to review what worked and what didn't work. 10. SOURCES OF INFORMATION HARDWARE AND SOFTWARE REFERENCE MANUALS Software reference manuals usually have a chapter entitled "Getting Started" or "Installation", in which the hardware requirements for the package are stated, and instructions are given for installing the software on floppy disk or hard disk. In many cases, installation on a network disk is the same as installation on local hard disk, but in other cases there will be special instructions for installation on network disk. This information will assist in setting up software for use from all three storage media as the need arises. In the case of application programs and system software which have to be configured for particular printers, the installation section of the reference manual usually gives instructions for doing so. At installation time, it is best to install printer drivers for all printers that are likely to be used at any time, rather than just those printers which will be used under normal conditions. The MS-DOS and Windows User's reference contain invaluable information on the data backup commands. Other MS-DOS and Windows capabilities useful in failsoft include DISKCOPY, DISKCOMP and XCOPY, and MODE and SET, which may be necessary to set up a local printer. Good printer manuals provide extensive information on the hardware and software configuration required to use the printer, and on the precise detail of the cabling required. Many printers and computers offer more than one means of connection: A typical Mac offers LocalTalk and serial communications, and some Macs offer Ethernet. Most IBM compatible PC machines have both serial and parallel communication ports. Study of the computer and printer manuals will indicate the range of possibilities, which can be exploited to provide multiple paths from computer to printer. INTERNET NEWS The Internet News system is a world-wide broadcast electronic mail system carried in Australia by AARNet. Useful information on Macintosh, MS-DOS and Windows shareware and public domain software, some of which is useful in failsoft strategy, can be found in these newsgroups: - aus.archives - aus.computers.ibm-pc - comp.binaries.ibm.pc - comp.binaries.ibm.pc.archives - comp.binaries.ibm.pc.d - comp.binaries.mac - comp.sys.mac.digest There are many other groups which may be useful too. Through AARNet, it is possible to obtain most of the software discussed in these newsgroups, using FTP. UNIVERSITY ONLINE SERVICES The University maintains local public access collections of technical information, equipment and software price-lists and public domain and shareware software for Macintosh, MS-DOS and Windows systems. The majority of this information is available from the LITSS Home page. THE LOCAL IT SUPPORT CONTACTS MAILING LIST The IT Support Forum uses this electronic mail mailing list for discussion of IT-related matters. Electronic mail sent to site- contacts is forwarded to all subscribers. It is IT Services practice to publish announcements of new facilities and services, meetings and seminars through the mailing list as well as through other channels. APPENDIX 1. PASSWORD SELECTION GUIDELINES The following guidelines should be considered when selecting your password: - Don't use your login name in any form (as-is, reversed, capitalised, doubled, etc.). - Don't use your first or last name in any form. - Don't use your spouse's or child's name. - Don't use other information easily obtained about you. This includes license plate numbers, telephone numbers, the brand of your car, the name of the street you live on, etc. - Don't use a password of all digits, or all the same letter. This significantly decreases the search time for a cracker. - Don't use a word contained in English or foreign language dictionaries, spelling lists, or other lists of words. - Don't use a password shorter than eight characters. - Do use a password with mixed-case letters. - Do use a password with nonalphabetic characters, e.g., digits or punctuation. - Do use a password that is easy to remember, so you don't have to write it down. - Do use a password that you can type quickly, without having to look at the keyboard. This makes it harder for someone to steal your password by watching over your shoulder. Although this list may seem to restrict passwords to an extreme, there are several methods for choosing secure, easy-to-remember passwords that obey the above rules. Some of these include the following: - Choose a line or two from a song or poem, and use the first letter of each word. For example, "In Xanadu did Kubla Kahn a stately pleasure dome decree" becomes IXdKKaspdd. - Alternate between one consonant and one or two vowels, up to eight characters. This provides nonsense words that are usually pronounceable, and thus easily remembered. Examples include routboot, quadpops, and so on. - Choose two short words and concatenate them together with a punctation character between them. For example: dog;rain, book+egg, or kid?goat