From implementing a new infrastructure or migrating to a new operating system, to something as seemingly innocuous as the application of a security patch, technological hitches can cause serious problems. IT mishaps can impede business performance, causing loss of productivity and considerable reputation damage. They have also been known to affect share price, as they can knock shareholder confidence in the efficiency and effectiveness of the organisation.
So how are businesses able to manage the risk in IT? One element that crops up again and again in discussions of technology disasters is that of quality assurance and software testing. Over the last few years numerous high profile technology failures have been caused by inadequacies in quality assurance and testing procedures, or by the fact that testing was left too late in the day to be able to spot errors early enough.
A prime example is the case of the National Air Traffic Services where a relatively small scale task ended up having massive nationwide repercussions. Engineers had to test an update that was due for live roll out, which involved bringing down the complex 30 year old Flight Data Processing System and running a 45-minute test. However, when the live system was restarted, controllers began to report errors in the flight data. Fearing plane collisions, all take offs were halted as the pre-test system was restored. The hour long down time meant flight cancellations affected 200,000 passengers across the UK. In this instance, an aging IT infrastructure meant that tests were run on a live system without adequate impact analysis.
Another example of a project failing due to insufficient testing is the UK Child Support Agency. Dogged by innumerable problems, the CSA's £456m computer system caused havoc to the organisation's attempts to track and secure child support payments from absent parents. The Government threatened to pull the plug on the system, while the supplier, EDS, blamed the Government for ordering 2,000 changes to its original requirements. However, it is evident from Government reports that there are many defects outstanding in the IT system and incompatibilities with the CSA business processes. A more stringent process would have ensured changes were correctly managed and tested and that the changing requirements did not affect the quality of the system.
Strategic recognition
The problem is that testing is all too often considered a box ticking operational function and is not given the strategic recognition it deserves. As part of any organisation's risk management strategy and procedures, testing and quality assurance should feature as a priority, when in actual fact, they are often only paid lip service to.
Risk-based testing means that the planner identifies the business processes that are at the greatest risk of failure and those that present a high risk to the business if they did fail. With this information, the business can decide on the optimum quality and testing assurance strategy given the budget, time available and quality objectives. Risk-based testing and quality assurance mean resources will be focused purely on mitigating risk, which saves time and headaches in the long run.
Technology failures from testing inadequacies have been particularly prevalent in outsourcing arrangements. An example of an outsourcing project failing due to insufficient testing is a desktop project at the UK Department of Work and Pensions (DWP). Poor testing procedures meant a local PC operating system upgrade was erroneously applied across the organisation, disabling host access. It may not seem the most serious of incidents, but it just highlights how a seemingly simple local upgrade can have widespread effects if the right testing procedures are not in place. Up to 80,000 of the DWP's PCs were affected, making it one of the worst IT failures in the last year.
Another DWP outsourcing project that has been dogged by problems, principally due to defective quality assurance and testing, is the recent Job Centre Plus debacle. In a report, the House of Commons Work and Pensions Committee recently denounced the project as being a 'catastrophic failure in customer service'. Problems arose from the outset, with insufficient needs analysis compounded by inadequate capacity testing procedures. The fundamental problem seemed to be that the system was built with little thought given to the actual business needs of Job Centre Plus.
The report stated that: 'It seems extraordinary that a system developed at great cost and introduced with some difficulty, is so out of step with the future DWP IT strategy that there are plans to replace it incrementally before it has even been rolled out nationwide." If there had been sound ongoing testing and quality assurance procedures in place, the alignment between the technology and business needs could have been validated as the project progressed and any divergence identified and addressed.
Whereas public sector technology failures as a result of problematic systems testing are widely publicised, due to the fact that Government projects are more accountable, system testing problems are just as likely to occur in the private sector.
Verification and validation
Testing is often considered an operational function, when it should be about the verification and validation of a solution to fit the business. It is not simply checking that the system works. As in the Job Centre Plus scenario, it seems that, all too often, technology implementations are conducted with little thought for the business requirements. Testing helps to establish whether the specification is correct and whether the system has been built according to the specification. The result of quality assurance is the realisation of a system or process, which does exactly what it says on the tin and ensures that all areas of the business are aligned to the project.
To achieve this verification and validation in-house requires significant management of people and processes, coupled with clarity and visibility of communications and information.
Another reason why testing and quality assurance can take a back seat in technology projects, is that suppliers and internal IT departments are often under pressure to meet deadlines, fall within budget and meet stringent service level agreements. The pressure to push systems out on time can mean testing is compressed into a reduced time period to decrease delays in implementation.
In order to ensure that damage limitation is as iron clad as possible in any technology implementation, testing and quality assurance should play a vital role. If, as recommended, they are involved from the outset, the important questions are these:
- Idea: is the idea viable and commercially sound?
- Requirements definition: are the requirements unambiguous and complete?
- Specification and design: will the design fulfil the requirements?
- Development: has the right outsourced system been built and has it been built correctly?
- Implementation: does the outsourced system/process match what was intended at the start of the implementation stage?
- Value realisation: are the benefits measured the same as the ones expected?
As the success of any major technology implementation is so dependent on the robustness of testing and quality assurance procedures, it really is time testing was put firmly on the agenda.
- Adam Ripley is managing director of test consultancy specialist IS Integration, www.isintegration.com
THE ROLE OF TESTING IN MIGRATION
According to recent research, the migration to Windows XP has yet to take off in the UK, with only 11% of UK companies having made the transition, despite the fact that the system was released nearly four years ago. And now organisations have to think about Windows Vista, whose release is scheduled for January 2007.
The biggest hurdles to migrating to new operating systems tend to be concerns about the complexity of the roll out process and the demands it will put upon already stretched IT departments. There is also the concern over the ability of existing infrastructures to support the new system. Understandably, with some large organisations having 40-50,000 desktops, many are worried about the possible fall out and damage to the organisation should any hitches occur in the migration process. Despite these reservations, migration to new, updated operating systems is inevitable, and, sooner or later, organisations have to bite the bullet.
While there is little doubt that migration to a new operating system requires more effort than initially thought, there is little evidence to show that organisations carefully plan and prepare for it. In fact, many companies take a firefighting approach to migration - they fail to anticipate problems and only deal with them when they occur.
Successful migration depends on an effective management process, and while some companies have already migrated to a new environment, principally Windows XP, few have documented the successful methods that will help when they finally have to adopt Vista. Testing plays a pivotal role in this environment and should be addressed from the conceptual stage of the migration, at all stages until the new system goes live.
Take co-existence as an example. IS Integration urges any organisations embarking upon a migration programme to define strategies early and determine what critical services need to co-exist during the transition from the old to the new environments. In this context, co-existence activities need to consider both users, who need to share data across different operating environments and applications that have to work with different versions of libraries or configuration data sets.
Thorough planning and testing will enable organisations to spot and deal with problems arising from situations such as this before they happen. For example, Microsoft introduced new file formats for packages such as Visio and Project. Until all users had upgraded to the new versions of these packages, companies had to train staff to save documents in older formats. This evidently has cost implications for the company. Knowledge-bases and document management systems also required upgrades to continue to index data correctly.
Other problems include user-developed applications, which tend to be written using programs such as Microsoft Access and may turn out not to be supported by the new version, rendering them useless. And security is a further consideration.