Effective Asset Management Tool Helps Underscore Crucial Role Of Maintenance Function In Platform Design, Operation
Two separate surveys on the motivation for outsourcing the maintenance function were recently held in The Netherlands; one (NIPO, 1994) addressing maintenance staff; another (Ernst & Young, 1998) asking for the opinion of executives.
Fig. 1 shows the wide disparity of their responses.
The first category mainly valued the additional flexibility in terms of hiring-in specialist's skills and temporary workforce capacity. However, the representatives from the boardroom essentially challenged the existence of an in-house maintenance function by putting a heavy weighting on the non-core business aspects, the desired change-over from fixed to variable costs, and the expectation of significant cost savings.
Why such a large difference in perception? The importance of process reliability and availability is indisputable:
- In the heavily automated and capital-intensive oil process industry, any loss of production due to equipment unavailability strongly impairs the company profit; operating rate is the single most important factor in determining return on average capital employed (ROACE) (Anonymous, 1994).
- In various sectors of industry, the direct costs (manning and material) of maintaining plant assets are of a similar level or higher than that of operating the process. A rough estimate of the total costs is two to five times the direct costs. Costs for similar installations may vary by a factor of five. An in-house study of Union Carbide in 1987 showed that maintenance managers are convinced of having limited degree of control-only about 50% over all activities (Pierce, 1987). Pacesetters have shown that direct maintenance costs may be reduced by up to 40% (Cooper, 1992) with a simultaneous increase in plant availability.
- Reliability forms the basis of safety and environmental control. Industrial accidents invariably are shown to have precursors in human and system reliability. Of the 121 events recorded in the European Major Accident Reporting System (Drogaris, 1993), 61 were directly related to maintenance-induced errors, component failures, or corrosion. Environmental pollution frequently results from the inability of systems to control large process excursions due to sudden equipment failure.
Part of the answer may lie in the culture of the maintenance staff. Organizational effectiveness and proper education are necessary to achieve the "improvement mode" and the "thinking like an operations manager, not like an appliance repairman" stance (LePree, 1999). We observed earlier a striking difference in evolution and user acceptance between such comparable fields as control engineering and reliability engineering (Van Rijn and Scholten, 1996). In recent decades, a whole body of control and optimization theory has been developed that now forms part of the standard curriculum of young engineers. New, advanced computerized control and optimization techniques were easily accepted by operations and now are considered as being indispensable.
A comparable theoretical background of asset management still is missing; most of the literature on maintenance optimization models does not fit in an engineering mindset, being too specialist and lacking an overall, systems-engineering viewpoint (Dekker and Van Ryn, 1996). Reliability engineering is hardly ever integrated in the mechanical engineering curriculum, and maintenance managers still lack effective decision-support techniques at large.
Asset management.
A fundamental reason for struggling with cost and recognition problems often is that maintenance departments have no proper place in the organizational structure of a company.
The maintenance function, given its influence on profits and integrity, should have its own business plan with mission, vision, and objectives derived from that of the company. Management-control techniques have to be in place to achieve clearly defined targets and milestones at strategic, tactical, and operational levels (Fig. 2).
To allow the company to make strategic decisions, to evaluate new opportunities, or to meet new demands, asset management should be capable of providing detailed information on the production capabilities of existing installations, preferably as a function of time.
Planning and structuring of asset management takes place at the tactical level. Technical input is needed in new designs or in the redesign of existing facilities. Given asset reliability and maintenance characteristics, the maintenance function should be capable of estimating the probability of meeting requested demands-both in long-term average terms and as a function of time over, say, 5-year periods. Critical process elements should be identified early to effectively focus design and maintenance management activities. Where possible, maintenance strategies should be based on cost-benefit calculations. Together with technical information, these findings are laid down in a maintenance reference plan (MRP), the structure of which has to be reflected in the data model of the computerized maintenance management system (CMMS). MRPs are usually updated every 5 years, reflecting changes in business climate and improved know-how of the installation.
Operational management covers the execution of the activities laid down in the MRP; the most costly part of asset management is where resources and material are consumed and interference with production takes place. It thus requires close attention of the maintenance manager to guarantee effectiveness and quality of work. Because the first MRP is based on either generic or historic data from comparable equipment, the well-known quality improvement cycle-plan, schedule, execute, analyze, improve-applies here. Technicians should be well aware of the underlying reasons for a chosen strategy-and the assumptions involved-so that they can provide essential feedback based on their observations.
These three layers are strongly interdependent. For instance, a good plan will not provide the estimated cost benefit if the underlying assumption of high quality of execution is not met. On the other hand, improving execution on tasks that are not rewarding from a business perspective yields a similar ineffectiveness. Given the differences in goal-setting and execution monitoring, it is essential to locate management problems at the right level in the hierarchy (Grievink, et al., 1993). Decision-support tools that are easy and "safe" to be used by staff at all levels, especially those lacking a concise reliability-engineering background, are essential to demonstrate the underlying reasons of the MRP in order to achieve acceptance and integration of the plan in the company.
SPARC
Within Shell Research Amsterdam, these observations led to the development of a decision-support tool called SPARC (System for Production Availability & Resources Consumption) that, fitting in an engineering (rather than a specialist) mindset, allows maintenance staff to make fact-based decisions in the design and operational phase of complex production systems.
Because Monte Carlo simulation packages proved difficult to be used by (and to convince) practical engineers, SPARC uses an analytical approach. Via a graphical user interface, an engineer can easily model a system and analyze its steady state or time-dependent behavior. Calculations normally take less than 1-2 min on a standard personal computer, thus making SPARC an effective tool in solving "what if" questions.
SPARC has been successfully used in a large number of Shell Exploration & Production companies with such typical applications as redundancy and availability optimization, equipment selection, risk analysis, maintenance strategy, and maintenance manning level optimization. It proves to be an effective vehicle in integrating skills in a project organization, because various experts can work with the same model. After the business reorientation in Shell, Program 42 Consulting Ltd. obtained an exclusive license to market or lease this package both within and outside Shell and its affiliates.
SPARC uses two different ways of representing the process. The first is a top-down, tree-type decomposition of the process into systems; any failed system (power, compression, etc.) will bring the process into a down state. Systems are composed of equipment units (vessels, pumps, instrumentation, etc.) linked in reliability engineering terms in a serial or parallel1 configuration. Units are characterized by their processing capacity, e.g., in million b/d or cu m/hr, and by their starting date.
Each unit will have one, or a number of, specified tasks2 that have to be maintained-in SPARC terms, maintainable conditions (MCs). MCs may fail in one or various failure modes (FM). The probability of failure as a function of time is expressed via the Weibull a and b parameters, where arepresents a characteristic time constant (for b = 1: the mean time between failures) and b the time-dependency (b = 1: constant failure rate, random failures; failure rate is proportional to timeb-1). The consequence of failure is modeled by the expected downtime and the (operational) capabilities to reduce the effect; the (local) capacity after failure is taken to be reduced from its initial value to a bypass capacity that may range from 0 to 100%. Maintenance strategies3 are based both on these FM descriptions and on the technical effectiveness of planned maintenance. Finally, maintenance activities may be grouped into packages (MPs).SPARC offers the following maintenance strategies:
- Corrective (break-down) maintenance.
- Time-based (block) preventive maintenance.
- Age-based (e.g., run-hours) preventive maintenance.
The model assumes the failed element to be "as good as new" after each intervention. To model partial replacements where this assumption is invalid, a fourth category, "block with minimal repair" (where repair leads to "as good as old"), is available. Each activity is characterized by downtime, man-hours involved (up to five categories), and costs (set-up, material).
On the basis of this model and the input data, SPARC derives the time-profile and steady state (long-term average) value of the up- and downtime distributions of the various model elements. Using an availability block representation (Fig. 4), a large number of output variables may be calculated, as long-term averages or as time-dependent values in periods and in a time horizon defined by the user:
- Probability of system running at the various capacity levels.
- Effective production.
- Probability of meeting demand (system effectiveness).
- Resource and materials consumption.
- System, unit, MC, FM availability.
The output is generated in the form of Excel tables and graphs, allowing easy connection with standard word processors.
Production platform design evaluation
Fig. 5 shows the availability block diagram of a production platform4 (a screen dump from the SPARC graphical user interface).
Seven wells, each producing 50,000 b/d, feed via risers and templates into a platform with a design capacity of 300,000 boe/d, where oil and gas are separated, metered, and transported via two pipelines. The platform also houses a vapor recovery system, glycol and chemical injection facilities, and a number of auxiliaries.
The purpose of this study is to assess the expected annual production and (unplanned) man-hours at an early stage in the design. Because equipment selection is supposed to be not yet final, only generic reliability, downtime, and resource-consumption data (OREDA, 1997) or historical in-house data at unit level can be used. Such figures are best regarded as long-term averages of "standard" operation, using various maintenance strategies.
Therefore we use a constant failure rate concept (Weibull b = 1) and a characteristic time ( equal to the observed MTBF (mean time between failures). About 280 failure modes are taken into account in this model using the OREDA classifications of critical, degraded, and incipient.A first analysis shows that the system is capable to deliver, on average, 269,400 boe/d, which is 89.8% of the platform capacity. Based on OREDA data, we expect an annual workload for unplanned maintenance of 14,772 man-hours divided over the various system units as given in Fig. 6 (to cover the range, the man-hours are plotted on a logarithmic scale).
With SPARC, the designer easily can carry out a sensitivity-criticality analysis to estimate the change in system capacity if a given subsystem were 100% available.
Fig. 7 shows the gas-compression train would cause most of the production losses; if only this system would be 100% available (with the other units unchanged), the annual production would increase by 5.393 million bbl. Second in importance is the gas-transfer and sales section; if 100% available, the output would be 1.924 million bbl/year more than the original system.
Given the design structure, the many FMs, and bypass capacities, the system can operate at various capacity-levels. The average production of 269,400 boe/d can further be analyzed in a production envelope (Fig. 8). This figure shows the cumulative probability of a given production level over a time interval selected by the user, a parameter of importance in dealing with sales agreements. For example, the original system has a 96% probability of meeting a demand volume per month between 0 and about 70%; for a level between 70% and 82%, this probability drops to 78%, to reach a level of 75% at the maximum system capacity.
We noted already that the compression train turned out to be the most critical system. The designer now may quickly evaluate design alternatives by cutting, copying, and pasting in the GUI. Fig 8 shows that adding a fourth train of 105% capacity to the existing three has almost the same effect as increasing the capacity of each of the three trains to 150%. The cost benefit of such design alternatives are easily calculated, given the capital investment costs and the SPARC assessment of production and resources.
Time-dependent behavior
Such long-term assessments are a viable step in the first analysis of a preliminary design, where decisions on design alternatives almost without exception will determine the system configuration during its economic life. However, it does not show the ability of the maintenance function to manage unavailability; a well-chosen maintenance strategy can effectively support operations to "produce the right amount of products at the right moment in time."
Fig. 9 shows the screen-dump of the availability block-diagram of a (reciprocal) compressor system containing the compressor; lubrication oil and cooling water system; and associated vessels, piping, and instrumentation. Reliability and maintenance data for the SPARC model were taken from generic sources, enhanced by expert opinion and the first 4 years of field data in the CMMS.
Fig. 10a shows the evolution of expected monthly availability over time since the starting date of the system. Effective management requires such time-dependent information, rather than a long-term average (Fig 10b), like A = MTBF/(MTBF+MTTR) [mean time to recovery]. The compressor is seen to have the largest contribution to system unavailability, followed by the lubricating oil system. Clearly visible are the effects of planned maintenance routines-temporarily reducing system availability due to the associated downtime but restoring part of the reliability of operation thereafter.
The installed preventive maintenance program has an appreciable effect on the expected output of the platform. Fig. 11 compares the monthly expected unavailability (loss of production) of the installed system with the same, using breakdown maintenance only. Clearly, the maintenance manager is capable of reducing the unavailability by about 40%, or about 7 days/year of production. This benefit occurs in spite of the involved system shutdowns (the peaks in Fig. 11); a proper timing of these activities with respect to demand can significantly further improve the effectiveness. Fig. 12 shows the additional man-hours similarly required for a proper cost-benefit analysis.
Conclusions
Asset management is an important managerial topic in the oil production sector, both from an economic point of view, as well as in relation to safety and environment. The current low profile of the maintenance function, with its strong focus on technical aspects, observed in a number of companies, has to be changed to a facts-based, proactive management attitude.
In fact, the maintenance function has to develop itself into a business unit with mission, vision, and objectives derived from that of the company. A proper recognition of the various layers in the asset management hierarchy then is essential for structuring decisions. To be on a par with-and to convince-other decision-makers, effective decision-support tools such as SPARC, that are easy to use by engineers and technicians at various levels in the organization, are essential for comparing alternatives and strategies, deepening insight, and achieving cooperation and feedback of all staff involved.
References
Anonymous (1994), Industry analysts focus on US, Asian, Latin American markets (OGJ, Apr. 25, 1994, p. 45).
Cooper, B. (1992), Maintenance strategy procedures, development, and implementation, Third ESReDA Seminar on Equipment Aging and Maintenance, Chamonix, Oct. 14-15.
Dekker, R. and C.F.H. van Ryn, (1996), Operational research supports maintenance decision making, in L. Fortuin, P. van Beek, and L. van Wassenhove, OR at wORk, Taylor & Francis, London, ISBN 0 7484 0456, pp. 93-109.
Drogaris, G. (1993), Major Accident Reporting System: Lessons Learned from Accidents Notified, Elsevier, Amsterdam.
Grievink, J., K. Smit, R. Dekker, and C.F.H. van Ryn (1993), Managing reliability and maintenance in the process industry, Foundations of Computer-Aided Process Operations (FOCAPO), Crested Butte, Colo., July 18-23, CACHE.
Lepree, Joy (1999), Maintenance at the millennium, IMPOmag.com, Feature Focus, October 1999.
OREDA-97, Offshore Reliability Data Handbook, 3rd Edition, 1997, Det Norske Veritas, Hovik, Norway.
Pierce, F.R. (1987), Invited comments on maintenance planning, in G.V. Reklaitis and H.D. Spriggs (Eds.) Proceedings of FOCAPO, Park City, Utah, CACHE, pp. 272-280.
Van Rijn, C.F.H. and P. Scholten (1996), Integral management of production assets, Maintenance, June.
Notes
- In a serial configuration, the process is assumed to fail if one of the elements fail; parallel configurations are employed to model redundancy.
- These tasks are derived from a failure mode and effect analysis, following the Reliability Centered Maintenance approach.
- With the multitude of maintenance jobs on a platform, the use of strategies rather than individual job-assessment is the most viable approach.
- This illustrative example is based upon an actual study. For reasons of commercial security, equipment names, capacities, and reliability data were altered. If all data are available, such a model can be built in a few man-days. Each calculation (evaluation of design alternative) takes less than 2 min, making SPARC a powerful "what if" analysis tool.
The Author
Cyp van Rijn spent his active life with Royal Dutch/Shell's Shell Research in Amsterdam. Among others, he managed a group in R&D on process optimization and control in Shell's chemical industry. Later, he became responsible for R&D on optimization techniques for Shell worldwide; his group developed new decision-support tools for oil production, refineries, chemical plants, and logistics. He was active in organizing IFAC and FOCAPO conferences; in initiating the Dutch Association for Reliability Engineering; in setting up and chairing the European Safety, Reliability, & Data Association; and in editorial and review activities for technical journals. After his early retirement, he initiated an MSc course on maintenance management, where he now acts as a lecturer on reliability assessment and maintenance optimization. He works part-time for the European Commission as a research evaluator-reviewer and as a senior consultant for Program 42/IES.