Reliability, Availability, Maintainability and Safety
AEON were tasked by a UK satellite integration and operational services provider to evaluate their existing In-Orbit Servicing Control Centre and to specify a RAMS analysis programme which would contribute to the prevention of catastrophic mission scenarios due to the failure of terrestrial systems.
The In-Orbit Servicing Control Centre is a dedicated facility to support the operation, configuration, and calibration of satellites whilst they are in orbit. Typically, these centres resemble control rooms, with specialised computing and networking equipment that permits engineers and scientists to communicate with and command, in-orbit assets.
The aim was to ensure that equipment failures within the In-Orbit Servicing Control Centre (e.g., computing, networking, and satellite connectivity infrastructure such as Ground Stations) would not compromise the success of in-orbit operations, and ultimately, the mission.
To achieve this aim, AEON performed an on-site survey at the In-Orbit Servicing Control Centre, taking inventory of existing equipment and generating an architecture of functionally dependent equipment. By classifying equipment by type, determining reliability and availability rates, and then overlaying common in-mission functions and operations, it was possible to identify equipment which was highly critical to mission success. Knowledge of which equipment items were highly critical would allow for failure probability analyses and mitigation planning.
AEON’s Systems Engineers and Supportability Specialists performed a benchmark assessment of a UK satellite integration and operational services provider’s In-Orbit Servicing Control Centre and analysed the need to implement a Reliability, Availability, Maintainability and Safety analysis.
Standards and Best Practices for RAMS Analysis
Many standards and ‘best practices’ exist for the assessment of reliability, availability, maintainability and safety, including (but not limited to):
- ANSI/EIA 632: Processes for Engineering a System
- IEC 300: Dependability Management
- IEEE 1220: Application and Management of Systems Engineering
- ISO/IEC 15288:2008: Systems Engineering – System Life Cycle Processes
- MIL-STD-499A: System Engineering Management
- MIL-STF-470B: Maintainability program Requirements for Systems and Equipment
- MIL-STD-785: Reliability program for Systems and Equipment, Development and Production
- MIL-STF-882C: System Safety Program requirements
- MIL-STF-1388-1A: Logistics Support Analysis
- MIL-STF-1472D: Human Engineering Design Criteria for Military systems, Equipment and Facilities
- SAE JA1011: Evaluation Criteria for Reliability Centred Maintenance (RCM) Processes
Determining Functions and Equipment
Given that the In-Orbit Servicing Control Centre already exists, and has been designed with purpose, following a site survey, it is possible to extract two key structural hierarchies which are used to help define a system:
- Functional Breakdown Structure: It’s functional definition, which defines the system’s functions and behaviours
- Equipment Breakdown Structure: It’s physical definition, which defines the system’s physical construct
Combining these two structures informs which function is performed by which piece of equipment and importantly, it will likely show that some functions rely on several pieces of interfacing equipment performing several sub-functions. By using Enterprise Architect, AEON’s experts were able to effectively model types of equipment, their deployment, functionality, and connectivity. This model is called an architecture and describes an abstracted representation of the In-Orbit Servicing Control Centre.
Traceability of Functions
Following the definition of functions and equipment, it is necessary to trace functions to known in-orbit operations, such as normal communication exchanges, monitoring of parameters, or the specific issuing of commands. This is a description of the functions (as taken from the Function Breakdown Structure), albeit they are described using a ‘mission context’, for instance “adjust attitude of space segment”. These in-orbit ‘use-cases’ may require multiple functions and sub-functions to be performed either in logical sequence or in parallel (e.g., compile command, send command, acknowledge receipt of command, execute manoeuvre, send confirmation of command execution etc.). This mapping of functions and use-cases allows the evaluation of which item of equipment is required for each in-orbit use-case; it also defines the function duty cycle.
Assessing Reliability and Availability
“A system is a construct of elements”
Equipment can be evaluated in terms of reliability (def: the duration or probability of failure-free performance of the equipment, under stated conditions) and its availability (def: a measure of the degree to which equipment is in an operable and committable state at the start of a mission when the mission is called for at an unknown time).
Knowing each equipment’s reliability and availability, it is possible to string together the net availability when considering an end-to-end system, and with respect to the fulfilment of a specific function or mission use-case. In this activity, AEON’s experts were not looking to perform any kind of Root Cause Analysis (RCA), or to define what the underlying failure might be, it is simply determining how the failure of equipment affects the system’s wider functionality or capability to perform mission use-cases.
The key consideration in this activity is to quantitatively determine the acceptable level of availability for each function or mission use-case. To add further complexity, specific functions or mission use-cases may have a higher or lower criticality at different phases throughout the mission. For instance, the required availability for an in-orbit docking system during deployment may be nil, conversely, the docking systems required availability during a docking event should be as high as is practically possible! This type of evaluation results in a configurable criticality and may drive decision making should tolerable failures occur during the mission.
Assessing Maintainability and Safety
Maintainability may be defined as the measure (or ability) of an equipment item to be retained in (or restored to) a specified condition when maintenance is performed during the course of a specified mission profile. It is a complex notion which incorporates the skill needs to perform remedial work, the availability of prescribed procedures, the availability of required resources (incl. replacement parts), and the definition of the “basic level of operation” to fulfil key functions (i.e., repairing the equipment to a level of minimum viable functionality).
To sustain the availability of the In-Orbit Servicing Control Centre, it is necessary to perform maintenance. Maintenance assures that equipment fulfils functional requirements and therefore mission use-cases satisfactorily; it ensures that equipment failure rates are maintained at their lowest tolerable rate (i.e., the reliability of the equipment is kept high). The type of equipment and what may constitute maintenance is very dependent on the equipment and how it is deployed, for instance, software patches/updates may or may not form part of regular maintenance planning and may need to be handled in a systematic and coordinated manner following integration testing and/or simulated deployment testing.
In many cases, maintenance is not performed on equipment which is in active operation and most commonly organisations prefer some element of planned maintenance or Planned Preventative Maintenance (PPM) -it should be stated that this is only one approach, and many other operational models and techniques can be used to great success. An obvious point to make is, that by performing maintenance planning, an organisation may schedule appropriate windows for the system, in this case the In-Orbit Servicing Control Centre, to be unavailable – this is likely performed between each mission and not during critical phases.
Safety is included in RAMS analyses although in this context it is defined as the ability for a system to not harm people, the environment, or any equipment item during a mission. If operators or end-users are inherently part of the functioning system, as is the case in the In-Orbit Servicing Control Centre, it is necessary to consider sustaining their availability, ensuring they are exposed to only tolerable levels of risk of injury. A single point of failure can easily be an individual or specific group/type of operators.
There are overlapping aspects to reliability, maintainability, and safety. The role of the system designer is to ensure these aspects are duly considered when designing a system to satisfy an availability objective.
In performing the site survey and evaluating the needs of In-Orbit Servicing Control Centre, AEON was able to both define the equipment, its functional dependency, and identify any obvious single points of failure. An architecture was modelled using Enterprise Architect, facilitating the output of a visual representation (schematic) of the facility. AEON’s experts also prescribed the need to establish traceability between all aspects of the RAMS programme with the mission objectives, and to a lower level, its use-cases. As the facility accommodates differing missions, with differing requirements, AEON recommended that the service provider define and adopt a standardised “Capability and Functionality Requirements Document”, in addition to an “Interface Control Document” – their purpose being to elicit key RAMS requirements from each mission director and to allow the service provider to affirm compliance of the ground segment to fulfil these needs. Where necessary, this may also identify key modifications or upgrades to support each in-orbit servicing mission.
AEON provided a RAMS programme roadmap, defining how to initialise a RAMS programme, which activities to perform, in which order, and how information from one phase to the next may impact decision making. Key activities were prescribed to assure that all works were comprehensive, accurate and conclusive in prescribing operational models of the In-Orbit Servicing Control Centre. Based on the feedback received, the service provider felt better informed on the works underpinning a typical RAMS programme and was pleased that they now had a structured approach to assess expansibility and modifications to the existing facility. Additionally, the explicit need to elicit requirements from each mission director was likely to be adopted to better assure that the facility is fit for purpose, and to use any misalignments to justify upgrades and expansibility.
As the world moves toward increasing levels of engineered complexity, there is an inherent need to assess how engineering systems will be built and how they will be sustained throughout their lifespan. RAMS is a just one such tasset of a bigger “Systems Engineering” approach to handling complexity.
Can we help?
If you’d like to speak to us about how RAMS or Systems Engineering might help your organisation, reach out to us anytime.
Should you be interested in our other abilities and services, you can find them within our brochure.