Have you read about the latest broadcast disruption? Probably not.
Equipment failures in ground stations typically don’t make the news such as the latest headlines about HTS or smallsat constellations, but that doesn’t mean they are any less important—or without drama.
Let’s not forget that after the rockets are launched and satellites deployed, the essential role of ground stations then kicks in. Straddling RF and IP, digital skyways and terrestrial info-highways, today’s ground stations face new challenges to deliver services through hybrid, interconnected global networks. No longer residing in its RF silo, ground stations today look more like data centers, with tens or hundreds of overlapping networks and a proliferating mix of RF and IP equipment.
Yet, despite the technology convergence, ground station operations are struggling to keep pace from a process and a people standpoint. Some staff are monitoring RF interference; others network performance; while others interface on customer service issues.
When service disruptions occur, a scramble ensues to isolate the problem and unravel which customer services and SLAs are being affected. With numerous alerts, but too little visibility, the lowest priority service or SLA may be treated like the highest priority. All while the SLA clock ticks, penalties mount and customer frustration grows.
Satellite operators won’t find much sympathy from customers, whose only concern is that their broadband, video or voice service is delivered shipboard, in-flight or at a remote site with little to no downtime. That is the value proposition, after all.
Ground station operators are at an inflection point where they face a widening gap between their current concept of operations and the demands of delivering high-availability services through a complex hybrid network environment. This article looks at how ground station operations can leverage advances in automation, big data intelligence, predictive analytics, and SLA management to efficiently deliver high Quality of Service (QoS) for the near 100 percent uptime expectations in an always-on world.
The New Model
No longer a simple RF hand-off, or ‘fire up and forget’ service for a carrier, customers are looking to buy services in different ways. That may be a slice of a beam, a data pipe, and deeper integration into their operations. This is a merging of satellite and terrestrial environments with more of an IP type business model and IT service arrangement applied. Yet, ground station tools and systems weren’t designed for those environments.
With far more things to break and fail in a mixed RF-IP environment, unraveling the cause and effect of service disruptions can be similar to playing a three level chessboard. The jack of all trades RF engineer, who managed hardware and could troubleshoot pure-play satellite operations, has grown into a balkanized division of labor, with engineering, networks, and customer service teams. Before a modem, server or antenna control unit can be repaired or replaced, the impact on the customer must be gauged. The information to make those associations, spread around, requires teams to dig through what are often thousands of spreadsheet items and diagrams, translating and reconciling information in an iterative time-consuming cycle.
Automation
If teleport operations are a mix of people, processes and technology, a concept of operations defined as ‘service quality management’ (SQM) is a leap ahead to unify them. In this more mature model, automation does the heavy lifting. All of the operational data from hundreds or thousands of individual IP and RF components are gathered into a virtual model or graphic representation of the end-to-end services to customers.
SQM fuses together and standardizes all device and metric information into a unified dashboard, including data from monitor and control, network management, element management, and carrier monitoring systems.
A major reason these systems haven’t fully converged is because the types of data each monitors is different, as are the protocols, instrumentation, and terminology. Satellite managers, for example, rely on EIRP, Eb/No and BER as key performance indicators, while terrestrial network managers rely on latency and packet loss. Yet all five factors directly affect QoS and bandwidth usage and need to be managed together. SQM unites this so the data center group and the RF team can manage services that traverse both technologies. This also centralizes the management of hybrid and distributed networks in the NOC, uniting what can be silo’d operations of each teleport with its own systems.
Big Data Intelligence
With a single converged view of satellite and terrestrial networks, no longer do staff hit a dead end because their tools only monitor a segment of the network. Big data analytics correlate all key information—equipment names, locations, network links, carrier monitoring and other performance metrics—into a single management layer.
Operators can instantly see the associations between all of the equipment, services, and SLAs in a unified dashboard. Rather than reacting to a series of alerts from racks of indistinguishable equipment, staff gain visibility into each component that makes up the service chain. They can drill down into device issues for quick root cause identification, while also understanding the impact on customer SLAs, whether from a failed modem, performance decay of a high power amplifier, bandwidth saturation, or a drop in carrier power levels.
Is the modem supporting one or 15 customers? Which critical SLAs are being affected, and when do penalties or charge-backs start to kick in, and how much? Having that insight readily available in the runtime system is immensely powerful. With those correlations, staff can make smarter decisions, such as whether to default to a full switchover of traffic to a redundant path, which can be more costly and disruptive, or to isolate and bypass the culprit device, replacing it at the next authorized service window. Even for routine maintenance and repair, staff will know exactly which customer services will be affected before potentially setting off a further cascade of problems or outages.
This accelerates issue response and remediation at the very time when service quality expectations are ratcheting up. Shaped in part by consumer experience with smartphones and always-on broadband at home and work, Service Level Agreements (SLAs) that were once less formalized are becoming more codified and more stringent, reflecting demands of the new normal—99.999+ uptime.
Predictive Analytics
Speaking of problem resolution, better yet, rather than waiting on alarms for what’s broken, the predictive analytics in service quality management can be applied to equipment to answer ‘when will this device fail,’ notifying operators beforehand. These predictive analytics can detect a range of common issues that affect satellite operations such as increasing power levels of high performance amplifiers indicating a possible rain fade event, beam voltage fluctuations highlighting an impending transmitter problem, rising temperatures on low noise amplifiers signaling near-term failure, or server memory utilization pointing to a capacity or performance issue before a crash.
Traditional management systems only provide a real-time alarm for equipment performance, not a future prediction of failure. With a service quality management approach, operators can be alerted well in advance of a device failure, so they can anticipate or have a spare in place to minimize service impacts and SLA penalties.
SLA Management
Like a cross section cut-away of a garden, SQM reveals grains of sand on bottom (devices) percolating up to a root system of networks, with the emerging flower on top as the customer service. Just as every ‘flower’ may not need the same watering or TLC, neither do all services and SLAs require the same immediacy of attention. With limited resources, staff can now prioritize and manage services by the most important criteria, whether the most stringent or punitive SLAs or critical services.
Picture multiple services having an outage at the same time, such as a high-speed Internet for a deep sea oil platform, video conferencing service for an office, and a long distance learning application for a university. Now that they are able to view the compliance status and the cost of customer outage credits, operators can clearly see that the video conferencing service has the most significant penalty associated with a violation, making it the highest priority issue for resolution, with the long distance learning application and the Internet service tiered second and third.
Understandably, ground station operators proficient in RF may be less so when it comes to these advanced principles of service management. By adopting these concepts and capabilities, they can mature their IT operations in step with the “Gartner Maturity Model.” This outlines how IT operations can progress from Chaotic and Reactive management (ad hoc, non-centralized alert and event monitoring) to Proactive (predict, prevent and manage availability), up to Service (SLA-service views) and to ultimately Value, which links IT processes to business impact.
At a time when disruptive tech giants and startups are entering the space industry, and satellite and terrestrial service providers are looking more alike in the race to build global end-to-end services, SQM gives operators a competitive advantage to efficiently manage high availability service over a combination of satellite, fiber, wireless or microwave networks and infrastructure. Customers will look to align themselves with and choose ground station partners they can trust to outsource more of the complexity, who can provide a single-point of accountability for end-to-end service delivery, and add continuous improvement cycles to realize the network’s full revenue potential.
Future Proofing—Scaling Up
In the face of game-changing new satellite technologies that are bringing exponentially more throughput and bandwidth, ground station operators will want to re-examine their concept of operations, adopting approaches that can scale to the volume, variety and velocity of these mixed network and device environments. By taking advantage of service quality management and its use of automation, big data visualization, predictive analytics, and SLA management, ground station operators can evolve their expertise to better deliver QoS with less manual processes.
Here’s An SQM Solution That Meets The Demands Of An Always On World
As customer expectations continue to rise along with the expansion of hybrid networks, ground stations are looking for technologies that help them keep pace with the growth. NeuralStar SQM (www.KratosNetworks.com/SQM) is the first end-to-end enterprise software product for satellite and terrestrial service management.
With NeuralStar SQM operators are able to manage customer services, SLAs and the supporting devices across the globe and view the status of every service using real-time intelligence and analytics. Service providers are able to quickly assess which customers are affected by a degradation or outage of even a single device anywhere in the network. NeuralStar SQM enables operators to advance their ground station operations to improve quality of service and customer satisfaction, maintain and grow revenue and optimize satellite and terrestrial operations.
Phill Howard has more than 21 years of experience in satellite and infrastructure management for military and commercial systems. He has diverse hands-on engineering experience as a subject matter expert spanning RF/satellite communications, spectrum management, software development, management and control systems, complex and large hybrid enterprise systems. Over his career, Phill has helped provide solutions to some of the industry’s largest and most complex systems.