"Big data" may be the most overused buzz-phrase in the smart grid industry today. But that doesn’t mean that managing, analyzing and deriving operational and customer value from the massive amounts of data being collected by utilities isn’t important and valuable -- or that utilities aren't spending on it. GTM Research has pegged the value of the global utility data analytics market at a cumulative $20 billion between 2013 and 2020, growing from an annual spend of $1.1 billion this year to nearly $4 billion by decade’s end.

But there’s a long way to go from where we are today to actually delivering on the vision of a data-enabled utility enterprise. On the technical level, scattered and incompatible data sources must be integrated to avoid the “garbage in, garbage out” problem. On the organizational level, siloed utility operations must share data and analytics resources, to avoid wasting time and money on duplicative and isolated efforts.

And on the economic and regulatory level, data analytics must prove its worth as a business proposition. That means proving to utility regulators that the costs of implementing data analytics, which are passed through to customers as rate increases, are worth the sometimes hard-to-quantify benefits.

Over the past several years, we’ve seen startups like AutoGrid, C3 Energy, Trove, Opower and Verdeeco, IT giants like Oracle, IBM, SAS, Teradata, EMC and SAP, and grid giants including General Electric, Siemens/eMeter, ABB/Ventyx, Schneider Electric/Telvent, Toshiba/Landis+Gyr and more, leap into the big data fray. Perhaps 2014 will be the year for these efforts to begin to shake themselves out, into those that can prove their value for utilities, regulators, customers and markets -- and those that can’t make the cut.

AMI Data Integration and Analysis: Where We Are Today

It’s natural for utilities to start their big data efforts in business operations where the big data is: smart metering, or automated metering infrastructure (AMI). Over the past few years, we’ve seen AMI-centered projects emerge from pilot deployments to full-scale commercial offerings, whether in the form of embedded IT systems at large investor-owned utilities, or as a cloud-based software-as-a-service (SaaS) models.

But the move from one-off, proof-of-concept deployments to broader adoption of analytics is still a work in progress for the utility industry. A host of reports out this year prove that many utilities haven’t even captured AMI data’s value to make sure their core meter-to-cash and communications systems are working at optimum value, let alone begun adding more advanced functions to the mix.

At the same time, AMI deployments, while fairly data-heavy on their own, don't compare to the flood of data to come from the other systems on the utility’s analytics roadmaps, as this chart from GTM Research points out:

Taking a project-by-project approach is certainly better than trying to talk utilities into making massive up-front investments into analytics for as-yet unproven business value. “I don’t think you can say that there’s going to be one big enterprise platform that’s going to solve every utility operational requirement,” Larsh Johnson, CTO of eMeter, said in a December interview.

“At the same time, I don't think you can say that siloed operations will be right either,” he said. Instead, “There will be multiple forms of analytic platforms that are being architected to solve multiple use cases.”

This indicates that AMI-based data analytics systems are just starting to integrate into other parts of the utility enterprise. In general terms, we can break down these efforts into three categories: customer systems, grid operations systems, and enterprise systems. Each has its own opportunities, its own target markets, and its own challenges -- and, in keeping with the fuzzy, silo-crossing nature of the big data challenge, each category overlaps with its counterparts.

Customer-Facing Analytics, From Theft Detection to Grid Edge Management

Let’s start with the customer realm. That can include everything from revenue protection and theft detection to help utilities manage their more troublesome customer accounts, to demand response, variable pricing and consumer analytics that help enlist customers as partners in the grid management challenge.

Revenue protection, also known as “non-technical loss” detection (or, in more straightforward language, theft detection) is one category seeing lots of early-stage work. Stolen or misappropriated power can cost utilities hundreds of millions of dollars a year. AMI data can help track this unbilled power, whether it's from illicit operations like indoor marijuana grow operations or due to simple mistakes in managing customer shut-offs and turn-ons. Analytics offerings from the likes of startup C3, General Electric’s GridIQ Insight platform and Siemens’ eMeter analytics system are taking on this task.

Another core task for customer-facing analytics is demand response -- getting customers to reduce energy consumption to reduce system peak loads. That can be applied to integrating multiple utility programs with an array of in-home devices, as startup AutoGrid is doing for utility partner Austin Energy and vendor partner Schneider Electric. It can also focus on customer behavior, or the combination of messaging and incentives that are best at getting customers to turn down energy at a lowest possible cost to utilities, as startup Opower is doing with a variety of partners.

Knowing how customers respond to variable pricing programs, like time-of-use rates or peak-time rebates, is also a field where analytics can play an important role. Oklahoma Gas & Electric, which has won awards for its efforts to integrate smart meter data and customer insight into long-range ratemaking and grid planning, is a good case study on this front.

In the long run, a better understanding of customer behavior could prove invaluable to utilities in a range of fields. But there are two key problems here: the sheer randomness of how millions of customers will choose to act -- and the relative lack of effective methods via which utilities can influence that decision-making process. This could drive investment in traditional market intelligence analytics, whether it's incorporating social media platforms like Facebook and Twitter in pinpointing outages or improving service, or testing the popularity of new utility offerings for the home.

On these fronts, competitive energy markets offer a huge early-stage opportunity. That’s because retail energy providers can use analytics to reduce customer churn, develop device marketing campaigns to acquire new customers, and determine how their customer offerings balance out against the costs for buying power on wholesale markets in the long term. Of course, these retail energy providers are a lot less likely to share data analytics best practices with one another -- or admit where their efforts have come up short.

The same goes for the long list of home energy management offerings from telecommunications, home security and home improvement providers. Sure, they have lots of data, but sharing it with utilities will require balancing both parties’ economic interests, while also maintaining customer data privacy and security. While the regulatory framework for utilities and their customers is expected to undergo some significant changes on this front (indeed, in early-adopter markets like California, it's already happening), we may have to wait awhile for the full range of customer-facing data analytics to emerge.

From Tactical to Strategic: The Challenge of Outage Detection, Asset Management, Workforce Deployment and Big Data Integration

In the meantime, utilities have plenty of their own data to collect, clean up and make use of on the grid operations front. Today's analytics efforts are largely on the “tactical” level -- deployments aimed at proving that the data integration and analysis behind them works. To reach their greatest value, however, they'll need to move toward more “strategic” efforts, that make use of shared IT infrastructures and analytics capabilities for multiple use cases and benefits.

Outage management system (OMS) improvements are only the first and most obvious way to use AMI data for operational improvements. Lots of utilities are tapping the “last-gasp” outage detection abilities of smart meters, for example, to augment the old-fashioned method of waiting for customers to call in and report that their power is out. We’ve already seen reports of successful implementation of these AMI-assisted outage restorations in the wake of Hurricane Sandy and other weather events. After several years of effort, these systems are also becoming reliable enough to give line crews the ability to fix as-yet-unreported outages, such as those that happen when people are asleep or away from home.

The next step is to start diagnosing the health of grid systems themselves. Oracle, SAP, Silver Spring Networks, Siemens/eMeter, ABB/Ventyx and many other vendors have launched transformer health monitoring applications, for example, which combine AMI and other grid sensor data with weather and temperature data, asset management system data and other information to determine which transformers may be close to breaking down, or are being put under stress that requires an upgrade.

Of course, managing grid assets in a holistic manner requires a lot more data integration work. A good example is AEP’s Asset Health Center, a long-term project that’s tying location-specific data, system-wide modeling and long-term asset planning into a platform for strategic asset management analytics.

As AEP noted in an October presentation (PDF), automating the aggregation of all the data needed for this process has been a daunting effort. It starts with manually entering data from handwritten notes and inspection log books dating back to the time of equipment installation, then moves to merging data from the differently formatted spreadsheets and databases used to track it over the decades. It then moves to reconfiguring utility SCADA system data to match it to individual assets, and updating workforce data processes to make sure day-to-day field operations don’t get lost in the shuffle.

The core challenge is to avoid gaps in data that will rob the system of the specificity required to make reliable projections of asset health. This costs a lot of money and takes a lot of time, of course, and utilities are hoping that vanguard efforts like AEP’s may help prove the value of this upfront investment, so that regulators will have models for allowing more projects like this to move forward.

The end result, however, could be operational efficiencies and technology-enabled capabilities that start to match the futuristic sales pitches we often hear. Imagine, for example, wirelessly connected tablets that can take pictures of equipment in the field and return a list of that asset’s specifications, overlay 3-D images of how downed power poles should look once they’re rebuilt, and automate the process of ordering new equipment.

That’s a vision that Matt Wakefielddirector of information and communication technology for EPRI, described in a November presentation. The trick is to integrate data from systems “that have never been integrated before.” But if it can be done, it could help speed up storm restoration efforts, enable utilities and other emergency responders to coordinate efforts, and otherwise prepare for threats to grid reliability and resiliency.

Moving From Past to Present: Real-Time Grid Awareness

Moving from tactical to strategic data analytics is one challenge for the industry. Another is moving from systems that can only operate in a backward-looking, historical fashion to those that can offer something close to real-time insight into the grid. The catchall term for this is “situational awareness,” and it brings the critical dimension of time into an already complicated picture.

Transmission grids are a good place to look for cutting-edge developments on this front, such as synchrophasor deployments meant to help system operators detect and correct problems like those that led to the massive 2003 blackout across much of the Northeast U.S. and Canada. But the rise of solar PV, distributed energy, plug-in electric vehicles and other disruptive influences on the edges of the grid are bringing some of the same pressures to distribution systems.

Managing these disruptions will require a whole new way of looking at data, starting with ways to smooth out the expensive and onerous work needed to create accurate representations of grids as they exist today. For example, Consolidated Edison’s distribution SCADA project will integrate 1.5 million data collection points into its system over a two-year period.

As Erich Gunther, chairman and CTO of EnerNex, noted at Greentech Media’s Soft Grid 2013 conference in October, today’s grid data is “distributed over many different systems; there are no processes within an organization to keep them synchronized or up to date…and we don’t even have any standards in place in the industry” for tracking them. Soorya Kuloor, CTO of GRIDiant, noted, “If there was a standard format for that, it would speed up the deployment time significantly.”

At the same time, accurate data models of the grid are just the first step, according to Doug Dorr, senior project manager for EPRI’s data analytics initiatives. “It’s not just about the voltage and current sensors,” he said. “It’s about using satellite images to look at vegetation growth, to know where the next storm is likely to bring down lines,” or analyzing weather data to predict where clouds and fog will have an impact on solar generation from rooftop PV and utility-scale solar.

Finally, any system that seeks to turn real-time awareness into actionable information must present it in a format that utility operators can react to, he noted. “The guy in the control room who can see there’s trouble on this circuit because all this distributed generation came on-line -- I don’t want him to subjectively act; I want something that gives us a red, or a green, or a yellow” signal to respond to, he said.

This boiling down of massive amounts of data into information that can be made use of in a real-world, real-time context is a critical aspect of the renewable energy planning systems being developed by grid giants, as well as IT vendors like IBM. But it’s also an important part of proving the real-world value of simpler distribution grid deployments by collecting the data required to show whether they’re effective or not -- and where future deployments will, or won’t, be beneficial.

Take conservation voltage reduction (CVR), a term that covers a whole range of technologies that lower voltages on distribution circuits to save energy, while keeping every end customer supplied with adequate power. “If you’re going to implement that system-wide, you better make sure you have at least the voltage sensors or smart metering in place so you know what’s happening across the whole circuit, not just at the substation,” Dorr noted.

“Then you can come up with whether that saved you 2 percent of the typical energy that circuit used, or 3 percent. [...] And then you can talk about how many generating plants you took offline,” he said. In other words, “It took a lot of the analytics to cost-justify it for the entire system.”

From Representing the Present to Capturing the Future: Predictive Analytics

These same imperatives will come into play as utilities face the challenge of managing their long-term plans for handling all the disruptions happening at the grid edge. This, of course, requires taking another step forward in the dimension of time, from accurately representing today’s grid to creating a prediction of the grid of the future.

There’s an enormous amount of work to do in preparation for this vision. First of all, the task of predicting many different possible futures implies an exponential increase in the amount of data that must be incorporated into the process, compared to creating an accurate view of past events or even current conditions.

Second, some of the key business objectives that predictive analytics can help solve involve variables well beyond a utility’s control, whether those are the rates of customer adoption of rooftop PV or plug-in electric vehicles, or regulatory and economic developments that can wreak havoc with long-range financial plans.

But predictive analytics also promises the greatest bang for the buck of all possible applications of big data technology. Think of systems that can accurately predict where storms are expected to do their worst damage to the grid and allow utilities to stage crews and equipment accordingly. Or consider the opportunity to predict the spread of distributed generation, demand response participation and other customer-side changes to meet regulator requirements to build these expectations into future distribution grid plans, as California’s utilities are being asked to do in the next few years.

Finally, predictive analytics will play a key role in creating financial models that will allow utility regulators to consider the costs and benefits of future deployments. As Dorr noted, much of EPRI’s data analytics work is “centered on doing the analysis on the demos that we feel are far enough along that we can extrapolate to a full-system deployment, and actually get that value, whether it’s improved reliability, or a monetary value. It’s important to do that, because you’re not going to get that back from the regulators to recoup investment unless you’re pretty clear on what that value is.”