It is no secret that the explosion of data across the enterprise is affecting every industry.  A Boeing jet generates 10 terabytes every 30 minutes; Wal-Mart sees about one million customer transactions per day, producing 2.5 petabytes; by October 2011, Twitter reached the 250-million-tweet-per-day milestone. Utilities are no exception to this trend: a 3,000X increase in meters reads courtesy of smart meters; a 100X increase in grid data using Phasor Measurement Units (PMUs); and an increase from 30 percent to 80 percent of electric power flowing through power electronics by 2030.

The exponential increase in data volume, combined with the velocity of data creation, ingest and processing, with the variety of data sources make up the three Vs of Big Data. Harnessing big data is critical to the success of many initiatives including smart metering, distribution system optimization, load forecasting, profiling and segmentation, predictive outage analytics, and condition-based asset maintenance.

Of course, all this data is meaningless if it can’t be efficiently stored, analyzed and understood. It is often said that utilities are data-rich, yet information-poor. Utilities are desperate for more actionable insight from their information -- the ability to turn their data into something of meaningful value. Outages cost American utilities more than $100 billion per year on average.

Non-technical losses are also becoming a much bigger problem around the world:

-       “In some parts of Brazil, as much as 20 percent of electricity output is pinched.”

-       “Theft can be the single biggest problem that utilities in developing nations face. India’s government named the “reduction of aggregate technical and commercial losses” -- i.e., better efficiency and less energy theft -- as priorities for the projects it wants to fund.”

-       “In some pockets of South Asia, Sub-Saharan Africa, and the former Soviet Union, losses reach 50 percent.”

-       “The Northeast Group calculates that in some emerging market countries, the benefit of theft reduction alone can pay back the cost of AMI deployments in less than a decade.”

-       In the United States, utilities lose 1 percent to 3 percent of revenue -- or about $6 billion industry-wide -- each year, according to Electric Light & Power magazine.    

Unfortunately, traditional data storage and analytics technologies are not able to meet the challenges presented by Big Data. They were never designed to handle such high volumes of real-time data nor to deal with the diversity of format, structure and frequency that the different systems generate. As a result, new approaches are required to transform Big Data into actionable intelligence.

Finding the Needles in the Haystack

Innovative new software from EMC and Space-Time Insight can be applied to the problem, resulting in a solution that can provide real-time analytics for the largest smart grid deployments. The solution, part of a new category of software called situational intelligence, unifies and correlates data from disparate systems and presents it to users in a combination of intuitive geospatial and traditional analytics interfaces. 

One of the core components of this solution is a breakthrough technology that focuses on identifying outliers in a massive pool of data. This technology, called Multifactor Anomaly Detection (MFAD), uses probabilistic models to highlight assets or situations of interest based on specific criteria. For example, only a small number of smart meters may be subject to theft or tampering out of millions deployed. Multi-Factor Anomaly Detection offers users at-a-glance identification of the suspect meters based on real-time data and gives them the flexibility to narrow or expand the scope of the search on the fly, producing the five (or the 100) most-likely theft or tampering cases.

The baseline model used to generate the analysis might span hundreds of millions of events and petabytes of data. A massively parallel system like EMC Greenplum provides the performance required to deliver these results in seconds. Importantly, as users navigate around different parts of a geography, the baseline model can be redefined on the fly to provide accurate projections for the area the user is exploring. In fact, as benchmarked by Space-Time Insight and EMC, a baseline model containing a half-billion meter events can be created in two seconds.

Gaining Deep Insight Into Your Data

With real-time information available from the Multi-Factor Anomaly Detection technology, users have the freedom and flexibility to gain deep insights into data from any collection of systems. Correlating data from multiple systems and finding recurring patterns in that data suddenly becomes possible, revealing information that was previously unobtainable. Obtaining a list of meters that have not been read over a period of time is relatively easy. But applying to that list the many factors that could contribute to the lack of meter response presents a much more challenging problem that spans heterogeneous and disparate networks, hardware and software systems, and applications across the utility’s grid, control center and back office. These factors could include equipment failure, network problems, broader power outages, theft, and tampering.

Space-Time Insight’s situational intelligence software helps users quickly sort through the clutter of numerous read failures and understand the broader context for the lack of meter response. For example, if presented with a list of meters unread for the last day, a user can easily visualize where those meters are physically located, from a birds-eye perspective all the way down to street level. By applying various filters and overlaying data from other sources and systems (such as outage information, service disconnections due to lack of payment, and maintenance work), the user can then identify the root cause of problems in hand-picked geographic areas or for specific meters. As a result, questions such as “Which meters within a five-mile radius of the outage caused by the substation fire on Oak Street are suspected of theft or tampering based on events recorded over the last three weeks in that neighborhood and by comparison with other neighborhoods within a 50-mile range?” become easier to answer.

***

Dan Pearl is Technical Architect, Utility Industry at EMC Corp., and Steve Ehrlich is Vice President of Marketing at Space-Time Insight.