From Graveyards to Goldmines: Leveraging the Compliance Data Challenge

Over the past 30 years, the amount of data wastewater treatment plants typically generate has increased exponentially. Systems for storing and organizing that data have struggled to keep up. 

In many cases, water utilities have come to see the sheer amount of data on file as a liability rather than an asset: something that needs to be constantly monitored, corralled, and inevitably pushed to the side.

However, as some researchers have noted, that massive pile of data overwhelming your water utility has the potential to become a goldmine. 

When data is organized and shared effectively, it becomes a powerful tool for upgrading the efficiency and performance of your wastewater treatment system, anticipating disturbances before they happen and adapting to stricter standards of compliance.

The Explosive Growth of Water Treatment Data

One study showed that a single large wastewater treatment plant, serving 800,000 to 3 million people, can generate up to 30,000 data points. These include everything from sampling data essential for reporting compliance and meeting environmental regulations, to GPS coordinates, call logs, field notes, and more.

Such a large volume of data has an impact on individual personnel. For example: a single employee at a large water utility is often responsible for overseeing more than 40,000 backflow prevention devices, each of which generates annual inspection data. The system used to organize such data has an enormous impact on that employee’s day-to-day job, affecting their ability to share information, file reports, and ensure compliance.

On its own, the overarching project of compliance—particularly tracking permits—represents an enormous task. One water utility uses Klir to manage over 3,000 permits, a task that would be daunting without Klir’s fully configurable data management systems.

As Lluís Corominas, a researcher at the Catalan Institute for Water Research, writes

Plant operators have an overwhelming stream of data at their hands, which is very difficult to process and analyze in a timely enough fashion to allow for better understanding or proper decision-making.

The earliest tremors of this explosion of data generation can be traced back to the 1970s, when one of the hottest topics at international wastewater treatment conferences was data collection from sensors.

The sensors being used were adapted from other industries and ill-fitted for use in wastewater treatment systems, but attendees were already discussing the best ways to automate the collection and management of data in their plants.

The same report lists four primary reasons why managing water treatment data (referred to as information, control, and automation [ICA]) has since become such an enormous task:

  • Effluent quality standards, which became more demanding and complex
  • Economic factors, which encouraged water utilities to develop automated, money-saving compliance management tools that generated more data than prior solutions
  • Plant complexity, one of the most important driving factors, which increased as methods of water treatment advanced
  • Improved tools, such as advanced remote sensors, which generated more data for water utilities to manage

With such a large amount of information to deal with, one of the most important tools at a water utility’s disposal is data centralization.

Aerial View WWTP

The Importance of Data Centralization

Utilities are increasingly data-rich but information-poor. As Corominas notes, a large number of utilities have become host to “data graveyards,” massive stores of data that cannot be easily navigated or accessed. 

The data graveyard is a sort of invisible weight burdening a water utility, demanding resources to be maintained, causing a constant drain on time and money, but rarely producing outright catastrophic effects. 

Individuals may be forced to enter the graveyard on a regular basis, in order to dredge up information for the sake of renewing permits, for instance, or to confirm the status of different backflow devices. But each of these is simply a slow, laborious task–one that creates drag on standard processes without ever pushing them to their breaking point.

The cumulative effect of the data graveyard may be huge, but it’s difficult to see. That’s especially the case when pieces of it are owned by different individuals and teams, or scattered across multiple disconnected databases. 

If the cumulative effect of a data graveyard is difficult to grasp, its potential for good may be even more elusive. Your water utility could have a huge amount of data on hand that might be leveraged to speed up and improve processes, anticipate problems, and plan for the future. But so long as it’s a fragmentary mess and a headache to access, its potential is impossible to realize.

The first step in converting your data graveyard to a goldmine is centralizing it. Bringing all your data together in one place, under one administrative dashboard, lets you assess its potential.

The best tool for the job is a comprehensive software as a service (SaaS) solution. Learn more about why SaaS makes sense for water. 

Once your data is centralized and easier to navigate, it’s ready to be mined.

Gold Mining for Data

To push the metaphor to the breaking point, once you’ve converted your data graveyard into a goldmine, it’s time to start mining for gold.

“Mining for gold,” in this sense, means converting raw data into information—becoming both data-rich and information-rich. The biggest opportunities for leveraging data into information fall under three categories: machine learning, improvement of remote and real-time monitoring, and increased collaboration.

The Increasing Promise of Machine Learning

Increasingly, machine learning shows potential to have a huge impact on how water utilities leverage their data to improve operations.

Machine learning is, in brief, the process of using computers to analyze large amounts of data, discover patterns, and use those patterns to make predictions, solve problems, and answer questions. 

Already, machine learning has been applied to water utility data in order to track the spread of COVID-19, reduce energy usage, and detect compliance violations.

Machine Learning and Wastewater

By testing wastewater samples, infectious disease experts are already able to predict upsurges in COVID-19 infections three to seven days before standard swab testing does the same. 

That makes wastewater a window into COVID infection rates among particular populations—provided you have the tools to examine the data accurately.

While current systems for monitoring COVID via wastewater suffer some gaps in information—partly due to reduced detectability in people who have been vaccinated—machine learning has shown promise when it comes to predicting upsurges and tracking COVID’s spread.

What’s more, similar techniques can be used to track other viruses, such as norovirus and polio. You can learn more from our article on wastewater-based epidemiology.

Improving Remote and Real-Time Monitoring Capabilities with Water Data

COVID-19 lockdowns around the world fast-forwarded a general trend, across many industries, towards remote-first work policies. The lockdowns also drove home just how important it is for organizations to be able to access and manage their data remotely.

In this sense, water utilities were ahead of the curve: Many utilities already remotely manage thousands of infrastructure assets using sensors, controllers, and transmitters.

That remote capability is wasted, however, if data is fragmentary—stored natively on a variety of different media (harddrives, thumb drives, backup devices, etc.), accessible only by particular teams or individuals. 

Even utilities who stored data in a centralized fashion on their own local servers faced problems when moving to remote working arrangements, as personnel encountered technical barriers to accessing the organization’s intranet from offsite computers.

A cloud-based SaaS (i.e., Software as a service) is the best solution for utilities that want to make their data available to all relevant personnel, regardless of their locations, at all times. 

With the help of such a system, a water utility can:

  • Cut down on work-related travel and site visits
  • Put in place more accurate and effective alert and notification systems
  • Shorten response times when issues arise
  • Scale new operations quickly across the organization
  • Respond nimbly to staffing shortages or future lockdown situations

Get More Value out of Your Wastewater Compliance Program

Curious about how technology can help your utility tackle NPDES and other wastewater-related compliance challenges for good? Download the guide and book a demo of Klir today.

How Water Utilities Can Use Machine Learning to Reduce Their Electric Bill

In Singapore, the Ulu Pandan Water Reclamation Plant used machine learning to analyze its operational data, and were able to reduce aeration energy usage by 15%.

Instead of using reactive control mechanisms, which adjust wastewater treatment processes in reaction to changing nutrient levels, flow rates, etc., the machine learning algorithm in use at Ulu Pandan creates predictive models, making fine adjustments to the system earlier than it would otherwise.

Effectively, the automated systems at the treatment plant spend less energy playing catch-up with changing conditions—opting, instead, to literally “go with the flow.”

Detecting Violations of the Clean Water Act (CWA) With Machine Learning

In theory, some water treatment facilities are more likely to violate the CWA than others—there’s just no way to know which ones. Unless you apply machine learning to the task, that is.

In 2018, researchers from Stanford demonstrated that machine learning could be used to predict the likelihood of particular water treatment facilities violating the CWA. In theory, with that information, inspectors could be sent to the facilities most likely to be in violation of the CWA, rather than to facilities with a very low likelihood of being found non-compliant.

As their paper in Nature Sustainability demonstrates, using such a system can double the number of violators caught, while allocating inspection resources more effectively. 

There’s also an element of deterrence at work: the researchers theorize that, if water treatment facilities know their data is being monitored and that a machine learning algorithm will be able to anticipate any future violations, they will be more diligent, working harder to ensure violations never occur at all.

Improving Collaboration with Centralized Water Data 

Machine learning and the rise of the distributed workforce are both exciting aspects of water utility data management. In fact, they could have a major impact on the future of how water utilities operate. 

But organizing and centralizing data has the most immediate impact upon a water utility’s most valuable resource: its people.

When data is accessible to all personnel, across all teams, collaboration becomes more fluid, easy, and intuitive. It’s easier for engineers, compliance professionals, operations management, and other stakeholders to take advantage of the utility’s vast store of data, and use it to everyone’s benefit.

Ready to Turn Your Compliance Data Into an Asset?

Klir’s compliance tracking tools help utilities get more out of their data while cutting down on administration and record-keeping work, create new opportunities for collaboration, and provide a level of system-wide visibility unmatched by other water data management systems. Learn more and book a demo today.

Wastewater-Based Epidemiology Is Already Here. How Should Utilities Prepare?

Wastewater-based epidemiology is a scientific field that’s surged in popularity during the COVID-19 pandemic. 

Researchers have used wastewater to predict surges before they happen — a breakthrough that’s helped governments focus on preparation instead of reaction. 

And it isn’t limited to COVID. Once the pandemic becomes endemic, researchers will still be on the hunt for pathogens that can cause issues for public health. In the best-case scenario, scientists could catch the next pandemic before it happens.

But the strategy requires gathering and sorting through huge amounts of data. That presents privacy and data management issues for utilities to overcome. We spoke with two experts about the possibilities of wastewater-based epidemiology, and what utilities should be doing now to prepare.

What Is Wastewater Epidemiology?

Think of wastewater-based epidemiology as reading tea leaves, but grosser. 

Scientists take samples of human waste and analyze them for pathogens, like the virus that causes COVID-19. Researchers can then use the data to predict surges based on trends in where the pathogen is found and how much of it is there.

That allows local governments to ramp up public health measures like expanding health-care facilities and instituting mandates like masking or lockdowns.

The International Effort to Track COVID Using Wastewater

Most people have learned about wastewater epidemiology through the pandemic. Researchers’ ability to predict COVID trends has led to the field’s expansion

Some public health officers have found it’s less helpful in their municipalities, though they’re sometimes not sure why. In a sign of the field’s still-emerging status, experts have debated just how much wastewater sampling can tell us about COVID.

In the United States and Canada, and many other jurisdictions, wastewater sampling decisions lie with local governments, limiting the data’s scope. But projects like COVIDPoops19 show the method’s worldwide spread and a glimpse of what it could become.

Its creator, University of California Merced professor Colleen Naughton, said she hopes every city in the world will eventually have its own wastewater epidemiology outpost — though she noted it’ll take a while to get there. Even though it’s cheaper than individual testing, “it still requires resources and … utility-level sampling, and then courier services, and then labs to analyze it,” she said.

What Could Wastewater Epidemiology Be Used for in the Future?

Some experts believe COVID tracking is just the beginning. 

Naughton noted that the method was used to track polio outbreaks in the past. The U.S. Centers for Disease Control and Prevention (CDC) now hopes to catch influenza, norovirus, fungal infections, hepatitis A and B, and antimicrobial-resistant (AMR) pathogens — bugs that have evolved to be resistant to antimicrobial drugs. 

“We’ve had large hepatitis A outbreaks in California and Michigan, so it would have been nice to have more of a warning system about that,” she said.

Experts have long warned of AMR “superbugs” that could ravage the world in a similar — or worse — way to COVID-19. Wastewater epidemiology could potentially identify them before they spiral out of hand. 

Wastewater testing could also be tested for opioids and other illicit drugs to combat the overdose epidemic, Naughton said.

Future-Proof Your Sampling Operation With Klir

Interested in learning how better data management can help you run more effective sampling programs? Book a demo of the Klir sampling module today.

What Challenges Will Wastewater-Based Epidemiology Create for Drinking and Wastewater Utilities?

Naughton and epidemiologist David Larsen raved, unprompted, about partnering and working with wastewater treatment plants. 

“They don’t get a lot of recognition for their important role,” Naughton said. “We go to the treatment plants and they’re always running around, you know, the phone’s ringing off the hook, and they’re like, ‘Yeah, sure, we’ll take an extra sample for you.’ It’s just been amazing what they’ve done to help. So we really appreciate it.”

Plant operators are “really excellent people,” Larsen said. “They’re a huge contributor to this public health initiative. And so I’m just really glad to be working with them and they’ve been great so far.”

Clearly, utilities are doing something right. But for those that want to get ahead of the game as wastewater testing expands in the coming years, the professors had some tips:

Equipment and Staffing

Some water utilities have absorbed extra epidemiological sampling requirements with ease. But others — especially rural and smaller plants — don’t have the equipment, time, or staff to spare, Naughton said.

Some that can only do a grab sample once per day have tried to time it with the morning flush to get the most data, Naughton said. But the “gold standard” of 24-hour composite sampling yields more complete data, Larsen said. 

Naughton recommended small utilities keep an eye out for programs like the one from the CDC and Water Environment Federation (WEF) where they could apply for free autosamplers. 

Smaller plants will also have more difficulty hiring enough people to carry out the sampling, then shipping it off, Larsen said. 

Lots of government grants are going to labs for analysis — but more cash should be flowing to utilities to hire more staff to carry out the increased sampling load, Naughton said. Public health is “chronically underfunded,” Larsen said, adding that any increased burden shouldn’t come out of utilities’ budgets.

Coordination With Researchers

Scientists analyzing wastewater samples rely on utilities’ knowledge of their systems, the experts said. Things like the amount of industrial waste the facility processes, or how much salt is used on the roads in winter can make a big difference to sampling, Naughton said. 

“So when we see things that are weird with the data, they can be like, ‘Oh, that clarifier went offline then, or we had this industrial flow,’” she said. 

Developing those relationships is critical to a smooth working partnership, Larsen said. 

“Know who the epidemiologists are and know who the environmental epidemiologists are,” he said.

Naughton added that she invites utilities she works with to meetings so they can ask questions and engage with public health departments. 

Data Privacy

Health data is a sensitive topic, and utilities need to make sure security is top-of-mind.

As the population sample decreases, the privacy risk increases, Larsen said. Sampling for COVID-19 and other illnesses at a city level can’t put any individuals’ or groups’ data at risk. But what data is made public from sampling at smaller levels must be considered thoughtfully, the experts said. Generally, sampling becomes a data privacy risk at under 3,000 people, Naughton said.

“It’s public information that public money is going towards… so you should see that data. But we do need to be sensitive to how that data is shared and what it’s used for, and if it’s targeting communities more than helping communities,” she said.

Larsen recommended reading the World Health Organization’s guidelines on public health surveillance. Arizona State University has also studied the field’s privacy implications.

And if local public health units opt to sample for drugs, Naughton said it should be to find out which neighbourhoods need more support — not to increase incarceration or otherwise punish communities.


Fortunately, there’s lots of reading on wastewater-based epidemiology for utilities to do — potentially an “overwhelming” amount, Naughton said. 

She pointed to resources from the Canadian Water Network as a great starting point. Videos like the WEF’s that show what happens to samples can be helpful for utilities as well, she said. 

“Our plants love seeing the data, and getting it back, and (knowing) that it’s being used for something,” she said.

Better data management can help utilities better prepare for wastewater epidemiology

Wastewater epidemiology requires sampling large amounts of data. Klir’s sampling module allows you to stay on top of it all, so you can be ready when researchers come knocking.

Book a demo today to find out how Klir can help you streamline data throughout your organization. Learn more and book a demo today.

Request a Demo

Book a demo with our team and receive the latest industry insights and exclusive offers.

Request a Demo

Book a demo with our team and receive the latest industry insights and exclusive offers.