Friday, December 23, 2016

Monitoring System Topology

I recently had a discussion with a client in Taiwan regarding the methods used to collect data from remote monitoring stations.  My client does a lot of landslide monitoring, and their technique is to install a cellular modem in every station and collect data from each.  This is a "Multi-Point Collection" network where data is collected separately from each station. Because the terrain is not only remote but also rugged and steep, they are always fighting with cell signal issues.  I have been trying to convince my client to use radios at sites with poor cellular signal, and route the data to a single station with a good signal strength.  For purposes of discussion we'll call this a "Single-Point" collection scheme - where a host station is responsible for polling and collecting data from remote stations over a local wireless network, and data is collected from only the Host station(s).  This is the technique we have incorporated into our wireless EmbankNet ™ dam monitoring system, and we have found it very efficient and cost effective for remote sites.  This blog will describe and attempt to point out the relative benefits of each of these approaches.

SCADA vs. Data Logging

Before we discuss the different forms of data collection, I should say that our discussion is centered around a methodology of monitoring called "data logging" as opposed to SCADA systems that use Programmable Logic Controllers (PLCs).  Data loggers typically are low-power devices designed to be connected to sensors and deployed in remote locations for extended periods, usually using a battery and a solar panel as a power source.  PLCs are found on the factory floor where there is abundant power and where they are dedicated to measuring and controlling things in real time.

Due to technological advances, the line between these two concepts is definitely blurring.  But for purposes of this discussion, a SCADA system is dedicated to controlling processes in real time without humans collecting and analyzing the data.  Data logging is dedicated to collecting the  sensor data for analysis, modeling and reporting.  It's important to point out that data may be collected in a SCADA system, and real-time control may be implemented in a data logging system.  But the hardware and software in each has evolved from a different core purpose.

So our discussion is about how we can collect data from data loggers using different network topologies using modems and/or radios.

Multi-Point Collection

This type of data collection using telemetry is the simplest to implement as it basically collects data from each monitoring station without reference to others.  Each station must have it's own telemetry that connects directly to the collection point at the home or office.  With this topology we periodically connect to each station directly with a server (or have each station connect to the server) and we retrieve the stored data. 

Multi-Point Collection

In the diagram above each station is connected to the Server through the Internet.  We tend to use this technique when our stations are spread over a wide area and there is no "line-of-sight" between them. The benefit of this approach is ease of deployment.  Each station stands alone and only needs some form of communication to access it.  It used to be a phone line, but now it's usually either a satellite or cellular modem.  The major limitation to this approach is that Internet connectivity is not always available at remote locations.  There is also a significant management and cost factor over the long term as service plans must be procured and maintained for each modem.  This is not a one-time occurrence or cost as technology and data plans are changing constantly and service providers may not have the same sensitivity to the importance of your data.  Service providers are creating machine-to-machine (M2M) data plans which makes management and provisioning easier.  But with over 15 years of experience using this approach I can tell you that change is the constant in this industry.

Single-Point Collection

This form of data collection involves using radios and modems in tandem to adapt to local conditions, and to consolidate data into fewer number of data collection points - called Host stations.

Single-Point Collection


This technique requires more up-front programming to configure as remote stations must send data to the Host either on a pre-determined schedule, or in response to a request from the Host. This means that radio configuration, synchronizing clocks and managing connection failures has to be added to the standard data logger programming.  Many people think that radio communications are problematic, but we have not found this to be the case.  In fact, properly programmed and configured, radio communications can be extremely robust and reliable.  And the brand of radio you use does matter.  Adding radios is a technological challenge that has to be overcome.  But it's a one-time cost in labor, and once you develop a system you can use it again in other systems you build.

The advantages to using this data collection method is more flexibility in design and layout, less long-term cost, and it's your network in that you are less dependent on a third party provider.  The disadvantages are increased complexity in programming.  You also need at least two communications ports (one for modem and one for radio) on at least the Host data logger.

Hybrid Systems

Hybrid systems use a combination of radio and cellular networks to adapt to local conditions and extend a monitoring network over a large area.  Where Internet connections are not available, or where we have a quantity of stations over a relatively small areal extent, we use a radio network to connect remote stations to a Host.

Hybrid System - Point and Multi-Point


Host stations are generally located where we have higher quality cellular service.  In this manner information can be consolidated for data collection and also shared from one Host station to another throughout the extended monitoring network.  Enhanced data visualization techniques, like web-based HMIs, can also be used at a Host station to provide real time access to data from almost anywhere. This type of hybrid data logging system with data being freely shared between separate monitoring systems starts to resemble a factory-floor SCADA system - but with a lower rate of data throughput.  Measured points are taken at a frequency appropriate for the purpose of recording the data, but the data is shared to enhance operations and improve awareness.  This will also improve the quality of the collected data as more stakeholders take an interest in data integrity.


Tuesday, November 22, 2016

Rating Curve Management and Display Tool

SiteHawk Rating Curve Tool
This blog is about a specific software tool that Eyasco has developed to aid some customers that have to measure river and stream flow.  This is particularly important for maintaining minimum flows where fish migrate up fresh water streams to spawn. 

What Is a Rating Curve?
A rating curve is a relationship between stage (water level) and flow or discharge at a cross section of a river. Stage is the height of a water surface above an established point and flow (or discharge) is the volume of water moving down a stream or river per unit of time. Flow values can be obtained by applying these rating curve formulas to stage measurements.




In the example above the discharge is 40 cubic feet per second (cfs) when the stage equals 3.3 feet. The dots on the curve represent measurements (concurrent) of stage and discharge used to develop the curve.

To develop a rating curve flow measurements are taken manually at different stage levels.  This is because the slope or curve of a river bank can change enough with water depth so that the relationship between water height and flow is non-linear.  This process is called "rating" the stream.  Historically log relationships were used to create the rating curves because they resulted in a near straight line on a log plot.  However, in more recent years polynomial equations have gained favor due their ability to better handle low-flow conditions.

If the channel geometry changes enough then more than one rating curve at a specific location may be necessary. For example, the picture below illustrates how there can be different volumes of water depending on the shape of the river bank, and there may be one rating curve for water heights less than 2 feet, and another for water levels greater than 2 feet.


Change in channel geometry also creates different stage-flow relationships
If the geometry of the river channel changes due to erosion or deposition of sediment at the stage measurement site then a new rating curve has to be developed.  Over time then a specific rated location will have many rating curves.  Deriving historic or real-time flow from these curves requires application of the correct rating based on the stage values and/or the date and time.

The Rating Curve Tool

The Rating Curve tool included with SiteHawk allows users to enter and manage rating curves and to derive plots of flow based on application of these curves to stage measurements. 




Each rating curve entry allows a user to define a date range for application.
 


And a minimum and maximum stage level



The application will apply rating curves based on their applicability to a specific level measurement, and generate continuous, historically accurate flow data.  The tool can chart both the stage and flow values. Changing the polynomials on the application will reflect instantaneously on the chart the next time it is generated. Graphs can be generated for viewing or export from the user interface.


Applying the Rating Curve In Real Time
Once a user has entered the rating curves for a specific rated stream location, they can generate plots of flow manually, or decide to use the curves to generate flow data in real time  - as if the flows were actually being calculated or measured in the field!  The benefit of using this approach is that flows can be displayed in real time on charts (Infopages) and maps (SiteHawk).  


The Rating Curve tool is a powerful application that allows users to manage and organize their rating curves, generate flow calculations, visualize their calculations with graphs, and automate flow calculations for real-time monitoring.

Tuesday, September 27, 2016

Big Springs Ranch Salmon Restoration Project

In 2009 the Nature Conservancy (TNC) began an effort to restore the coho salmon populations in the Klamath basin with the Shasta Big Springs Ranch Project. The Klamath River was once a major salmon producing. Changes to Klamath River flows and habitat loss due to human activities and development in the last 150 years have caused declines in the salmon runs. The Shasta River is a tributary to the Klamath River and was historically a major salmon producing stream.  Shasta Big Springs Ranch, located in the upper portion of the watershed, is a critical area where springs maintain both flow and cool temperatures.  TNC acquired the ranch and adjacent lands several years ago with the aim of improving habitat for coho and Chinook salmon and steelhead trout.
Shasta Big Springs Ranch Study Site (Reference Note 1)
Land and water use changes through time have led to water temperatures in the  Shasta River basin that do not support all life stages of these cool water fishes.   Critical to salmon and steelhead survival are appropriate water temperatures for summer rearing and spring-time juvenile migration.

Big Springs Creek’s water source provides water in the 10C-12C temperature range year-round and has flow rates that seasonally range from 40 to 80 cfs. Therefore, to create and maintain a suitable habitat for the Salmon along Big Springs Creek and downstream of the creek, maintaining the flow and these lower water temperatures provided by Big Springs Creek is critical.

Elevated water temperatures on Big Springs Creek were caused by low water levels and lack of shade due to loss vegetation, and inflows of irrigation return water. In 2009 livestock were prevented from entering Big Springs Creek which caused an increase in the aquatic vegetation including extensive emergence vegetations. This added vegetation increased water depth (via flow resistance) and shade, both aided in reducing heating of the water in Big Springs Creek. The added vegetation also provided cover for juvenile salmon and formed a basis of the food web (primary production and invertebrate populations) that support young salmon.
Year-on-year temperature measurements from Big Springs Creek and nearby irrigation canal

TNC was targeting a living landscape approach the would support both instream flow and habitat as well as providing a means to support agricultural practices.  In 2010 Eyasco was contracted to install a network of wireless solar-powered monitoring stations that would collect temperature and flow data and display real-time data on a web site that could share the data with ranch managers and staff. 

Automated temperature and flow monitoring stations on Shasta Big Springs Creek
The concept was that ranch managers could observe water temperatures at multiple locations on the creek and in diversion canals and return flow facilities, and use this information to operate in a manner that would minimize temperature impacts due to irrigation return flow.  Differences between irrigation water in off-stream canals and the water in Big Springs Creek in real time were accessible to managers, allowing them to release water back into the creek when temperature differences were within acceptable limits.
Daily temperature swings in creek and canal
 
Before the Shasta Big Springs Ranch project, only 30-60 feet of Shasta Big Springs Ranch had suitable habitat for salmon. After the introduction of better management practices including real-time temperature monitoring suitable habitat was increased to 10 miles with an overall 7.2 degrees C drop in the water temperature during summer. Eyasco’s real time low-power monitoring system continues to provide critical data year-round information for ranch managers.

Monday, June 6, 2016

Satellite M2M Communications - An Expensive Lesson


Satellite modem traffic (in Mbytes) at 6 sites over a 4 month period

Many of our monitoring systems use either cellular or satellite technology to transmit data from our remote sites to a computer tasked with managing the data.  This use of the technology is called "machine-to-machine" or M2M, and there are many companies now offering service plans tailored for this application. Cellular bandwidth fees are pretty inexpensive - and our data requirements are pretty low compared to the average consumer.  But satellite bandwidth fees are quite a bit more expensive and restrictive in the sense that maximum usage is capped at levels much lower than cellular 'limits'.  For example - a typical cellular usage plan might be something like $40 for 5GB of data a month where a satellite plan might be $44 for 2MB.  This simple fact taught us a valuable lesson about something that is happening on the Internet that most of us never see.  It's amazing really and it has me thinking seriously about the efficacy of connecting our infrastructure through the Internet.

The graph above shows monthly data usage at 6 sites using Galaxy Communications BGAN/M2M service.  The same amount of data collected each month from each site, yet the usage in the first two months is 3-10 times greater than in the last two months.  What gives?

One word - FIREWALL.  During the first two months shown on the chart there was no firewall enabled which allowed any IP access to the modem.  There was no real security vulnerability to the connected devices - the attached measurement controllers were not connected to any other infrastructure and there were no control capabilities built into them.  What was really surprising was analyzing the packets to see what other IPs were accessing or trying to access the modems.
  • Egypt
  • Philippines
  • China
  • Hungary
  • Japan
  • Greece
  • Russia
It was only through diligence and persistence of Eyasco employees that this was even discovered.  It took many hours over several months polling through packet reports to determine the cause of the extra usage over that anticipated for data collection.  Approximately 85% of the bandwidth usage without the firewall restricting traffic to a single IP is from "non-native" IPs.  Good for the satellite company as this resulted in "Out-of-Bundle" usage fees of over $1000.

It bears repeating that while this level of extra-curricular traffic is huge and costly for the satellite modems - it would probably not even be noticed on a cellular modem. The satellite modems above have monthly plans of 2Mbytes each. We have a cellular plan that includes 250Mbytes for any number of modems and we rarely go over.  It takes some serious IP camera viewing or web HMI viewing to jack the costs over the limit.  Even then the penalty is on the order of $50 rather than $1000.

And the conclusion seems to be that there is a significant amount of effort being expended world-wide to hack into any public-facing unprotected access point!




Thursday, May 19, 2016

Multi-Site Management with Merlin Enterprise


Eyasco started as a business building monitoring systems in the geotechnical and drinking water industries (another blog perhaps on "what is geotechnical monitoring?").  Our goal was to build monitoring systems that 'included' data management and display.  From day one we wanted a true end-to-end solution, the kind of thing you connect sensors to and view data on a web browser or a smartphone.  It sounds common place today - but we started in 2003.

 The innovative concept my partner and I came up with was to embed the data that was being collected with enough information that our data collection software would know exactly what to do with it.  In other words, instead of collecting the data in spreadsheets or in a database and cutting and pasting (in the case of the spreadsheet), or programming (in the case of the database), our software would collect the data and it would be ready for display because of the bits of information (metadata) we embedded in the data stream.  Our thinking was everyone who makes one of these monitoring systems has to program them – so that’s a given.  If we could eliminate the so-called “middle-tier” programming then we would create a fast-track to presenting data on the internet.  We built the monitoring systems in my garage, and were serving data on the Internet with a server in my bedroom.   
 





















First we monitored things like flow, water level, turbidity (water clarity) pH and conductivity at "mountain" spring sites.  After we got really good at these low-power systems, we started doing other types of monitoring, not only adding other sensors, but other types of monitoring for control and security.  We love the challenge of designing new systems, adding new technology and sensors, and integrating new types of telemetry.  But we rarely have to work on our software – unless a client requests something new.

The web display part has credentials and role assignment built into it, so access to data and web parts if controlled through a credential manager. 



So what makes our approach so good?  Imagine your business is sending water treatment systems all over the world.  You want to monitor the health of all of those systems, and you want to give each end-user the ability to monitor their own system.  Our approach would be to connect the sensor outputs to one of our QuB monitoring systems - which include Campbell Scientific measurement and control units (MCU).  We would program the MCU for the number and types of sensors.  Once the unit was deployed and connected to telemetry (cellular modem, iphone, whatever), we would see it and download not only the data, but it’s location.  It would show up on a map and all the data from the sensors would be visible in tabular and graphical form.  All of this without any programming on the data collection side.  The only configuration that is necessary is for the admin user to log in and define who gets to see what.  This is all done by creating and assigning roles through the web interface. 

 

This works very well for a small company like us.  Programming and configuration is largely confined to the controller - which we cannot avoid anyway.  But once that is complete the display pretty much comes with deployment.  When our field crew is finished with an install, the data is available on a password-protected web page before they get in the truck to leave.  All our clients know is that - they get their data.

Wednesday, August 26, 2015

The Business Model and Database Design

What is a "relational database"? You can look it up on Wikipedia:

A relational database is a digital database whose organization is based on the relational model of data, as proposed by E.F. Codd in 1970.[1] This model organizes data into one or more tables (or "relations") of rows and columns, with a unique key for each row. Generally, each entity type described in a database has its own table, the rows representing instances of that type of entity and the columns representing values attributed to that instance. Because each row in a table has its own unique key, rows in a table can be linked to rows in other tables by storing the unique key of the row to which it should be linked (where such unique key is known as a "foreign key"). Codd showed that data relationships of arbitrary complexity can be represented using this simple set of concepts.

The definition goes on to explain the differences with hierarchical data structures, etc.  Perhaps technically correct but doesn't tell the whole story.  To me the relational database is used to define how logical subsets of data are related and how they defend the integrity of the business model the database supports. 

Data Tables

In general the tables in a database should be "as little as possible" consisting of the least number of columns possible to define a unique record.  The link to other tables defines fundamental relationships between the data - like "parent-child" for example.  When constructed the entire database defines not only these relationships, but how data flows through the business process that supports it.  A well designed database defines the entire business model and can accommodate changes and additions with minor modifications.  This can happen when the designer spends enough time with his or her feet on the ground to understand the business process, and creates data structure that is granular - almost molecular - in it's composition. This takes the most time to create, but also creates the most flexible and long-lasting structure.  There are many other considerations, but there is no substitute for the really hard work of defining the business model with the database. 

A relational database is not a spreadsheet - or a collection of spreadsheets.  A spreadsheet makes sense for 2-dimensional representation of data and is used primarily to inform the human eye.  It works great for the human eye because we can quickly relate to the two-dimensions and peer down into the individual data pieces.  But it's not efficient for a computing engine - something that is designed to find and extract pieces of information as quickly as possible.  The example below shows how the eye can quickly find a measurement by triangulating between dates in the rows and the instrument in the columns. Suppose you were asked to find OW-12 on 7/20/94:


While the eye can do this in an instant, it's very inefficient form of data storage.  One way
to understand why is to look at the column headers.  They are unique for each instrument so they essentially require a custom data structure.  If you add a new measurement, you change the table structure.  Every time you search you don't know which column the result will be in.  Compare that structure with the following.


The table above would be used in a relational database to store the data shown in the spreadsheet. The data has been 'normalized' to minimize data redundancy and provide an efficient search path.  It consists of only three columns no matter how many instruments you have.  The first two columns define a unique record - the date and the sensor name. To find the record we found in the spreadsheet we work from left to right to find the day, then sensor, and then the value.  Not as easy for the eye perhaps, but easier for a database.

So how is this structure used to define a Business Model?  By first defining what constitutes a unique record in a table (called Primary Keys), and then creating relationships between tables, you define how data will be used to define your business model.

Relationships

Example - Customers, Contracts and Plants

Business Model 1

Lets assume your database is defining a customer, the contracts you have with that customer, and the plant where the work will be done.  Maybe you first consider a simple business model like:

" Plants and Contracts belong to a Customer.  Multiple Projects can be grouped under Contracts"

This can be represented with a simple organization chart as follows:



Business Model 2

What if another customer then presented you with another business model scenario, like:
 
"Project Numbers are specific to Plants, with the possibility of multiple projects under a single contract"
 
You need to be able to define something like the structure shown in the figure below where Plants are related to Projects:
 
 
 

Business Model 3

Then another customer presents you with another business model:
 
"Two separate customers with their own contracts and projects are using the same plant."
 
 
 
 
Unless you want to spend all your time programming, you want your database design to be able to represent and enforce the integrity of ALL the business models you have to support. The figure below shows an actual database design that supports the above scenarios.
 
 
 
 
It's not as complicated as it looks.  It almost looks like the data structure for Business Model 1.  The key is that Plants is not connected directly to Customers.  It is linked to both Projects (Contract_Projects) and to Customers.  The link to Customers is not direct either.  It is using a "relationship-table" where the relationships between plants and customers is defined.   All the above relationships defined by business models 1 through 3 are supported by this structure - without data redundancy.
 
 

Tuesday, August 25, 2015

Monitoring Dam Safety

Monitoring a dam is a lot harder than you think.  Some might think that it's easy because, in general, dams don't move.  But it's precisely for this reason that dam monitoring is difficult and requires a special discipline.  How to stay interested when there isn't anything interesting going on?

Did you know that many dams in the US have reached or exceeded their design life?  We don't build many new dams because it's difficult to get a dam through the approval process, and partially because this is so, the cost is excessive. We are increasingly dependent on aging dams (like our aging infrastructure as a whole).  If a dam is past it's design life it isn't necessarily in danger of failing.  In fact in all likelihood most well-designed and built dams will still be sitting there when the lake behind it is filled with silt.  But predicting behavior in an earthquake, for example, becomes difficult. We can't say with certainty that a dam 10 years past it's design life of 50 years (60 years old) will behave a certain way in an earthquake - since we don't have similar observation to go by.  So we must watch our dams a little more carefully.

How do we watch a dam?  In general there are four basic types of measurements one can take to monitor the long term condition of an embankment dam:

  1. Pore pressure measurements
  2. Surveying surface points
  3. Seepage measurements
  4. Visual observation
The first three methods can be automated and often are.  Of these three, pore pressure measurements is the simplest way to get a direct measurements of the dams current condition compared to its theoretical design.

Pore Pressure Measurements
The proper design of a dam requires an understanding of how water will ultimately flow through the dam.  As the diagram shows below, the idea is to NOT let the phreatic surface, or top of the saturated zone, reach the toe of the dam with enough pressure behind it to flow with destructive force.  Many dam designs incorporate clay cores and gravel drains to prevent any seepage to reach the front shell of the dam.  Many dam monitoring systems use buried pressure transducers - called "piezometers" - to measure the pore water pressure in the dam and define the phreatic surface.  These devices are buried in a dam and can measure water pressure even with very small amounts of water being present.

Figure 1 - Flow net through embankment dams with and without drain blanket
 

In general, one would like to see measured pressures fall within certain operational limits based on reservoir head and the location of the pressure sensing element.  A pressure sensing element placed anywhere within a dam should read close to the design pressure - called the design phreatic surface.  In the event of an earthquake (if the dam is located in a seismic zone) then the before and after pressures should be the same.  If they are not then one has to quickly determine why.  A certain amount of increased pressure might be expected in the less permeable portion of the dam due to  pore pressure build up during seismic shaking.  But if the dam structure fails in any way that allows a more direct connection to reservoir head, this could cause critical failure. The plot below shows historic pressure readings in a dam in Northern California compared to reservoir elevation data.

Figure 2 - Historic piezometer and reservoir elevation plot
 
 
Historical Trends
Having piezometers wired up to a recording system allows pore pressure to be monitored at regular intervals in all conditions.  This provides a historical perspective for not only evaluating dam safety, but also understanding the actual conditions within the dam.  A piezometer like that shown in Figure 2 shows an attenuated response to reservoir level, but at a lower pressure level than full reservoir head.  This is normal response for a piezometer located in the interior of the dam.
 
Some piezometers located above the phreatic surface will exhibit no response to reservoir head (Figure 3) and some in highly permeable zones (for example in the dam abutment) will mimic reservoir level when water levels are above their tip elevation (Figure 4). 
 
Figure 3 - Historic piezometer plot with no response
 
Figure 4 - Historic piezometer plot that follows reservoir level
 
 
All of these piezometer time series provide a signature - like an electrocardiogram (EKG) does for a heart - of the interior conditions within a dam.  From the outside a dam may look static and unchanging, but a electronic monitoring of the dams interior pressure provides a more dynamic picture - one that changes with the seasons and with age.  It's an interesting and invaluable perspective for assessing the health of our aging dams. 
 
References: