Energy Sum
In this method, the values in each vertical strip of the spectrogram are summed. The sequence of sums, one sum per vertical strip, makes up the detection function. Energy summation is a useful detection method for scanning sounds to find everything of interest, particularly when the target sounds may be unknown or highly variable. It can also be used as a first step for a more sophisticated multi-step classification system.
Adding the module
From the PAMGuard file menu select Add Modules > Detectors > Ishmael energy sum.
Give a name to the detector, and hit OK.
This detector will require that you also configure a FFT (Spectrogram) Engine as it’s input.
Configure the module
From the PAMGuard Settings menu select Ishmael energy sum Settings … to open the dialog
Data source and channels
Select the FFT data source. You should think carefully about the best FFT Length and FFT Hop to use based on the sample rate of your data and the type of sound you are trying to detect. See here for more information on this important, but often overlooked, topic.
Select which channels you want to process and how you want them to be grouped. If channels are in the same group, it’s slightly easier for the PAMGuard localisers to measure time delays between data on the different channels, but only if the hydrophones in each group are quite close together.
Energy Sum calculation
The Ishmael Energy Sum, detector initially had a static threshold. This means that if the overall noise increased, e.g. due to weather or ship noise, many more sounds would pass over threshold resulting in more false positives. To remedy this, there are two potential options: the use of an “energy ratio” or an “adaptive threshold” with additional options to smooth the energy sum output and/or use log scales.
Use Energy Ratio
When enabled the Ishmael detector peak is the ratio of the energy sum between two frequency bands. This can help in situations when, for example, broadband noise triggers the detector.
Use Adaptive Threshold
The adaptive threshold tracks background noise, ensuring that the threshold is always some number of dB above the background and thus significantly minimises false detections. The downside of this is that the relative sensitivity of the detector changes with background noise (i.e. the detection range changes with noise). The long filter defines how noise is tracked - a lower number means a larger averaging window - a typical value might be 0.00005. If there is a loud, short sound, then the adaptive noise filter will take a long time to decay back to normal. If the result reaches Spike Threshold current energy, then noise exponentially decays.
Use Detector Smoothing
Detector smoothing smoothes the energy sum output so that small spikes do not trigger the detector - this is useful for longer sounds. A typical value might be 0.1.
Use Log Scale
The energy sum is calculated from a log scale base ten scale - in other words the threshold values become units of dB