Once captured, a digital video image can now be analyzed using software tools. This has lead to a number of new or emerging CCTV applications:
Here, AI-based software automatically analyses the video data looking for pre-identified patterns of behavior. When such behavior is identified, pre-set actions are then triggered. An enormous amount of research is underway in this area, since it has the potential to aid monitoring staff by highlighting potentially significant events for their scrutiny. However, the technology is still very much in its infancy. Plagued by high false positive rates, only the very simplest of these applications have so far been shown to be commercially deployable. Generally these applications require a very controlled environment to operate effectively. For example, video analytics is used to reduce employee fraud in supermarkets by reconciling shape recognition in CCTV data of items filmed on the check-out belt with EPOS data from items keyed by checkout staff. It is also used in an anti-terrorist application to detect abandoned packages on, for example, railway platforms or other public areas where people gather. Software ‘learns’ what the camera view looks like, and compares successive image frames looking for new items that remain stationary. London Underground has trialed this technology with some success. Other video analytic applications that are for the moment more of a work in progress, include algorithms to identify people running, fighting, falling, carrying guns or knives. There are algorithms for car crime (identifies human-sized object, moving from car to car to car in a car park) and potential suicides on railway platforms (apparently, would-be suicides stand as far away from the platform edge as possible and then move to the edge, and repeat this activity several times before making their attempt. This pattern of movement is in theory identifiable using video analytics).
A branch of video analytics. Currently this can be made to work with a high degree of reliability, but only in very controllable environments, for example police interview suites, access control gates, or border control points. The latest generation of systems being used at border control points work by projecting a grid of light ‘stripes’ onto the individual’s face and then using CCTV images of the distortion of the grid to build a 3D model of the face. Recently this technology has been shown to work to a useful level of reliability as travelers walk normally through a turnstile type ‘pinch point’ designed for the purpose. The still unresolved problem is the level of time consuming data crunching required for matching each new person passing through against the data of known ‘wanted’ targets. Currently it takes about three seconds for comparison, making the system still unworkable for high footfall urban transit applications.
Existing video storage systems automatically apply ‘metadata’ tagging to video image data to provide various search functionalities. The simplest of these metadata tags are date, time and camera ID number, allowing users to search CCTV data by time and date, or by the images from a particular camera. However, as a by-product of video analytics, if AI software can identify objects in CCTV image data then it can also automatically apply appropriate metadata tags to the image data. An example might be ‘person running’ or ‘red truck’. In the future such metadata could be streamed live (which would require only small bandwidth) from the various CCTV systems around the country to a central real-time searchable database facility. However, for the moment, the lack of a standards for metadata tagging is hampering any such efforts.