Computer Vision
What is it
Computer vision is the field of computing and AI techniques that trains computers to interpret images and video, turning pixels into semantic information such as identified assets, defects or events (for example rail defects, infrastructure assets or trespass events). In rail infrastructure monitoring, it means software analysing camera feeds from trains or fixed installations to detect suspected assets, hazards and condition changes for expert review.
Systems such as AIVR in Britain combine high‑definition video with location and telemetry so engineers can review, search and measure the railway remotely, significantly reducing the need for on-site track visits.
Computer vision is typically related to Machine Vision, which is the applied use of computer vision techniques for inspection, measurement and operational decision support, frequently in industrial environments.
Why it matters
Computer vision reduces ‘boots on ballast’, cutting railway-workers’ exposure to live risks while still giving maintainers detailed, frequent views of infrastructure. It also supports earlier, more targeted interventions by flagging vegetation, track defects or electrification issues before they become operational incidents. At industry level, it underpins a shift from periodic, manual inspection to continuous, data‑driven monitoring and condition-based and predictive maintenance approaches, improving both reliability and efficient use of scarce access windows (possessions).
Where it is used
In the UK, Network Rail deploys computer‑vision‑enabled AIVR devices on infrastructure monitoring trains, Class 153 Visual Inspection Units and in‑service passenger fleets across all regions, with pioneering programmes on the North West & Central route for overhead line monitoring and conductor rail fault detection. Desktop Basic Visual Inspection (BVI) of switches and crossings using AIVR imagery is now routine, giving maintainers across the UK rapid visual access without site visits.
Internationally, North American freight railroads use wayside machine‑vision portals to inspect wagons and running gear at line speed. European and Israeli suppliers apply deep‑learning based vision for obstacle detection and driver assistance.
When: key dates
Automated image‑processing for track and pantograph-catenary inspection began to appear in research and pilot systems from the late 1990s and 2000s. Early 2000s freight rail trials in North America used machine vision to inspect trains at speed. In Britain, commercial deployment accelerated in the late 2010s, with AIVR launched in 2018 and then rapidly scaled so that by the mid‑2020s it supported thousands of users and widespread remote visual inspection across the network.
How it works
Computer‑vision systems capture imagery from train‑mounted or lineside cameras, synchronising it with satellite, inertial and other positioning systems and railway referencing systems so every frame can be located on the network. Image processing techniques are then applied, and AI models are trained on labelled examples of assets and defects to classify components, detect anomalies and measure clearances or geometry, often in near-real time. The processed outputs feed into web platforms and decision tools, where engineers filter by location, asset type or risk level, compare time series, and export evidence into workbanks, maintenance plans and incident investigations.
Computer Vision has traditionally used ‘pixel inspection’ type algorithms, which follow rules to try to understand the image. Increasingly Computer Vision is being completed by Machine Learning techniques which consider the image as a whole and pass it through a neural network. Even more recently Vision Transformer Architectures are used for further interpretation of video imagery.
Computer Vision can be applied to a host of differently sensed image and video data from across the electromagnetic spectrum. This may be from sensors that focus on a particular part of the spectrum, e.g. specialist thermal or UV sensors, and it may be on different types of video and imagery capture, e.g. fully compressed video stream from a standard IP camera, or line-scanning imagery collected from a machine vision system.
AIVR utilises all of these different types of techniques as different use cases require different approaches based on the type of imagery, constraints of the compute available and accuracy of data required or certification framework.