ISEP - Dissertações de Mestrado
URI permanente desta comunidade:
Navegar
Percorrer ISEP - Dissertações de Mestrado por autor "ABREU, DIOGO PAUPÉRIO ANTÓNIO DE"
A mostrar 1 - 1 de 1
Resultados por página
Opções de ordenação
- Computer Vision for Image-Based Automated Analogue Gauge ReadingPublication . ABREU, DIOGO PAUPÉRIO ANTÓNIO DE; Viana, Paula Maria Marques Moura GomesAnalogue gauges display physical quantities through a needle over a marked scale and remain common in vehicles, medical devices, and industrial systems. In practice, readings are often recorded on site by a human operator. Replacing all gauges with networked sensors is possible but typically costly and disruptive. We developed a lightweight on-device system based on computer vision and machine learning. The pipeline operates in two stages. First, it detects the gauge within an image. Second, it estimates the value using ellipse and needle geometry combined with optical character recognition (OCR) and a robust angle-to-value mapping. The system runs fully offline on a mobile device, following a privacy-by-design approach: images remain on the device, no raw frames or identifiers are transmitted, and intermediate data are kept only for local inference. All quantitative results presented here refer to the desktop pipeline. The mobile application demonstrates on-device feasibility but is not benchmarked for accuracy or latency in this work. Two complementary datasets were used for evaluation. The Tailored Test Set contains controlled images with known pose, where the needle is rotated through fixed angles, and is used to stress-test the reading stage under standardised conditions. The Unfiltered Test Set contains real and heterogeneous images with unknown orientation and scale, and is used to evaluate the complete pipeline under unconstrained conditions. For detection, YOLOv5-Lite-S was trained on the largest dataset used in this research compared to prior work, combining about ten thousand synthetic images with more than seven hundred real ones. This mixed training improved performance on a real-world benchmark of 568 images from mAP0.50:0.95 = 30.2% to 36.8%, with mAP0.50 = 82.7%. In task-based evaluation, the model achieved a mean detection rate of Rdetection = 91.3% on the Unfiltered Test Set and 100% on the Tailored Test Set. On a shared subset from a prior benchmark, the detector reached 100% detection versus 98.6% previously reported. For reading, the end-to-end pipeline completed 91.8% of cases on the Tailored set and 82.7% on the Unfiltered set. Within the Unfiltered set, 46.3% of all images — corresponding to 56.0% of completed reads — were within ±5% of full scale. On matched gauges, the system outperformed prior results in most metrics, including black-dial cases that are typically difficult due to low contrast and sparse labels, although unconstrained reading accuracy did not exceed the best published results in every scenario. The system is deployable, privacy-preserving, and has a low computational footprint, which is valuable where processing, connectivity, or data governance constraints limit cloud-based solutions. We also explored large multimodal models to assess their potential for understanding scale limits and units. Without any fine-tuning, the model correctly identified minimum and maximum values and units in 81.8% of gauges, though it was not integrated into the pipeline, final readings still rely on classical computer vision. In summary, a fully automatic detector–reader can be built with modest computational resources. Mixing synthetic and real data improves detection on real images. Reading is reliable under standardised conditions and feasible under unconstrained ones, with accuracy mainly limited by OCR quality and label sparsity. Future work should focus on on-device benchmarking, increased robustness, and integration of more advanced OCR or multimodal models.
