Liu, Yang (2017-08). Human-Tool-Interaction-Based Action Recognition Framework for Automatic Construction Operation Monitoring. Doctoral Dissertation.
Monitoring activities on a construction jobsite is one of the most important tasks that a construction management team performs every day. Construction management teams monitor activities to ensure that a construction project progresses as scheduled and that the construction crew works properly in a safe working environment. However, site monitoring is often time-consuming. Various automated or semi-automated tracking approaches such as radio frequency identification, Global Positioning System, ultrawide band, barcode, and laser scanning have been introduced to better monitor activities on the construction site. However, deploying and maintaining such techniques require a high level of involvement by very specific well-trained professionals and could be costly. As an alternative way to monitor sites, object recognition and tracking have the advantage of requiring low human involvement and intervention. However, it is still a challenge to recognize construction crew activities with existing methods, which have a high false recognition rate. This research proposes a new approach for recognizing construction personnel activity from still images or video frames. The new approach mimics the human thinking process with the assumption that a construction worker performs a certain activity with a specific body pose using a specific tool. The new approach consists of two recognition tasks, construction worker pose recognition and tool recognition. The two recognition tasks are connected in sequence with an interactive spatial relationship. The proposed method was developed into a computer application using Matlab. It was compared against a benchmark method that only uses construction worker body pose for activity recognition. The benchmark method was also developed into a computer application with Matlab. The proposed method and the benchmark method were tested with the same sample set containing 500 images of over 10 different construction activities. The experimental results show that the proposed framework achieved a higher reliability (precision value), a lower sensitivity (recall value), and an overall better performance (F1 score) than the benchmark method.