Immersive-Envs

Project Overview

Integrating augmented reality (AR) with externally hosted computer vision (CV) models can provide enhanced AR experiences. For instance, by utilising an advanced object detection model, an AR system can recognise a range of predefined objects within the user’s immediate surroundings. However, existing AR-CV workflows rarely incorporate user-defined contextual information, which often come in the form of multi-modal queries blending both natural and body language. Interpreting these intricate user queries, processing them via a sequence of deep learning models, and then adeptly visualising the outcomes remains a formidable challenge.In this paper, we describe Situated Imaging (SI), an extensible array of techniques for in-situ interactive visual computing. We delineate the architecture of the Situated Imaging framework, which enhances the conventional AR-CV workflow by incorporating a range of advanced interactive and generative computer vision techniques. We also describe a demonstration implementation to illustrate the pipeline’s capabilities, enabling users to engage in activities such as labelling, highlighting, or generating content within a user-defined context. Furthermore, we provide initial guidance for tailoring this framework to example use cases and identify avenues for future research. Our model-agnostic Situated Imaging pipeline acts as a valuable starting point for both academic scholars and industry practitioners interested in enhancing the AR experience by incorporating computationally intensive AI models.

Towards Situated Imaging: Bridging Augmented Reality and AI for Smart In-Situ Assistants

Project Overview

Project Members