Digitalization of the 4 Mountains Test for (4MT) for Robot-Administered Spatial Memory Assessment
Background
The 4MT consists of:
-
Topographical tasks: Computer-generated landscapes containing four mountains where the topography (geometry of the surface) can be varied
-
Non-spatial information tasks: Tasks where non-spatial visual features can be independently manipulated
In both task types, the non-tested attributes (spatial/non-spatial) remain the same across four choices but differ from the sample. The changes in viewpoint and non-spatial properties between sample and target ensure that topographical tasks depend on matching allocentric topographical information rather than simple visual pattern matching. The 4MT is shown in Figure 1 and Figure 2.
Bottom: Examples of non-spatial and topographical items. In non-spatial tasks, participants matched scenes by features such as cloud cover, lighting, texture, and vegetation color (target bottom-left). In topographical tasks, matches were based solely on landscape layout, with viewpoint and non-spatial features varied (target top-left). Spatial, configural, and elemental foils occupy the top-right, bottom-left, and bottom-right positions, respectively.
Objectives
-
Design a Digital 4MT: Create a functional prototype where participants can complete the 4MT on a digital medium without human supervision
-
Define the System Architecture: Clearly define the system's inputs (images, touch/voice responses, video monitoring) and outputs (performance metrics, logs, scores, user feedback) to ensure robust data collection
-
Prepare for Robotic Integration: Design the system with a clear interface for future integration into a larger framework where a robot or virtual agent administers the test autonomously
-
Level 1 (Advanced Stimuli) Implement Personalized Stimuli Generation: Enable the system to capture live scenes (e.g., in a user’s home) and automatically generate the necessary test images using multi-view image generation [2]
-
Level 2 (Advanced Interaction) Integrate Multimodal Interaction: Utilize sensors for multi-modal input (ex. audio, visual, touch input) to record responses and monitoring user behavior for the system to react to [3]
Level of Work
-
This thesis is suitable for a Bachelor or Master student with interest in Human-Computer/Robot Interaction (HCI/HRI) with applications in the medical/care field. It combines software implementation with applied research questions on usability in design, software application development, and system integration
-
This work requires you to have experience with the Python programming language. Familiarity with ROS, deep learning techniques for computer vision and code versioning (Git/GitLab) can be advantageous
-
This work will give you the opportunity to (1) gain experience in Python and ROS programming; (2) learn to utilize machine learning models for graphical manipulation; (3) practice scientific writing