The MACCHINA project advocates a new paradigm for multimodal problems that is based on learning continuous embeddings that are spatially aware. The goal of MACCHINA is to create a system that conducts a natural language dialog with a user about the joint visual context. The way MACCHINA realizes this is divided in four different parts that will all work together.
The four different parts in MACCHINA.
By integrating MACCHINA in self- driving cars passengers are be able to have a dialog with the car about its next action (ex: Where to park the car? In the sun or in the shade?)
Integrating MACCHINA in surgical robots could help surgeons during operations by allowing them to give verbal instructions. The robot then might be able to recognize but also predict certain medical events.