Yoel Zeldes • 3/22/2018

Gated Multimodal Units for Information Fusion

This technical article details the Gated Multimodal Unit (GMU), a neural network component for multimodal information fusion. It explains the GMU's self-attention mechanism, which allows a model to dynamically weight input from different modalities (e.g., vision and text) based on their relevance. The post includes the model's equations and a practical implementation with a synthetic dataset to demonstrate how the GMU learns to ignore noisy input channels.

0 comments

#Neural Networks #Deep Learning #Tensorflow