Currently, I am understanding Attention models. I specifically need it to build a machine comprehension model (a model which can find answer to a question from a given comprehension). But I want to understand the model generally and not specifically to this topic. I have a question about attention layer but before that I would like to share what I understood.
The things I understood so that you can correct me if I made some mistakes
For this, I have referred an article online which I am sharing here - http://www.wildml.com/2016/01/attention-and-memory-in-deep-learning-and-nlp/
It gave basic intuition of what the attention layer does - It asks the model the model to focus on a specific part of the input in order to generate the output. Here's an image of it - http://www.wildml.com/wp-content/uploads/2015/12/Screen-Shot-2015-12-30-at-1.51.19-PM.png
The pink shades are defined by attention variables 'a(i)'. These variables are actually probability distribution that shows 'the probability that the model should focus on the area represented by that variable'. So, a general figure on any attention based model is as - http://www.wildml.com/wp-content/uploads/2015/12/Screen-Shot-2015-12-30-at-1.16.08-PM.png - where 'a's are the attention variables (weights). These variables are trained by the model.
The question
I have a question in mind. Are the attention weights same for every inputs (in machine comprehension, input = 'comprehension'). If so, then independent of the comprehension, the model will focus on a very small part of every comprehension to find an answer. And it isn't necessary that the answer lies in same part of all comprehensions.
If the weights are different for different inputs, then how will we train them. Because, as per my knowledge, in every neural network, the weights are trained for all the input data irrelevant of the input identifier i.e the final weight values at the end of training is same for all the inputs. And after training, these weights are the final values which are used in real-time application.