Basically, my question is how to make my model work with infinite
vocabulary.
It would be unwise (how are you going to do optimization with such data?).
But you don't have to. Basically you're asking how to deal with unknown words.
One answer for that is to just use some other representation for words - instead as representing them as one-hot vectors from some vocabulary, you can use subword features (like characters or character n-grams) - you can find papers using this terminology, they're also called character-level features.
For intuition you could look into lingustic knowledge - most words aren't actually completely unrelated to other words, but they're formed from more basic parts, or morphemes.