"Is there a better word for that distribution?"
There's a worthwhile distinction here between using words to describe the properties of the distribution, versus trying to find a "name" for the distribution so that you can identify it as (approximately) an instance of a particular standard distribution: one for which a formula or statistical tables might exist for its distribution function, and for which you could estimate its parameters. In this latter case, you are likely using the named distribution, e.g. "normal/Gaussian" (the two terms are generally synonymous), as a model that captures some of the key features of your data, rather than claiming the population your data is drawn from exactly follows that theoretical distribution. To slightly misquote George Box, all models are "wrong", but some are useful. If you are thinking about the modelling approach, it is worth considering what features you want to incorporate and how complicated or parsimonious you want your model to be.
Being positively skewed is an example of describing a property that the distribution has, but doesn't come close to specifying which off-the-shelf distribution is "the" appropriate model. It does rule out some candidates, for example the Gaussian (i.e. normal) distribution has zero skew so will not be appropriate to model your data if the skew is an important feature. There may be other properties of the data that are important to you too, e.g. that it's unimodal (has just one peak) or that it is bounded between 0 and 24 hours (or between 0 and 1, if you are writing it as a fraction of the day), or that there is a probability mass concentrated at zero (since there are people who do not watch youtube at all on a given day). You may also be interested in other properties like the kurtosis. And it is worth bearing in mind that even if your distribution had a "hump" or "bell-curve" shape and had zero or near-zero skew, it doesn't automatically follow that the normal distribution is "correct" for it! On the other hand, even if the population your data is drawn from actually did follow a particular distribution precisely, due to sampling error your dataset may not quite resemble it. Small data sets are likely to be "noisy", and it may be unclear whether certain features you can see, e.g. additional small humps or asymmetric tails, are properties of the underlying population the data was drawn from (and perhaps therefore ought to be incorporated in your model) or whether they are just artefacts from your particular sample (and for modelling purposes should be ignored). If you have a small data set and the skew is close to zero, then it is even plausible the underlying distribution is actually symmetric. The larger your data set and the larger the skewness, the less plausible this becomes — but while you could perform a significance test to see how convincing is the evidence your data provides for skewness in the population it was drawn from, this may be missing the point as to whether a normal (or other zero skew) distribution is appropriate as a model ...
Which properties of the data really matter for the purposes you are intending to model it? Note that if the skew is reasonably small and you do not care very much about it, even if the underlying population is genuinely skewed, then you might still find the normal distribution a useful model to approximate this true distribution of watching times. But you should check that this doesn't end up making silly predictions. Because a normal distribution has no highest or lowest possible value, then although extremely high or low values become increasingly unlikely, you will always find that your model predicts there is some probability of watching for a negative number of hours per day, or more than 24 hours. This gets more problematic for you if the predicted probability of such impossible events becomes high. A symmetric distribution like the normal will predict that as many people will watch for lengths of time more than e.g. 50% above the mean, as watch for less than 50% below the mean. If watching times are very skewed, then this kind of prediction may also be so implausible as to be silly, and give you misleading results if you are taking the results of your model and using them as inputs for some other purpose (for instance, you're running a simulation of watching times in order to calculate optimal advertisement scheduling). If the skewness is so noteworthy you want to capture it as part of your model, then the skew normal distribution may be more appropriate. If you want to capture both skewness and kurtosis, then consider the skewed t. If you want to incorporate the physically possible upper and lower bounds, then consider using the truncated versions of these distributions. Many other probability distributions exist that can be skewed and unimodal (for appropriate parameter choices) such as the F or gamma distributions, and again you can truncate these so they do not predict impossibly high watching times. A beta distribution may be a good choice if you are modelling the fraction of the day spent watching, as this is always bounded between 0 and 1 without further truncation being necessary. If you want to incorporate the concentration of probability at exactly zero due to non-watchers, then consider building in a hurdle model.
But at the point you are trying to throw in every feature you can identify from your data, and build an ever more sophisticated model, perhaps you should ask yourself why you are doing this? Would there be an advantage to a simpler model, for example it being easier to work with mathematically or having fewer parameters to estimate? If you are concerned that such simplification will leave you unable to capture all of the properties of interest to you, it may well be that no "off-the-shelf" distribution does quite what you want. However, we are not restricted to working with named distributions whose mathematical properties have been elucidated previously. Instead, consider using your data to construct an empirical distribution function. This will capture all the behaviour that was present in your data, but you can no longer give it a name like "normal" or "gamma", nor can you apply mathematical properties that pertain only to a particular distribution. For instance, the "95% of the data lies within 1.96 standard deviations of the mean" rule is for normally distributed data and may not apply to your distribution; though note that some rules apply to all distributions, e.g. Chebyshev's inequality guarantees at least 75% of your data must lie within two standard deviations of the mean, regardless of the skew. Unfortunately the empirical distribution will also inherit all those properties of your data set arising purely by sampling error, not just those possessed by the underlying population, so you may find a histogram of your empirical distribution has some humps and dips that the population itself does not. You may want to investigate smoothed empirical distribution functions, or better yet, increasing your sample size.
In summary: although the normal distribution has zero skew, the fact your data are skewed doesn't rule out the normal distribution as a useful model, though it does suggest some other distribution may be more appropriate. You should consider other properties of the data when choosing your model, besides the skew, and consider too the purposes you are going to use the model for. It's safe to say that your true population of watching times does not exactly follow some famous, named distribution, but this does not mean such a distribution is doomed to be useless as a model. However, for some purposes you may prefer to just use the empirical distribution itself, rather than try fitting a standard distribution to it.