Questions tagged [unsupervised-learning]

Finding hidden (statistical) structure in unlabelled data, including clustering and feature extraction for dimensionality reduction.

Finding hidden (statistical) structure in unlabelled data, including clustering and feature extraction for dimensionality reduction

Because the items are unlabelled, there's nothing that points toward the "correct" labels, as there is with supervised learning. Unsupervised learning uses methods like clustering and principal components analysis to discover structure.

Reference:
Wikipedia - Unsupervised learning

637 questions
89
votes
4 answers

How to produce a pretty plot of the results of k-means cluster analysis?

I'm using R to do K-means clustering. I'm using 14 variables to run K-means What is a pretty way to plot the results of K-means? Are there any existing implementations? Does having 14 variables complicate plotting the results? I found something…
71
votes
2 answers

Performance metrics to evaluate unsupervised learning

With respect to the unsupervised learning (like clustering), are there any metrics to evaluate performance?
user3125
  • 2,617
  • 4
  • 25
  • 33
69
votes
2 answers

How can an artificial neural network ANN, be used for unsupervised clustering?

I understand how an artificial neural network (ANN), can be trained in a supervised manner using backpropogation to improve the fitting by decreasing the error in the predictions. I have heard that an ANN can be used for unsupervised learning but…
51
votes
2 answers

Choosing the right linkage method for hierarchical clustering

I am performing hierarchical clustering on data I've gathered and processed from the reddit data dump on Google BigQuery. My process is the following: Get the latest 1000 posts in /r/politics Gather all the comments Process the data and compute an…
42
votes
4 answers

Evaluation measures of goodness or validity of clustering (without having truth labels)

I'm clustering a set of data but I don't have truth document that allow me to evaluate the result of clustering (I have unlabelled data), so I can not use an external evaluation measure. In this case, is there any efficient evaluation measures -…
shn
  • 2,479
  • 9
  • 31
  • 38
38
votes
4 answers

What are the differences between sparse coding and autoencoder?

Sparse coding is defined as learning an over-complete set of basis vectors to represent input vectors (<-- why do we want this) . What are the differences between sparse coding and autoencoder? When will we use sparse coding and autoencoder?
30
votes
5 answers

Clustering procedure where each cluster has an equal number of points?

I have some points $X=\{x_1,...,x_n\}$ in $R^p$, and I want to cluster the points so that: Each cluster contains an equal number of elements of $X$. (Assume that the number of clusters divides $n$.) Each cluster is "spatially cohesive" in some…
30
votes
1 answer

Distant supervision: supervised, semi-supervised, or both?

"Distant supervision" is a learning scheme in which a classifier is learned given a weakly labeled training set (training data is labeled automatically based on heuristics / rules). I think that both supervised learning, and semi-supervised…
30
votes
5 answers

Distinguishing between two groups in statistics and machine learning: hypothesis test vs. classification vs. clustering

Assume I have two data groups, labeled A and B (each containing e.g. 200 samples and 1 feature), and I want to know if they are different. I could: a) perform a statistical test (e.g. t-test) to see if they are statistically different. b) use…
30
votes
2 answers

Supervised learning, unsupervised learning and reinforcement learning: Workflow basics

Supervised learning 1) A human builds a classifier based on input and output data 2) That classifier is trained with a training set of data 3) That classifier is tested with a test set of data 4) Deployment if the output is satisfactory To be used…
29
votes
3 answers

Unsupervised, supervised and semi-supervised learning

In the context of machine learning, what is the difference between unsupervised learning supervised learning and semi-supervised learning? And what are some of the main algorithmic approaches to look at?
28
votes
2 answers

What's the intuition behind contrastive learning or approach?

Maybe a noobs query, but recently I have seen a surge of papers w.r.t contrastive learning (a subset of semi-supervised learning). Some of the prominent and recent research papers which I read, which detailed this approach are: Representation…
28
votes
4 answers

Supervised clustering or classification?

The second question is that I found in a discussion somewhere on the web talking about "supervised clustering", as far as I know, clustering is unsupervised, so what is exactly the meaning behind "supervised clustering" ? What is the difference with…
26
votes
2 answers

Generative vs discriminative models (in Bayesian context)

What are the differences between generative and discriminative (discriminant) models (in the context of Bayesian learning and inference)? and what it is concerned with prediction, decision theory or unsupervised learning?
nkint
  • 768
  • 3
  • 9
  • 20
25
votes
3 answers

How to choose an optimal number of latent factors in non-negative matrix factorization?

Given a matrix $\mathbf V^{m \times n}$, Non-negative Matrix Factorization (NMF) finds two non-negative matrices $\mathbf W^{m \times k}$ and $\mathbf H^{k \times n}$ (i.e. with all elements $\ge 0$) to represent the decomposed matrix as: $$\mathbf…
1
2 3
42 43