If you are familiar with machine learning algorithms you’d know that in most of the supervised learning problems what we essentially do is, we predict a target variable based on known variables. And target variables can again be of two types:
1. Target variables that can take only nominal values: True/False; Good/Bad/Average -This is what we call classification as those nominal values can be considered as classes
2. Target variables that can take infinite number of numeric values- This is what we call as regression
There is another set of problems where you don’t need to predict any target value. They fall under unsupervised learning.
Here also you have two approaches: either you try to fit your data into some discrete groups called clusters. Or you find out how strong the fit is into each group.
Forming clusters with the data is known as clustering. Making a numerical estimate for the fit comes under density estimation algorithm.
Density estimation is a simple concept and most of us are familiar with one such technique called histograms. A major problem with histograms, however, is that the choice of binning can have a disproportionate effect on the resulting visualization. This post from scikit-learn gives you a good understanding of density estimation.