Density-based clustering (DBSCAN)
labels = dbscan(X, eps, min_samples) [labels, core_idx] = dbscan(X, eps, min_samples)
is a N x D (N samples, D features) real matrix.
a positive scalar (default=0.5). It is the neighborhood radius.
Points within distance eps are considered neighbors.
an integer (default=5). It is the minimum number of points required
within the eps neighborhood for a point to be considered a
core point.
integers column vector (N x 1).
Cluster assignment for each point.
The value -1 indicates a noise point.
vector containing the indices of the core points.
The DBSCAN (Density-Based Spatial Clustering of Applications with Noise) algorithm groups
points that are close to each other into clusters, based on local density. Points that are too far from others
are considered noise.
It is based on two main concepts:
eps: this is the maximum distance between two points so that they are
considered in the same neighborhood.
min_samples corresponds to the minimum number of points needed to form a dense cluster.
So, each cluster is formed around core points, which are points that
have at least min_samples neighbors within distance eps.
Points close to core points but not dense enough themselves are called
border points. All remaining points are labeled as noise.
Unlike kmeans, the DBSCAN algorithm does not require the number of clusters to be specified in advance, and it can detect clusters of arbitrary shapes.
Two compact clusters with noise
rand("seed", 0) n = 50; X = [rand(n,2); rand(n,2)+3]; X = [X; 6*rand(10,2)]; labels = dbscan(X, 0.5, 5); scf(); gca().isoview = "on"; scatter(X(:,1), X(:,2), [], labels, "fill"); xtitle("Two compact clusters with noise"); | ![]() | ![]() |
Three clusters of different densities
rand("seed", 0) X1 = 0.3*rand(100,2); X2 = rand(50,2) + 3; X3 = 1.8*rand(100,2) - 2; X = [X1; X2; X3]; labels = dbscan(X, 0.4, 5); scf(); scatter(X(:,1), X(:,2), [], labels, "fill"); xtitle("Three clusters of different densities"); gca().isoview = "on"; | ![]() | ![]() |
half-moon shaped data
rand("seed", 0) n = 100; t = linspace(0, %pi, n)'; X1 = [cos(t), sin(t)] + 0.05*rand(n,2); X2 = [1-cos(t), -sin(t)-0.5] + 0.05*rand(n,2); X = [X1; X2]; labels = dbscan(X, 0.2, 5); scf(); gca().isoview = "on"; scatter(X(:,1), X(:,2), [], labels, "fill"); | ![]() | ![]() |
Circular cluster with noise
rand("seed", 0) theta = 2*%pi*rand(200,1); r = 1 + 0.1*rand(200,1); X1 = [r.*cos(theta), r.*sin(theta)]; // circle X2 = 3*(rand(30,2)-0.5); // noise X = [X1; X2]; labels = dbscan(X, 0.2, 5); scf(); gca().isoview = "on"; scatter(X(:,1), X(:,2), [], labels, "fill"); xtitle("Circular cluster with noise"); | ![]() | ![]() |
Nested spirals
rand("seed", 0) t = linspace(0, 4*%pi, 200)'; r = linspace(0.1, 1, 200)'; X1 = [r.*cos(t), r.*sin(t)] + 0.02*rand(200,2); X2 = [r.*cos(t+%pi), r.*sin(t+%pi)] + 0.02*rand(200,2); X = [X1; X2]; labels = dbscan(X, 0.15, 5); scf(); gca().isoview = "on"; scatter(X(:,1), X(:,2), [], labels, "fill"); xtitle("Nested spirals"); | ![]() | ![]() |

| Version | Description |
| 2026.0.0 | Function added. |