# K-Means service specifications

## Service Description

The ** K-means Unsupervised Classifier (K-means)** processing service service derives a classification map from a set of calibrated single-band assets from the same mission.

The classification algorithm minimises a criterion known as the inertia or within-cluster sum-of-squares. When assets are coming from multiple Datasets, the processor generates a co-location of all input single band assets to generate an image stack. K-means is then employed to get a classification of the image stack into N clusters. At this stage, the k-means clustering is made using the number of classes specified by the user. To speed up the k-means computations, the service offers the possibility to employ the Principal Component Analysis (PCA) dimensionality reduction algorithm.

In the Earth Observation applications, PCA is used to reduce the number of bands that are necessary for a certain analysis (i.e. classification) as each multi-spectral satellite image, several bands may contain similar information in particular for close wavelengths. In particular, the purpose of a PCA is to reduce this redundancy by comparing the spectral information in each band with that in every other band via an orthogonal transformation, so that the first principal component (PC) represents the greatest variance of the data, the second PC represents the second greatest variance of the data, and so on. PCs are a linear combination of input bands sorted in decreasing eigenvalues (PC1, PC2 etc.).

In this service PCA can be employed prior to k-means clustering to work only on PCA-reduced EO data. As an example k-means clustering into N classes can be done using only the principal components 1 and 2 derived from input single band assets.

The output of the service is a classification map into N classes. In the K-means unsupervised classfication the user can define up to 12 classes. The output K-means classification map is offered with the qualitative color scheme as shown in the below legend.

## Inputs

The **K-means** service requires as input one or more calibrated Datasets from the same mission or constellation.

## Parameters

The **K-means** service requires a specified number of mandatory and optional parameters. Table 1 describes the K-means service parameters.

Parameter | Description | Required | Default value |
---|---|---|---|

Input reference product(s) |
Reference to input product(s) to be used in the k-means unsupervised classification. If more than a product reference is given a collocation is made to have an image stack on the same grid. | YES | |

List(s) of comma separated assets |
List of single-band assets to be extracted from input product reference/s and used in the k-means classification | YES | |

Number of classes |
This parameter specifies the number of classes (N_C) to be used in the k-means classification. N_C>1 and N_C<=12. | YES | 5 |

Number of PCs to be used in the EO data reduction |
This optional parameter defines the number of PCs (N_PC) to be employed in the k-means classification with PCA EO data reduction. N_PC>1 and N_PC<=3. | NO | |

Area of Interest |
This optional parameter defines the area of interest expressed as a Well-Known Text value. If set, it overrides the automatic determination of the maximum common area between the input-reference products geometry. | YES |

*Table 1 - Service parameters for the K-means processor.*

### Input product references

The reference/s to input Calibrated datasets containing the single-band assets to be used in the K-Means classification.

### List-of-comma-separated-bands

This second mandatory parameter is a list of bands expressed as a comma separated list of common band names. The list of single-band geophysical assets to be used for the co-location shall be given as a list of comma separated CBN.

Example

To define a Sigma0 single-band assets from SAR data in X-Band and HH polarization (e.g. **s0_db_x_hh**) from a **single Radar Calibrated Dataset**, the user shall define **1** input assets in K-means as following:

```
s0_db_x_hh
```

Example

To define multiple reflectance single-band assets from VIS and NIR (e.g. **blue**, **green**, **red**, and **nir**) from **two Optical Calibrated Dataset**, the user shall define **4** input assets in K-means as following:

```
blue,green,red,nir
```

### Number of classes

This third mandatory parameter specifies the Number of Classes (N_C) to be used in the k-means classification.

Warning

The number of Number of Classes (N_C) shall be `N_C`

> **1** and `N_C`

<= **12**.

### Number of PCs to be used in the EO data reduction

If needed, the K-means service offers possibility to employ the Principal Component Analysis (PCA) dimensionality reduction algorithm. Thus, this **optional parameter** defines the number of Principal Components (N_PC) to be employed in the k-means classification with PCA EO data reduction. As an example, in case `N_PC`

is equal to **2** the first 2 PCs components are used in the image classification instead of all input assets.

Warning

The number of Principal Components (N_PC) shall be `N_PC`

> **1** and `N_PC`

<= **3**.

Note

If the number of Principal Components (N_PC) is not specified the k-means unsupervised classification is made without PCA dimensionality reduction.

### AOI

This last parameter defines the area of interest expressed as a Well-Known Text value.

Tip

In the definition of “Area of interest as Well Known Text” it is possible to apply as AOI the drawn polygon defined with the area filter. To do so, click on the :fontawesome-solid-magic: button in the left side of the **"Area of interest expressed as Well-known text"** box and select the option **AOI** from the list. The platform will automatically fill the parameter value with the rectangular bounding box taken from the current search area in WKT format.

## Output

The result product of the **K-means** service is a single-band classification map GeoTIFF in COG format. Product specifications for this service are shown in the below Table.

Attribute | Value / description |
---|---|

Long Name | K-means unsupervised classification map |

Short Name | k-means-classification |

Description | K-means classification map into N classes |

Data Type | Int16 |

Band | Single |

Format | COG |

Projection | Native or EPSG:4326 - WGS84 |

Fill Value | 0 |

Attribute | Value / description |
---|---|

Long Name | Co-located input single band assets employed in K-Means |

Short Name | pc-1, pc-2, pc-N |

Description | Geophysical quantity (reflectance or backscatter) after a co-location geometric correction. |

Data Type | Float32 |

Band | Single |

Format | COG |

Projection | Native or EPSG:4326 - WGS84 |