Supervised vs Unsupervised (Perspective: Maximum Likelihood to Select The Sample Area)



As we known, data mining techniques come in two main forms: supervised (also known as predictive or directed) and unsupervised (also known as descriptive or un-directed). Both categories encompass functions capable of finding different hidden patterns in large data sets. Supervised data mining techniques are appropriate when we have a specific target value that you’d like to predict about your data. The targets can have two or more possible outcomes, or even be a continuous numeric value (more on that later). The accuracy is determined by the quality of the sampling and the number of samples. The sample area is created using Region Of Interest (ROI). ROI must first be created before conducting this supervised classification process. Region Of Interest is the sampling area formed as a training area on the supervised classification. The classification model can have more than two possible values in the target attribute.
To use these methods, we ideally have a subset of data points for which this target value is already known. We use that data to build a model of what a typical data point looks like when it has one of the various target values. We then apply that model to data for which that target value is currently unknown. The algorithm identifies the “new” data points that match the model of each target value.
The pixel-based classification using the maximum likelihood algorithm is a guided classification method based on the Bayes theorem (homogeneous objects always display the normal histogram distributed). MLE uses discriminant function (density probability / probability density). At the time of classification, all unspecified pixels are set to be class members that have been determined by the highest probability of occurrence in each class. If the probability value of a class is smaller than the specified threshold value then, the pixel is not grouped. 
 Fig. 1 The image satellite of Sentinel-2 composite 432 plus Near infrared 8




In this Figure 1. we can see the satellite image which downloaded from https://scihub.copernicus.eu with specific image Sentinel-2 on date 01 January 2017 in some area Wielkopolski. We used a composite band for Sentinel True Color, after that we can change the composite from true color to RGB (432) plus NIR 8. This change with the aim of sharp the object of vegetation and water. The Processing satellite image-based Supervised Classification using Maximum Likelihood Estimation algorithm (MLE). Figure 2 is the result map of supervised classification in ArcGis 10.3
 




Fig. 2 The Result map with maximum likelihood (supervised classification) in ArcGis 10.3.
The using of this maximum likelihood method gives the interpreter the chance to determine the type
of object we own, but the weakness of the u-normal distribution (un-bias) pattern causes pixel
scattering that does not reach the actual object. For the example, In the object high vegetation (forest)
there is a little appearance of a high building which mean in the real world is impossible happened.
There is some error interpretation because of abnormal distribution and sampling.
So, the conclusion of using maximum likelihood (supervised method) for interpretation land use/land cover is:
Advantages:

1.    Maximum likelihood provides a consistent approach to parameter estimation problems. This means that maximum likelihood estimates can be developed for a large variety of estimation situations.

2.    Maximum likelihood methods have desirable mathematical and optimal properties which mean, they become minimum variance unbiased estimators as the sample size increases and They have approximately normal distributions and approximate sample variances that can be used to generate confidence bounds and hypothesis tests for the parameters.

Disadvantages:

1.    The likelihood equations need to be specifically worked out for a given distribution and estimation problem.
2.    Maximum likelihood estimates can be heavily biased for small samples. The optimal properties may not apply for small samples.
3.    Maximum likelihood can be sensitive to the choice of starting values as like high building and low building



 

Komentar

Postingan populer dari blog ini

DenoSa ~ Dewi Novita Sari

Welcome to Poland!

You Can Call Me Anything You Want