TESTING FOR REVEALING OF DATA STRUCTURE BASED ON THE HYBRID APPROACH


Abstract

In this paper testing for revealing data structure based on a hybrid approach has been presented. The hybrid approach used during the testing suggests defining a pre-clustering hypothesis, defining a pre-clustering statistic and assuming the homogeneity of the data under pre-defined hypothesis, applying the same clustering procedure for a data set of interest, and comparing results obtained under the pre-clustering statistic with the results from the data set of interest. The pros and cons of the hybrid approach have been also considered.


Keywords

pre-clustering hypothesis; data group structure testing; group structure revealing

Mosorov V., Tomczak L.: Image texture defect detection method using fuzzy c-means clustering for visual inspection systems. Arabian Journal for Science and Engineering 39(4)/2014, 3013–3022 [DOI:10.1007/s13369-013-0920-7].

Kumar D., Bezdek J.C., Rajasegarar S., Leckie C., Palaniswami M.: A visual-numeric approach to clustering and anomaly detection for trajectory data. The Visual Computer, December 2015 [DOI:10.1007/s00371-015-1192-x].

Zhang S., Hu W., Wang T., Liu J., Zhang Y.: Speaker Clustering Aided by Visual Dialogue Analysis. Advances in Multimedia Information Processing – PCM 2008. Springer Science + Business Media. 693–702.

Strauss D.J., Riverside C.: A model for clustering. Biometrika 62(2)/ 1975, 467–475 [DOI:10.1093/biomet/62.2.467].

Bock H.H.: On some significance tests in cluster analysis. Journal of Classification 2(1)/1985, 77–108 [DOI:10.1007/bf01908065].

Hartigan J.A., Mohanty S.: The runt test for multimodality. Journal of Classification 9(1)/1992, 63–70 [DOI:10.1007/bf02618468].

Hennig C., Lin C-J.: Flexible parametric bootstrap for testing homogeneity against clustering and assessing the number of clusters. Statistics and Computing 25(4)/2015, 821–833 [DOI:10.1007/s11222-015-9566-5].

Hautaniemi S., Edgren H., Vesanen P. et al.: A novel strategy for microarray quality control using Bayesian networks. Bioinformatics 19(16)/2003, 2031–2038 [DOI:10.1093/bioinformatics/btg275].

Everitt B.S., Landau S., Leese M., Stahl D.: Cluster analysis. John Wiley & Sons, January 7, 2011.

Gordon A.: Studies in Classification, Data Analysis, and Knowledge Organization. Gordon AD. From Data to Knowledge. Springer Science + Business Media 1996, 32–44.

Fisher R.A.: The use of multiple measurements in taxonomic problems. Annals of Eugenics 7(2)/1936, 179–188 [DOI:10.1111/j.1469-1809.1936.tb02137.x].

Gorman R.P., Sejnowski T.J.: Analysis of hidden units in a layered network trained to classify sonar targets. Neural Networks 1/1988, 75–89.

Ripley B.D.: Neural networks and related methods for classification. Journal of the Royal Statistical Society - Series B (Methodological) 56(3)/1994, 409–456 [DOI:10.2307/2346118].

Download

Published : 2017-06-30


Mosorov, V., Panskyi, T., & Biedron, S. (2017). TESTING FOR REVEALING OF DATA STRUCTURE BASED ON THE HYBRID APPROACH. Informatyka, Automatyka, Pomiary W Gospodarce I Ochronie Środowiska, 7(2), 119-122. https://doi.org/10.5604/01.3001.0010.4853

Volodymyr Mosorov  w.mosorow@kis.p.lodz.pl
Lodz University of Technology, Institute of Applied Computer Science  Poland
Taras Panskyi 
Lodz University of Technology, Institute of Applied Computer Science  Poland
Sebastian Biedron 
Lodz University of Technology, Institute of Applied Computer Science  Poland