KONSTRUKCJA I WERYFIKACJA MATEMATYCZNEGO MODELU DANYCH WIDM MASOWYCH
Małgorzata Plechawska-Wójcik
gosiap@cs.pollub.plPolitechnika Lubelska, Wydział Elektrotechniki i Informatyki, Instytut Informatyki, Lublin (Polska)
Abstrakt
Artykuł przedstawia kwestie związane z konstrukcją, dopasowaniem i implementacją modelu matematycznego widm masowych opartego o rozkłady normalne i mieszaniny rozkładów oraz o widmo średnie. To zadanie jest kluczowe dla analizy, wymaga też określenia wielu parametrów modelu.
Słowa kluczowe:
spektrometria masowa Maldi-Tof, rozkłady Gaussa, mieszaniny rozkładów Gaussa, klasyfikacja SVM-RFEBibliografia
Akaike H.: A new look at the statistical model identification. IEEE Transactions on Automatic Control, 9 s.716–723, 1974.
DOI: https://doi.org/10.1109/TAC.1974.1100705
Google Scholar
Baggerly K.A., Morris J., Wang J., Gold D., Xiao L.C., Coombes K.R.: A comprehensive approach to the analysis of matrix-assisted laser desorption/ionization time of flight proteomics spectra from serum samples. Proteomics, s. 1667–1672, 2003.
Google Scholar
Banfield J., Raftery A.: Model-based Gaussian and non-Gaussian clustering. Biometrics, 49 s. 803–821, 1993.
DOI: https://doi.org/10.2307/2532201
Google Scholar
Boster B., Guyon I., Vapnik V.: A training algorithm for optimal margin classifiers. Fifth Annual Workshop on Computational Learning Theory, s. 114– 152, 1992.
DOI: https://doi.org/10.1145/130385.130401
Google Scholar
Bozdogan H.: Choosing the number of component clusters in the mixturemodel using a new informational complexity criterion of the inverse-fisher informational matrix. Springer-Verlag,Heidelberg, 19 s. 40–54, 1993.
DOI: https://doi.org/10.1007/978-3-642-50974-2_5
Google Scholar
Bozdogan H.: On the information-based measure of covariance complexity and its application to the evaluation of multivariate linear models. Communications in Statictics, Theory and Methods, 19 s. 221–278, 1990.
DOI: https://doi.org/10.1080/03610929008830199
Google Scholar
Celeux G., Soromenho G.: An entropy criterion for assessing the number of clusters in a mixture model. Classification Journal, 13, s. 195–212, 1996.
DOI: https://doi.org/10.1007/BF01246098
Google Scholar
Clyde M.A., House L.L., Wolpert R.L.: Nonparametric models for proteomic peak identification and quatification. ISDS Discussion Paper, s. 2006–2007, 2006.
Google Scholar
Coombes K., Baggerly K., Morris J.: Pre-processing mass spectrometry data, Fundamentals of Data Mining in Genomics and Proteomics, W Dubitzky, M Granzow, and D Berrar, eds. Kluwer, s. 79-99. 2007, Boston.
DOI: https://doi.org/10.1007/978-0-387-47509-7_4
Google Scholar
Coombes K.R., Koomen J.M., Baggerly K.A., Morris J., Kobayashi R.: Understanding the characteristics of mass spectrometry data through the use of simulation. Cancer Informatics, 1 s. 41–52, 2005.
DOI: https://doi.org/10.1177/117693510500100103
Google Scholar
Comon P.: Independent component analysis – a new concept? Signal Processing, 36 s. 287–314, 1994.
DOI: https://doi.org/10.1016/0165-1684(94)90029-9
Google Scholar
Dempster A.P., Laird N.M., Rubin D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. R. Stat. Soc., 39,1 s. 1-38, 1977.
DOI: https://doi.org/10.1111/j.2517-6161.1977.tb01600.x
Google Scholar
Du P., Kibbe W., Lin S.: Improved peak detection in mass spectrum by incorporating continous wavelet transform-based pattern matching. Genome analysis, 22 s. 2059-2065, 2006.
Google Scholar
Dubitzky W., Granzow M., Berrar D.: Fundamentals of data mining in genomics and proteomics. Springer, Kluwer Boston, 2007.
DOI: https://doi.org/10.1007/978-0-387-47509-7
Google Scholar
Fung E.T., Enderwick C.: Proteinchip clinical proteomics: computational challenges and solutions. Biotechniques, Suppl., 32 s. 34–41, 2002.
DOI: https://doi.org/10.2144/mar0205
Google Scholar
Gyaourova A., Kamath C., Fodor I.K.: Undecimated wavelet transforms for image de-noising. Technical Report UCRL-ID-150931, Lawrence Livermore National Laboratory, Livermore, CA, 2002.
DOI: https://doi.org/10.2172/15002085
Google Scholar
Gentzel M., Kocher T., Ponnusamy S., Wilm M.: Preprocessing of tandem mass spectrometric data to support automatic protein identyfication. Proteomics, 3, s. 1597–1610, 2003.
Google Scholar
Gras R., Muller M., Gasteiger E., Gay S., Binz P.A., Bienvenut W., Hoogland C., Sanchez J.C., Bairoch A., Hochstrasser D.F., Appel R.D.: Improving protein identification from peptide mass fingerprinting through a parameterized multi-level scoring algorithm and an optimized peak detection. Electrophoresis, 20 s. 3535-3550, 1999.
Google Scholar
Jutten C., H´erault J.. Blind separation of sources, part I: An adaptive algorithm based on neuromimetic architecture. Signal Processing, 24 s. 1-10, 1991.
DOI: https://doi.org/10.1016/0165-1684(91)90079-X
Google Scholar
Lang M., Guo H., Odegard J.E., Burrus C.S., Well R.O.Jr.: Nonlinear processing of a shift invariant DWT for noise reduction. Proc. SPIE. Wavelet Applications II, 2491 s. 640-651, 1995.
Google Scholar
Lang M., Guo H., Odegard J.E., Burrus C.S., Well R.O.Jr.: Noise reduction using an undecimated discrete wavelet transform. IEEE Signal Processing Letters, 3 s. 10-12, 1996.
DOI: https://doi.org/10.1109/97.475823
Google Scholar
Lewandowicz A., Bakun M., Imiela J., Dadlez M.: Proteomika w uronefrologii - nowe perspektywy diagnostyki nieinwazyjnej? Nefrologia i dializoterapia polska, 1 s. 15–21, 2009.
Google Scholar
Mantini D., Petrucci F., Pieragostino D., Del Boccio P., Di Nicola M., Di Ilio C., Federici G., Sacchetta P., Comani S., Urbani A.: Limpic: a computational method for the separation of protein signals from noise. BMC Bioinformatics, 8:101, 2007.
Google Scholar
Mantini D., Petrucci F., Del Boccio P., Pieragostino D., Di Nicola M., Lugaresi A., Federici G., Sacchetta P., Di Ilio C., Urbani A.: Independent component analysis for the extraction of reliable protein signal profiles from Maldi-ToF mass spectra. Bioinformatics, 24 s.63 – 70, 2008.
DOI: https://doi.org/10.1093/bioinformatics/btm533
Google Scholar
McLachlan G.: Finite mixture models. John Wiley and Sons, 2001.
DOI: https://doi.org/10.1002/0471721182
Google Scholar
Morris J., Coombes K., Kooman J., Baggerly K., Kobayashi R..: Feature extraction and quantification for mass spectrometry data in biomedical applications using the mean spectrum. Bioinformatics, 21(9): 1764-1775. 2005.
DOI: https://doi.org/10.1093/bioinformatics/bti254
Google Scholar
Norris J., Cornett D., Mobley J., Anderson M., Seeley E., Chaurand P, Caprioli R.: Processing MALDI mass spectra to improve mass spectral direct tissue analysis. National institutes of health. 2007, USA.
DOI: https://doi.org/10.1016/j.ijms.2006.10.005
Google Scholar
Plechawska-Wójcik M.: Comprehensive analysis of mass spectrometry data – a case study. Foundations of Computing and Decision Sciences. Vol. 36 - No. 3-4, s. 275-292, 2011.
Google Scholar
Plechawska M.: Comparing and similarity determining of gaussian distributions mixtures. Polish Journal of Environmental Studies, 17, No. 3B s. 341–346, 2008.
Google Scholar
Polanska J., Plechawska M.: Comparison of convergence criterions used in expectation-maximization algorithm. Symbiosis, 2008.
Google Scholar
Randolph T., Mithcell B., McLerran D., Lampe P., Feng Z.: Quantifying peptide signal in maldi-tof mass spectrometry data. Molecular & Cellular Proteomics, 4 s. 1990–1999, 2005.
Google Scholar
Schwarz G.: Estimating the dimension of a model. Annals of Statistics, 6 s. 461–464, 1978.
DOI: https://doi.org/10.1214/aos/1176344136
Google Scholar
Tibshirani R., Hastiey T., Narasimhanz B., Soltys S., Shi G., Koong A., Le Q.T.: Sample classification from protein mass spectrometry, by ’peak probability contrasts’. Bioinformatics, 20 s. 3034 – 3044, 2004.
Google Scholar
Tversky A., Hutchinson J.W.: Nearest neighbor analysis of psychological spaces. Psychological review, 93(1) s. 3–22, 1993.
DOI: https://doi.org/10.1037/0033-295X.93.1.3
Google Scholar
Vapnik V.N.: The Nature of Statistical Learning Theory. Springer, 1995.
DOI: https://doi.org/10.1007/978-1-4757-2440-0
Google Scholar
Vapnik V.N.: Statistical Learning Theory. Wiley, 1998.
Google Scholar
Windham M.P. Cutler A.: Information ratios for validating cluster analyses. Journal of the American Statistical Association, 87 s. 1188–1192, 1993.
DOI: https://doi.org/10.1080/01621459.1992.10476277
Google Scholar
Wold H.: Estimation of principal components and related models by iterative least squares. Multivariate Analysis, s. 391–420, 1966.
Google Scholar
Yasui Y., Pepe M., Thompson M.L., Adam B.L., Wright G.L., Qu Y., Potter J.D., Winget M., Thornquist M., Feng Z.: A data-analytic strategy for protein biomarker discovery: profiling of high-dimensional proteomic data for cancer detection. Biostatistics, 4 s. 449-463, 2003.
DOI: https://doi.org/10.1093/biostatistics/4.3.449
Google Scholar
Autorzy
Małgorzata Plechawska-Wójcikgosiap@cs.pollub.pl
Politechnika Lubelska, Wydział Elektrotechniki i Informatyki, Instytut Informatyki, Lublin Polska
Statystyki
Abstract views: 197PDF downloads: 130
Licencja
Utwór dostępny jest na licencji Creative Commons Uznanie autorstwa – Na tych samych warunkach 4.0 Miedzynarodowe.
Inne teksty tego samego autora
- Małgorzata Plechawska-Wójcik, Kinga Wesołowska, Martyna Wawrzyk, Monika Kaczorowska, Mikhail Tokovarov, ANALIZA WPŁYWU DOBORU ODPROWADZEŃ REFERENCYJNYCH ZAPISU EEG NA UZYSKANE WIDMO , Informatyka, Automatyka, Pomiary w Gospodarce i Ochronie Środowiska: Tom 7 Nr 2 (2017)
- Małgorzata Plechawska-Wójcik , METODY ELIMINACJI ARTEFAKTÓW W SYGNAŁACH EEG , Informatyka, Automatyka, Pomiary w Gospodarce i Ochronie Środowiska: Tom 5 Nr 2 (2015)