Number of found documents: 262
Published from to

Robustness of High-Dimensional Data Mining
Kalina, Jan; Duintjer Tebbens, Jurjen; Schlenker, Anna
2014 - English
Standard data mining procedures are sensitive to the presence of outlying measurements in the data. This work has the aim to propose robust versions of some existing data mining procedures, i.e. methods resistant to outliers. In the area of classification analysis, we propose a new robust method based on a regularized version of the minimum weighted covariance determinant estimator. The method is suitable for data with the number of variables exceeding the number of observations. The method is based on implicit weights assigned to individual observations. Our approach is a unique attempt to combine regularization and high robustness, allowing to downweight outlying high-dimensional observations. Classification performance of new methods and some ideas concerning classification analysis of high-dimensional data are illustrated on real raw data as well as on data contaminated by severe outliers. Keywords: classification analysis; robust estimation; high-dimensional data Available in digital repository of the ASCR
Robustness of High-Dimensional Data Mining

Standard data mining procedures are sensitive to the presence of outlying measurements in the data. This work has the aim to propose robust versions of some existing data mining procedures, i.e. ...

Kalina, Jan; Duintjer Tebbens, Jurjen; Schlenker, Anna
Ústav informatiky, 2014

A posteriori algebraic error estimation in numerical solution of linear diffusion PDEs
Papež, Jan; Vohralík, M.
2014 - English
Keywords: finite element method; algebraic error; a posteriori error estimation; stopping criteria Available in a digital repository NRGL
A posteriori algebraic error estimation in numerical solution of linear diffusion PDEs

Papež, Jan; Vohralík, M.
Ústav informatiky, 2014

Towards Low-Dimensional Gaussian Process Metamodels for CMA-ES
Bajer, Lukáš; Holeňa, Martin
2014 - English
Gaussian processes and kriging models has attracted attention of researchers from different areas of black-box optimization, especially since Jones’ introduction of the Efficient Global Optimization (EGO) algorithm. However, current implementations of the EGO or real-world applications are rather few. We conjecture that the EGO is not suitable for higher-dimensional optimization and try to investigate whether hybridization of a low-dimensional local optimization with the current state-of-the-art continuous black-box optimizer CMA-ES (Covariance Matrix Adaptation Evolution Strategy) could help. In this paper, only a first proposal of such a GP/CMA-ES connection is described and some preliminary tests are presented. Keywords: CMA-ES; Gaussian processes; global optimization; surrogate model; metamodel Available in digital repository of the ASCR
Towards Low-Dimensional Gaussian Process Metamodels for CMA-ES

Gaussian processes and kriging models has attracted attention of researchers from different areas of black-box optimization, especially since Jones’ introduction of the Efficient Global Optimization ...

Bajer, Lukáš; Holeňa, Martin
Ústav informatiky, 2014

Interpreting and Clustering Outliers with Sapling Random Forests
Kopp, Martin; Pevný, T.; Holeňa, Martin
2014 - English
The main objective of outlier detection is finding samples considerably deviating from the majority. Such outliers, often referred to as anomalies, are nowadays more and more important, because they help to uncover interesting events within data. Consequently, a considerable amount of statistical and data mining techniques to identify anomalies was proposed in the last few years, but only a few works at least mentioned why some sample was labelled as an anomaly. Therefore, we propose a method based on specifically trained decision trees, called sapling random forest. Our method is able to interpret the output of arbitrary anomaly detector. The explanation is given as a subset of features, in which the sample is most deviating, or as conjunctions of atomic conditions, which can be viewed as antecedents of logical rules easily understandable by humans. To simplify the investigation of suspicious samples even more, we propose two methods of clustering anomalies into groups. Such clusters can be investigated at once saving time and human efforts. The feasibility of our approach is demonstrated on several synthetic and one real world datasets. Keywords: anomaly detection; anomaly interpretation; clustering; decision trees; feature selection; random forest Available in digital repository of the ASCR
Interpreting and Clustering Outliers with Sapling Random Forests

The main objective of outlier detection is finding samples considerably deviating from the majority. Such outliers, often referred to as anomalies, are nowadays more and more important, because they ...

Kopp, Martin; Pevný, T.; Holeňa, Martin
Ústav informatiky, 2014

Online System for Fire Danger Rating in Colorado
Vejmelka, Martin; Kochanski, A.; Mandel, J.
2014 - English
A method for the data assimilation of fuel moisture surface observations has been developed for the purpose of incorporation in wildfire forecasting and fire danger rating. In this work, we describe the method itself and also an online computer system that implements the method and combines it with the Real-Time Mesoscale Analysis to track local weather conditions and estimate the fuel moisture content in the state of Colorado. We discuss the construction of the system and future development. Keywords: fire danger; fuel moisture; data assimilation; remote automated weather stations; real-time mesoscale analysis; software; nebezpečí požáru; vlhkost paliva; asimilace dat; vzdálené automatické meteostanice Available in digital repository of the ASCR
Online System for Fire Danger Rating in Colorado

A method for the data assimilation of fuel moisture surface observations has been developed for the purpose of incorporation in wildfire forecasting and fire danger rating. In this work, we describe ...

Vejmelka, Martin; Kochanski, A.; Mandel, J.
Ústav informatiky, 2014

Modifications of the limited-memory BFGS method based on the idea of conjugate directions
Vlček, Jan; Lukšan, Ladislav
2013 - English
Simple modifications of the limited-memory BFGS method (L-BFGS) for large scale unconstrained optimization are considered, which consist in corrections of the used difference vectors (derived from the idea of conjugate directions), utilizing information from the preceding iteration. For quadratic objective functions, the improvement of convergence is the best one in some sense and all stored difference vectors are conjugate for unit stepsizes. The algorithm is globally convergent for convex sufficiently smooth functions. Numerical experiments indicate that the new method often improves the L-BFGS method significantly. Keywords: limited memory; variable metric methods; conjugate directions; large scale optimization; numerical solution Fulltext is available at external website.
Modifications of the limited-memory BFGS method based on the idea of conjugate directions

Simple modifications of the limited-memory BFGS method (L-BFGS) for large scale unconstrained optimization are considered, which consist in corrections of the used difference vectors (derived from the ...

Vlček, Jan; Lukšan, Ladislav
Ústav informatiky, 2013

Mathematical fuzzy logic: first-order and beyond
Cintula, Petr
2013 - English
Keywords: mathematical fuzzy logic; predicate fuzzy logic; metamathematics of fuzzy logic; higher-order fuzzy logics Available in digital repository of the ASCR
Mathematical fuzzy logic: first-order and beyond

Cintula, Petr
Ústav informatiky, 2013

Autocorrelated residuals of robust regression
Kalina, Jan
2013 - English
The work is devoted to the Durbin-Watson test for robust linear regression methods. First we explain consequences of the autocorrelation of residuals on estimating regression parameters. We propose an asymptotic version of the Durbin-Watson test for regression quantiles and trimmed least squares and derive an asymptotic approximation to the exact null distribution of the test statistic, exploiting the asymptotic representation for both regression estimators. Further, we consider the least weighted squares estimator, which is a highly robust estimator based on the idea to down-weight less reliable observations. We compare various versions of the Durbin-Watson test for the least weighted squares estimator. The asymptotic test is derived using two versions of the asymptotic representation. Finally, we investigate a weighted Durbin-Watson test using the weights determined by the least weighted squares estimator. The exact test is described and also an asymptotic approximation to the distribution of the weighted statistic under the null hypothesis is obtained. Keywords: linear regression; robust statistics; diagnostics; autocorrelation Fulltext is available at external website.
Autocorrelated residuals of robust regression

The work is devoted to the Durbin-Watson test for robust linear regression methods. First we explain consequences of the autocorrelation of residuals on estimating regression parameters. We propose ...

Kalina, Jan
Ústav informatiky, 2013

Error Analysis of Three Methods for the Parameter Estimation Problem based on Spatio-temporal FRAP Measurement
Papáček, Š.; Matonoha, Ctirad
2013 - English
Keywords: diffusion equation; inverse problem formulation; sensitivity analysis Available in digital repository of the ASCR
Error Analysis of Three Methods for the Parameter Estimation Problem based on Spatio-temporal FRAP Measurement

Papáček, Š.; Matonoha, Ctirad
Ústav informatiky, 2013

Forecasting System for Truck Parking Based on Statistical Modeling of Indirect Data
Brabec, Marek; Konár, Ondřej; Kasanický, Ivan; Pelikán, Emil; Malý, Marek; Lokaj, Z.; Zelinka, T.
2013 - English
In this paper, we describe briefly ongoing work on a project devoted to development and pilot verification of a system for highway truck parking detection and forecasting. The project has been funded by the Technological Agency of the Czech Republic (TACR) during 2012-2014 as the project number TA02031411: “Increasing the usage of parking capacity on highways using prediction models”. It is based on a unique collaboration of Faculty of Transportation Sciences, Czech Technical University in Prague, Institute of Computer Science, Academy of Sciences of the Czech Republic, Inoxive Ltd. and Kapsch Telematic Services Ltd. Keywords: intelligent transport systems; traffic modeling; statistical modeling; generalized additive models; dynamic models Available in digital repository of the ASCR
Forecasting System for Truck Parking Based on Statistical Modeling of Indirect Data

In this paper, we describe briefly ongoing work on a project devoted to development and pilot verification of a system for highway truck parking detection and forecasting. The project has been funded ...

Brabec, Marek; Konár, Ondřej; Kasanický, Ivan; Pelikán, Emil; Malý, Marek; Lokaj, Z.; Zelinka, T.
Ústav informatiky, 2013

About project

NRGL provides central access to information on grey literature produced in the Czech Republic in the fields of science, research and education. You can find more information about grey literature and NRGL at service web

Send your suggestions and comments to nusl@techlib.cz

Provider

http://www.techlib.cz

Facebook

Other bases