Regression for High-Dimensional Data: From Regularization to Deep Learning
Kalina, Jan; Vidnerová, Petra
2020 - English
Regression modeling is well known as a fundamental task in current econometrics. However, classical estimation tools for the linear regression model are not applicable to highdimensional data. Although there is not an agreement about a formal definition of high dimensional data, usually these are understood either as data with the number of variables p exceeding (possibly largely) the number of observations n, or as data with a large p in the order of (at least) thousands. In both situations, which appear in various field including econometrics, the analysis of the data is difficult due to the so-called curse of dimensionality (cf. Kalina (2013) for discussion). Compared to linear regression, nonlinear regression modeling with an unknown shape of the relationship of the response on the regressors requires even more intricate methods.
Keywords:
regression; neural networks; robustness; high-dimensional data; regularization
Fulltext is available at external website.
Regression for High-Dimensional Data: From Regularization to Deep Learning
Regression modeling is well known as a fundamental task in current econometrics. However, classical estimation tools for the linear regression model are not applicable to highdimensional data. ...
Lexicalized Syntactic Analysis by Restarting Automata
Mráz, F.; Otto, F.; Pardubská, D.; Plátek, Martin
2019 - English
We study h-lexicalized two-way restarting automata that can rewrite at most i times per cycle for some i ≥ 1 (hRLWW(i)-automata). This model is considered useful for the study of lexical (syntactic) disambiguation, which is a concept from linguistics. It is based on certain reduction patterns. We study lexical disambiguation through the formal notion of h-lexicalized syntactic analysis (hLSA). The hLSA is composed of a basic language and the corresponding h-proper language, which is obtained from the basic language by mapping all basic symbols to input symbols. We stress the sensitivity of hLSA by hRLWW(i)-automata to the size of their windows, the number of possible rewrites per cycle, and the degree of (non-)monotonicity. We introduce the concepts of contextually transparent languages (CTL) and contextually transparent lexicalized analyses based on very special reduction patterns, and we present two-dimensional hierarchies of their subclasses based on the size of windows and on the degree of synchronization. The bottoms of these hierarchies correspond to the context-free languages. CTL creates a proper subclass of context-sensitive languages with syntactically natural properties.
Keywords:
Restarting automaton; h-lexicalization; lexical disambiguation
Fulltext is available at external website.
Lexicalized Syntactic Analysis by Restarting Automata
We study h-lexicalized two-way restarting automata that can rewrite at most i times per cycle for some i ≥ 1 (hRLWW(i)-automata). This model is considered useful for the study of lexical (syntactic) ...
Laplacian preconditioning of elliptic PDEs: Localization of the eigenvalues of the discretized operator
Gergelits, Tomáš; Mardal, K.-A.; Nielsen, B. F.; Strakoš, Z.
2019 - English
This contribution represents an extension of our earlier studies on the paradigmatic example of the inverse problem of the diffusion parameter estimation from spatio-temporal measurements of fluorescent particle concentration, see [6, 1, 3, 4, 5]. More precisely, we continue to look for an optimal bleaching pattern used in FRAP (Fluorescence Recovery After Photobleaching), being the initial condition of the Fickian diffusion equation maximizing a sensitivity measure. As follows, we define an optimization problem and we show the special feature (so-called complementarity principle) of the optimal binary-valued initial conditions.
Keywords:
second order elliptic PDEs; preconditioning by the inverse Laplacian; eigenvalues of the discretized preconditioned problem; nodal values of the coefficient function; Hall’s theorem; convergence of the conjugate gradient method
Available in digital repository of the ASCR
Laplacian preconditioning of elliptic PDEs: Localization of the eigenvalues of the discretized operator
This contribution represents an extension of our earlier studies on the paradigmatic example of the inverse problem of the diffusion parameter estimation from spatio-temporal measurements of ...
A Nonparametric Bootstrap Comparison of Variances of Robust Regression Estimators.
Kalina, Jan; Tobišková, Nicole; Tichavský, Jan
2019 - English
While various robust regression estimators are available for the standard linear regression model, performance comparisons of individual robust estimators over real or simulated datasets seem to be still lacking. In general, a reliable robust estimator of regression parameters should be consistent and at the same time should have a relatively small variability, i.e. the variances of individual regression parameters should be small. The aim of this paper is to compare the variability of S-estimators, MM-estimators, least trimmed squares, and least weighted squares estimators. While they all are consistent under general assumptions, the asymptotic covariance matrix of the least weighted squares remains infeasible, because the only available formula for its computation depends on the unknown random errors. Thus, we take resort to a nonparametric bootstrap comparison of variability of different robust regression estimators. It turns out that the best results are obtained either with MM-estimators, or with the least weighted squares with suitable weights. The latter estimator is especially recommendable for small sample sizes.
Keywords:
robustness; linear regression; outliers; bootstrap; least weighted squares
Fulltext is available at external website.
A Nonparametric Bootstrap Comparison of Variances of Robust Regression Estimators.
While various robust regression estimators are available for the standard linear regression model, performance comparisons of individual robust estimators over real or simulated datasets seem to be ...
Implicitly weighted robust estimation of quantiles in linear regression
Kalina, Jan; Vidnerová, Petra
2019 - English
Estimation of quantiles represents a very important task in econometric regression modeling, while the standard regression quantiles machinery is well developed as well as popular with a large number of econometric applications. Although regression quantiles are commonly known as robust tools, they are vulnerable to the presence of leverage points in the data. We propose here a novel approach for the linear regression based on a specific version of the least weighted squares estimator, together with an additional estimator based only on observations between two different novel quantiles. The new methods are conceptually simple and comprehensible. Without the ambition to derive theoretical properties of the novel methods, numerical computations reveal them to perform comparably to standard regression quantiles, if the data are not contaminated by outliers. Moreover, the new methods seem much more robust on a simulated dataset with severe leverage points.
Keywords:
regression quantiles; robust regression; outliers; leverage points
Fulltext is available at external website.
Implicitly weighted robust estimation of quantiles in linear regression
Estimation of quantiles represents a very important task in econometric regression modeling, while the standard regression quantiles machinery is well developed as well as popular with a large number ...
A Robustified Metalearning Procedure for Regression Estimators
Kalina, Jan; Neoral, A.
2019 - English
Metalearning represents a useful methodology for selecting and recommending a suitable algorithm or method for a new dataset exploiting a database of training datasets. While metalearning is potentially beneficial for the analysis of economic data, we must be aware of its instability and sensitivity to outlying measurements (outliers) as well as measurement errors. The aim of this paper is to robustify the metalearning process. First, we prepare some useful theoretical tools exploiting the idea of implicit weighting, inspired by the least weighted squares estimator. These include a robust coefficient of determination, a robust version of mean square error, and a simple rule for outlier detection in linear regression. We perform a metalearning study for recommending the best linear regression estimator for a new dataset (not included in the training database). The prediction of the optimal estimator is learned over a set of 20 real datasets with economic motivation, while the least squares are compared with several (highly) robust estimators. We investigate the effect of variable selection on the metalearning results. If the training as well as validation data are considered after a proper robust variable selection, the metalearning performance is improved remarkably, especially if a robust prediction error is used.
Keywords:
model choice; computational statistics; robustness; variable selection
Available in digital repository of the ASCR
A Robustified Metalearning Procedure for Regression Estimators
Metalearning represents a useful methodology for selecting and recommending a suitable algorithm or method for a new dataset exploiting a database of training datasets. While metalearning is ...
A Hybrid Method for Nonlinear Least Squares that Uses Quasi-Newton Updates Applied to an Approximation of the Jacobian Matrix
Lukšan, Ladislav; Vlček, Jan
2019 - English
In this contribution, we propose a new hybrid method for minimization of nonlinear least squares. This method is based on quasi-Newton updates, applied to an approximation A of the Jacobian matrix J, such that AT f = JT f. This property allows us to solve a linear least squares problem, minimizing ∥Ad+f∥ instead of solving the normal equation ATAd+JT f = 0, where d ∈ Rn is the required direction vector. Computational experiments confirm the efficiency of the new method.
Keywords:
nonlinear least squares; hybrid methods; trust-region methods; quasi-Newton methods; numerical algorithms; numerical experiments
Available at various institutes of the ASCR
A Hybrid Method for Nonlinear Least Squares that Uses Quasi-Newton Updates Applied to an Approximation of the Jacobian Matrix
In this contribution, we propose a new hybrid method for minimization of nonlinear least squares. This method is based on quasi-Newton updates, applied to an approximation A of the Jacobian matrix J, ...
On the Optimal Initial Conditions for an Inverse Problem of Model Parameter Estimation - a Complementarity Principle
Matonoha, Ctirad; Papáček, Š.
2019 - English
This contribution represents an extension of our earlier studies on the paradigmatic example of the inverse problem of the diffusion parameter estimation from spatio-temporal measurements of fluorescent particle concentration, see [6, 1, 3, 4, 5]. More precisely, we continue to look for an optimal bleaching pattern used in FRAP (Fluorescence Recovery After Photobleaching), being the initial condition of the Fickian diffusion equation maximizing a sensitivity measure. As follows, we define an optimization problem and we show the special feature (so-called complementarity principle) of the optimal binary-valued initial conditions.
Keywords:
parameter identification; bleaching pattern; initial boundary value problem; sensitivity measure
Available in digital repository of the ASCR
On the Optimal Initial Conditions for an Inverse Problem of Model Parameter Estimation - a Complementarity Principle
This contribution represents an extension of our earlier studies on the paradigmatic example of the inverse problem of the diffusion parameter estimation from spatio-temporal measurements of ...
Application of the Infinitely Many Times Repeated BNS Update and Conjugate Directions to Limited-Memory Optimization Methods
Vlček, Jan; Lukšan, Ladislav
2019 - English
To improve the performance of the L-BFGS method for large scale unconstrained optimization, repeating of some BFGS updates was proposed. Since this can be time consuming, the extra updates need to be selected carefully. We show that groups of these updates can be repeated infinitely many times under some conditions, without a noticeable increase of the computational time. The limit update is a block BFGS update. It can be obtained by solving of some Lyapunov matrix equation whose order can be decreased by application of vector corrections for conjugacy. Global convergence of the proposed algorithm is established for convex and sufficiently smooth functions. Numerical results indicate the efficiency of the new method.
Keywords:
unconstrained minimization; limited-memory variable metric methods; the repeated Byrd-Nocedal-Schnabel update; the Lyapunov matrix equation; the conjugate directions; global convergence; numerical results
Available at various institutes of the ASCR
Application of the Infinitely Many Times Repeated BNS Update and Conjugate Directions to Limited-Memory Optimization Methods
To improve the performance of the L-BFGS method for large scale unconstrained optimization, repeating of some BFGS updates was proposed. Since this can be time consuming, the extra updates need to be ...
Nonparametric Bootstrap Techniques for Implicitly Weighted Robust Estimators
Kalina, Jan
2018 - English
The paper is devoted to highly robust statistical estimators based on implicit weighting, which have a potential to find econometric applications. Two particular methods include a robust correlation coefficient based on the least weighted squares regression and the minimum weighted covariance determinant estimator, where the latter allows to estimate the mean and covariance matrix of multivariate data. New tools are proposed allowing to test hypotheses about these robust estimators or to estimate their variance. The techniques considered in the paper include resampling approaches with or without replacement, i.e. permutation tests, bootstrap variance estimation, and bootstrap confidence intervals. The performance of the newly described tools is illustrated on numerical examples. They reveal the suitability of the robust procedures also for non-contaminated data, as their confidence intervals are not much wider compared to those for standard maximum likelihood estimators. While resampling without replacement turns out to be more suitable for hypothesis testing, bootstrapping with replacement yields reliable confidence intervals but not corresponding hypothesis tests.
Keywords:
robust statistics; econometrics; correlation coefficient; multivariate data
Fulltext is available at external website.
Nonparametric Bootstrap Techniques for Implicitly Weighted Robust Estimators
The paper is devoted to highly robust statistical estimators based on implicit weighting, which have a potential to find econometric applications. Two particular methods include a robust correlation ...
NRGL provides central access to information on grey literature produced in the Czech Republic in the fields of science, research and education. You can find more information about grey literature and NRGL at service web
Send your suggestions and comments to nusl@techlib.cz
Provider
Other bases