Questions tagged [underdetermined]

Analyses are underdetermined when the number of parameters to be estimated is greater than the number of data. This problem is also referred to as 'p >> n'.

Analyses are underdetermined (cf., Wikipedia: underdetermined system) when the number of parameters to be estimated is greater than the number of data. This problem is also referred to as 'p >> n'. A practical example of this is genome-wide association studies that attempt to determine if any of a large number of genetic variants predict a disease.

23 questions

votes

1 answer

Feature selection & model with glmnet on Methylation data (p>>N)

I would like to use GLM and Elastic Net to select those relevant features + build a linear regression model (i.e., both prediction and understanding, so it would be better to be left with relatively few parameters). The output is continuous. It's…

asked Sep 17 '13 at 13:34

PGreen

votes

5 answers

Detecting significant predictors out of many independent variables

In a dataset of two non-overlapping populations (patients & healthy, total $n=60$) I would like to find (out of $300$ independent variables) significant predictors for a continuous dependent variable. Correlation between predictors is present. I am…

regression pca feature-selection stepwise-regression underdetermined

asked Aug 21 '12 at 12:32

jokel

2,403
4
32
40

votes

2 answers

Can one (theoretically) train a neural network with fewer training samples than weights?

First of all: I know, there is no general number of sample size required to train a neural network. It depends on way too many factors like complexity of the task, noise in the data and so on. And the more training samples I have, the better will be…

neural-networks overfitting underdetermined

asked Jul 19 '17 at 09:31

Hobbit

votes

1 answer

Applying ridge regression for an underdetermined system of equations?

When $y = X\beta + e$, the least squares problem which imposes a spherical restriction $\delta$ on the value of $\beta$ can be written as \begin{equation} \begin{array} &\operatorname{min}\ \| y - X\beta \|^2_2 \\ \operatorname{s.t.}\ \…

regression least-squares regularization ridge-regression underdetermined

asked Jan 21 '14 at 01:59

hatmatrix

votes

4 answers

Solving a practical machine learning problem

I am currently doing my Phd in computational biology at Stanford. I get the data I need to answer the questions I am interested in. The data sets are sometimes "large" and these large problems take longer time periods to solve (a couple of days…

machine-learning underdetermined scalability

asked Aug 11 '14 at 06:14

Sid

2,489
10
15

votes

3 answers

SVM has relatively low classification rate for high-dimensional data even though 2-D projections show they are separable

I have another problem with 14000 features and 500 training samples. It is a binary classification problem and approximately in the form of an ellipse. My classification accuracy using the 2nd degree polynomial Kernel and via CV is ~ 80%. However,…

svm underdetermined

asked Jul 11 '13 at 16:04

user27525

votes

2 answers

Why is it bad if number of dimensions / factors > sample size?

I've been told (read) this many times, but I never understood why it's bad for the number of dimensions in your data, or the number of explanatory variables in your model to be higher than your number of samples. Why is this the case?

sample-size underdetermined

asked Jun 27 '13 at 16:10

tmakino

votes

3 answers

Why is $n < p$ a problem for OLS regression?

I realize I can't invert the $X'X$ matrix but I can use gradient descent on the quadratic loss function and get a solution. I can then use those estimates to calculate standard errors and residuals. Am I going to encounter any problems doing this?

regression machine-learning least-squares underdetermined

asked May 31 '17 at 03:11

badmax

1,659
7
19

votes

0 answers

How to identify a SEM with formative dependent variable (with R's lavaan package)?

I have a formative construct in a structural equation model (SEM) which I would like to estimate with the function sem in the lavaan package in R. Currently, the model is underidentified. I know about four different approaches for identifying the…

r structural-equation-modeling underdetermined lavaan

asked Aug 25 '15 at 19:00

jhg

votes

0 answers

What is scikit-learn's LinearRegression doing when there are more features than observations?

I'm trying to understand what sklearn's LinearRegression (which should be using ordinary least squares) is doing when there are more features than observations. import numpy as np from sklearn.linear_model import LinearRegression X =…

regression least-squares scikit-learn underdetermined

asked Aug 30 '21 at 20:43

dseok

votes

1 answer

Fitting least squares when number of predictors are larger than instances

A statement from the book Introduction to Statistical learning with applications in R, didn't quite make sense to me. It says, "In cases when number of predictors are greater than the instances we cannot even fit the multiple linear regression…

multiple-regression linear-model identifiability underdetermined

asked Jan 18 '17 at 02:31

GeneX

votes

1 answer

Linear discriminant analysis with $p\gg n$

I am studying Linear Discriminant Analysis (LDA). According to the formula for LDA, we are supposed to get the inverse of within group covariance. However, if $p\gg n$ (i.e., the dimension is much larger than the number of samples), what should I…

machine-learning classification discriminant-analysis underdetermined

asked Oct 05 '18 at 14:58

coolcat

votes

0 answers

LASSO prediction model question

I am trying to create a prediction model with 33 predictors (brain metabolite levels in various regions) and 8 observations (cognitive test scores) with p>>n problem using LASSO in MATLAB (lassoglm function). When I run LASSO 100 times with 5-fold…

cross-validation feature-selection lasso overfitting underdetermined

asked Oct 22 '14 at 16:38

Cemil

votes

1 answer

Dealing with underdetermination in Bayesian models

Bayesian models are supposedly well equipped to deal with high-dimensionality problems, and can handle sparse data well, too. But suppose I've created a model that estimate more parameters than there are data points. Are there tricks to deal with…

underdetermined

asked Nov 19 '13 at 19:34

Brash Equilibrium

3,565
1
25
43

votes

1 answer

Dataset for Least Angle Regression

I have read that least angle regression is good for high dimensional data. I didn't actually understand the meaning of high dimensional data, so does this mean $p>>n$ case? And does anyone know any good dataset with such properties on which we can…

regression dataset lars underdetermined high-dimensional

asked Jun 11 '14 at 14:27

Saurabh7

2 Next