Questions tagged [chemometrics]

statistics used in chemistry.

53 questions
67
votes
5 answers

How small a quantity should be added to x to avoid taking the log of zero?

I have analysed my data as they are. Now I want to look at my analyses after taking the log of all variables. Many variables contain many zeros. Therefore I add a small quantity to avoid taking the log of zero. So far I've added 10^-10, without any…
miura
  • 3,364
  • 3
  • 21
  • 27
7
votes
1 answer

Mean centering or not in the context of Partial Least Squares

In my current project, I'm using PLS regression on infrared spectra (FTIR). For this I'm using R and the pls function from the plsr package. pls always mean centers both the input data and the infrared spectra. When predicting using a fitted PLS…
6
votes
2 answers

Exclude observations with measurements below limit of detection?

I am analysing a dataset for the relationship between an exposure variable x and a response y (in my case, these are urinary concentration of a specific compound and a measure of cognitive function). x is measured using an analytical method which…
Gux
  • 193
  • 2
  • 10
5
votes
1 answer

statistical handling of lab values below limit of quantitation (BLQ)

There were several samples BLQ because of the lower limit of quantitation (LLQ) of the method, e.g. 5 ng/ml or less. Using the statistical program PRISM6 I marked these values together with the outliers (determined with the Rout method, 1% rule,…
5
votes
2 answers

Predicting chemical property (Boiling Point) from a SMILES string

I was trying to develop a model for predicting Boiling Points (BP) given a chemical name. One good and unique (ok, almost) way to encode a name is the SMILES notation string. The details of the notation are a bit complex ( see here) e.g. Name …
curious_cat
  • 1,043
  • 10
  • 28
4
votes
1 answer

How is "Orthogonal distance" computed?

I was reading the vignette of the R package chemometrics (link). In the second paragraph (right below the first equation) of Page 12, the author writes: the OD (Orthogonal Distance) is calculated in the original space as the orthogonal distance of…
Alex
  • 437
  • 1
  • 5
  • 15
4
votes
0 answers

inverse.predict chemcal package

I have noticed that the inverse.predict function (chemCal package) does not take into account all the degrees of freedom of the model in order to calculate the confidence interval, and I am wondering why. Let me explain a bit better. I was trying…
4
votes
1 answer

When is it considered a repeated and independent experiment in chemistry?

Sorry for the somewhat confusing title. I was wondering about the use of confidence intervals in the context of chemical and biochemical experiments. The experiments must be repeated and the data have to be independent, I know. But - say you are…
3
votes
0 answers

Should I use the prediction interval or inverse prediction interval to calculate the uncertainty of $x$ when using reverse regression?

I'm calibrating a piece of lab instrumentation. I create solutions of known concentration ($x$) and measure my instrument response ($y$). On unknown samples, I measure the response and use the regression line to predict the actual concentration…
3
votes
1 answer

Multiple Linear Regression with more variables than samples

I'm currently learning chemometrics for my work and I have a simple question about Multiple Linear Regression (MLR). Just to explain the context: I am simply using UV-Vis-NIR spectra (2500 wavelengths) to quantify a molecule in presence of…
Snedron
  • 31
  • 2
3
votes
2 answers

Intersections of chemistry and statistics

I am asking this question for a friend who knows a lot of chemistry and is now studying statistics, primarily since he heard this is the age of data and one should know statistics. However, he is interested to know if there are works on the…
Landon Carter
  • 1,295
  • 11
  • 21
3
votes
1 answer

What is maximum likelihood PCA?

There are many papers on this topic, such as this one (pdf). However, I could not find out what exactly maximum likelihood PCA is, how it is applied and for which purpose. Can anyone explain it?
nik
  • 105
  • 1
  • 10
3
votes
1 answer

Prediction of independent data with PLS

In Matlab's plsregress function and in many other statistic toolboxes, there is a BETA vector returned that simplyfies the regression problem to(excluding the intercept term in BETA): Y=X*BETA In almost all documentations, this BETA vector is used…
gunakkoc
  • 1,382
  • 1
  • 10
  • 23
3
votes
1 answer

Kruskal-Wallis: how to handle ties that might not be really ties?

I have data from a mass spectrometer that are precise to 6 decimal places and range from 0.1 to 10. However, some items cannot be measured by the mass spectrometer because they are "below the limit of quantitation" (BLQ). Researchers I'm working…
shorty
  • 31
  • 2
3
votes
1 answer

Is removing points from a calibration rigorous?

When a calibration is generated from a set of standards run on an analytical instrument, should the standards be remade and reanalyzed if not all of the points fit within 20%-30% (depending on regulations) of the regression or give a coefficient of…
DifferentialPleiometry
  • 2,274
  • 1
  • 11
  • 27
1
2 3 4