This is a useful article for ensuring the validation of your statistical analyses. However, much of what a BI analyst does deals with qualitative data that may not as strictly adhere to the validation recommendations and requirements presented here. Within the field of intelligence analysis, much work has been done to identify ways to quantify qualitative assessments of validity, reliability, analytic confidence, and other aspects to ensure validation of intelligence findings, many modeled on statistical validation. Think about your most recent project, whether for work or school. How could you numerically and objectively evaluate the validity of your research?
3. Validation Characteristics
Statistics recommended by ICH, USP, and FDA to evaluate method suitability are presented below.
3.1 Specificity/selectivity
Specificity is a quantitative indication of the extent to which a method can distinguish between the analyte of interest and interfering substances on the basis of signals produced under actual experimental conditions. Random interferences should be determined
using representative blank samples. Table 4 presents an example of specificity of a developed method for determination of ornidazole in pharmaceutical formulations.
The % recovery was found to be in the range of 99.4–100.4%, hence there were
no interferences of the excipients and additives which indicate the selectivity of the developed method.
Table 4 Determination of ornidazole in presence of excipients and additives.
Excipients and additives | Amount (mg) | %Recovery of ornidazol ∓ RSD |
---|---|---|
Titanium dioxide | 40 | 100.1 ∓ 0.76 |
Talc | 50 | 100.4 ∓ 0.81 |
Hydroxypropyl methylcellulose | 50 | 100.14 ∓ 0.9 |
Corn starch | 50 | 99.98 ∓ 0.93 |
Methylhydroxyethylcellulose | 50 | 100.03 ∓ 0.94 |
Magnesium stearate | 40 | 99.4 ∓ 0.84 |
Microcrystalline cellulose | 30 | 99.66 ∓ 0.89 |
%Recovery = Mean value/Added amount × 100; RSD: relative standard deviation \(%RSD = (s / \overline X) \times 100%\).
⁎ Average of five determination.
3.2 Accuracy
Accuracy refers to closeness of agreement between the true value of the analyte concentration and the mean result obtained by applying experimental procedure to a large number of homogeneous samples. It is related to systematic error and analyte recovery.
Systematic errors can be established by the use of appropriate certified reference materials (matrix-matched) or by applying alternative analytical techniques. Table 5 provides an example of accuracy data assessment of an analytical method for atomic
absorption spectrometry analysis of Pb in bivalve molluscs.
In this example, the calculated accuracy error is less than 2. So, it is considered insignificant. Therefore, the uncertainty associated to accuracy of the method is equal to the uncertainty
of the reference material used for testing accuracy study.
Table 5 Result of an accuracy study.
Number (n) | 10 |
Mean (d) | 0.989 |
Standard deviation (Sd) | 0.072 |
Concentration of reference material (C) (μg/g) | 1.19 |
Uncertainty of reference materials | 0.18 |
Relative error (%) | 16.879 |
Accuracy (%) | 83.121 |
Accuracy error | 1.036 |
Relative error % = greatest possible error/measured value; Accuracy % = \(\overline X / \nu ) \times 100\) , with: \(\overline X\)= mean of test results obtained for reference sample, \(m\) = "true" value given for reference sample; Accuracy error = \(\overline \nu - V\), with: \(\overline \nu \): mean of measures, \(V\): true value.
3.3 Precision
Comparison of results obtained from samples prepared to test the following conditions:
- Repeatability expresses the precision under the same operating conditions over a short interval of time. Repeatability is also termed intra-assay precision.
- Intermediate precision expresses within-laboratories variations: different days, different analysts, different equipments, etc.
- Reproducibility expresses the precision between laboratories (collaborative studies, usually applied to standardization of methodology).
Both repeatability and the reproducibility are expressed in terms of standard deviation and are generally dependent on analyte concentration. It is thus recommended that the repeatability and within-laboratory reproducibility are determined at different
concentrations across the working range, by carrying out 10 repeated determinations at each concentration level. As stipulated by Horwitz and Albert, the variability between laboratories is the dominating error component in the world of practical
ultartrace analysis. They conclude that a single laboratory cannot determine its own error structure, except in the context of certified reference materials or consensus results from other laboratories.
Table 6 provides an example of a typical
data analysis summary for the evaluation of a precision study for an analytical method for the quantitation of Ecstasy in seizures by UV-Vis spectrophotometer. In this example, the method was tested in two different laboratories by two different analysts
on two different instruments.
The standard deviations and the percentage recoveries (not more than 2%) indicate good precision of the method.
In the example provided in Table 6, precision is determined for a number of different levels
during validation, which include system precision, repeatability, intermediate precision, and reproducibility. The system precision is evaluated by comparing the means and relative standard deviations. Reproducibility is assessed by means of an inter-laboratory
assay. The intermediate precision is established by comparing analytical results obtained when using different analysts and instruments and performing the analysis on different days. The repeatability is assessed by measuring the variability in the
results obtained when using the analytical method in a single determination. In each case, the mean and % of RSD is calculated and compared to the established acceptance criteria.
Table 6 Example of results obtained for a precision study.
Repeatability | Replicability | ||
---|---|---|---|
|
|||
1st day | 2nd day | ||
Number (n) | 10 | 10 | 10 |
Mean (d) | 48.2531 | 49.05252 | 50 |
Standard deviation (Sd) | 0.0264673 | 0.056026 | 0.0168736 |
CV% | 0.054851 | 0.11421617 | 0.03356 |
Precision | 96.38% | 98.068% | 99.45% |
Tcrit (95;9) | 2.262 | 2.262 | |
Error | 3.62 | 1.931562 | |
Confidence interval | 48.253 ∓ 0.019 | 49.288 ∓ 0.196 | |
Mean of means | 48.65281 | ||
Mean of standard deviation | 0.04124665 |
3.4 Detection limit
The ICH guideline mentions several approaches for determining the detection limit: visual inspection, signal-to-noise, and using the standard deviation of the response and the slope. The detection limit and the method used for determining the detection
limit should be presented. If visual evaluation is used, the detection limit is determined by the analysis of samples with known concentration of analyte and by establishing the minimum level at which the analyte can be reliably detected. The signal-to-noise
ratio is performed by comparing measured signals from samples with known low concentrations of analyte with those of blank. When the detection limit is based on the standard deviation of the response and the slope, it is calculated using the following
equation.
\(LDM = \frac{3.3 \alpha}{S}\),
where σ is the standard deviation of the response and \(S\) is the slope of the calibration curve.
The limit of detection is usually expressed as the analyte concentration corresponding to the sample blank plus three sample standard deviations,
based on 10 independent analyses of sample blanks.
3.5 Quantitation limit
The ICH guideline mentions several approaches for determining the quantitation limit: an approach based on visual evaluation, an approach based on signal-to-noise and an approach based on the standard deviation of the response and the slope. The quantitation
limit and the method used for determining the quantitation limit should be presented. When the quantitation limit is based on the standard deviation of the response and the slope, it is calculated using the equation below:
\(LDM = \frac{3.3 \alpha}{S}\),
where σ is the standard deviation of the response and \(S\) is the slope of the calibration curve.
Limit of quantitation is set by various conventions to be five, six or ten standard deviations of the blank mean. It is also sometimes known as the
limit of determination.
The LDM defined as the lowest detectable concentration on the calibration curve where both accuracy and precision should be within the maximum tolerable CV of 5.32%, was deemed to be 0.406 μg/mL. This LDM is adequate
for the analysis of forensic samples, as this value falls within the concentration range of MDMA in many ecstasy tablets analyzed. Furthermore, Ration of Conformity (6.26) is between 4 and 10, so LDM is validated.
Table 7 Limit of detection and limit of quantification data of a method for the quantitation of ecstasy in seizures by UV-Vis spectrophotometer.
Number (n) | 10 |
Mean (d) | 2.54293 |
Standard deviation (Sd) | 0.1352983 |
CV% | 5.32 |
Precision | 82.02 |
LDM | 0.406 |
LQM | 1.35 |
R (ratio of conformity) | 6.26 |
\(R = \overline X / LDM\)
3.6 Working and linear ranges
For any quantitative method, there is a range of analyte concentrations over which the method may be applied. At the lower end of the concentration range the limiting factor is the value of the limit of detection and/or limit of quantification. At the
upper end of the concentration range limitations will be imposed by various effects depending on the detection mechanism.
Within this working range there may exist a linear range, within which the detection response will have a sufficiently
linear relation to analyte concentration. The working and linear range may differ in different sample types according to the effect of interferences arising from the sample matrix. It is recommended that, in the first instance, the response relationship
should be examined over the working range by carrying out a single assessment of the response levels to at least six concentration levels. To determine the response relationship within the linear range, it is recommended that three replicates are
carried out at each of at least six concentration levels.
If there is a linear relationship, test results should be evaluated by linear regression analysis. The correlation coefficient, y-intercept, and slope of the regression line and residual
sum of squires should be submitted with a plot of data.
Table 8 Example of linear regression analysis of standard solutions for cocaine by a HPLC/MS/MS method.
Range (ng/mL) | Equation | r |
---|---|---|
0.1–500 | Y = 5.43 × 103 X − 1.05 × 105 | 0.983 |
0.1–250 | Y = 7.82 × 103 X + 7.27 × 104 | 0.9905 |
1–100 | Y = 2.71 × 104 X + 4.59 × 104 | 0.9966 |
Table 9 Example of linear regression analysis of standard solutions for a 1–100 ng/mL concentration range, performed in 3 different days for cocaine by a HPLC/MS/MS method.
Day | Equation | Range (ng/mL) | r |
---|---|---|---|
1 | \(Y = 1.54 × 10^4 X + 6.55 × 10^3\) | 1–100 | 0.9919 |
2 | \(Y = 1.53 × 10^4 X\) | 1–100 | 0.9921 |
3 | \(Y = 2.71 × 10^4 X + 4.59 × 10^4\) | 1–100 | 0.9966 |
3.7 Sensitivity
Sensitivity is the measure of the change in instrument response which corresponds to a change in analyte concentration. Where the response has been established as linear with respect to concentration, sensitivity corresponds to the gradient of the response
curve.
Recovery study reported in Table 10 shows that the calculated percentage recovery varies between 79.38% and 131.62%. These percentages were validated by Student's t-test.
Table 10 Result of recovery study of cadmium quantification in bivalve molluscs using an atomic absorption spectrometry with graphite furnace method.
C | Cf | Cf − C | Ca | %Recovery |
---|---|---|---|---|
21.5380 | 31.8610 | 10.323 | 10 | 103.23 |
25.5430 | 33.4810 | 7.938 | 10 | 79.38 |
16.9935 | 28.7975 | 11.804 | 10 | 118.04 |
15.5160 | 23.6830 | 8.167 | 7 | 116.67 |
9.8015 | 16.3830 | 6.581 | 5 | 131.62 |
Mean | 109.788 | |||
Sd | 19.747 | |||
α | 0.05 | |||
Tcrit | 2.78 | |||
Tobs | 12.432 |
C: concentration before standard additions; Cf: concentration after standard additions; Ca: concentration of standard additions.
3.8 Robustness
This is a measure of how effectively the performance of the analytical method stands up to less than perfect implementation. In any method there will be certain parts which will severely affect the method performance, unless they are carried out with
sufficient care. These aspects should be identified and, if possible, their influence on the method performance should be evaluated using the robustness tests. These tests provide important information for the evaluation of the measurement uncertainty.
The methodology for evaluating uncertainty given in the IS Guide relies on identifying all parameters that may affect the result and on quantifying the uncertainty contribution from each source. This is very similar to procedures used in robustness
tests which identify all the parameters likely to influence the result and determine the acceptability of their influence through control. If carried out with this in mind, the robustness tests can provide information on the contribution to the overall
uncertainty from each of the parameters studied. Mean and %RSDs are compared against the acceptance criteria to evaluate impact of changing experimental parameters.
The ICH guidelines suggest detailed validation schemes relative to the purpose
of the methods. It lists recommended data to report for each validation parameter. Acceptance criteria for validation must be based on the previous performances of the method, the product specifications and the phase of development.
The path
to validation forms a continuum. It begins in the early phases of development as a set of informal experiments that establishes the soundness of the method for its intended purpose. It is expanded throughout the regulatory submission process into
a fully-documented report that is required for commercial production. It is repeated whenever there is a significant change in instrumentation, method, specifications and process.