1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16 package org.apache.commons.math.stat.inference;
17
18 import org.apache.commons.math.MathException;
19 import org.apache.commons.math.stat.descriptive.StatisticalSummary;
20
21 /**
22 * An interface for Student's t-tests.
23 * <p>
24 * Tests can be:<ul>
25 * <li>One-sample or two-sample</li>
26 * <li>One-sided or two-sided</li>
27 * <li>Paired or unpaired (for two-sample tests)</li>
28 * <li>Homoscedastic (equal variance assumption) or heteroscedastic
29 * (for two sample tests)</li>
30 * <li>Fixed significance level (boolean-valued) or returning p-values.
31 * </li></ul>
32 * <p>
33 * Test statistics are available for all tests. Methods including "Test" in
34 * in their names perform tests, all other methods return t-statistics. Among
35 * the "Test" methods, <code>double-</code>valued methods return p-values;
36 * <code>boolean-</code>valued methods perform fixed significance level tests.
37 * Significance levels are always specified as numbers between 0 and 0.5
38 * (e.g. tests at the 95% level use <code>alpha=0.05</code>).
39 * <p>
40 * Input to tests can be either <code>double[]</code> arrays or
41 * {@link StatisticalSummary} instances.
42 *
43 *
44 * @version $Revision: 161625 $ $Date: 2005-04-16 22:12:15 -0700 (Sat, 16 Apr 2005) $
45 */
46 public interface TTest {
47 /**
48 * Computes a paired, 2-sample t-statistic based on the data in the input
49 * arrays. The t-statistic returned is equivalent to what would be returned by
50 * computing the one-sample t-statistic {@link #t(double, double[])}, with
51 * <code>mu = 0</code> and the sample array consisting of the (signed)
52 * differences between corresponding entries in <code>sample1</code> and
53 * <code>sample2.</code>
54 * <p>
55 * <strong>Preconditions</strong>: <ul>
56 * <li>The input arrays must have the same length and their common length
57 * must be at least 2.
58 * </li></ul>
59 *
60 * @param sample1 array of sample data values
61 * @param sample2 array of sample data values
62 * @return t statistic
63 * @throws IllegalArgumentException if the precondition is not met
64 * @throws MathException if the statistic can not be computed do to a
65 * convergence or other numerical error.
66 */
67 public abstract double pairedT(double[] sample1, double[] sample2)
68 throws IllegalArgumentException, MathException;
69 /**
70 * Returns the <i>observed significance level</i>, or
71 * <i> p-value</i>, associated with a paired, two-sample, two-tailed t-test
72 * based on the data in the input arrays.
73 * <p>
74 * The number returned is the smallest significance level
75 * at which one can reject the null hypothesis that the mean of the paired
76 * differences is 0 in favor of the two-sided alternative that the mean paired
77 * difference is not equal to 0. For a one-sided test, divide the returned
78 * value by 2.
79 * <p>
80 * This test is equivalent to a one-sample t-test computed using
81 * {@link #tTest(double, double[])} with <code>mu = 0</code> and the sample
82 * array consisting of the signed differences between corresponding elements of
83 * <code>sample1</code> and <code>sample2.</code>
84 * <p>
85 * <strong>Usage Note:</strong><br>
86 * The validity of the p-value depends on the assumptions of the parametric
87 * t-test procedure, as discussed
88 * <a href="http://www.basic.nwu.edu/statguidefiles/ttest_unpaired_ass_viol.html">
89 * here</a>
90 * <p>
91 * <strong>Preconditions</strong>: <ul>
92 * <li>The input array lengths must be the same and their common length must
93 * be at least 2.
94 * </li></ul>
95 *
96 * @param sample1 array of sample data values
97 * @param sample2 array of sample data values
98 * @return p-value for t-test
99 * @throws IllegalArgumentException if the precondition is not met
100 * @throws MathException if an error occurs computing the p-value
101 */
102 public abstract double pairedTTest(double[] sample1, double[] sample2)
103 throws IllegalArgumentException, MathException;
104 /**
105 * Performs a paired t-test evaluating the null hypothesis that the
106 * mean of the paired differences between <code>sample1</code> and
107 * <code>sample2</code> is 0 in favor of the two-sided alternative that the
108 * mean paired difference is not equal to 0, with significance level
109 * <code>alpha</code>.
110 * <p>
111 * Returns <code>true</code> iff the null hypothesis can be rejected with
112 * confidence <code>1 - alpha</code>. To perform a 1-sided test, use
113 * <code>alpha * 2</code>
114 * <p>
115 * <strong>Usage Note:</strong><br>
116 * The validity of the test depends on the assumptions of the parametric
117 * t-test procedure, as discussed
118 * <a href="http://www.basic.nwu.edu/statguidefiles/ttest_unpaired_ass_viol.html">
119 * here</a>
120 * <p>
121 * <strong>Preconditions</strong>: <ul>
122 * <li>The input array lengths must be the same and their common length
123 * must be at least 2.
124 * </li>
125 * <li> <code> 0 < alpha < 0.5 </code>
126 * </li></ul>
127 *
128 * @param sample1 array of sample data values
129 * @param sample2 array of sample data values
130 * @param alpha significance level of the test
131 * @return true if the null hypothesis can be rejected with
132 * confidence 1 - alpha
133 * @throws IllegalArgumentException if the preconditions are not met
134 * @throws MathException if an error occurs performing the test
135 */
136 public abstract boolean pairedTTest(
137 double[] sample1,
138 double[] sample2,
139 double alpha)
140 throws IllegalArgumentException, MathException;
141 /**
142 * Computes a <a href="http://www.itl.nist.gov/div898/handbook/prc/section2/prc22.htm#formula">
143 * t statistic </a> given observed values and a comparison constant.
144 * <p>
145 * This statistic can be used to perform a one sample t-test for the mean.
146 * <p>
147 * <strong>Preconditions</strong>: <ul>
148 * <li>The observed array length must be at least 2.
149 * </li></ul>
150 *
151 * @param mu comparison constant
152 * @param observed array of values
153 * @return t statistic
154 * @throws IllegalArgumentException if input array length is less than 2
155 */
156 public abstract double t(double mu, double[] observed)
157 throws IllegalArgumentException;
158 /**
159 * Computes a <a href="http://www.itl.nist.gov/div898/handbook/prc/section2/prc22.htm#formula">
160 * t statistic </a> to use in comparing the mean of the dataset described by
161 * <code>sampleStats</code> to <code>mu</code>.
162 * <p>
163 * This statistic can be used to perform a one sample t-test for the mean.
164 * <p>
165 * <strong>Preconditions</strong>: <ul>
166 * <li><code>observed.getN() > = 2</code>.
167 * </li></ul>
168 *
169 * @param mu comparison constant
170 * @param sampleStats DescriptiveStatistics holding sample summary statitstics
171 * @return t statistic
172 * @throws IllegalArgumentException if the precondition is not met
173 */
174 public abstract double t(double mu, StatisticalSummary sampleStats)
175 throws IllegalArgumentException;
176 /**
177 * Computes a 2-sample t statistic, under the hypothesis of equal
178 * subpopulation variances. To compute a t-statistic without the
179 * equal variances hypothesis, use {@link #t(double[], double[])}.
180 * <p>
181 * This statistic can be used to perform a (homoscedastic) two-sample
182 * t-test to compare sample means.
183 * <p>
184 * The t-statisitc is
185 * <p>
186 * <code> t = (m1 - m2) / (sqrt(1/n1 +1/n2) sqrt(var))</code>
187 * <p>
188 * where <strong><code>n1</code></strong> is the size of first sample;
189 * <strong><code> n2</code></strong> is the size of second sample;
190 * <strong><code> m1</code></strong> is the mean of first sample;
191 * <strong><code> m2</code></strong> is the mean of second sample</li>
192 * </ul>
193 * and <strong><code>var</code></strong> is the pooled variance estimate:
194 * <p>
195 * <code>var = sqrt(((n1 - 1)var1 + (n2 - 1)var2) / ((n1-1) + (n2-1)))</code>
196 * <p>
197 * with <strong><code>var1<code></strong> the variance of the first sample and
198 * <strong><code>var2</code></strong> the variance of the second sample.
199 * <p>
200 * <strong>Preconditions</strong>: <ul>
201 * <li>The observed array lengths must both be at least 2.
202 * </li></ul>
203 *
204 * @param sample1 array of sample data values
205 * @param sample2 array of sample data values
206 * @return t statistic
207 * @throws IllegalArgumentException if the precondition is not met
208 */
209 public abstract double homoscedasticT(double[] sample1, double[] sample2)
210 throws IllegalArgumentException;
211 /**
212 * Computes a 2-sample t statistic, without the hypothesis of equal
213 * subpopulation variances. To compute a t-statistic assuming equal
214 * variances, use {@link #homoscedasticT(double[], double[])}.
215 * <p>
216 * This statistic can be used to perform a two-sample t-test to compare
217 * sample means.
218 * <p>
219 * The t-statisitc is
220 * <p>
221 * <code> t = (m1 - m2) / sqrt(var1/n1 + var2/n2)</code>
222 * <p>
223 * where <strong><code>n1</code></strong> is the size of the first sample
224 * <strong><code> n2</code></strong> is the size of the second sample;
225 * <strong><code> m1</code></strong> is the mean of the first sample;
226 * <strong><code> m2</code></strong> is the mean of the second sample;
227 * <strong><code> var1</code></strong> is the variance of the first sample;
228 * <strong><code> var2</code></strong> is the variance of the second sample;
229 * <p>
230 * <strong>Preconditions</strong>: <ul>
231 * <li>The observed array lengths must both be at least 2.
232 * </li></ul>
233 *
234 * @param sample1 array of sample data values
235 * @param sample2 array of sample data values
236 * @return t statistic
237 * @throws IllegalArgumentException if the precondition is not met
238 */
239 public abstract double t(double[] sample1, double[] sample2)
240 throws IllegalArgumentException;
241 /**
242 * Computes a 2-sample t statistic </a>, comparing the means of the datasets
243 * described by two {@link StatisticalSummary} instances, without the
244 * assumption of equal subpopulation variances. Use
245 * {@link #homoscedasticT(StatisticalSummary, StatisticalSummary)} to
246 * compute a t-statistic under the equal variances assumption.
247 * <p>
248 * This statistic can be used to perform a two-sample t-test to compare
249 * sample means.
250 * <p>
251 * The returned t-statisitc is
252 * <p>
253 * <code> t = (m1 - m2) / sqrt(var1/n1 + var2/n2)</code>
254 * <p>
255 * where <strong><code>n1</code></strong> is the size of the first sample;
256 * <strong><code> n2</code></strong> is the size of the second sample;
257 * <strong><code> m1</code></strong> is the mean of the first sample;
258 * <strong><code> m2</code></strong> is the mean of the second sample
259 * <strong><code> var1</code></strong> is the variance of the first sample;
260 * <strong><code> var2</code></strong> is the variance of the second sample
261 * <p>
262 * <strong>Preconditions</strong>: <ul>
263 * <li>The datasets described by the two Univariates must each contain
264 * at least 2 observations.
265 * </li></ul>
266 *
267 * @param sampleStats1 StatisticalSummary describing data from the first sample
268 * @param sampleStats2 StatisticalSummary describing data from the second sample
269 * @return t statistic
270 * @throws IllegalArgumentException if the precondition is not met
271 */
272 public abstract double t(
273 StatisticalSummary sampleStats1,
274 StatisticalSummary sampleStats2)
275 throws IllegalArgumentException;
276 /**
277 * Computes a 2-sample t statistic, comparing the means of the datasets
278 * described by two {@link StatisticalSummary} instances, under the
279 * assumption of equal subpopulation variances. To compute a t-statistic
280 * without the equal variances assumption, use
281 * {@link #t(StatisticalSummary, StatisticalSummary)}.
282 * <p>
283 * This statistic can be used to perform a (homoscedastic) two-sample
284 * t-test to compare sample means.
285 * <p>
286 * The t-statisitc returned is
287 * <p>
288 * <code> t = (m1 - m2) / (sqrt(1/n1 +1/n2) sqrt(var))</code>
289 * <p>
290 * where <strong><code>n1</code></strong> is the size of first sample;
291 * <strong><code> n2</code></strong> is the size of second sample;
292 * <strong><code> m1</code></strong> is the mean of first sample;
293 * <strong><code> m2</code></strong> is the mean of second sample
294 * and <strong><code>var</code></strong> is the pooled variance estimate:
295 * <p>
296 * <code>var = sqrt(((n1 - 1)var1 + (n2 - 1)var2) / ((n1-1) + (n2-1)))</code>
297 * <p>
298 * with <strong><code>var1<code></strong> the variance of the first sample and
299 * <strong><code>var2</code></strong> the variance of the second sample.
300 * <p>
301 * <strong>Preconditions</strong>: <ul>
302 * <li>The datasets described by the two Univariates must each contain
303 * at least 2 observations.
304 * </li></ul>
305 *
306 * @param sampleStats1 StatisticalSummary describing data from the first sample
307 * @param sampleStats2 StatisticalSummary describing data from the second sample
308 * @return t statistic
309 * @throws IllegalArgumentException if the precondition is not met
310 */
311 public abstract double homoscedasticT(
312 StatisticalSummary sampleStats1,
313 StatisticalSummary sampleStats2)
314 throws IllegalArgumentException;
315 /**
316 * Returns the <i>observed significance level</i>, or
317 * <i>p-value</i>, associated with a one-sample, two-tailed t-test
318 * comparing the mean of the input array with the constant <code>mu</code>.
319 * <p>
320 * The number returned is the smallest significance level
321 * at which one can reject the null hypothesis that the mean equals
322 * <code>mu</code> in favor of the two-sided alternative that the mean
323 * is different from <code>mu</code>. For a one-sided test, divide the
324 * returned value by 2.
325 * <p>
326 * <strong>Usage Note:</strong><br>
327 * The validity of the test depends on the assumptions of the parametric
328 * t-test procedure, as discussed
329 * <a href="http://www.basic.nwu.edu/statguidefiles/ttest_unpaired_ass_viol.html">here</a>
330 * <p>
331 * <strong>Preconditions</strong>: <ul>
332 * <li>The observed array length must be at least 2.
333 * </li></ul>
334 *
335 * @param mu constant value to compare sample mean against
336 * @param sample array of sample data values
337 * @return p-value
338 * @throws IllegalArgumentException if the precondition is not met
339 * @throws MathException if an error occurs computing the p-value
340 */
341 public abstract double tTest(double mu, double[] sample)
342 throws IllegalArgumentException, MathException;
343 /**
344 * Performs a <a href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda353.htm">
345 * two-sided t-test</a> evaluating the null hypothesis that the mean of the population from
346 * which <code>sample</code> is drawn equals <code>mu</code>.
347 * <p>
348 * Returns <code>true</code> iff the null hypothesis can be
349 * rejected with confidence <code>1 - alpha</code>. To
350 * perform a 1-sided test, use <code>alpha * 2</code>
351 * <p>
352 * <strong>Examples:</strong><br><ol>
353 * <li>To test the (2-sided) hypothesis <code>sample mean = mu </code> at
354 * the 95% level, use <br><code>tTest(mu, sample, 0.05) </code>
355 * </li>
356 * <li>To test the (one-sided) hypothesis <code> sample mean < mu </code>
357 * at the 99% level, first verify that the measured sample mean is less
358 * than <code>mu</code> and then use
359 * <br><code>tTest(mu, sample, 0.02) </code>
360 * </li></ol>
361 * <p>
362 * <strong>Usage Note:</strong><br>
363 * The validity of the test depends on the assumptions of the one-sample
364 * parametric t-test procedure, as discussed
365 * <a href="http://www.basic.nwu.edu/statguidefiles/sg_glos.html#one-sample">here</a>
366 * <p>
367 * <strong>Preconditions</strong>: <ul>
368 * <li>The observed array length must be at least 2.
369 * </li></ul>
370 *
371 * @param mu constant value to compare sample mean against
372 * @param sample array of sample data values
373 * @param alpha significance level of the test
374 * @return p-value
375 * @throws IllegalArgumentException if the precondition is not met
376 * @throws MathException if an error computing the p-value
377 */
378 public abstract boolean tTest(double mu, double[] sample, double alpha)
379 throws IllegalArgumentException, MathException;
380 /**
381 * Returns the <i>observed significance level</i>, or
382 * <i>p-value</i>, associated with a one-sample, two-tailed t-test
383 * comparing the mean of the dataset described by <code>sampleStats</code>
384 * with the constant <code>mu</code>.
385 * <p>
386 * The number returned is the smallest significance level
387 * at which one can reject the null hypothesis that the mean equals
388 * <code>mu</code> in favor of the two-sided alternative that the mean
389 * is different from <code>mu</code>. For a one-sided test, divide the
390 * returned value by 2.
391 * <p>
392 * <strong>Usage Note:</strong><br>
393 * The validity of the test depends on the assumptions of the parametric
394 * t-test procedure, as discussed
395 * <a href="http://www.basic.nwu.edu/statguidefiles/ttest_unpaired_ass_viol.html">
396 * here</a>
397 * <p>
398 * <strong>Preconditions</strong>: <ul>
399 * <li>The sample must contain at least 2 observations.
400 * </li></ul>
401 *
402 * @param mu constant value to compare sample mean against
403 * @param sampleStats StatisticalSummary describing sample data
404 * @return p-value
405 * @throws IllegalArgumentException if the precondition is not met
406 * @throws MathException if an error occurs computing the p-value
407 */
408 public abstract double tTest(double mu, StatisticalSummary sampleStats)
409 throws IllegalArgumentException, MathException;
410 /**
411 * Performs a <a href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda353.htm">
412 * two-sided t-test</a> evaluating the null hypothesis that the mean of the
413 * population from which the dataset described by <code>stats</code> is
414 * drawn equals <code>mu</code>.
415 * <p>
416 * Returns <code>true</code> iff the null hypothesis can be rejected with
417 * confidence <code>1 - alpha</code>. To perform a 1-sided test, use
418 * <code>alpha * 2.</code>
419 * <p>
420 * <strong>Examples:</strong><br><ol>
421 * <li>To test the (2-sided) hypothesis <code>sample mean = mu </code> at
422 * the 95% level, use <br><code>tTest(mu, sampleStats, 0.05) </code>
423 * </li>
424 * <li>To test the (one-sided) hypothesis <code> sample mean < mu </code>
425 * at the 99% level, first verify that the measured sample mean is less
426 * than <code>mu</code> and then use
427 * <br><code>tTest(mu, sampleStats, 0.02) </code>
428 * </li></ol>
429 * <p>
430 * <strong>Usage Note:</strong><br>
431 * The validity of the test depends on the assumptions of the one-sample
432 * parametric t-test procedure, as discussed
433 * <a href="http://www.basic.nwu.edu/statguidefiles/sg_glos.html#one-sample">here</a>
434 * <p>
435 * <strong>Preconditions</strong>: <ul>
436 * <li>The sample must include at least 2 observations.
437 * </li></ul>
438 *
439 * @param mu constant value to compare sample mean against
440 * @param sampleStats StatisticalSummary describing sample data values
441 * @param alpha significance level of the test
442 * @return p-value
443 * @throws IllegalArgumentException if the precondition is not met
444 * @throws MathException if an error occurs computing the p-value
445 */
446 public abstract boolean tTest(
447 double mu,
448 StatisticalSummary sampleStats,
449 double alpha)
450 throws IllegalArgumentException, MathException;
451 /**
452 * Returns the <i>observed significance level</i>, or
453 * <i>p-value</i>, associated with a two-sample, two-tailed t-test
454 * comparing the means of the input arrays.
455 * <p>
456 * The number returned is the smallest significance level
457 * at which one can reject the null hypothesis that the two means are
458 * equal in favor of the two-sided alternative that they are different.
459 * For a one-sided test, divide the returned value by 2.
460 * <p>
461 * The test does not assume that the underlying popuation variances are
462 * equal and it uses approximated degrees of freedom computed from the
463 * sample data to compute the p-value. The t-statistic used is as defined in
464 * {@link #t(double[], double[])} and the Welch-Satterthwaite approximation
465 * to the degrees of freedom is used,
466 * as described
467 * <a href="http://www.itl.nist.gov/div898/handbook/prc/section3/prc31.htm">
468 * here.</a> To perform the test under the assumption of equal subpopulation
469 * variances, use {@link #homoscedasticTTest(double[], double[])}.
470 * <p>
471 * <strong>Usage Note:</strong><br>
472 * The validity of the p-value depends on the assumptions of the parametric
473 * t-test procedure, as discussed
474 * <a href="http://www.basic.nwu.edu/statguidefiles/ttest_unpaired_ass_viol.html">
475 * here</a>
476 * <p>
477 * <strong>Preconditions</strong>: <ul>
478 * <li>The observed array lengths must both be at least 2.
479 * </li></ul>
480 *
481 * @param sample1 array of sample data values
482 * @param sample2 array of sample data values
483 * @return p-value for t-test
484 * @throws IllegalArgumentException if the precondition is not met
485 * @throws MathException if an error occurs computing the p-value
486 */
487 public abstract double tTest(double[] sample1, double[] sample2)
488 throws IllegalArgumentException, MathException;
489 /**
490 * Returns the <i>observed significance level</i>, or
491 * <i>p-value</i>, associated with a two-sample, two-tailed t-test
492 * comparing the means of the input arrays, under the assumption that
493 * the two samples are drawn from subpopulations with equal variances.
494 * To perform the test without the equal variances assumption, use
495 * {@link #tTest(double[], double[])}.
496 * <p>
497 * The number returned is the smallest significance level
498 * at which one can reject the null hypothesis that the two means are
499 * equal in favor of the two-sided alternative that they are different.
500 * For a one-sided test, divide the returned value by 2.
501 * <p>
502 * A pooled variance estimate is used to compute the t-statistic. See
503 * {@link #homoscedasticT(double[], double[])}. The sum of the sample sizes
504 * minus 2 is used as the degrees of freedom.
505 * <p>
506 * <strong>Usage Note:</strong><br>
507 * The validity of the p-value depends on the assumptions of the parametric
508 * t-test procedure, as discussed
509 * <a href="http://www.basic.nwu.edu/statguidefiles/ttest_unpaired_ass_viol.html">
510 * here</a>
511 * <p>
512 * <strong>Preconditions</strong>: <ul>
513 * <li>The observed array lengths must both be at least 2.
514 * </li></ul>
515 *
516 * @param sample1 array of sample data values
517 * @param sample2 array of sample data values
518 * @return p-value for t-test
519 * @throws IllegalArgumentException if the precondition is not met
520 * @throws MathException if an error occurs computing the p-value
521 */
522 public abstract double homoscedasticTTest(
523 double[] sample1,
524 double[] sample2)
525 throws IllegalArgumentException, MathException;
526 /**
527 * Performs a
528 * <a href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda353.htm">
529 * two-sided t-test</a> evaluating the null hypothesis that <code>sample1</code>
530 * and <code>sample2</code> are drawn from populations with the same mean,
531 * with significance level <code>alpha</code>. This test does not assume
532 * that the subpopulation variances are equal. To perform the test assuming
533 * equal variances, use
534 * {@link #homoscedasticTTest(double[], double[], double)}.
535 * <p>
536 * Returns <code>true</code> iff the null hypothesis that the means are
537 * equal can be rejected with confidence <code>1 - alpha</code>. To
538 * perform a 1-sided test, use <code>alpha * 2</code>
539 * <p>
540 * See {@link #t(double[], double[])} for the formula used to compute the
541 * t-statistic. Degrees of freedom are approximated using the
542 * <a href="http://www.itl.nist.gov/div898/handbook/prc/section3/prc31.htm">
543 * Welch-Satterthwaite approximation.</a>
544
545 * <p>
546 * <strong>Examples:</strong><br><ol>
547 * <li>To test the (2-sided) hypothesis <code>mean 1 = mean 2 </code> at
548 * the 95% level, use
549 * <br><code>tTest(sample1, sample2, 0.05). </code>
550 * </li>
551 * <li>To test the (one-sided) hypothesis <code> mean 1 < mean 2 </code>,
552 * at the 99% level, first verify that the measured mean of <code>sample 1</code>
553 * is less than the mean of <code>sample 2</code> and then use
554 * <br><code>tTest(sample1, sample2, 0.02) </code>
555 * </li></ol>
556 * <p>
557 * <strong>Usage Note:</strong><br>
558 * The validity of the test depends on the assumptions of the parametric
559 * t-test procedure, as discussed
560 * <a href="http://www.basic.nwu.edu/statguidefiles/ttest_unpaired_ass_viol.html">
561 * here</a>
562 * <p>
563 * <strong>Preconditions</strong>: <ul>
564 * <li>The observed array lengths must both be at least 2.
565 * </li>
566 * <li> <code> 0 < alpha < 0.5 </code>
567 * </li></ul>
568 *
569 * @param sample1 array of sample data values
570 * @param sample2 array of sample data values
571 * @param alpha significance level of the test
572 * @return true if the null hypothesis can be rejected with
573 * confidence 1 - alpha
574 * @throws IllegalArgumentException if the preconditions are not met
575 * @throws MathException if an error occurs performing the test
576 */
577 public abstract boolean tTest(
578 double[] sample1,
579 double[] sample2,
580 double alpha)
581 throws IllegalArgumentException, MathException;
582 /**
583 * Performs a
584 * <a href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda353.htm">
585 * two-sided t-test</a> evaluating the null hypothesis that <code>sample1</code>
586 * and <code>sample2</code> are drawn from populations with the same mean,
587 * with significance level <code>alpha</code>, assuming that the
588 * subpopulation variances are equal. Use
589 * {@link #tTest(double[], double[], double)} to perform the test without
590 * the assumption of equal variances.
591 * <p>
592 * Returns <code>true</code> iff the null hypothesis that the means are
593 * equal can be rejected with confidence <code>1 - alpha</code>. To
594 * perform a 1-sided test, use <code>alpha * 2.</code> To perform the test
595 * without the assumption of equal subpopulation variances, use
596 * {@link #tTest(double[], double[], double)}.
597 * <p>
598 * A pooled variance estimate is used to compute the t-statistic. See
599 * {@link #t(double[], double[])} for the formula. The sum of the sample
600 * sizes minus 2 is used as the degrees of freedom.
601 * <p>
602 * <strong>Examples:</strong><br><ol>
603 * <li>To test the (2-sided) hypothesis <code>mean 1 = mean 2 </code> at
604 * the 95% level, use <br><code>tTest(sample1, sample2, 0.05). </code>
605 * </li>
606 * <li>To test the (one-sided) hypothesis <code> mean 1 < mean 2, </code>
607 * at the 99% level, first verify that the measured mean of
608 * <code>sample 1</code> is less than the mean of <code>sample 2</code>
609 * and then use
610 * <br><code>tTest(sample1, sample2, 0.02) </code>
611 * </li></ol>
612 * <p>
613 * <strong>Usage Note:</strong><br>
614 * The validity of the test depends on the assumptions of the parametric
615 * t-test procedure, as discussed
616 * <a href="http://www.basic.nwu.edu/statguidefiles/ttest_unpaired_ass_viol.html">
617 * here</a>
618 * <p>
619 * <strong>Preconditions</strong>: <ul>
620 * <li>The observed array lengths must both be at least 2.
621 * </li>
622 * <li> <code> 0 < alpha < 0.5 </code>
623 * </li></ul>
624 *
625 * @param sample1 array of sample data values
626 * @param sample2 array of sample data values
627 * @param alpha significance level of the test
628 * @return true if the null hypothesis can be rejected with
629 * confidence 1 - alpha
630 * @throws IllegalArgumentException if the preconditions are not met
631 * @throws MathException if an error occurs performing the test
632 */
633 public abstract boolean homoscedasticTTest(
634 double[] sample1,
635 double[] sample2,
636 double alpha)
637 throws IllegalArgumentException, MathException;
638 /**
639 * Returns the <i>observed significance level</i>, or
640 * <i>p-value</i>, associated with a two-sample, two-tailed t-test
641 * comparing the means of the datasets described by two StatisticalSummary
642 * instances.
643 * <p>
644 * The number returned is the smallest significance level
645 * at which one can reject the null hypothesis that the two means are
646 * equal in favor of the two-sided alternative that they are different.
647 * For a one-sided test, divide the returned value by 2.
648 * <p>
649 * The test does not assume that the underlying popuation variances are
650 * equal and it uses approximated degrees of freedom computed from the
651 * sample data to compute the p-value. To perform the test assuming
652 * equal variances, use
653 * {@link #homoscedasticTTest(StatisticalSummary, StatisticalSummary)}.
654 * <p>
655 * <strong>Usage Note:</strong><br>
656 * The validity of the p-value depends on the assumptions of the parametric
657 * t-test procedure, as discussed
658 * <a href="http://www.basic.nwu.edu/statguidefiles/ttest_unpaired_ass_viol.html">
659 * here</a>
660 * <p>
661 * <strong>Preconditions</strong>: <ul>
662 * <li>The datasets described by the two Univariates must each contain
663 * at least 2 observations.
664 * </li></ul>
665 *
666 * @param sampleStats1 StatisticalSummary describing data from the first sample
667 * @param sampleStats2 StatisticalSummary describing data from the second sample
668 * @return p-value for t-test
669 * @throws IllegalArgumentException if the precondition is not met
670 * @throws MathException if an error occurs computing the p-value
671 */
672 public abstract double tTest(
673 StatisticalSummary sampleStats1,
674 StatisticalSummary sampleStats2)
675 throws IllegalArgumentException, MathException;
676 /**
677 * Returns the <i>observed significance level</i>, or
678 * <i>p-value</i>, associated with a two-sample, two-tailed t-test
679 * comparing the means of the datasets described by two StatisticalSummary
680 * instances, under the hypothesis of equal subpopulation variances. To
681 * perform a test without the equal variances assumption, use
682 * {@link #tTest(StatisticalSummary, StatisticalSummary)}.
683 * <p>
684 * The number returned is the smallest significance level
685 * at which one can reject the null hypothesis that the two means are
686 * equal in favor of the two-sided alternative that they are different.
687 * For a one-sided test, divide the returned value by 2.
688 * <p>
689 * See {@link #homoscedasticT(double[], double[])} for the formula used to
690 * compute the t-statistic. The sum of the sample sizes minus 2 is used as
691 * the degrees of freedom.
692 * <p>
693 * <strong>Usage Note:</strong><br>
694 * The validity of the p-value depends on the assumptions of the parametric
695 * t-test procedure, as discussed
696 * <a href="http://www.basic.nwu.edu/statguidefiles/ttest_unpaired_ass_viol.html">here</a>
697 * <p>
698 * <strong>Preconditions</strong>: <ul>
699 * <li>The datasets described by the two Univariates must each contain
700 * at least 2 observations.
701 * </li></ul>
702 *
703 * @param sampleStats1 StatisticalSummary describing data from the first sample
704 * @param sampleStats2 StatisticalSummary describing data from the second sample
705 * @return p-value for t-test
706 * @throws IllegalArgumentException if the precondition is not met
707 * @throws MathException if an error occurs computing the p-value
708 */
709 public abstract double homoscedasticTTest(
710 StatisticalSummary sampleStats1,
711 StatisticalSummary sampleStats2)
712 throws IllegalArgumentException, MathException;
713 /**
714 * Performs a
715 * <a href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda353.htm">
716 * two-sided t-test</a> evaluating the null hypothesis that
717 * <code>sampleStats1</code> and <code>sampleStats2</code> describe
718 * datasets drawn from populations with the same mean, with significance
719 * level <code>alpha</code>. This test does not assume that the
720 * subpopulation variances are equal. To perform the test under the equal
721 * variances assumption, use
722 * {@link #homoscedasticTTest(StatisticalSummary, StatisticalSummary)}.
723 * <p>
724 * Returns <code>true</code> iff the null hypothesis that the means are
725 * equal can be rejected with confidence <code>1 - alpha</code>. To
726 * perform a 1-sided test, use <code>alpha * 2</code>
727 * <p>
728 * See {@link #t(double[], double[])} for the formula used to compute the
729 * t-statistic. Degrees of freedom are approximated using the
730 * <a href="http://www.itl.nist.gov/div898/handbook/prc/section3/prc31.htm">
731 * Welch-Satterthwaite approximation.</a>
732 * <p>
733 * <strong>Examples:</strong><br><ol>
734 * <li>To test the (2-sided) hypothesis <code>mean 1 = mean 2 </code> at
735 * the 95%, use
736 * <br><code>tTest(sampleStats1, sampleStats2, 0.05) </code>
737 * </li>
738 * <li>To test the (one-sided) hypothesis <code> mean 1 < mean 2 </code>
739 * at the 99% level, first verify that the measured mean of
740 * <code>sample 1</code> is less than the mean of <code>sample 2</code>
741 * and then use
742 * <br><code>tTest(sampleStats1, sampleStats2, 0.02) </code>
743 * </li></ol>
744 * <p>
745 * <strong>Usage Note:</strong><br>
746 * The validity of the test depends on the assumptions of the parametric
747 * t-test procedure, as discussed
748 * <a href="http://www.basic.nwu.edu/statguidefiles/ttest_unpaired_ass_viol.html">
749 * here</a>
750 * <p>
751 * <strong>Preconditions</strong>: <ul>
752 * <li>The datasets described by the two Univariates must each contain
753 * at least 2 observations.
754 * </li>
755 * <li> <code> 0 < alpha < 0.5 </code>
756 * </li></ul>
757 *
758 * @param sampleStats1 StatisticalSummary describing sample data values
759 * @param sampleStats2 StatisticalSummary describing sample data values
760 * @param alpha significance level of the test
761 * @return true if the null hypothesis can be rejected with
762 * confidence 1 - alpha
763 * @throws IllegalArgumentException if the preconditions are not met
764 * @throws MathException if an error occurs performing the test
765 */
766 public abstract boolean tTest(
767 StatisticalSummary sampleStats1,
768 StatisticalSummary sampleStats2,
769 double alpha)
770 throws IllegalArgumentException, MathException;
771 }