This is the fourth post of a four-part series on simple statistics for clinical trials. Without too much technical detail, this series of posts is intended to offer some background in how statistical planning can influence study effectiveness and value.

This series has discussed the various types of data that can be collected in clinical studies, how sample size can influence reliability and error in interpreting study data and methods to eliminate or reduce bias in study results. You’ve seen that careful, early planning for all of these aspects with a qualified clinical statistician before a study begins is essential. 

It is also critical to understand, in the planning stages, how study data will be analyzed after the study is completed. Understanding statistical analysis concepts and terminology will help you communicate with the clinical study planning team, particularly the statistician, whose role is central to the planning process.

There are two forms of statistical analysis: descriptive statistics and inferential statistics.

Descriptive statistics

Descriptive statistics create the foundation of data analysis for clinical studies. Descriptive analysis allows researchers to get acquainted with the sample data by summarizing important information before further analysis is conducted. Descriptive statistics consider only actual data collected from or about the study subjects, like:

  • measures of frequency, such as counts or percentages of occurrence;
  • expressions of central tendency, such as mean (average of values), median (central value), and mode (most commonly-occurring value);
  • descriptions of data dispersion or variation, such as range of values and standard deviation; and
  • indications of position, such as rankings by percentile or along a scale of possible values.

Inferential statistics

Inferential statistics goes further and uses the sample data to make reasoned, evidence-based conclusions and predictions about larger populations. One important function of inferential statistics is to inform judgments about whether an observed difference between groups is a dependable study result or happened by chance.

Regression analysis is an inferential tool used to determine how a change in one variable relates to a change in another variable, allowing researchers to estimate one value when another is known. 

One important caution with regression analysis is to avoid confusing correlation with causation. Correlation means that a change in one variable is simply linked with a change in another.  For example, glove sales and snow shovel sales tend to increase together, so we can say the two are correlated. Causation means that a change in one variable one caused the change in another. In the example, the increase in glove sales didn’t cause the increase in snow shovel sales, so no causal relationship exists between the two. Falling temperatures, however, might have a causal relationship with both. 

Parametric data modeling allows inferences to be made about larger populations based on assumptions about the “shape” of the study data, such as a bell curve. Non-parametric data modeling, on the other hand, does not assume a shape for the data but instead allows the data to estimate the model shape. Parametric modeling is often preferred for its straightforward assumptions, but the more complex non-parametric modeling avoids the risk of selecting a model that does not accurately reflect population data. 

Start with statistics

Statistical analysis is abstract, but the realities of statistical analysis have enormous influence on value and reliability in clinical studies. 

When considering your next study, be sure to work with an established professional testing team that actively involves a qualified clinical statistician in the planning process. When you work with a professional testing team, such as Consumer Product Testing℠ Company, you’ll be rewarded with a more efficient study that provides more compelling data than if statistical analysis had been merely a final step in the study process.