Basics Of Statistics


In the 21st century, professionals often claim that the world’s most valuable resource is ‘DATA’. Despite the validity of this statement, it is important to know that data is useless unless it is processed and managed in a meaningful way. Therefore, to generate insightful conclusions data and analytics must interact collectively.

So, in an age where data is king, statistics has become an essential tool for making sense of the vast amounts of information we generate every day.

What is ‘Data’?

First, let’s define what data is. Different individuals may have different interpretations of data. This is how I interpret data. Let’s consider ‘12’. We all know 12 is a number. There is no meaning or message in this ‘12’. It is just a raw number. It could be an age, a weight, a month or even a name. But, if we say, ‘the weight of a 2-year-old child is 12’, then this is considered to be DATA: A number with context. 

What is Statistics?

Simply statistics is about DATA. If we speak more comprehensively, Statistics is a branch of mathematics that deals with collecting, organizing, summarizing, analyzing, and interpreting data. The goal of using statistics is to gain an understanding of data by applying various techniques.

It plays a critical role in the fields such as science, medicine, business, economics, and social sciences, helping researchers to evaluate the effectiveness of interventions and treatments.

Statistics is also used to test hypotheses and draw conclusions about future outcomes based on past data. For example, analysts use statistics to predict election results or forecast the weather.

Overall, statistics is a powerful tool used to understand and make informative decisions and decide patterns that may not be visible to the naked eye.

In statistics, there are two technical words that we always come across. They are,

  • Individuals

Individuals are the objects described in a set of data. These objects could be people, animals, trees, or things.

  • Variables 

Variables are the characteristics of the given individuals. We can further divide the variables into two depending on their characteristics.

  • Qualitative variables which are also known as categorical variables can be divided into distinct groups based on certain attributes. Gender (Male/Female), eye color(black/brown/blue), hair color(black/brunette/blonde) or Age groups are some examples of categorical variables.
  • Quantitative variables which are also knowns as numerical variables are variables that take on numerical values in which arithmetic operations can be performed. This can be further divided into two groups: discrete and continuous. Discrete variables take only integer values such as the number of family members, the number of meals per day. Continuous variables take on any value within a given range, such as weight or height.

Statistical Analysis

Now let’s explore the insights of statistical techniques used in organizing and analyzing data. Statistical analysis helps in gaining a better understanding of complex data and making evidence-based decisions.

Researchers use various techniques like hypotheses testing, probability theory, and statistical inference to extract meaningful information from data. All of these techniques could be summarized into two main methods known as,

  1. Descriptive method
  2. Inferential method

Descriptive statistics

Descriptive statistics help to describe the features of a given specific dataset by summarizing and measuring the data. The data we consider for this is obtained by a sample that is taken from the population. 

The most recognized types of descriptive data analysis are the measure of center; mode, median, mean, and measure of spread; range, and variance.

Descriptive statistics can be presented using various graphical interpretations such as bar charts, pie charts, histograms, scatterplots, etc. These graphical interpretations allow the analysts to visualize the obtained data and patterns in a much more concise manner to the stakeholders. 

The decisions obtained through this analysis are only applicable to the sample considered and there is a high chance of errors when we try to generalize the decisions to the population. However, descriptive statistics is an important tool for summarizing and is often the first step in conducting more advanced statistical analysis.

Inferential statistics

Inferential statistics is used to make predictions about a larger population based on a limited sample of data. The most recognized inferential techniques used are hypotheses testing, confidence interval, and regression analysis. 

In hypotheses testing, we consider a pair of hypotheses: Null hypothesis and Alternative hypothesis where we determine which hypotheses the given data support. We use this technique when we want to assess a claim about a population. Whereas when we need to estimate a population parameter, we use confidence intervals. In confidence intervals, we estimate the range of values that a population parameter is likely to fall within, based on a sample of data.

The decisions obtained from this analysis are applicable to the entire population. However, it is important to be aware of the assumptions and limitations of these techniques as well as the potential errors occurring during the process. 

We cannot exactly say which method is the most accurate or convenient. But it is advised to use a mixture of both techniques in most situations in order to get many precise and reliable conclusions. 

Statistics can be a valuable skill to have if you are a researcher, data scientist, or analyst, or simply someone interested in understanding the world around you. So, keep learning, keep exploring and keep asking questions. With statistics on your side, there is no limit to what you can discover!


Image Courtesies


Tagged : / / /