Friday, November 22, 2024

Predicting My Students' SGPA: From OLS to Machine Learning

Summary: This article highlights the importance of machine learning algorithms and traditional econometrics models. Using a classic classroom example, this article suggests that a student of economics should use both tools in economic modeling.  

While teaching econometrics, the fundamental challenge we face is to choose the perfect example or data to use while explaining econometrics and the importance of being present in class. Most of the time, I ended up using the classic example of grade points (GPA or SGPA) and how it is affected by attendance, IQ, internal marks, and so on. The example and the method to prove to my students that these factors are crucial for getting a good grade remained the same for the past few years. The mighty ordinary linear least square (OLS) regressions always do their tricks and show that the student will get lower grades if they perform poorly in the internal exams or have less attendance. However, I always questioned whether the OLS is the best model. In most cases, students are in their first year. I cannot teach them non-linear equations, time-varying state space, or any fancy model that may fit the data perfectly.

Figure 1: SGPA and Average Internal Marks of the students of the Department of Economics


An OLS model seems perfect for the data presented in Figure 1. However, the data has a higher dispersion at specific ranges, such as 60 to 70 or 87 to 94, which is a classic case of heteroskedasticity. One can remove these data points and label them outliers, but then the students will question my intention. So, removing data points or applying a complex model is not an option.

If a student who has an average internal mark of 65 approaches me and wants to know the predicted SGPA, I will use OLS to show that, based on the regression result of Table 1, the student will get an average SGPA of 4.9 with a mean squared error (MSE) of 0.93 and r2=0.81. However, as I mentioned, students in this cluster have a higher variation, which means my prediction may be misleading.

Table 1: Simple OLS result of SGPA on Average Internal Marks

Figure 2: OLS prediction of SGPA for the student with 65 average internal marks


In the era of data analytics and machine learning, I should use machine learning techniques to predict my students' SGPAs. One of the basic methods is the K-Nearest Neighbourhood algorithm. The idea is that we can predict the behavior of data by looking at its nearest neighbors. I used 20% of the data for testing and K=8 nearest neighbors to predict the SGPA of the student with 65 average internal marks. The prediction has changed to 5.23 with a mean squared error of 2.0 and r2=0.49, as depicted in Figure 3. I changed the value of K many times, and it remained above the predicted value of OLS.

Figure 3: K- Nearest Neighborhood prediction of SGPA for a student with 65 Average Internal Marks


The data's clustering behavior may still lead to wrong predictions. So, I used the decision tree algorithm, which is more appropriate when neighboring clusters display different patterns or the data has a more complex pattern. Using a basic decision tree algorithm, I predicted that the student with an average of 65 internal marks might get an SGPA of 6.28 with a mean squared error of 2.03 and r2=0.41, which is way above the OLS prediction (figure 4).

Figure 4: Decision tree prediction of SGPA for a student with 65 average internal marks

All three models have strengths and weaknesses; no one can claim that one model is better in all situations. As the literature has mentioned, there is always a trade-off between unbiasedness and standard error. So the investigator should be careful while using these models for forecasting or predicting a variable. Although machine learning algorithms are popular, OLS is a powerful and simple technique with a solid theoretical background. The overall relationship between Internal marks and SGPA or attendance and SGPA is positive and significant as predicted by the OLS. And remember, under all the assumptions of classical linear regression, OLS is still BLUE (Best Linear Unbiased Estimate). 

Please Note: Don’t take this post seriously. Econometrics is just for fun. (All Python codes are available in open sources.)


By

Dr. Akash Kumar Baikar

Assistant Professor, Department of Economics, SBSS, MRIIRS

Thursday, November 7, 2024

The Ladder of Happiness Through Macroeconomic Fundamentals or Social Expenditure: A BRICS Perspective

 Summary:

In current geopolitics, the BRICS nations can set new standards for future economic development. This article explores how these countries should progress in their future development paths. Keeping happiness as an objective, this article suggests that social expenditures are much more critical than macroeconomic fundamentals.  

Figure 1: The Ladder of Happiness of BRICS nations from 2014 to 2023

Source: World Bank Open Data, The size of the bubble represents the level of the Cantril Ladder Index of that country in that year



Introduction

The Cantril Ladder, a simple yet effective tool for gauging subjective well-being, poses a straightforward question: "Imagine a ladder with the best possible life at the top step (10) and the worst possible life at the bottom step (0). On which step do you feel you personally stand at this time?"[1] By quantifying subjective experiences, this scale offers valuable insights into individual life satisfaction. While individual well-being is a complex interplay of personal circumstances and psychological factors, macro-level variables can significantly influence happiness. This study delves into the relationship between these broader economic factors and Cantril Ladder scores, focusing specifically on the BRICS nations as they represent a significant portion of the global economy. Understanding the factors that influence the happiness of their citizens is crucial for policymakers and researchers alike.

Our panel data analysis indicates that three key variables—GDP per capita, expenditure on health and education, and the unemployment rate—significantly influence Cantril Ladder scores within the BRICS nations. The following sections delve into the specific impact of each variable. 

Panel data analysis of BRICS countries

Table 1: Random effect regression result of the Cantril Ladder index on macroeconomic and social expenditure variables

 

Model-1

Model-2

CPI

-.002

(0.003)

0.0032

(.002)

Log of GDP Per Capita

0.471**

(0.189)

1.262***

(0.09)

Unemployment

-0.058***

(0.01)

-0.058***

(0.006)

Health Expenditure (% of GDP)

0.332***

(0.05)

 

Expenditure on Education (% of GDP)

 

0.629***

(0.057)

Intercept

-0.083

(1.79)

-8.61***

0.996

***,**,* represent 1%, 5%, and 10% L.S. respectively, S.E. in the parentheses

As an indicator of wealth distribution, GDP per capita directly and significantly impacts citizens' happiness. As a country's average income rises, so does the general well-being of its people. Our analysis reveals that a 1% boost in GDP per capita leads to a statistically significant increase in the Cantril Ladder score. This suggests that income distribution is important to a country's happiness level.
However, the relationship between income and happiness is not always linear. Studies have shown that while increased income can boost happiness to a certain point, additional wealth may not significantly increase well-being beyond a certain threshold.
Unemployment, a scourge of modern economies, can cast a long shadow over individual well-being. Job loss can lead to a host of negative consequences, including financial insecurity, increased stress, and a diminished sense of purpose. These factors can significantly impact people's perceptions of their lives and, consequently, their Cantril Ladder scores. Our analysis reveals that a 1% increase in the unemployment rate is associated with a statistically significant 0.058 decrease in the Cantril Ladder score. This finding underscores the detrimental impact of unemployment on subjective well-being. Moreover, high unemployment rates can have broader societal implications, such as increased social unrest and political instability, further eroding people's sense of security and overall well-being.
Government investment in education and healthcare is a cornerstone of societal progress. By prioritizing these sectors, nations can foster long-term economic growth, enhance social development, and improve the overall well-being of their citizens. Our research shows that investing in education and healthcare can significantly boost people's happiness. A 1% increase in health expenditure is linked to a 0.33 increase in the happiness index, while a 1% increase in education expenditure is associated with a 0.629 boost in happiness.



The BRICS: A comparative analysis-  China's sustained economic growth and significant investments in education have contributed to a relatively stable Cantril Ladder score (Figure-1). In contrast, Brazil and Russia have experienced fluctuations in their scores, influenced by economic volatility and political uncertainty. India, despite rapid economic growth, faces challenges related to unemployment and inequality, which can impact its citizens' well-being.
In conclusion, while macroeconomic variables play a significant role in shaping individuals' perceptions of their lives, a holistic approach is necessary to understand the complex interplay between economic conditions and subjective well-being. Policymakers should consider not only economic growth but also social and environmental factors to promote sustainable and equitable development.

R code for the plot: Packages used- dplyr, ggplot2, gganimate

p <- ggplot(data, aes(x = GDPPerCapita, y = HappinessIndex, size = HappinessIndex, color = Country)) +
  geom_point(alpha = 0.7) +
  geom_text(aes(label = Country), vjust = 1.5, hjust = 1.5, size = 3) +
  scale_size(range = c(2, 12)) +
  theme_minimal() +
  labs(
    title = 'BRICS Countries: Happiness Index vs GDP Per Capita',
    x = 'GDP Per Capita',
    y = 'Happiness Index',
    size = 'Happiness Index'
  ) +
  theme(
    plot.title = element_text(hjust = 0.5)
  )
animated_plot <- p + transition_time(Year) +
  labs(subtitle = 'Year: {frame_time}')

References:

[1] OECD Guidelines on Measuring Subjective Well-being





By
Tisha Virmani
M.A. Economics (2024-26), Department of Economics, SBSS, MRIIRS, Faridabad