1 Introduction

The first case of COVID-19 in Malaysia was reported on the 25th of January 2020, involving three China nationals who entered Malaysia via Johor from Singapore [1]. Since then, the number of cases had continued to rise and the Malaysian government declared the first Movement Control Order (MCO) on 18th March 2020 which ended after 2 months [1, 2]. The surge in cases was largely attributed to large religious events, the outbreak among the incarcerated population in Sabah, and a state election in Sabah [3]. Malaysia continued to implement various measures to control the spread of COVID-19, including targeted lockdowns, restrictions on travel and gatherings and non-pharmaceutical interventions [3].

The year 2021 also saw the arrival of COVID-19 variants on Malaysia’s shores [4]. Cases rose in 2021 and the incidence of COVID-19 was comparatively higher in the central region of Peninsular Malaysia [3]. COVID-19 cases continued to surge, aided by the relaxation of COVID-19 protocols and border restrictions. The relaxation of protocols allowed more economic activities to resume and increased the capacity limit for public gatherings, such as for festive occasions [3].

The severity of the pandemic in Malaysia continued to escalate during the second half of 2021, as the Delta variant became the dominant strain beginning in July 2021 which prevailed several months after [4]. The Delta variant was deadlier and more transmissible than earlier variants of the virus and had been responsible for the increase in hospitalisation and deaths due to COVID-19 in many countries [5,6,7]. In Malaysia, COVID-19 pandemic management was under duress, and hospitals and intensive care units were above capacity. As evident in August 2021, Malaysia surpassed India’s per-capita COVID-19 death toll [8].

Parallel to the surge in COVID-19 cases and fatality throughout Malaysia, was the rapid conduct of mass vaccination of the population [9]. Malaysia started the COVID-19 vaccination programme in February 2021, where front-line workers and older adults with comorbidities were prioritised to receive the COVID-19 vaccination [10]. As Malaysia ramped up the mass vaccination rollout, the country became one of the fastest countries globally for daily vaccination rate per capita by July 2021 [9]. Hereafter, a significant decrease in COVID-19 admissions, critical hospitalizations, and fatalities was apparent nationwide [11].

In December 2021, the Omicron variant of concern arrived in the country, facilitated by pilgrims returning from Saudi Arabia [4]. Despite public health measures, such as compulsory quarantine and mandatory COVID-19 testing for travellers, the spread of the Omicron variant into communities was inevitable. However, while the number of daily COVID-19 cases increased amid the spread of the Omicron variant, severe cases remained low and the overall pandemic came under control within the country [11]. As of early March 2022, 98.7% of the country’s adult population had received two doses of COVID-19 vaccine, whereas 64% had received their booster shots [11]. Subsequently, Malaysia had begun transitioning to the endemic phase of COVID-19 on April 1, 2022.

Differences in mortality between groups of people are an important proxy indicator of relative risk of death [12]. The case fatality rate (CFR) indicate the severity of a disease and identify at-risk populations. Although COVID-19 could infect the young and old at a similar rate, it may cause more severe morbidity and mortality among older adults [13]. Males may also have a higher risk of mortality from COVID-19 compared to females. The difference in the age structure of the affected population may guide the true burden of COVID-19 in terms of morbidity and mortality, and guide policy decisions regarding the allocation of medical resources to those at higher risk of severe illness or death from COVID-19 [12, 14]. It is important to identify inequities that exist within populations to form a comprehensive pandemic response.

In this study, we adopt an exploratory multivariate approach to investigate the dimensionality of the COVID-19 fatality within Malaysia. Specifically, we aim to provide novel insights into the sex and age differentials in the risk of mortality due to the disease, beginning from when the disease first landed in Malaysia in 2020, until the beginning of the transition to the endemic COVID-19 phase in 2022. A comparative analysis of differentials in CFR across different points in time will provide a more comprehensive understanding of the pandemic’s impact. Analysing past CFR data helps in evaluating the effectiveness of various policies implemented during the pandemic and can guide policymakers in refining strategies and developing targeted interventions for specific demographics or regions. Hence, this knowledge is valuable for shaping public health policies, improving preparedness, and mitigating the impact of future health crises [15]. Even in an endemic stage, ongoing analysis and reflection on past experiences remain crucial for informed decision-making and effective public health management [16].

Malaysia provides a unique case study, serving as a laboratory for analysing variations across different states, each with its distinct geographic conditions. Notable events unfolded in Malaysia throughout the pandemic. Year 2020 saw the arrival of the virus and strict measures enforced by the Malaysian government. The first half of year 2021 saw the arrival of COVID-19 variants on Malaysia’s shores [4], contrast with a relaxation of COVID-19 protocols and border restrictions in the country. The second half of year 2021 saw a devastating rise in hospitalisations and deaths due to the deadly Delta variant [4, 8], although vaccinations of the population were ramped up. Year 2022 marked the Omicron variant phase. Although the number of daily COVID-19 cases increased, severe cases remained low, and almost all of the country’s adult population had received two doses of COVID-19 vaccine [11].

2 Methodology

In this section, we describe the dataset of the study and the statistical analysis performed using the principal component analysis technique.

2.1 Dataset

This study used open data on the number of COVID-19 cases and deaths by state/federal territories in Malaysia, captured from 25th January 2020 until 24th April 2022, made publicly available by the Ministry of Health (MOH) Malaysia [17]. Healthcare facilities adjudicate all deaths as either death due to COVID-19, or death with COVID-19 based on a set of criteria defined by MOH [18]. However, only the former contributed to COVID-19 mortality statistics. The data was filtered for Malaysians only and comprised 13 states (Perlis, Kedah, Kelantan, Perak, Pahang, Penang, Melaka, Terengganu, Johor, Negeri Sembilan, Selangor, Sabah and Sarawak) and 3 federal territories (Kuala Lumpur, Labuan and Putrajaya) in Malaysia. The list of states/federal territories and the respective abbreviations can be found in Table S1.

Case fatality rate/ratio (CFR) is the proportion of individuals diagnosed with a disease who die from that disease [12]. The CFR was computed for each state/federal territory, according to sex for four different periods: [1] January–December 2020; [2] January–June 2021; [3] July–December 2021; [4] January–April 2022. Additionally, CFR by sex was also computed according to age groups: (a) those between ages 0–17 years were coded as ‘Child’; (b) those between 18 and 59 years were coded as ‘Adult’; and (c) those aged 60 years and above as ‘Senior’. Hence, six values of CFR were computed for each state/federal territory within these four periods for further analysis, i.e., CFR for males among different age groups (M_Child; M_Adult; M_Senior) and similarly for females (F_Child; F_Adult; F_Senior).

2.2 Data analysis

Subsequently, the dataset comprising these six values of CFR for each state/federal territory during periods 1–4, was used in Principal Component Analysis (PCA) to examine underlying patterns within the data. PCA is a robust multivariate technique that reduces the dimensionality of a dataset and improves interpretability while preserving the most important information or patterns within the data [19, 20]. These principal components are linear combinations of the original variables and are arranged in order of their ability to explain the variance in the data. The first principal component captures the maximum amount of variance, followed by the second, and so on [19]. This technique is based on the decomposition of the original data matrix into the scores and loadings matrices—scores classify the samples, whereas loadings (also referred to as the weight of each variable) classify the variables [19]. More details about this multivariate technique can be found elsewhere [19, 20].

We first presented the cumulative confirmed cases and deaths, and respective CFR due to COVID-19 for each state/federal territory by sex, over four periods. Then we presented CFR by state, sex and age groups. The data was shifted and log-transformed, due to skewness. The PCA technique was performed using the “prcomp” command of the R statistical software, after standardizing each variable to have mean zero and standard deviation of one. Standardization involves rescaling the variables such that each will have the properties of a standard normal distribution with a mean of zero and a standard deviation of one. The principal component score vectors were extracted from the results and visualised through 2-dimensional plots using the “ggplot2” R package [21].

3 Results

3.1 CFR in states with confirmed cases and deaths from COVID-19, by sex

In 2020, the highest CFR was observed among males in Perlis at 7.69%, followed by Putrajaya at 1.92% (Table 1). Aside from Perlis and Putrajaya, Sarawak recorded a comparatively higher CFR than the rest of the states, among both males and females at 1.75% and 1.76% respectively. In Kelantan, the CFR among females stood at 1.69% and was strikingly higher than the males at 0.51%. Similarly, in Penang, the CFR among females was higher at 0.75% compared to the males at 0.32%.

Table 1 CFR in states with confirmed cases and deaths from COVID-19, by sex (period: 2020)

By contrast during the first half of 2021 (Table 2), the highest CFR was observed among females in Perlis at 2.03%, more than two times higher than males in the state. The highest CFR among males was observed in Labuan at 1.84%. Aside from Perlis, the CFR in the other states were usually higher among the males compared to the females.

Table 2 CFR in states with confirmed cases and deaths from COVID-19, by sex (period: 2021 January–June)

During the second half of 2021 (Table 3), the CFR among males were higher compared to females in all states/federal territories. The number of cases and deaths soared and more than doubled in most states and the federal territory of Kuala Lumpur. Johor recorded the highest CFR among males at 2.75%, followed by Sarawak at 2.69%. Similarly, Johor and Sarawak also recorded the highest CFR among females at 1.84%, and 1.98% respectively. In comparison, the CFR among males and females in Labuan and Putrajaya dropped to around 0.2% and below.

Table 3 CFR in States with Confirmed Cases and Deaths from COVID-19, by Sex (Period: 2021 July–December)

Between January to April of 2022 (Table 4), the number of cases remained high, but the number of deaths were much lower compared to the previous period. The resulting CFR among the males and females in all parts of the country have dropped to below 1%.

Table 4 CFR in states with confirmed cases and deaths from COVID-19, by sex (period: 2022 January–April)

3.2 CFR in states by sex and age groups

The CFR among male and female seniors were much higher than their younger counterparts. It was also observed that the CFR among female seniors in Kelantan were higher than their male counterparts throughout the four periods. In 2020 (Table 5), the CFR was the highest among male seniors in Perlis at 33.33%, followed by Putrajaya at 21.43% and Sarawak at 12%. Among female seniors, Melaka recorded the highest CFR at 8%, which was almost two times higher than for male seniors in the state. Except for Melaka and Kelantan, all other states had higher CFR among male seniors compared to their female counterparts.

Table 5 CFR in states by sex and age group (period: 2020)

During the first half of 2021 (Table 6), the CFR among both female and male seniors in Perlis were the highest in the country, at 11.11% and 6.9% respectively. However, during the second half of the year (Table 7), Sarawak recorded the highest CFR among both male and female seniors, at 10.40% and 8.98% respectively. During this period, the CFR among female seniors in Kedah, Kelantan and Melaka were higher than their male counterparts. Then, the fourth period (Table 8), showed a decline in the risk of death among the older population. The CFR showed a drop in 2022, between January and April. CFR among male and female seniors in Perlis were among the highest at 3.42% and 2.19%. As for the other states, the CFR have dropped to below 2%.

Table 6 CFR in states by sex and age group (period: 2021 January–June)
Table 7 CFR in states by sex and age group (period:2021 July–December)
Table 8 CFR in states by sex and age group (period: 2022 January–April)

Overall, children have been less severely affected by COVID-19 compared to adults. Though, Sabah had relatively higher CFR among the child population in comparison with the rest of the country from 2020 until 2021. When COVID-19 cases and deaths peaked during the second half of 2021, Sabah recorded the highest CFR among male children at 0.33%, whereas Melaka had the highest CFR among female children at 0.18%.

3.3 Proportion of variance explained by principal components

The variance contributed by the first, second and third principal components were around 43.4%, 20.1% and 12.4% respectively (Table 9). Taken together, the first three components captured around 76% of the variance of CFR explained by sex and age group. Of the remaining variance, the fourth, fifth and sixth components captured around 11.6%, 9.3% and 3.2% respectively.

Table 9 Proportion of variance explained by principal components (PC)

3.4 Principal component loading vectors

The first loading vector placed rather large, positive weights on the CFR among female adults, female seniors, male adults and male seniors (Table 10). Hence, the first component corresponded more towards the severity of the COVID-19 surge among the adult and senior population in the country. Conversely, CFR among female children loaded highly on the second loading vector, followed by the CFR among male children, which predominantly reflected the severity of the COVID-19 surge among the younger population. Subsequently, on the third loading vector, CFR among female seniors had a large positive weight, whereas the weight on the CFR among male adults was almost similar but on the opposing direction.

Table 10 Principal component (PC) loading vectors

The principal components were indicative of the severity and risk of mortality due to COVID-19 among different subgroups of the Malaysian population. As the first three principal components contributed to a substantial amount of variance (76%) and were highly representative of the data, we have presented them in plots to further aid in interpretation. Differences can be examined between the sixteen states/federal territories in Malaysia via two 2-dimensional plots of three principal component score vectors as shown in Figs. 1, 2. Each state was displayed as coloured text in short abbreviations, labelled according to the respective period (1–4), and represented the scores of the principal components in the corresponding plots. For example, the state Sabah in period 1 was labelled as ‘SBH_1’ in the plots. State coordinates in the plots can be found in Table S2.

Fig. 1
figure 1

Principal component 2 (PC2) vs principal component 1 (PC1). ‘SSS_#’, where ‘SSS’ represents a 3-character state abbreviation, and ‘#’ represents the period being assessed

Fig. 2
figure 2

Principal component 3 (PC3) vs principal component 1 (PC1). ‘SSS_#’, where ‘SSS’ represents a 3-character state abbreviation, and ‘#’ represents the period being assessed

Large positive scores on the PC1 axis indicated higher CFR predominantly among the adult and senior population, whereas large positive scores on the PC2 axis indicated higher CFR mainly among the child population (Table 10). Figure 1 revealed that the COVID-19 pandemic in most states had generally increased in severity especially among the adult and senior population from year 2020 (period 1) to 2021 (periods 2 and 3), but declined in severity in 2022 (period 4). The overall severity among these population groups in Perlis (PLS) and Sarawak (SWK) were relatively higher in comparison with other states in year 2020 (period 1). However, higher severity in COVID-19 was evident in Johor (JHR), Sarawak (SWK) and Melaka (MLK) during the second half of 2021 (period 3).

By contrast, the risk of mortality due to COVID-19 among the child population appeared to have risen from 2020 to 2022 in most parts of the country. It was notable that the COVID-19 severity among the child population in Sabah (SBH) was rather high in 2020 (period 1), and remained quite severe during the first and second half of 2021 (periods 2 and 3). In several states such as Terengganu (TRG), Melaka (MLK) and Sarawak (SWK), the severity was relatively higher during the second half of 2021 (period 3), but declined in 2022 (period 4).

Figure 2 further revealed the delineation in terms of sex in the risk of mortality from COVID-19. States which appeared near the top of the PC3 axis would indicate rather high severity among the female population, especially among female seniors (Table 10). On the contrary, those with large negative scores near the bottom of the axis would indicate high severity among the male population, notably among male adults.

Throughout the four periods, most parts of the country appeared around the middle of the PC3 axis, with the exception of Kelantan (KTN) and Perlis (PLS). Notably, COVID-19 severity in Perlis (PLS) was much higher among the male population in 2020 (period 1). Instead, in said state, a higher severity among the female population was observed during the first half of 2021 (period 2). Additionally, it was observed that Kelantan (KTN) exhibited higher COVID-19 severity among the female population in 2020 (period 1).

4 Discussion

This study provided deeper insights into the risk of mortality due to COVID-19 in all states/federal territories in Malaysia, stratified by sex and age groups from January 2020 until April 2022. This work revealed heterogeneity in fatality between states, and identified several states which were more vulnerable during the pandemic such as Kelantan, Perlis in northern Peninsular Malaysia, and Sabah and Sarawak on the East Coast of Malaysia. State-level differences in COVID-19 mortalities indicate differences in the severity of the pandemic or the effectiveness of public health interventions, and these are important indicators which guide policy decisions concerning the allocation of limited medical resources during a pandemic [12].

Vaccination coverage, monitoring of hospital capacity, and evidence-based booster strategy are critical in preventing COVID-19 mortalities [10]. The decline in COVID-19 fatality rates in 2022 indicates the success of the mass vaccination rollout in Malaysia. Up till February 2022, close to 60% of those aged 18 and above in the country have received the booster dose under the booster programme, which began on 1st Oct 2021 [22]. However, the acceptance of COVID-19 vaccines varied among different states. At this point in time, Sarawak had a high booster rate of its population, and was deemed to be better positioned against the Omicron wave. Kelantan however, had the lowest uptake of COVID-19 booster shots in the country, while ICU occupancy within the state was the highest in the country [23]. The low vaccination uptake among the Kelantanese population is likely attributed to the lack of confidence in the vaccine, particularly among Muslims who have concerns regarding its compliance with Islamic law [24]. Others have also found that vaccine hesitancy is more common among the Muslims due to concerns of safety, and the ‘halal’ status of the vaccines [25]. Nevertheless, augmenting clinical studies on COVID-19 booster doses, providing guidance from healthcare professionals or policymakers and addressing side effect concerns could bolster public confidence and vaccine uptake [26, 27].

The variability in CFR between states may also be attributed to population density, taking into account population size and spatial consideration [28]. More populous states such as Penang for example, where districts have high population densities and are highly inter-connected with neighbouring districts, are likely to experience a surge in COVID-19 cases within weeks of each other [28]. Higher interstate mobility or higher flows of tourism are also likely reasons in the increase in fatality rates [29]. Perlis is the smallest state in Malaysia, located on the northernmost part of the country’s west coast on the border with Thailand. In 2021, Perlis frequently recorded the highest COVID-19 infectivity rate (Rt) in the country [30,31,32]. The surge in COVID-19 cases was likely caused by high interstate travel involving workers and visitors [33].

In corroboration with other studies, it is evident that COVID-19 affects the senior population the most, and older adults face much higher risk of developing severe disease or dying from the virus than children [13, 34]. Though, it is also notable that fatality rates among children in Sabah are higher in comparison with other states in the country. Many believed that the Sabah state election in September of 2020 led to a wave of COVID-19 infections within the state, which spread throughout the country [1]. Sabah was the first state in Malaysia to record more than 10,000 cases [1]. Lack of COVID-19 testing, non-conformance in standard operating procedures regarding physical distancing, wearing face mask and regular hand washing may have exacerbated the severity of the COVID-19 surge. The number of COVID-19 tests conducted per 1000 population in Malaysia, specifically in Sabah, before and after the Sabah cluster, can be considered as another potential factor contributing to the rise in confirmed cases [1]. The surge in cases may have resulted in a spill over effect, leading to a significant number of children under the age of 12 contracting COVID-19 infection [35].

Previous findings have indicated higher morbidity and mortality from COVID-19 in males than females [36,37,38]. This differential is largely attributed to gendered behaviours or lifestyles, i.e., higher propensity to smoke and drink among males compared to females, and higher undertaking of preventive measures such as frequent handwashing, wearing of face mask, and stay at home orders among females than males [36]. While men overall died at a higher rate than women, our findings show that the trend varied between states. The higher risk of fatality from the disease among females in Kelantan can likely be due to social or economic factors. Most Kelantanese females are heavily involved in informal business sectors and contribute largely to the state’s economic development [39]. As many of the businesses are operated by females, they are unlikely to be staying at home during the pandemic. Though, other factors may also play a role in the observed disparity [40], and further analysis would be needed to determine the specific reasons behind the higher CFR among older females in Kelantan.

Our findings revealed variation in COVID-19 mortality levels between states, likely attributed to vaccination status, interstate mobility, sex differentials and socioeconomic factors. Lessons can be gleaned from Malaysia as a case study, providing insights into addressing regional disparities and implementing targeted strategies to protect vulnerable population subgroups. It is imperative to strengthen healthcare infrastructure, ensuring access to vital services such as geriatric and paediatric care, to effectively prepare for future health emergencies [41]. Additionally, this study underscores the significance of ongoing data monitoring and analysis for identifying emerging trends, evaluating intervention efficacy, and making well-informed policy decisions [41]. Governments must prioritize transparent and effective communication with the public, disseminating accurate information, promoting adherence to preventive measures, and addressing misconceptions or concerns [42, 43]. By doing so, they lay the foundation for successful public education campaigns, pivotal in raising awareness and nurturing shared responsibility.

There are limitations in this study. Firstly, the reported COVID-19 cases may be an underestimate of the actual disease burden within the Malaysian population [18]. It is important to recognize that the actual Case Fatality Rate (aCFR) is likely lower, in contrast to case numbers estimated by other sources. For instance, Jayaraj et al. [18] suggested that the true number of total infections in Malaysia, after adjusting for underdiagnosis, was approximately 9.95 million cases (95% CI 6.6 million–19 million) between March 2020 and December 2021. This estimate is roughly four times higher than the figure derived from the official data provided by the MOH’s public GitHub resources [17]. We have applied the assumption that despite the potential variation in CFR due to potential case under-reporting, the implication of a lower aCFR can be consistently applied across all states in Malaysia. As such, we have opted not to apply any variability to the CFR from state to state to which this analysis has been conducted.

Second, since the death toll and cases of COVID-19 collected by the Malaysian government are up to April 24, 2022, we can only use the available data at that point in time. As the main focus of the study is to investigate the dimensionality of the COVID-19 fatality between states in Malaysia, we only extracted cases and deaths data by state, sex and age groups. Third, we computed the CFR according to four irregular time periods, assuming that patterns in the pandemic are similar within these periods. CFR can be influenced by various factors such as testing capacity, healthcare infrastructure and population demographics. Future research can consider additional contextual information and conduct a more comprehensive analysis to understand the underlying reasons for the differences in CFR between states.

5 Conclusions

Our study distinguishes itself through a detailed examination of sex and age disparities in COVID-19 fatality rates across Malaysian states. By applying PCA, our approach unveils intricate patterns in the data, providing a novel, comprehensive multivariate visualization of COVID-19 fatality throughout Malaysia.

Multivariate visualizations facilitated a standardized comparison between states and revealed heterogeneity in COVID-19 fatality. The revelation of vulnerable states, notably in the northern region and on the East Coast, adds depth to our understanding by emphasizing regional disparities in fatality rates. Furthermore, the noted sex disparity in Kelantan and the comparatively higher CFR among the child population in Sabah offer fresh perspectives, enhancing our understanding of the complex interplay between demographic factors and COVID-19 outcomes.

The findings have practical implications for healthcare practices in Malaysia, suggesting a need for adjustments in policies and procedures. Specifically, targeted interventions and resource allocation are recommended in regions identified as vulnerable. Additionally, recognizing demographic disparities, particularly among seniors and children, calls for tailored healthcare approaches.

These insights can also inform global healthcare recommendations, advocating for a more nuanced and region-specific approach in the development of international health policies. The integration of these insights into healthcare policymaking not only enhances the resilience of Malaysia’s health system but also fosters a collaborative global approach, emphasizing the importance of sharing best practices and innovative strategies to address health disparities on an international scale [41].