Estimating the number of unidentified cases of COVID-19 in Italy as of March 31st using South Korean and Chinese mortality rates
Andrea Molle, Chapman University, California USA.
The global panic around the COVID-19 epidemic is fed by alarming estimates of its mortality rate. Italy in particular is watched upon with great anxiety as a potential global scale scenario with a mortality rate currently estimated in the 10%. Using the mortality rates by age group identified in China and South Korea as theoretical mortality rates and comparing them to the deceased numbers in Italy in order to estimate the number of unidentified COVID-19 cases, I suggest that as many of 500,000 infected, asymptomatic, individuals are not included in the official count. This in return, results in the over estimation of the overall gross mortality rate which probably falls around 2%. There are strategical public policy implications to our quarantine and mitigation strategies.
The official number of cases and deaths from COVID-19 in Italy represents a mystery for the disease seems to have taken on a more aggressive and lethal form than in other countries, with a mortality rate currently estimated in the 10%. In this research note, I assume that this is a statistical artifact and a consequence of extremely unreliable data on the true total number of cases in Italy. First, there is one main factor which contributes to an underestimation of the total case numbers. Italy appears to have performed fewer tests than other countries and, more importantly, it is testing only individuals who experience severe symptoms, and who ultimately require hospitalization. Many of the currently infected, asymptomatic, people are therefore not included in the official count. Secondly, in more acute cases, there is a lag of about 8 to 10 days between the initial onset of the symptoms and the death of the patient. All this clearly results in the over estimation of the overall mortality rate.
Here I suggest that is possible to get a better understanding of the actual spread of the contagion in Italy using the mortality rates by age group identified in China and South Korea as theoretical mortality rates and compare them to the deceased numbers in Italy in order to estimate the number of unidentified COVID-19 cases.
Estimates of Total Cases
First we need to consider mortality rates in China using the most recent data available . Being the first country to experience an outbreak of COVID-19, it is now probably the closest country to having a conclusive outcome for most of its active cases. Chinese estimates are, however, considered highly problematic and present a staggering difference between the mortality rates in Wuhan and the rest of the country. Therefore, we advise extreme caution if using them as a reference.
The following table (1) computes estimates of the total cases in Italy using Chinese mortality rates as a reference. Using the official number of deaths by age group reported by the Italian Ministry of Health at March 30th   (column B of the table), we estimate the number of true cases by age group (column D) assuming that Italy has the same mortality by age group as China. This is done by dividing the number of deaths in each age group by the corresponding theoretical mortality rate. By subtracting the number of official cases (column C) from them, we determine the estimated number of infected people who are not yet identified (column F). In comparing the latter with the official Italian data, we assume that the more the detected lethality differs from the theoretical mortality, the more infected people are not yet identified.
Table 1 – Estimated true cases (Chinese mortality reference)
For example, if we want to estimate the true number of infected in the 70 to 79 bracket, we divide the number of deaths officially recorded for this age group (3,458) by the corresponding mortality estimated from the Chinese data (8%) thus obtaining a projection of 43,225 cases which results in 25,761 more cases than the 17,464 currently detected. By repeating this for each age brackets, with the exception of the <30 bracket for which we don’t have mortality data available, we estimate that the total number of true cases is 169,408.
Let’s now consider mortality rates in South Korea as of March 30th . The East-Asian country has the most accurate estimates of the true size of the infection due to its extensive testing, it has already reached the cases peak, and is not far from having a conclusive outcome for most of its currently active cases. Because of the more reliable data, assuming that the standards for reporting cases outcomes are the same across both countries and its structural and demographical similarities with Italy, we recommend using the estimates based on the Korean case. In other words, the mortality rates by age group in South Korea represent a better approximation than China of what the true Italian mortality rates should be. Adopting the same procedure as we did with China and results are shown in the following table.
Table 2 – Estimated true cases (South Korea mortality reference)
Following the previous example, in order to estimate the true number of infected in the 70 to 79 bracket using the South Korean mortality rates, we divide the number of deaths officially recorded for this age group (3,458) by the corresponding mortality estimated from Korean data (5.27%) thus obtaining a projection of 65,617 cases which results in 48,153 more case than the 17,464 currently detected. By repeating this for each age brackets, with the exception of the <30 bracket for which we don’t have mortality data available, we estimate this time that the total number of true cases could be as large as 416,270.
Finally, to obtain a more accurate estimate of unidentified cases, we can factor the window from contagion to death in our calculations. I computed an estimate of future deaths by regressing the current distribution of cases with a fatal outcome up to March 30th. I then opted for a conservative prediction of 14,574 total deaths by April 5th and redistributed them across age brackets using the same proportions as in the original Italian data.
Table 3 – Estimated true cases with projected deaths (South Korea mortality reference; death cases adjusted for onset-to-death window)
Once again, if we use the resulting distribution to estimate the true number of infected in the 70 to 79 bracket using the South Korean mortality rates and we divide the number of deaths for this age group (5,026) by the corresponding mortality from the Korean data (5.27%) we obtain a projection of 95,374 cases which results in 77,910 more cases than the 17,464 currently detected. By repeating this for each age brackets, with the exception of the <30 bracket for which we don’t have mortality data available, we estimate that the total number of true cases could be as large as 605,330.
The validity of our assumptions and the robustness of our estimates are confirmed by the resulting mortality rate of 2.408% that is similar to the Case Fatality Rate at 10 days (2.45%) computed by dividing the number of death at March 30th (812) by the cases active at the beginning of March 20th (33,190) . The analysis shows that about 78.64 to 85.31% of cases haven’t been identified and thus between 327,367 and 516,427 infected people are still potentially contagious. Although these figures should be taken cautiously, the size of the difference between identified and unidentified cases remains alarming. Moreover, as shown in the following table (4), if the true mortality rate in Italy is the same as North Korea, the age breakdown suggests that more than 70% of undetected cases should be among the active population, between 40 – 69 years old.
Table 4 – Proportion of unidentified cases per age bracket (South Korea mortality reference; current cases vs. adjusted for onset-to-death window)
Many researchers are now suggesting the importance of comorbidities in determine the severity and the outcome of the infection by COVID-19. Having an estimate of undetected cases could help the Italian government, and other governments now facing the same scenario, to better investigate the spread of the virus among their population. Thus, extending aimed testing to underrepresented age brackets and, for example by targeting individuals with comorbidities, increasing the effectiveness of their public health strategies in facing the pandemic as well as mitigating the panic in the public.
About the Author
Andrea Molle, Department of Political Science and Institute for the Study of Religion, Economics and Society, Chapman University, Orange, California, 92866 USA
 The Novel Coronavirus Pneumonia Emergency Response Epidemiology Team. The Epidemiological Characteristics of an Outbreak of 2019 Novel Coronavirus Diseases (COVID-19) — China, 2020[J]. China CDC Weekly, 2020, 2(8): 113-122.
 Age distribution of coronavirus (COVID-19) cases in South Korea as of March 30, 2020, Korean Center for Disease Control. Retrieved through link [Retrieved on March 30th, 2020. The site updates regularly, mortality rates are subject to change].
 Characteristics of COVID-19 patients dying in Italy. Report based on available data on March 30th, 2020, Istituto Superiore di Sanita’: Link [Retrieved on March 30th, 2020. The site updates regularly, mortality rates are subject to change].
 COVID-19 Italia – Monitoraggio situazione by Protezione Civile: Link [Retrieved on March 30th, 2020. The site updates regularly, case numbers are subject to change].
 A. C. Ghani, C. A. Donnelly, D. R. Cox, J. T. Griffin, C. Fraser, T. H. Lam, L. M. Ho, W. S. Chan, R. M. Anderson, A. J. Hedley, G. M. Leung, Methods for Estimating the Case Fatality Ratio for a Novel, Emerging Infectious Disease, American Journal of Epidemiology, Volume 162, Issue 5, 1 September 2005, Pages 479–486, Link.