Serum Calcium in Relation to COVID-19 in Asian Population: A Two-Sample Mendelian Randomization Study

To explore the causal relationship between serum calcium concentration and the risk of COVID-19 in Asian population, this study apply the analysis method of two samples of Mendelian randomization to demonstrate. Methods: Through literature search, the summary statistic data of the whole gene association study (GWAS) related to serum calcium concentration was obtained, and this data was subjected to secondary data analysis. From the large-sample GWAS summary data, the genetic variation closely related to the serum calcium concentration was selected as an instrumental variable. Five models (Inverse variance weighted, MR Egger, Weighted median, Simple mode and Weighted mode) were used for analyzing and evaluating the causal relationship between serum calcium concentration and the risk of COVID-19 with OR value and 95% confidence interval. Results: A total of 8 SNPs were included as instrumental variables in this study. The results of gene pleiotropy test showed that there was no gene pleiotropy (P = 0.54). The IVW results showed that each increase in serum calcium concentration by one standard deviation would reduce the risk of COVID-19 (OR=0.01, 95%CI: 0.0001-0.73, P=0.03). Conclusion: Appropriate supplementation of serum calcium concentration can help prevent the onset of COVID-19.


Introduction
On February 11, 2020, the Director-General of the World Health Organization Tan Desai announced in Geneva, Switzerland, that the 2019 pneumonia infected by the new coronavirus SARS-CoV-2 was named "COVID-19" [1]. Since December 2019, the new coronavirus is still spreading widely around the world. As of 0:08 on October 13, 2021, Beijing time, there have been a total of 237,655,302 confirmed cases of new coronary pneumonia worldwide, and a total of 4,846,981 deaths, which has brought huge disasters to society [2]. At present, there is no effective medical method to kill this virus. Therefore, it is very important to prevent the new coronavirus infection. In recent years, with the vigorous development of nutrition, the role of trace elements has gradually been widely valued by the scientific community. Trace elements are active substances involved in the intermediate links of the body's metabolism, such as calcium. Calcium is the most abundant inorganic element in the human body [3].
It has the following biological functions [3]:

3.
Acting as a messenger;

4.
Helping to regulate enzyme activity,

6.
Reduce the permeability of capillaries and cell  [4,5]. Therefore, this study uses the 2SMR method to conduct a secondary analysis of genome-wide association study (GWAS) data to explore the causal relationship between serum calcium concentration and the risk of COVID-19.

Study Subjects
Material Source: Serum calcium concentration is an exposure factor, and whether one person is infected with new coronary pneumonia is the outcome. Download the serum calcium and GWAS summary data of whether or not the new coronary pneumonia is infected through the MR base database and the GWAS Catalog database (Table 1).

Screening of Instrumental Variables: The basic requirements
for genetic variation to become IV are summarized below. The IVs is related to exposure; variation has nothing to do with confusion issues related to exposure and outcome; and variation has no effect on the results, unless it is possible to expose through contact. First, using the whole genome information of Asian populations as a reference, SNPs with a significant level (P < 5*10-8) associated with adults up to the whole genome were screened out. In genetics, it is generally believed that genetic sites that are very close on chromosomes are "bundled" together and passed on to offspring.
This also results in a large r 2 between sites that are very close. In order to ensure the independence between SNPs, this study set the linkage disequilibrium parameter (r 2 ) to 0.001 and the genetic distance to 10000kb. That is, remove all SNPs whose genetic distance from the top SNP is r 2 > 0.001 within 10000kb. Secondly, in the outcome data set, if the information about the filtered instrumental variable is not found, the SNiPA website is used to search for alternative SNPs (r 2 > 0.9); if alternative SNPs cannot be obtained, the instrumental variable is eliminated. After that, align the allele effect values in the exposure data set and the outcome data set, and merge the data [6]. Finally,

Statistical
to evaluate whether there are weak instrumental variables. Among them, N is the sample size of the exposed data set, K is the number of SNPs, and R 2 is the degree of variation that can be explained by the SNPs in the exposed data set [8]. The calculation formula of R 2 is: . Among them, MAF is the minor allele frequency, which is equivalent to the effect allele frequency EAF when calculating R 2 , β is the allele effect value, and SD is the standard deviation.

Gene Pleiotropic Test:
When genetic variation can affect the occurrence of the outcome through other means besides exposure factors, the instrumental variable is pleiotropic [9].
The existence of gene pleiotropy will cause the "collapse" of the two-sample Mendelian randomization model. Therefore, this study used the software package "MR_PLEIOTROPY" in the R package to test the genetic pleiotropy of instrumental variables. In addition, the MR-Egger model is used to correct the bias caused by gene pleiotropy.

Linkage Disequilibrium Test:
In genetics, it is generally believed that genetic sites that are very close on chromosomes are "bundled" together and passed on to offspring, which leads to r 2 between sites that are very close. Will be very large; this phenomenon is called linkage disequilibrium. Therefore, in this study, the linkage disequilibrium parameter (r 2 ) was set to 0.001 and the genetic distance to 10000kb to avoid the occurrence of linkage disequilibrium. That is, remove all SNPs whose genetic distance from the top SNP is r 2 >0.001 within 10000kb.

Heterogeneity Test:
Cochran's Q test is used to judge the heterogeneity of instrumental variables [10]. height by one standard deviation. In order to strengthen the reliability of the results, the "Leave-one-out" method is used for sensitivity analysis, each SNP is eliminated in turn, and the remaining SNPs are used as IVs for two-sample MR analysis to determine that a certain SNP will have a strong causal relationship. Influence.

6.
The above analysis methods were all realized by R v4.0.5 software, and the two-sided P < 0.05 indicated that the difference was statistically significant.

Information About Instrumental Variables
This study finally included a total of 8 SNPs as instrumental variables (Table 2). R 2 is 0.25%, the F statistic corresponding to a single SNP has a distribution range of 15.32-38.50, and the F statistic corresponding to 8 SNPs is 22.32, indicating that the causal association is less likely to be affected by weak instrumental variable bias1 [11]. The MR-Egger regression intercept term is 0.33, which shows that there is no genetic pleiotropy between SNPs and the risk of COVID-19. The results of MR_PLEIOTROPY test showed that there was no gene pleiotropic bias (P = 0.54).

Two-Sample Mr Result
In Asian populations, serum calcium concentration is associated with a reduced risk of COVID-19 (Figure 1 & 2).

Heterogeneity Test Result
The Cochran's Q test result of IVW is P=0.87, indicating that there is no heterogeneity in SNPs; the Cochran's Q test result of MR-Egger regression is consistent with it.
The "Leave-one-out" sensitivity analysis results of this study are shown in (Figure 3). No matter which SNP is eliminated, the analysis results of the remaining 7 SNPs are similar to the analysis results of all the included SNPs, and no influence on the estimated value of causal association is found. Larger SNPs (Figure 3).

Conclusion
Appropriate supplementation of serum calcium concentration can help prevent the onset of COVID-19.

Discussion
This project uses large-scale GWAS aggregated data and uses two-sample MR to explore the causal relationship between serum calcium concentration and the risk of COVID-19 in Asian populations. The results show that serum calcium can reduce the risk of COVID-19. This study has the following advantages: First, the use of public GWAS data can save research cost and time; Second, the two GWAS databases are both from Asian populations and are independent of each other, avoiding population bias; Third, it is not the same as using a single SNP as the IV In comparison, using 8 SNPs together as IVs can increase the proportion of calcium ion variation that can be explained by genetic variation; fourth, compared with traditional experimental studies, MR simulates a more realistic random allocation process, and the research design Relatively simple, research and implementation will not violate ethics. At the same time, this study also has some limitations. 1. [12] First of all, MR cannot further explore and explain the biological mechanism of the influence of genetic variation on the pathogenesis of COVID-19.

Previous studies have shown that in
Second, MR assumes that there is a linear relationship between exposure and outcome. If there is a non-linear relationship between calcium ion concentration and the risk of COVID-19, MR will not be suitable for causal inferences between the two. Third, this study uses two GWAS aggregated databases, lacking individual level data, unable to conduct subgroup analysis of age or gender, and unable to compare the differences in causal effects between subgroups. At the same time, the small sample size of the COVID-19 database will cause problems.
The statistical power of causal inference is reduced. Finally, this study did not adjust the immune-related indicators, and the MR method can only make preliminary inferences about its causality, and further research is needed to explore it. In summary, this study used two samples of MR to investigate the serum calcium concentration and the risk of COVID-19. The results showed that there is a negative causal relationship between the two, that is, the increase in serum calcium concentration can reduce the incidence of COVID-19. Risk, which can provide a certain reference value for the clinical treatment process.