+1 (502) 904-2126   One Westbrook Corporate Center, Suite 300, Westchester, IL 60154, USA   Site Map
ISSN: 2574 -1241

Impact Factor : 0.548

  Submit Manuscript

Review ArticleOpen Access

Artificial Intelligence in the Future Biobanking: Current Issues in the Biobank and Future Possibilities of Artificial Intelligence Volume 7 - Issue 3

Jae-Eun Lee*

  • Division of Biobank for Health Sciences, Korea National Institute of Health, Korea

Received: July 23, 2018;   Published: July 31, 2018

*Corresponding author: Jae-Eun Lee, Division of Biobank for Health Sciences, Korea National Institute of Health, Korea

DOI: 10.26717/BJSTR.2018.07.001511

Abstract PDF


Medicine has been translating into personalized and precision medicine based on individual genetic, environmental and clinical characteristics, and lifestyle. The biobank is an essential infrastructure for the successful implementation of personalized and precision medicine. Recently, researches for development of artificial intelligence (AI) technology for personalized and precision medicine are actively being conducted, however research to utilize AI in biobanking is not noticeable. This article presents current issues in the biobank and the future possibilities of AI in biobanking.

Keywords:Artificial intelligence, Biobank, Biobanking, Precision biobanking, Precision medicine

Abbreviations:AI: Artificial Intelligence, ML: Machine Learning, NLP: Natural Language Processing, SOPs: Standard Operating Procedures, STR: Short Tandem Repeat, SNP: Single Nucleotide Polymorphism

Current Issues in the Biobank

It is important to establish a collection plan of biosamples and related data in view of research trends, preanalytical and analytical variables, disease trends, and health-related information. Especially, it is very important that the biobank controls the entire lifecycle of biosamples, because biosamples such as serum, plasma, urine, and tissue may be affected by preanalytical (e.g., biosample collection, processing, movement, and storage conditions) or analytical variables (e.g., the type of analyte and the method of analysis) [1-4]. The biobank may also have a need to develop and manage dynamic consent for future biomedical research. Dynamic consent makes it possible to economize the recruitment and management of biobank participants [5], and to continuously secure biosamples and related information (e.g., electronic clinical records and life log data) for the follow up of participants. It also allows research to be carried out flexibly, reflecting new analytical techniques. Current major issues in biobanking for personalized and precision medicine are summarized following as:

a) Developing and managing dynamic consent

Collecting, processing, transporting, storing, and distributing biosamples in consideration of preanalytical and analytical variables

c) Securing information (such as timestamp) on the entire lifecycle of biosamples

d) Selecting and classifying collected biosamples suitable to specific intended uses

e) Establishing a biosample collection plan in consideration of the research and disease trends and inventory status of the biosample.

Future Possibilities of AI in Biobanking

Although researches for application of AI to various fields including medicine are actively being conducted, research to utilize AI in biobanking is not noticeable. However, I believe that in the foreseeable future, a new generation of biobanking using AI will be launched. AI systems are able to process large amounts of data simultaneously and rapidly, and to learn from each incremental case to continually improve accuracy [6]. AI devices include machine learning (ML) methods and natural language processing (NLP) techniques. ML and NLP techniques extract information from structured data (such as images, genetic data, and electrophysiological recording) and unstructured data (such as clinical notes and the literature), respectively [7]. ML techniques can reveal complex relationships [6]. NLP techniques translate text-based data into structured data that can be analyzed using ML techniques [8]. In the medical field, AI applications could be utilized for various tasks including diagnosis and outcome prediction of diseases and medical image analysis. In deed Watson, a question-answering AI computer system that can answer questions raised in natural language, has been developed in IBM’s Deep QA project. IBM Watson has several types including Watson for Genomics to interpret genetic data and Watson for Oncology to recommend cancer patient’s treatments. AI has also the potential to play a variety of roles in supporting persons working in biobanks.

Bioresource Collection using AI

Dynamic consent in biobank research will be acquired and managed through web-based communication between AI and biobank participants; for example, AI-based systems could read and explain the contents of the consent form to the participants and answer the questions. When a participant withdraws consent, AI system could discard biosample-related data and ask the biobank’s administrator to destroy participant’s biosamples and could announce the research progress to participants in real time. Biosamples are useful if they are obtained in a standardized way. AI will develop standard operating procedures (SOPs) or standardized criteria for the acquisition of biosamples suitable to specific intended uses, through analysis of the literature on preanalytical and analytical variables by the type of biosample; for example, Marzi et al. [3] proposed that to detect microRNAs for lung cancer diagnosis, blood must be clotted for 2-3hr at room temperature and the serum is immediately separated after centrifugation [3]. AI could establish serum sampling conditions for microRNAsbased clinical test for lung cancer diagnosis by analyzing similar accumulated research results. AI systems will collect and manage information on the history of collection, processing, movement, and storage of biosamples. This information can be used to select biosamples suitable to research purposes. AI has the potential to interpret various types of medical image data (e.g., magnetic resonance imaging, radiograph, and ultrasound imaging). AI systems will extract significant information from electronic medical records of biobank participants and will collect information about health status by analyzing medical image data of participants.

Management using AI

AI will define and measure the quality of biosamples; for example, AI systems could assess DNA integrity with DNA gel electrophoresis images and could determine percentage of tumor and necrosis with digital histopathology images of tissue samples. The results of the short tandem repeat (STR) analysis and single nucleotide polymorphism (SNP) genotyping that conducted for quality control of biosamples could be used to judge whether or not they match the gender information of participants or DNA sequencing data. In addition, AI will establish a biosample collection plan for the future biomedical research by analyzing the biobank’s distribution and inventory status, and research trends (such as publication and patent trends in biomedical research). As biosamples are used for researches, empty space is caused irregularly in biosample storage equipment. If AI system is linked with an automated sample storage system, it will change the location of biosamples for efficient use of storage space.

Bioresource Utilization using AI

AI applications will analyze the contents of the research proposal and then will recommend biosamples suitable to specific uses. For this, AI could extract important information (e.g., the type of biosample, the type of analyte, the method of analysis, the target disease, and research purposes) from the research proposal and could analyze the references to preanalytical and analytical variations related to these elements. Next, AI could select biosamples suitable for the study in consideration of biosample collection, processing, and storage details, results of quality control, and participants’ clinical information. Biobanks are able to assess the value of their biosamples through bibliographic analysis on publications and patents specifying the use of their biosamples [9]. AI systems will extract publications and patents specifying the use of the biobank’s biosamples from a bibliographic database (e.g., the Scopus and Embase database) and will analyze research purposes from them.

This article describes how AI technology can promote standardization and innovation in biobanking. AI systems will continue to evolve with the development of big data analysis technology [7]. In the foreseeable future, AI technology will enable the precision biobanking [2] by supporting the work of people working in the biobank. The development of AI systems that can be used in biobanking should be performed in a generally same or similar direction to the challenges presented in this article.


  1. Betsou F, Bulla A, Cho SY (2016) Assays for Qualification and Quality Stratification of Clinical Biospecimens Used in Research: A Technical Report from the ISBER Biospecimen Science Working Group. Biopreserv Biobank 14(5): 398-409.
  2. Lee JE, Kim YY (2017) Impact of Preanalytical Variations in Blood- Derived Biospecimens on Omics Studies: Toward Precision Biobanking? OMICS 21(9): 499-508.
  3. Marzi MJ, Montani F, Carletti RM (2016) Optimization and Standardization of Circulating MicroRNA Detection for Clinical Application: The miR-Test Case. Clin Chem 62(5): 743-754.
  4. Zimmerman LJ, Li M, Yarbrough WG (2012) Global stability of plasma proteomes for mass spectrometry-based analyses. Mol Cell Proteomics 11(6): M111.014340.
  5. Budin-Ljøsne I, Teare HJ, Kaye J (2017) Dynamic Consent: A potential solution to some of the challenges of modern biomedical research. BMC Med Ethics 18(1): 4.
  6. Buch VH, Ahmed I, Maruthappu M (2018) Artificial intelligence in medicine: Current trends and future possibilities. Br J Gen Pract 68(668): 143-144.
  7. Jiang F, Jiang Y, Zhi H (2017) Artificial intelligence in healthcare: Past, present and future. Stroke Vasc Neurol 2(4): 230-243.
  8. Murff HJ, FitzHenry F, Matheny ME (2011) Automated identification of postoperative complications within an electronic medical record using natural language processing. JAMA 306(8): 848-855.
  9. Lee JE, Kim YY (2017) How Should Biobanks Prioritize and Diversify Biosample Collections? A 40-Year Scientific Publication Trend Analysis by the Type of Biosample. OMICS 22(4): 255-263.