It’s DNA Day 2019!
DNA Day commemorates the completion of the Human Genome Project in April 2003 and the discovery of the double helix of DNA in 1953.
In honor of the day, we took a closer look at the ~20,000 protein-coding genes in our DNA.
We’ve been able to associate just under 5500 of the 20,000 genes with a unique disease in our knowledge graph, which unifies data from 28+ million articles read with NLP, and dozens of databases.
Most mentioned genes in research
The top most mentioned genes have a heavy focus on cancer, along with other diseases with a high prevalence, and many of them are molecular markers. Understandably, research that has the potential to impact more lives gets larger research budgets and more attention. In case you were wondering, here are the top 20 genes appearing in research, along with an article count.
|TP53||9291||Tumor Protein 53 (TP53) is a tumor suppressor gene, that plays a critical role in maintaining genomic integrity and preventing mutations. Modifications in this gene can lead to a wide variety of cancers, including hereditary cancer Li-Fraumeni syndrome.|
|TNF||5643||Tumor necrosis factor (TNF) is a multifunctional proinflammatory cytokine. Mainly secreted by macrophage, TNF is involved in the regulation of a wide spectrum of biological processes and has been studied in immune diseases like multiple sclerosis, rheumatoid arthritis, and others.|
|EGFR||5321||The epidermal growth factor receptor (EGFR) controls cell differentiation and proliferation processes and is a potential cancer marker. Somatic mutations in this gene are specifically linked with non-small cell lung cancers.|
|VEGFA||4423||Vascular Endothelial Growth Factor A (VEGFA) enodes a heparin-binding protein that is crucial for physiological and pathological angiogenesis. VEGFA is an inflammatory marker that has been frequently associated with renal cell carcinoma, non-Hodgkin and mantle cell lymphoma, hypertensive intracerebral hemorrhage, Alzheimer’s and diabetic retinopathy.|
|IL6||4322||Interleukin 6 (IL6) gene encodes for interleukin 6 cytokine. This cytokine plays an important role in inflammation and B cell maturation. IL6 has been studied in acute pancreatitis, periodontitis, Kaposi Sarcoma, rheumatoid arthritis, and other autoimmune diseases.|
|APOE||4252||Apolipoprotein E (APOE) encodes for a protein that is essential for normal fat metabolism. Mutations in the gene may cause a variety of age-related complications, such as hearing loss, muscular degeneration, Alzheimer’s and cardiovascular disease.|
|TGFB1||4101||Transforming growth factor beta 1 (TGFB1) is a multifunctional peptide that controls proliferation, differentiation, motility, and apoptosis of the cells. Mutations in this gene may cause Camurati-Engelmann disease and several types of cancers.|
|MTHFR||3431||Methylenetetrahydrofolate Reductase (MTHFR) gene controls the conversion of amino acid homocysteine into methionine. Mutations in the gene are linked to homocystinuria, spina bifida, anencephaly and others.|
|ESR1||3092||Estrogen receptor 1 (ESR1) is an important biomarker for inflammation and has been associated with estrogen receptor-positive breast cancer, primary ovarian insufficiency, estrogen resistance, myocardial infarction and more.|
|AKT1||3088||AKT1 gene encodes for an enzyme “serine threonine protein kinase” that serves an important role in cell proliferation and differentiation. This gene is a cancer marker, studied frequently in metastatic prostate, oesophageal and vulvar squamous cell carcinoma and Proteus syndrome.|
|HIF1A||2976||Hypoxia-inducible factor-1 alpha (HIF1A) is a transcription factor that plays an essential role in cellular and systemic homeostatic responses to hypoxia. Mutations in the gene may also lead to retinal ischemia.|
|NFKB1||2941||Nuclear Factor Kappa B Subunit 1 (NFKB1) plays a crucial role in transcription regulation. Abnormal activation of this gene has been associated with a number of inflammatory diseases while persistent inhibition of NFKB1 leads to inappropriate immune cell development or delayed cell growth.|
|IL10||2938||Interleukin 10 (IL10) is a key anti-inflammatory cytokine produced by activated immune cells that plays a critical role in immune responses. Mutations in the gene are associated with HIV1 susceptibility, rheumatoid arthritis and others.|
|BRCA1||2778||Breast Cancer Type 1(BRCA1) plays critical roles in DNA repair, cell cycle checkpoint control, and maintenance of genomic stability. a It is most notoriously associated with breast and ovarian cancer. Numerous pathogenic variants in BRCA1 have been identified.|
|ERBB2||2759||Erythroblastic Oncogene B2 (ERBB2) encodes for Erb-B2 receptor tyrosine kinase 2, which facilitates cell proliferation and suppresses apoptosis. Overexpression of this gene has been associated with a variety of cancers.|
|MMP9||2690||Matrix Metalloproteinase 9 (MMP9) is a neoplastic marker that has been widely studied in breast, invasive prostate, papillary thyroid, hepatocellular and ovarian cancer. It has also been identified as an inflammation marker. MMP9 proteins are essential for the breakdown of extracellular matrix in various physiological processes, including embryonic development, reproduction, tissue modeling and others.|
|IL1B||2658||Interleukin 1 beta (IL1B) belongs to cytokine family that mediates acute phase response. Mutations in the gene are associated with gastric cancer, periodontal disease and others. Overexpression of IL1B can also lead to various autoinflammatory syndromes.|
|HLA-DRB1||2654||Human leukocyte antigen-DR beta 1 (HLA-DRB 1) complex plays a critical role in the effective functioning of the immune system. Mutations in this gene are associated with autoimmune Addison disease, multiple sclerosis, rheumatoid arthritis, and type 1 diabetes.|
|STAT3||2547||Signal transducer and activator of transcription 3 (STAT3) gene plays a key role in cellular processes such as cell growth and apoptosis. Mutations in STAT3 may lead to cancer, autoimmune disorders and autosomal dominant hyper-IgE syndrome.|
|APP||2520||Amyloid precursor protein (APP) gene encodes for an integral membrane protein, the amyloid precursor protein. Evidence suggests it controls synapse formation and neural plasticity. Mutations in the genes are linked with autosomal dominant Alzheimer’s disease and hereditary cerebral amyloid angiopathy.|