Multiple sclerosis (MS) is an inflammatory demyelinating disease of the central nervous system. Improved prevention and treatment will depend on a greater understanding of the causes and mechanisms involved in its onset and progression. MS is clearly driven by both environmental and genetic factors. Established contributory environmental factors include lower ultraviolet radiation exposure and lower vitamin D levels, Epstein-Barr virus and smoking. Our current understanding of MS genetics is undergoing a major upgrade as new genetic technologies are applied to large MS studies. In this article, we review the current literature describing a genetic contribution to MS susceptibility and review the methods to detect genetic variants that may underlie the genetic contribution to MS. We also consider how reporting of genetic discoveries in MS in the lay press has caused some confusion among patients and their families, who, not surprisingly, think that these discoveries can be translated into an available genetic test to diagnose MS or recognise family members at risk of developing MS. We review the current limited clinical use of genetics in the diagnosis and management of MS.
- Multiple Sclerosis
Statistics from Altmetric.com
In the1920s, a worldwide investigation of multiple sclerosis (MS) prevalence found that racial and ethnic differences might indicate a genetic influence.1 Later studies found MS aggregates in families,2 especially the risk to siblings compared to the general population (OR 16.8 (CI 14.0 to 20.3)).3 This was further supported by twin studies showing higher concordance rates in monozygotic twins (24–30%) than dizygotic twins (3–5%),4 ,5 and by the observations that there is no increased risk for adoptive relatives6 and an intermediate risk for half-siblings7 and offspring of conjugal pairs.8 Although these epidemiological data suggest that genetic variation is important for susceptibility to MS, gene discovery has lagged significantly until recently (figure 1).
The definitions of common genetic terms used in this review are shown in box 1.
Definition of common genetic terms
Haplotype: A set of alleles (DNA sequence) of a group of closely linked genes, such as the human leucocyte antigen complex, which are usually inherited as a unit. An individual inherits a separate haplotype from each parent.
The International HapMap Project: This is an organisation that aims to develop a complete haplotype map (HapMap) of the human genome. The information produced by the project is free to researchers around the world. http://hapmap.ncbi.nlm.nih.gov/
Linkage analysis: A gene-hunting technique that traces patterns of heredity in large, high-risk families, in an attempt to locate a disease-causing gene mutation by identifying traits coinherited with it. Linkage is the tendency for genes and other genetic markers to be inherited together because of their location near one another on the same chromosome.
Linkage disequilibrium: The non-random association of alleles at multiple loci.
Minor allele frequency (MAF): The ratio of chromosomes in the population carrying the less common variant to those with the more common variant.
Common variants: Variants with MAF ≥5%.
Rare variants: Variants with MAF <5%.
Single nucleotide polymorphism (SNP): SNPs are the most common type of genetic variant, where a single nucleotide is changed in a DNA sequence. SNPs are considered to be point mutations that are evolutionarily successful. Generally one SNP has two alleles. SNPs may be functional or non-functional; rare variants are more likely to be functional than common variants.
Genome-wide association study: This is an examination of many common genetic variants in different individuals to see if any variant is associated with a trait. It uses genotyping microarrays to measure hundreds of thousands of SNPs in one person's DNA and generally tests thousands of cases and controls.
Exome sequencing: This sequences all the exons in the genome (exome) and is a cheaper but still effective alternative to whole-genome sequencing. Exon is the protein-coding portion of a gene sequence. Exome sequencing may produce significant false-positive and false-negative rates and the coverage varies over exons, including non-coverage.
Epigenome: This is the study of the complete set of epigenetic modifications in the genetic material of a cell. The field is analogous to the study of the genome and proteome of a cell.
Epigenetics: This is the study of inherited changes in phenotype (appearance) or gene expression caused by mechanisms other than changes in the underlying DNA sequence.
Limited success for family-based linkage analysis and candidate-gene studies
Family-based linkage analysis has been powerful in identifying the genes responsible for Mendelian diseases whose traits follow monogenic patterns of inheritance, for example, cystic fibrosis; it is less helpful in identifying the genes underlying complex traits such as MS. In the1970s,9 linkage analysis showed that MS was associated with the major histocompatibility complex (MHC) region on chromosome 6 encoding the human leucocyte antigen (HLA) genes. Subsequently, several large MS linkage analyses have failed to identify any locus outside the MHC region, even though the largest study included 730 multiplex families with 2692 individuals.10 The lack of success relates to the limitations of classical genetic techniques: for example, microsatellite-based screens and low-density marker sets, along with disease heterogeneity and the absence of large extended families with clear and homogeneous mode of transmission. Candidate-gene studies are the practical alternative to linkage analysis, and these have successfully identified some genes contributing to susceptibility of common diseases, but not MS. Generally, candidate-gene studies are based upon biological hypotheses or knowledge of the candidate genes location through linkage analysis. When the fundamental physiological basis of a disease is unknown, as in MS, the candidate-gene approach is usually unsuccessful.
Success with genome-wide association studies
The HapMap project11 defined linkage disequilibrium patterns across the human genome, allowing development of genome-wide association studies (GWAS) tools that use single nucleotide polymorphisms (SNPs) as markers of genetic diversity directly on the basis of linkage disequilibrium. The HapMap project defined roughly 11 million common genetic variations (SNPs) in the human genome and found that many groups of neighbouring SNPs that correlate nearly perfectly with each other; this allowed the selection of variants that represent short regions of the genome.12 Many SNPs target a genetic locus and may indicate associations with many genes in an area of linkage disequilibrium within the genome. Therefore, it is important to take care when assigning an SNP to a specific gene. The HapMap data allowed the development of high-density SNP mapping of the genome with technologies that simultaneously assess over a million SNPs on a single chip in a highly cost-effective manner.
The major problem with GWAS studies is the statistical burden of multiple testing, such that larger and larger studies are needed to detect variants with smaller and smaller OR. Despite this, GWAS studies have been highly successful in finding common variants in many complex disorders. Also, newly developed statistical methods have improved the analysis, such as identifying and correcting for population stratification and relatedness,13 ,14 and imputating ungenotyped variants;15 these approaches increase the accuracy and reliability of GWAS outputs.
To date, GWAS have detected over 60 loci associated with MS outside the HLA region, and have confirmed the major role of the HLA-DRB1*15:01 (HLA-DR15) gene (OR 3.08, p<10−312).16 All the non-HLA associations are common variants with modest risk (OR in the region of 1.1–1.3), and many are near genes with key roles in the immune system (see supplementary appendix 1).17–27 The largest MS GWAS to date, conducted by the International Multiple Sclerosis Genetics Consortium (IMSGC) and Wellcome Trust Case Control Consortium 2 (WTCCC2), studied 9772 cases and 17 376 controls and identified a further 29 novel suscepbility loci, while replicating almost all the previously associated loci (see supplementary appendix 1). It also pointed out that the identified genetic regions were in, or near to, genes with a primary role in T-cell-mediated immune mechanisms.16
GWAS explain a small part of the heritability of MS
Although GWAS have identified over 60 loci associated with MS, these explain only a small fraction of MS heritability. In our own studies, taking account of the identified loci, allele frequencies and sibling relative risk using Risch's gene effect measuring method,28 we found that all the identified loci explained only 18–24% of the heritability of MS; HLA-DR15 explained 11%3 (figure 2). Hence the obvious question: ‘Where is the missing heritability?’
Where is the missing heritability in MS?
Several theories unproven as yet may explain the missing heritability:
Rare variants. GWAS are constrained by their design to the study of common variants (minor allele frequencies (MAFs) ≥5%); it is hard to detect rare variants (MAF <5%) by standard genotyping arrays.
Common variants. There are probably many more undiscovered common variants with even smaller effects (1<OR<1.1) waiting to be found by larger and larger GWAS. However, one recent modelling exercise in Crohn's disease suggested that even for an infinitely sized GWAS, common variants explained less than 50% of heritability.29
Epigenetics. Epigenetic modifications may regulate the expression or activity of genes; the effect of a gene may increase by either increasing or decreasing its expression.
Gene–gene interactions. If two or more genes govern a single phenotype, the genes will affect each other's expression and therefore change their effect.
Pathway involvement, where strongly associated variants cluster with weak ones within a known signalling pathway; this may increase the overall effect of the pathway and enhance the genetic susceptibility to disease.
Gene–environment interaction, where the environment increases or decreases the effect of a gene in determining disease onset. For example, the genetic defect underlying phenylketonuria, although completely penetrant when there are two copies of the abnormal gene, is phenotypically silent unless the environmental factor—dietary phenylalanine—is introduced.
Structural variants refer to regions of DNA from around 1 kb to several Mb, and include inversions and balanced translocations or genomic imbalances (insertions and deletions), commonly referred to as ‘copy number variants’. Such variation has been largely unexplored in relation to complex traits.30
Considering each of these issues in turn:
Rare variants: Recently, rare variants have received increased attention in an attempt to explain the ‘missing heritability’ fraction for many complex disorders, including MS. Rare variants with lower MAF (<5%) have not been represented on many of the microarrays used for major genetic studies until very recently. Rare variants are often difficult to detect in case–control study designs unless the samples are from an isolated population or have some other enrichment factor allowing detectable numbers of copies for the minor alleles. Rare variants are likely to have larger effect sizes (within the pedigrees or populations in which they segregate) and could contribute significantly to missing heritability. These variants are also likely to have more obvious functional consequences.31 There are rapidly accumulating data to show that rare variants have a large cumulative effect on normal phenotypic variation and are extremely important in disease.32 Recently, studies using exome sequencing in a set of multiplex multiple sclerosis families found several rare functional variants in CYP27B1 underlying the known MS locus on chromosome 12;33 this was the first description of a rare variant in MS. Interestingly, CYP27B1 encodes the protein that converts 25-hydroxy-vitamin D3 to the active hormone 1,25-dihydroxy-vitamin D3. The rare functional variants in CYP27B1 are associated with the development of vitamin D-dependent rickets. The authors noted that in all proven genetic cases of vitamin D-dependent rickets in Norway, there was a 100% concordance with MS. We await further evaluation of this finding in other cohorts. This finding also supports the increasing evidence for a role for vitamin D deficiency in the onset of MS.
Epigenetics and MS: ‘Epigenetic’ refers to the modification of gene expression or biological phenotypes resulting from changes to the genomic structure that do not alter the underlying genetic code.34 The major epigenetic processes in mammalian cells are methylation and modification of histones. DNA methylation may be responsible for the stable maintenance of a particular gene expression pattern through mitotic cell division35. DNA methylation is generally associated with reduced or suppressed gene expression. Cells acquire epigenetic modifications over time and the methylation signatures persist through cell division. Histone modifications are the post-translational covalent addition of molecules to the histone subunits of the nucleosome.36 Histone modification acts in diverse biological processes, such as gene regulation, DNA repair, chromosome condensation (mitosis) and spermatogenesis (meiosis). Histone modification may interact with DNA methylation to regulate gene expression. Specific patterns of histone modifications associate with the pattern of DNA methylation.37
The role of epigenetics in the aetiology of MS is still unclear. However, several studies have suggested that epigenetic modifications may be important in determining MS susceptibility. For example, older MZ twins show significantly different patterns of genomic distribution of 5-methylcytosine DNA and histone acetylation,38 indicating the effect of the external environment on the epigenome.39 Also, MS shows maternal parent-of-origin effects associated with DNA methlylation40 and these influence the age of MS onset.41 HLA-DR15 status also influences the maternal parent-of-origin effect: thus, MS patients who are HLA-DR15 positive have an increased female-to-male ratio and mothers who are HLA-DR15 are more likely to have affected female offspring than those who are HLA-DR15 negative.42 However, this effect was not seen in all studies.43 Recently, a whole-genome sequencing study, assessing DNA methylation and gene expression in three MS-discordant MZ twin pairs, found no consistent differences in DNA sequence, DNA methylation or gene expression in CD4 T cells.44 Researchers have subsequently argued that these data cannot conclusively rule out the possible involvement of somatic changes in twins discordant for MS because of design limitations with the study, particularly the small sample size, the low average depth coverage of the genome sequencing and the analysis limited to just CD4 T cells.45
Gene–gene interactions and MS: For common diseases driven by complex genetic susceptibility, gene–gene interactions appear to be explanation for the missing heritability.31 Gene–gene interaction refers to the situation where a single characteristic (phenotype) is governed by two or more genes and every gene affects the expression of the other genes involved. Several studies have examined the role of gene–gene interaction in MS, particularly HLA gene–gene interactions. For example, in a Swedish population, the combination of HLA-DR15 carriage and absence of HLA-A*02 alleles increased the risk of MS by 23-fold.46 However, the largest MS GWAS conducted to date found no definitive evidence for gene–gene interaction in MS susceptibility.16
Pathway analysis and MS: In biological systems, genes and their products (proteins) operate within pathways that exert various functional outcomes within the body, such as cell signalling cascades or the metabolism of compounds such as vitamins or amino acids. Pathway analysis involves investigating genes only within a particular pathway and thereby provides specific information about the genetic pathway being affected in disease conditions—yielding biological insights into the disease process. However, most common diseases are influenced by the combined effects of many loci and/or a number of rare variants that interact with environmental factors in multiple complicated ways.47
Pathway-based analytical approaches allow the genetic variants associated with disease to be considered within the context of a known biological pathway. Evidence of enrichment of susceptibility variants within a pathway may offer more insight into the disease process. If disease-associated variants cluster within a known signalling pathway, then disruption or alteration of that pathway may be involved in the disease process. For example, we considered SNPs shown by GWAS to be associated with MS19 and found evidence of enrichment of disease susceptibility SNPs within a module of the vitamin D metabolism and signalling pathway (figure 3). This pathway includes the confirmed MS associated genes CYP27B1 and IL-12.16 While some of the remaining SNPs are only weakly associated with MS (we used the top 25% of the GWAS sample) the additional evidence from the pathway analysis, and the postulated involvement of vitamin D in the disease process, strengthens their appeal for further consideration.
To date, there have been few published pathway analyses in MS. One pathway-orientated MS study that took into account all SNPs with nominal evidence of association (p<0.05), rather than just those SNPs that exceed the genome-wide significance threshold,48 found three immunological pathways that were over-represented in MS. These include cell adhesion pathways, communication and signalling pathways, and one neural pathway, namely axon-guidance and synaptic potentiation. Recently, by analysing the T helper cell differentiation pathway, the IMSGC/WTCCC2 GWAS found over-representation of genes involved in T-cell maturation, including 12 genes coding for the cytokine pathway, 6 genes coding for costimulatory molecules and 7 genes coding for signal transduction molecules, all of which are immunological relevant. These findings indicate that T-cell signalling pathways may be important in the pathogenesis of MS.16
Gene environment interaction and MS: The known environmental risk factors for MS include high Epstein-Barr virus (EBV) IgG antibody titres, low vitamin D level and/or ultraviolet radiation exposure and cigarette smoking. Interactions between genes and environmental factors may explain at least part of the missing heritability: a particular genetic variable in the right environmental situation may exert a significantly greater effect on MS causation than if the environmental exposure were not present. So far, attention has largely focused on the major genetic risk factor HLA-DR15. Thus, the effect of an environmental exposure on MS might depend on whether a person is HLA-DR15 positive or HLA-DR15 negative; conversely, the effect of HLA-DR15 might depend on the level of the environmental factor.
Since HLA-DR15 may be important in determining the CD41 Th-mediated immune response to EBV infections,49 it is possible that EBV and HLA-DR15 interact with each other. Indeed, several HLA alleles recognise EBNA-1 epitopes.50 Several epidemiological MS studies have looked at the interaction between HLA-DR15 and EBV (reviewed in ref.51), but so far without any clear association.52
Several genetic vitamin D pathway variants (eg, vitamin D receptor (VDR) gene,53 CYP24A1 gene16 and CYP27B1 gene19) appear to increase MS risk, but few studies have examined interactions with 25(OH)D levels, or reported sun exposure or vitamin D intake. Regarding the vitamin D receptor (VDR) gene, the Tasmanian MS case–control study showed that only people with a copy of G at the Cdx-2 polymorphism (rs11574010) in the VDR gene carried the risk for low winter childhood sun exposure,54 while in the US Nurses Cohorts study, only those with a ‘ff’ genotype of the FokI polymorphism (rs10735810) within the VDR gene, showed that the effect of low vitamin D intake. There was no significant interaction between latitude of residence and FokI genotype.55 The SNPs for FokI and CDx-2 are not in linkage disequilibrium with one another. There were no interactions between low vitamin D intake and CYP27B1, CYP24A1, CYP2R1, DBP or HLA-DR15.55
There is some interesting functional evidence of an interaction between HLA and vitamin D. Recently, a vitamin D response element (VDRE) was found in the HLA-DRB1 promoter region.56 The VDRE was highly conserved in the major MS-associated haplotype HLA-DR15, but not conserved among non-MS associated haplotypes.56 This study also showed that the VDRE responded to 1,25(OH)2D3 and influenced gene expression in B-cells that were transiently transfected with the HLA-DR15 promoter.56 These functional studies support the case that higher vitamin D levels at critical times—specifically in HLA-DR15 positive individuals—might reduce the risk of MS onset. This concept may be in line with the finding that people with MS born in April were more often HLA-DR15 positive whereas those born in November were less often positive for this gene.57
Many studies have found a positive association between smoking and MS, including dose–response or duration of smoking. Meanwhile, the HLA-A*02 allele may have a protective role for MS.46 ,58 A Swedish study found that among those people with both HLA-DR15 and HLA-A*02 alleles, smokers were at significantly increased risk of MS (OR 13.5) compared to non-smokers with neither of these genetic risk factors.59 The OR for smokers without genetic risk was 1.4 and compared to 4.9 for non-smokers with both genetic risk factors.59 They also observed a significant interaction between the absence of HLA-A*02 and smoking among those carrying HLA-DR15 alleles.59 However, a pooled study including three MS case–control studies showed no interaction between HLA-DR15 and smoking,52 suggesting the need for further investigation to understand whether there is a true interaction and what potential mechanisms might underlie it.
Having younger siblings may be a marker of repeated exposure to common early life microbial infections. We found that low infant sibling exposure was associated with an increased risk of MS.60 Moreover, we found an interaction between HLA-DR15 and low infant sibling exposure:61 the combined effect of HLA-DR15 positivity and low infant sibling exposure on MS was nearly fourfold higher than expected, based on the effects of HLA-DR15 positivity and low infant sibling exposure alone.
Therefore there is now strong evidence to suggest that gene–environment interactions play a significant role in MS onset and may explain some of the missing heritability of MS. Also, the investigations of gene–environment interactions have increased our understanding of how known environmental factors may influence gene expression and function and has linked known environmental and genetic factors together to provide biologically plausible mechanisms of how components of MS causation may occur.
Implications for MS patients and their families
For people with MS, progressive disability substantially affects life expectancy and impairs quality of life. The median life expectancy for MS patients is 5–10 years shorter than for age-matched general population.62 People with MS and their families are therefore very interested in anything that potentially may improve outcomes, may allow better and more accurate diagnosis of MS and may provide accurate information about the risk of MS among relatives, especially to children and siblings. Many people with MS, on being diagnosed, are alarmed to hear that there is a familial risk; they are keen to know whether relatives can be tested for MS risk and what can be done to reduce that risk. The revolution in our understanding of MS genetics in the last 5 years provides some of these answers.
Despite the dramatic increase in our knowledge of MS genetics, we have a long way to go. What is exciting is the rate at which we are learning. As noted in figure 1, until 2007 there was only one known MS genetic association and now we have information on over 60 loci. As genetic techniques become cheaper and more widely applicable, particularly whole-genome sequencing techniques, we will find more associations. These may well be rare variants with significant effects, such as the CYP27B1 variants; they may be present only in a small fraction of people with MS and they may even be personal family mutations. However, each new causative locus is another chink in the armour of a previously impenetrable disease. Finding genetic loci that are associated opens a whole field of human-based research that allows the development of new treatments, allows us to understand how the genes and the environment interact and provides information that facilitates epigenetic approaches understanding the cause and progression of MS.
We cannot yet offer specific genetic testing to people with MS or their families. However, the finding of further rare variants should soon enable us to tailor treatments to families by their genetic status. The genetic revolution has been a great step forward in MS research as it allows us to look at molecular mechanisms of this complicated disease in humans, rather than only in animals; this gives a far better chance of unravelling the cause of MS and developing preventative and treatment strategies (box 2).
Key clinical points from this review
No genetic test absolutely defines a person's risk of developing multiple sclerosis (MS). The most highly associated genetic risk loci in the human leucocyte antigen (HLA) region increase the risk of MS threefold. It is present in 70% of people with MS and 40% without MS. Thus, for individuals, it provides little information.
The other 62 described MS loci together contribute only a fraction of the risk. All are common alleles with the frequency in MS only slightly different from the normal population. Thus, for individuals, they are not useful markers.
The best test of increased risk is a family history: the increased risk for relatives is relatively well understood (figure 4). Note that all relative risks depend on the background population risk. It is therefore essential to understand the population prevalence for a person's region of residence and ethnic group, estimated from available latitudinal data.63
The recent identification of the first MS rare variant in the CYP27B1 gene may be the first instance where genotype knowledge may influence MS disease outcomes. In multiplex families testing for single nucleotide polymorphisms in the CYP27B1 gene may be useful but cannot yet be advocated.
Recent advances in genetic studies have confirmed that MS is a complex trait with genetic effects. Many environmental factors, and interactions between genes and environment, contribute to the development of MS. However, current knowledge of MS genetics is still insufficient and not yet clinically relevant. We cannot yet produce a predictive model of risk based on the genetic profile of an individual. However, gene identification efforts have considerably modified our understanding of the biological mechanisms underpinning the risk of and progression of MS. Promising approaches such as whole-exome and ultimately whole-genome sequencing have the potential to provide an exhaustive map of MS genes in the near future. We believe that the genetic information of MS can be meaningful, not only for scientists and clinicians, but also for MS patients and their families. In the long term, we are confident that progress in genetics research will lead to useful diagnostic and predictive tests and new treatments for MS.
Contributors All authors contributed equally to the development of the review. RL compiled the manuscript with direct assistance from all authors.
Competing interest None.
Provenance and peer review Commissioned. Externally peer reviewed. This manuscript was reviewed by Neil Robertson, Cardiff, UK.
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.