Article Text
Abstract
Whole-genome sequencing (WGS) has recently become the first-line genetic investigation for many suspected genetic neurological disorders. While its diagnostic capabilities are innumerable, as with any test, it has its limitations. Clinicians should be aware of where WGS is extremely reliable (detecting single-nucleotide variants), where its reliability is much improved (detecting copy number variants and small repeat expansions) and where it may miss/misinterpret a variant (large repeat expansions, balanced structural variants or low heteroplasmy mitochondrial DNA variants). Bioinformatic technology and virtual gene panels are constantly evolving, and it is important to know what genes and what types of variant are being tested; the current National Health Service Genomic Medicine Service WGS offers more than early iterations of the 100 000 Genomes Project analysis. Close communication between clinician and laboratory, ideally through a multidisciplinary team meeting, is encouraged where there is diagnostic uncertainty.
- GENETICS
- NEUROPATHY
- NEUROGENETICS
- HMSN (CHARCOT-MARIE-TOOTH)
Data availability statement
Data are available on reasonable request.
Statistics from Altmetric.com
Introduction
Whole-genome sequencing (WGS) and the use of genomic testing in neurology, including consent, indications and results, have recently been expertly reviewed in Practical Neurology.1 2 The success of the Genomics England 100 000 Genomes Project (100KGP), sequencing patients with cancer and rare diseases, has led to the introduction of WGS with virtual panels into routine clinical practice for many neurological diseases via the UK National Health Service Genomic Medicine Service (NHS-GMS, https://www.england.nhs.uk/genomics/nhs-genomic-med-service/). The theoretical benefits of WGS are clear; sequencing the entire genome (many orders of magnitude more DNA than previous routine testing, at comparable costs) wherein the molecular diagnosis should lie, provided the clinical diagnosis of a genetic disorder is correct. However, as with every new technology, WGS has its limitations. This article aims to outline the diagnostic utility of WGS, but also to note where caution is needed. The decision to request WGS is a critical step in any patient’s diagnostic journey. Where appropriate, especially in sporadic cases, acquired diseases should be excluded first. Figure 1 highlights key points to consider before requesting WGS.
Is Charcot-Marie-Tooth disease a good disease prototype for understanding WGS?
Charcot-Marie-Tooth (CMT) disease is an umbrella term for inherited neuropathies but is a clinically and genetically heterogeneous group of diseases. The clinical subtypes of CMT include demyelinating sensory and motor neuropathy (CMT1), axonal sensory and motor neuropathy (CMT2), sensory and motor neuropathy with intermediate conduction slowing (upper limb motor conduction velocity between 25 and 45 m/s, CMTi), hereditary sensory neuropathy (HSN) and hereditary motor neuropathy (HMN).3 4
The diagnostic utility of WGS for an individual lies in its ability to detect vast numbers, and in theory different types, of genetic variant. Figure 2 illustrates the features of a disease group that make it suitable for considering and understanding WGS testing.
One downside of CMT as a disease prototype is that functional validation of novel variants/genes is challenging but this underpins how important WGS is in CMT clinical practice. Gold-standard functional evidence would be ex vivo human diseased tissue demonstrating absent, deficient or dysfunctional protein contributing to pathology. This is possible in theory with peripheral nerve biopsies, but this is an invasive procedure requiring technical expertise. Alternatively, RNA sequencing can be used to demonstrate aberrant transcripts in appropriate tissues; Schwann cells are clearly easier to study than dorsal root ganglia or anterior horn cells. Overall, we feel CMT is an excellent disease to demonstrate the lessons and pitfalls of WGS and will explore these herein.
WGS technologies
A basic understanding of the molecular techniques involved in WGS is important to appreciate its potential pitfalls. First, WGS when used in common medical parlance, refers to ‘short-read’ WGS (srWGS). Table 1 highlights some useful terminology. There are other forms of genomic sequencing, and although used currently mostly in the research setting, their use is increasing in diagnostic genetic laboratories worldwide. Long-read WGS (lrWGS), as suggested in the description, continuously sequences long molecules of DNA, typically tens of kbp in length, but up to many hundreds of kbp depending on the sequencing technology used. The major benefit of lrWGS is the ability to detect and size repeat expansions accurately, and to detect complex, balanced structural variants. The drawbacks include the cost and longer sequencing time, and its error rate on an individual nucleotide level which, when combined with low read depth, affects its ability reliably to detect single-nucleotide variants (SNVs) or insertion-deletions (indels).5 Optical genome mapping is another form of genomic interrogation, and more appropriately termed ‘genome imaging’. Its uses have been compared with those previously investigated with karyotyping (ie, large structural variants) but with the benefit of up to 20 000-fold higher resolution. DNA molecules are enzymatically labelled, and the resultant ligated DNA then ‘imaged’ for its pattern of periodically spaced fluorescent signals. Its ability to detect large structural variants (0.5–1 Mbp) is superior to srWGS and lrWGS, and it is less costly to get higher coverage. As with srWGS and lrWGS it cannot detect aneuploidy (an abnormal number of chromosomes), although this is less relevant in the setting of non-developmental disorders. Another potential drawback of optical genome mapping is the requirement for DNA extraction from a fresh blood sample.6 As neither lrWGS nor optical genome mapping are used in standard NHS testing, from this point forward we will not discuss them further, and we will refer to srWGS simply as WGS.
Next-generation sequencing technology
WGS uses next-generation sequencing (NGS) technology, also known as high-throughput or massively parallel sequencing. NGS has been used for many years in clinical diagnostic laboratories for the sequencing of disease-specific gene panels and whole exome sequencing. There are several sequencing platforms,7 but the dominant provider worldwide is Illumina, which is also used by NHS-GMS, and the process described hereafter. A flow diagram of the process involved is shown in figure 3.8
The first step is library preparation (figure 3A); the genomic DNA library is a series of short fragments ready for sequencing. The DNA (typically extracted from leukocytes in blood; purple EDTA tube) is fragmented and then each fragment amplified. Fragments are then sequenced in a process called ‘sequencing by synthesis’, whereby fluorescently tagged nucleotides are added to a linear single strand of DNA complementary to the fragment; the resultant fluorescent DNA strand is known as a ‘read’ and can be sequenced by its characteristic spectral emission (one wavelength for each of the four nucleotides, figure 3B). The fragment is sequenced from both ends forming ‘paired-end reads’, allowing additional information to be gleaned when the reads are aligned. Data are then fed into the bioinformatic pipeline (figure 3C). The millions of reads are aligned to the reference genome, which when visually represented, form piles of overlapping reads. The overall coverage of the WGS describes what proportion of the reference genome is sequenced to a satisfactory read depth. Figure 4A shows in detail how an unmutated fragment is sequenced and aligned to the reference.
Variant calling is the process of identifying variants, that is, variation in an individual’s genome when compared with the reference. The basic output of a WGS bioinformatic pipeline is the identification of small variants; alteration/insertion/deletion of single nucleotides (SNVs, figure 4B) or a small number of consecutive nucleotides (indels). The universal final output for the millions of variants generated is a .vcf file. Other types of genetic variant can also be detected including structural variants (both copy number variants and balanced rearrangements; the latter where there is no change in dose at a particular locus), repeat expansions and mitochondrial DNA (mtDNA) variants, but their detection and calling is variable (figure 4 and see ‘When WGS might not be the correct test’).
After variant calling, the variants are filtered according to specified criteria (see ‘Filtering and prioritisation’). Application of a virtual panel(s) may yield possible candidate variants, which are interpreted by clinical scientists (figure 3D). If there is ambiguity or uncertainty, results are ideally discussed at a multidisciplinary team (MDT) meeting, following which a genetic report can be issued.
Virtual panels
Although WGS in theory allows analysis of variants from an individual’s entire genome, this is neither desirable (incidental unwanted findings) nor practical (a human genome contains approximately five million SNVs), therefore, virtual panels are essential to refine the search. In the NHS, clinicians are required to select virtual gene panel(s) when requesting WGS. The NHS-GMS PanelApp (https://nhsgms-panelapp.genomicsengland.co.uk/panels) is a publicly available resource that uses genetic expertise through crowdsourcing to curate disease-specific gene panels.9 For a gene to be included it needs to be approved as ‘green’ by a number of verified experts; a green gene is broadly one in which plausible disease-causing variants have been found in three or more unrelated individuals/families. However, the panels can only be as correct and up to date as their reviewers and the current available evidence. For example, SORD was discovered as a common, and potentially treatable, cause of CMT in 202010 but was not approved as a green gene until November 2022. Panels are periodically updated, and previous iterations can be found on PanelApp. Genes that cause a complex phenotype which include the disease group of interest, for example, ABHD12 causing polyneuropathy, hearing loss, ataxia, retinitis pigmentosa and cataract (PHARC) syndrome, are often not included if the panel specifies an isolated phenotype; it is not a green gene on the current ‘Hereditary Neuropathy or pain disorder’ panel (R78 V.3.24). Similarly, novel or rare genes may not meet green inclusion criteria. It is, therefore, important to understand which genes are tested in a specific panel, and if there is a particular gene of interest in a clinical case, this should be discussed with the genetic laboratory. It is currently recommended that broad rather than narrow use of panels is applied to maximise chances of identifying causative variants.
Filtering and prioritisation
Refining the vast number of variants detected through WGS requires filtering strategies. The two most powerful tools are the allele frequency of the variant in reference databases (the most commonly used is gnomAD; https://gnomad.broadinstitute.org/, box 1) and in family studies, the inheritance pattern, as defined by relative disease status.
GnomAD
The Genome Aggregation Database (gnomAD, pronounced nōˌmad) is the most widely used population database of genomic variation. Launched in 2014 as the Exome Aggregation Consortium (ExAC), it is now in its fourth iteration (gnomAD V.4, released in November 2023). The open-access online database contains genomic data from around 730 000 exomes and 76 000 genomes (up to 1.6 million alleles), derived from more than 100 studies in more than 25 countries. The major output is variant frequency data, that is, how many times has a particular variant been observed in this dataset—‘the population’? The genomic data are broadly derived from a mixture of case–control studies, and large biobanks, including more than 400 000 individuals from the UK Biobank; this is not a healthy control database and will contain affected individuals, with a frequency probably no higher than the disease prevalence.
Population allele frequency
Historically, the upper limit for the population allele frequencies was set at <1 in 100 for autosomal recessive, and <1 in 1000 for autosomal dominant disease, however, we know that for most rare diseases these thresholds are far too high. A useful online calculator for the estimation of a disease-specific population allele count and frequency is found at https://cardiodb.org/allelefrequencyapp/. It is important to remember that if a variant seemingly occurs at too high an allele frequency, it will be filtered by the bioinformatic pipeline, and not considered for interpretation. The most common variant c.757delG in SORD-related CMT is present in a highly homologous non-functioning pseudogene SORD2P in 95% of controls; the two variants can be challenging to delineate bioinformatically and therefore the SORD variant is potentially inappropriately filtered.10 This problem with this particular variant has been overcome but was a barrier to its discovery.
One must also be wary of regional ‘hotspots’ for particular variants. The GNE variant p.Val696Met (previously p.Val727Met) causing the rare recessive hereditary inclusion body myopathy/Nonaka myopathy is exceedingly common in the South Asian population where the majority of the disease is seen.11 The overall quoted allele frequency appears too high for the prevalence of the disease in the UK and may result in the variant being discounted. Only when the regional breakdown is examined, can it be appreciated that the variant is very rare in European populations, in keeping with disease prevalence.
Reference genome
The current human reference genome, denoted GRCh38, originates from the genomes of 20 anonymous volunteers from the USA. It has been shown that two-thirds comprises the genome of a single individual of mixed European and African descent.12 It is widely recognised that the current reference genome has significant limitations; it contains some gaps (~5%), has regions of unreliable coverage (eg, around the centromere), and reflects a very narrow ancestry. The Human Pangenome Reference consortium has set out to rectify the flaws in the current reference by creating a new reference built from 350 human genomes and have recently published a draft from 47 individuals from diverse backgrounds.13 Until the ‘Pangenome’ comes into routine clinical practice, clinicians must be aware that patients from certain ethnic backgrounds (eg, the Indian subcontinent) may have variants missed because the reference does not reflect their ancestry.
Family studies and relative disease status
Variant segregation through family studies (WGS in more than one family member that are subsequently analysed together) enhances diagnostic success.14 At recruitment, participants are assigned as affected, unaffected or unknown. Downstream in the process, if a dominant variant is detected in the affected proband and a reportedly unaffected parent, it will be disregarded or deprioritised. Therefore, caution is needed when the disease has an adult onset or a variable presentation, so that relatives’ disease status is appropriately assigned.
Human phenotype ontology terms
As part of the process of requesting WGS through the NHS-GMS, the clinician is required to include Human Phenotype Ontology (HPO) terms (https://hpo.jax.org/app/, box 2). These phenotypic descriptors can be used to prioritise variants using Exomiser, a programme utilised by NHS-GMS.15 For example, a man with a demyelinating neuropathy and upper motor neurone signs underwent WGS in the 100KGP with the Hereditary Neuropathy virtual panel applied. There were no candidate variants from the panel, but because the HPO terms included ‘demyelinating neuropathy’ and ‘Babinski sign’, a variant in ABCD1, known to cause X-linked myeloneuropathy, was identified. Subsequent discussion at our MDT, and further clinical and laboratory assessments, confirmed this to be the causative gene. This gene is not present in the current Hereditary Neuropathy panel.
Human Phenotype Ontology (HPO) terms
The concept of HPO is straightforward; to standardise the description of a clinical phenotype. HPO terms can include symptoms, examination findings, syndromes, investigation results, disease severity and onset. The NHS-GMS whole-genome sequencing (WGS) request form requires inputting of at least one, but ideally several, HPO terms for the patient in question. This can be time consuming and seem unnecessary, but detailed clinical information maximises the chances of WGS finding an answer for the patient. Consider the scenario of a patient deemed by the neurologist to have a unique phenotype of ophthalmoplegia (HP:0000602), gastrointestinal dysmotility (HP:0002579) and demyelinating peripheral neuropathy (HP:0007108). These terms inputted together might be very specific for a particular gene (eg, mitochondrial), and any variant found prioritised for analysis (even if not on the requested panel), and its classification potentially upgraded based on the information provided. Importantly, the term peripheral neuropathy (HP:0009830) provides no meaningful extra information if requesting the Hereditary Neuropathy panel. The absence of a clinical feature can also be recorded and may be relevant, for example, the absence of tremor in a syndrome of Parkinsonism. The clinical assessment by the neurologist can be the most powerful tool for refining genetic variants and detailed and specific HPO terms are a way of quantifying this expertise.
Variant interpretation and reporting
Every candidate variant is classified according to established criteria. UK laboratories use the American College of Medical Genetics and Genomics (ACMG) and Association for Clinical Genomic Service (ACGS) guidelines.16 17 Any given variant, with no supporting data, starts as a ‘variant of uncertain significance’ (VUS). Evidence is combined from different categories (including data on allele frequency, functional studies, segregation and prior literature reports) to upgrade the variant as likely pathogenic or pathogenic, or downgrade to likely benign or benign (figure 5). As with gene panels, variant interpretation relies on the available evidence, and its application, and therefore variant classification may differ between laboratories. Ideally, clinicians will have access to an MDT (with clinical scientists) to discuss WGS results of unsolved cases, cases with unexpected pathogenic variants, or those with a very typical phenotype for a particular gene, in which no variants have been reported. There is a criterion within the ACMG/ACGS criteria (PP4) that uses phenotype specificity to upgrade variants, for example, absence of dystrophin in a muscle biopsy in a male patient with muscular dystrophy phenotype, when considering a variant in DMD. Without the communication of clinical information from clinician to laboratory, the variant might remain a VUS.
Historically, relevant VUSs were listed as an addendum to genetic reports. However, NHS-GMS has adopted guidance from the ACGS that VUSs should not be reported unless exceptional circumstances apply, after a discussion at an MDT meeting. This change is critical for practising clinicians to be aware of. The rationale is that reporting a VUS may lead to confusion on the part of referring clinician or patient, misinterpretation and potentially misdiagnosis. Even when a VUS is likely to be causative, family screening for the variant would still need careful discussion and counselling, and preimplantation genetic testing or entry into a clinical drug trial would only be considered in exceptional circumstances.
However, we have experience that transparent reporting of VUSs to clinicians with genetic expertise, has been vital in clinching a genetic diagnosis with the passage of time. A ‘warm’ VUS may be upgraded to pathogenic following, for example, a new publication implicating the gene/variant in disease. Without information about VUSs made available on a genetic report, such cases may remain unsolved.
Another example of the need for careful reporting is the presence of a single pathogenic variant in a recessive gene, a so called ‘single hit’. If reported, it should be made clear that the diagnosis is not confirmed, but a single pathogenic variant has been detected. With a suggestive phenotype, a ‘single hit’ will often trigger a discussion with the laboratory to look on the other allele for deep intronic variants (that might affect splicing or create pseudoexons), or structural variants (ie, deletion of a portion of the gene), or explorative analysis of the genome in a research setting.18
When WGS might not be the correct test
The essential first step for genetic testing is ensuring the right test is sent. Jain et al have previously discussed this in detail.1 In the UK, clinicians must consult the NHS Genomic Test Directory https://www.england.nhs.uk/publication/national-genomic-test-directories/. Many neurological diseases, including some that are treatable, have their molecular basis in non-SNV genetic variation. Huntington’s disease (CAG trinucleotide repeat expansion in HTT), genetic motor neurone disease/frontotemporal dementia (GGGGCC hexanucleotide repeat expansion in C9ORF72), spinal muscular atrophy (biallelic deletion of exon 7±8 in SMN1), fragile X syndrome (CGG trinucleotide repeat expansion in FMR1) and Duchenne muscular dystrophy (~60% caused by exon-level deletions in the X-linked DMD) are all caused by either repeat expansion or structural variants. More than 50% of CMT is caused by a duplication of PMP22, and the remainder, a mixture of genetic variant types.
The limitation of WGS to accurately detect structural variants and repeat expansions lies in the read length. Put simply, it is difficult to quantify a variant with genomic size potentially orders of magnitude larger than the unit of measurement. Figure 4 details the use of paired-end reads in the sequencing and alignment process, and how they can be used to detect non-SNV variants. When the DNA fragment is sequenced from both ends, the two paired-end reads contain markers that identify them as a pair. If, when the reads are aligned to the reference genome, they align too far apart or too close together, this can be bioinformatically detected. Similarly, if a read aligns without a ‘mate’ (the other part of the pair cannot satisfactorily align to the reference), this can also be flagged. This approach for detecting non-SNV variants is shown in figure 4C and is known as a ‘paired-end’ (or ‘read-pair’) approach to detecting structural variants. Similarly, the ‘split-read’ approach uses information that a single read is disrupted, or split, by a structural variant. The read depth or ‘depth of coverage’ approach relies on algorithms detecting regions where there is a significant increase or decrease in coverage (figure 4D). All these computational approaches have their limitations for different structural variants, and the best algorithms combine more than one approach.19 Structural variation on a chromosomal level, for example, aneuploidy or ringed chromosomes, will not be detected by WGS and karyotyping should be requested separately.
Repeat expansions, where the number of repeats is critical to the diagnosis, can be challenging to size through WGS; large repeat expansions will be longer than the read, or read-pair (figure 4Civ). ExpansionHunter is a tool that estimates the repeat size at the loci of known expansions, which when paired with visual inspection, was sensitive and specific for correctly sizing expansions in the 100KGP when the expansion size was less than the read length.20
However, there are three important caveats to the above. First, as with virtual panels, if the gene and specifically the expansion (if that is the diagnostic question) is not on the virtual panel, non-SNVs will not be tested. Second, when the expansion is larger than the read length (as seen in FMR1, C9orf72, DMPK (myotonic dystrophy type 1) and FXN (Friedreich’s Ataxia), although an expansion could be identified, it was often significantly underestimated by ExpansionHunter (figure 4Civ). Although RFC1, the gene recently identified as causing cerebellar ataxia, neuropathy and vestibular areflexia syndrome (CANVAS) through biallelic pentanucleotide repeat expansions, was not examined by Ibañez et al, the same would apply; the expansion is typically >1000 repeats (>5000 nucleotides).21 In NHS laboratories, RFC1 is currently tested using non-WGS methods. Third, early iterations of the 100KGP pipeline did not routinely analyse for any non-SNVs, and many were missed and not reported.
MtDNA sequencing is currently requested as a separate test to sequencing of the nuclear genome. Studies have shown that with a satisfactory read depth, WGS can detect mtDNA variants at a heteroplasmy level down to 10%,22 but if there is a significant suspicion for mitochondrial disease, mtDNA sequencing should be requested separately. Other types of genetic mechanism including epigenetic factors such as DNA methylation or imprinting will not be detected using WGS and should have separate testing requested. Lastly, in the NHS, if a rapid result is critical to guide management, the R14 ‘Acutely unwell children with a likely monogenic disorder’ WGS can be requested for critically ill children and adults, with a turnaround time of 2–3 weeks.23
Examples from the clinic
Key to our diagnostic success in the 100KGP was access to the data in the research environment, and regular review of cases at a dedicated clinical-research-genetic MDT. We illustrate with clinical cases practical examples of potential pitfalls discussed above.
Case 1
A woman in her late teens presented with a subacute history of sensory changes in her hands, a few weeks following a viral illness. She developed progressive weakness and wasting of intrinsic hand muscles. At initial assessment she also had mild subclinical distal lower limb weakness (figure 6A–E). There was no family history of neuromuscular illness and parents were non-consanguineous. Initial neurophysiology showed a patchy, widespread, conduction slowing neuropathy. She was treated in her local hospital with intravenous immunoglobulin for presumed chronic inflammatory demyelinating neuropathy. Subsequent CSF examination showed normal constituents, nerve roots were markedly thickened and pathologically enhancing on MRI, and nerve biopsy showed a chronic demyelinating neuropathy without inflammation (figure 6G). She progressed slowly despite treatment; initial genetic testing, including CMT1A with multiplex ligation-dependent probe amplification (MLPA), and a 14 gene panel in 2015, was negative. She was enrolled into the 100KGP with her parents, with no primary findings. Through a research collaboration we identified the variant c.4271C>T p.(Thr1424Met) in ITPR3, a gene only reported in three families and not included in the virtual panel.24 25 Additionally, the variant was maternally inherited (figure 6F). Clinically the mother had no symptoms and a completely normal neurological examination, but neurophysiology showed a clear conduction slowing neuropathy. The diagnosis is CMT, with remarkable variability in severity, due to an ITPR3 variant. This case highlighted the importance of the assigned affected status; segregation was confirmed but only by neurophysiology. Similarly, research access to the 100KGP data was essential to identify a gene not on the virtual panel but in the literature.
Case 2
A man in his late 60s was referred for a diagnostic opinion. He had a progressive sensory and motor neuropathy since his 20s. Neurophysiology was clearly demyelinating with a median nerve motor conduction velocity of 22 m/s. The family history was of autosomal dominant disease. His 100KGP primary findings report was negative. We examined the aligned sequence data and discovered 1.5× the read depth in the region of PMP22 compared with the rest of the genome (figure 6H). MLPA confirmed the 17p.22 duplication; the diagnosis was CMT1A. The bioinformatic pipeline did not call this common copy number variant seen in CMT. We have now seen three cases of CMT1A referred for a diagnostic opinion where the chromosome 17 duplication was either missed or not looked for as clinicians were not aware that NGS gene panels and WGS in the 100KGP did not reliably detect the duplication.26 Despite the panel name ‘Hereditary Neuropathy NOT PMP22 copy number’, the current WGS panel does now include the PMP22 duplication, but the first line test in conduction slowing neuropathies should still be ‘R77 Hereditary Neuropathy—PMP22 copy number’ (MLPA).
Case 3
A man in his early 50s had a 4-year history of progressive unsteadiness, particularly in the dark, and reduced sensation in his distal limbs. He had a longstanding cough. Examination identified a sensory ataxia and large and small fibre sensory loss, without weakness. Neurophysiology showed a severe pure sensory axonal neur(on)opathy. Extensive investigations including antibody testing, neuroaxis imaging, positron emission spectroscopy scan, nerve and lip biopsy excluded inflammatory, nutritional and malignant causes. A 56-gene CMT panel, FXN and POLG sequencing and 100KGP testing was negative. We examined the aligned WGS sequence data of RFC1 in the research environment and found a complete drop of read depth within intron 2 (figure 6I). Subsequent repeat-primed PCR confirmed biallelic AAGGG repeat expansions in RFC1, and a diagnosis of CANVAS. This case highlights a missed large intronic repeat expansion, still not reliably called on WGS. Currently, RFC1 testing must be requested separately.
Case 4
A man in his early 40s presented with a 10-year history of progressive walking difficulties due to distal lower limb weakness. There was no family history. Examination showed a length-dependent motor neuropathy; this was confirmed on neurophysiology and there was no slowing or conduction block. Lead and hexosaminidase A levels were normal. Testing for AR expansion, 32-gene CMT2/distal HMN panel and 100KGP were negative. Review in the research environment identified a heterozygous variant in MME (c.202C>T p.(Arg68Ter)), a gene known to cause adult-onset recessive, motor predominant CMT.27 The single variant is classed as pathogenic when in trans with a second pathogenic variant; this was a single hit in a recessive disease. We then examined the aligned sequence data and identified a 9kbp drop in read depth in MME, consistent with a deletion including exons 15 and 16, predicted to be pathogenic (figure 6J). Both variants were confirmed in the diagnostic laboratory. The diagnosis was distal HMN due to compound heterozygous variants in MME; one that was missed because a single recessive variant was not reported, and the structural variant was not identified by the analysis pipelines.
Case 5
A man in his late teens was assessed as he transitioned to the adult neuropathy clinic. He had a normal birth but began walking with in-turning feet aged 4. His feet then began to slap as he developed slowly progressive weakness. His father had mild symptoms compatible with CMT. Examination of the proband revealed relatively mild, length-dependent motor deficits (figure 6K–L). His neurophysiology showed a sensory and motor demyelinating neuropathy; a clinical diagnosis of CMT1 was made. A 56-gene CMT panel was negative and the 100KGP project had no primary findings. Review of genes not included in the virtual panel used by the 100KGP in the research environment revealed a paternally inherited, previously reported pathogenic variant in the myelin protein gene PMP2, confirming the genetic diagnosis (figure 6M).28 Despite PMP2 being established as a cause of CMT in 2016, the gene was not included in the 100KGP panel.29
Conclusions
The diagnostic opportunities through WGS are clear and are reflected in the introduction of WGS into routine NHS diagnostic testing. However, caution must be taken when reading a ‘negative’ report. WGS has its technical limitations; it very reliably detects SNVs and small indels, and although bioinformatic algorithms are now confidently detecting copy number variants, this was not always the case, and detecting balanced structural variants and sizing large repeat expansions remains unreliable. Variants are prioritised according to the information provided by the requesting clinician; a detailed phenotypic description and, if applicable, broad use of virtual panels, increases the chances of a correct genetic diagnosis. Family studies increase the diagnostic yield but rely on correct assignment of disease status of relatives. If a negative report is received but there is high diagnostic suspicion, we encourage discussion with the genetic laboratory and/or an MDT meeting to consider further focused analysis. Provided the diagnosis of a genetic disorder is correct (excluding mtDNA disorders), although the answer should in theory lie within WGS, WGS is not always the correct test to request. Lastly, all the cases in this review were diagnosed through research access to 100KGP data; there will always be unsolved and novel causes for neurological disease and the authors feel strongly that clinical genomic researchers should, where their patient has consented, have access to their data to ensure we continue to increase genetic diagnoses for individuals and their families, and advance the field as a whole. Access to research data is not universal, and if after discussion with the local genetic laboratory there is no diagnosis, clinicians should consider referring to a specialist centre.
Key points
Whole-genome sequencing (WGS) is the first line test for many, but not all, suspected genetic neurological disorders. Before requesting WGS, clinicians should first ensure relevant initial single genetic tests are negative (eg, PMP22 duplication in Charcot-Marie-Tooth disease).
Gene panels are constantly evolving, and it is important to check which genes and/or type of genetic variant is offered, particularly if there is a specific genetic diagnosis in mind.
Accurate phenotype information, via Human Phenotype Ontology terms, and correct assignment of relative affected status, is critical to maximise diagnostic yield. Relative testing is desirable and sometimes essential.
Discussion, ideally in an multidisciplinary team setting, with the genetics laboratory is recommended for selected unsolved cases and where there are unexpected or uncertain results. Where variants of uncertain significance remain unreported, communication of specific phenotype data may be the key to their reclassification to pathogenic.
Further reading
100 000 Genomes Project Pilot Investigators; Smedley D, Smith KR, et al. 100 000 Genomes Pilot on Rare-Disease Diagnosis in Health Care – Preliminary Report. N Engl J Med. 2021 Nov 11;385(20) :1868–1880.
Ibañez K, Polke J, Hagelstrom RT, et al. Whole genome sequencing for the diagnosis of neurological repeat expansion disorders in the UK: a retrospective diagnostic accuracy and prospective clinical validation study. Lancet Neurol. 2022 Mar;21(3) :234–245.
Moore AR, Yu J, Pei Y, Cheng EWY, et al. Use of genome sequencing to hunt for cryptic second-hit variants: analysis of 31 cases recruited to the 100 000 Genomes Project. J Med Genet. 2023 Nov 27;60(12) :1235–1244.
Pipis M, Rossor AM, Laura M, Reilly MM. Next-generation sequencing in Charcot-Marie-Tooth disease: opportunities and challenges. Nat Rev Neurol. 2019 Nov;15(11) :644–656.
Data availability statement
Data are available on reasonable request.
Ethics statements
Patient consent for publication
Ethics approval
This study involves human participants and was approved by 09/H0716/61. Participants gave informed consent to participate in the study before taking part.
Acknowledgments
CJR and MMR are grateful to the Medical Research Council (MRC MR/S005021/1) and the National Institutes of Neurological Diseases and Stroke and office of Rare Diseases (U54NS065712 and 1UOINS109403-01 and R21TROO3034) and MMR also to the Muscular Dystrophy Association (MDA510281) and the Charcot Marie Tooth Association (CMTA) for their support. This research was also supported by the National Institute for Health Research University College London Hospitals Biomedical Research Centre.
This research was made possible through access to data in the National Genomic Research Library, which is managed by Genomics England Limited (a wholly owned company of the Department of Health and Social Care). The National Genomic Research Library holds data provided by patients and collected by the NHS as part of their care and data collected as part of their participation in research. The National Genomic Research Library is funded by the National Institute for Health Research and NHS England. The Wellcome Trust, Cancer Research UK and the Medical Research Council have also funded research infrastructure.
The authors acknowledge the work of the whole clinical-genetic team at the Centre for Neuromuscular Disease, UCL Queen Square Institute of Neurology, London: Dr Julian Blake, Dr Andrea Cortese, Dr Saif Haddad, Dr Matilde Laurá, Dr Menelaos Pipis, Dr Roy Poh, Dr James Polke, Dr Alexander Rossor and Ms Mariola Skorupinska. With particular thanks to Dr Laurá who oversees the care for patient 5. They would also like to thank Professor Sebastian Brandner, Professor Zane Jaunmuktane and Dr Thomas Millner for providing expert neuropathological diagnostics for case 1. Figures were created using BioRender.com
References
Footnotes
Contributors CJR analysed the data and wrote the manuscript. MMR conceptualised the study and provided senior critical review and revisions.
Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.
Competing interests None declared.
Provenance and peer review Commissioned. Externally peer reviewed by Rhys Thomas, Newcastle-upon-Tyne, UK.