Acknowledgments
An apology is extended to those colleagues whose papers could not be cited because of space considerations.
Human neoplasms display a wide variety of genetic alterations, most of which arise after conception, so-called somatic mutations. The types and patterns of mutation are highly variable and heterogeneous among, and sometimes within, different tumor entities, ranging from ploidy shifts to single-base substitutions and epigenetic alterations. Some of the neoplasia-associated mutations are promiscuous in that they are found in several tumor types, whereas others appear to be disease specific, occasionally even pathognomonic. The latter types of mutation, showing a strong association with a particular tumor phenotype, can be exploited for clinical diagnostic purposes. Indeed, genetic analyses are now routinely used as an adjunct to traditional morphologic and immunohistochemical investigations in the diagnosis of many neoplasms, including soft tissue tumors. It is also well recognized that certain types of mutation have a strong impact on the aggressiveness of the tumor cells, and that the mutation status thus must be assessed to select the correct type and level of treatment. Furthermore, a number of novel treatment strategies have recently been developed, often with a dramatic improvement of patient outcome, based on the identification of specific mutations in tumor cells, such as kinase inhibitors for chronic myeloid leukemia, malignant melanoma, or gastrointestinal stromal tumor. Thus, for many tumors with known biomarkers, genetic characterization, commonly known as molecular pathology, is now an integral part of routine health care that contributes to improved diagnostic precision and treatment stratification and allows evaluation of treatment response.
Similarly, genetic analyses have demonstrated that the clinical and biologic variation among soft tissue tumors is reflected in their genotypes, and it was recently shown that the addition of genetic information significantly improves the diagnostic precision. Still, many soft tissue tumor types, including sarcomas, remain poorly investigated, and the use of genetic data for treatment stratification lags far behind the way molecular genetics forms an integral part of the management of patients with leukemias and carcinomas. Indeed, there is no official requirement to perform genetic analyses as part of soft tissue tumor diagnostics, only vague general recommendations. For example, current European Society for Medical Oncology (ESMO) guidelines suggest that the morphologic and immunohistochemical analyses should be complemented by molecular pathology when the specific histologic diagnosis is doubtful, when the clinicopathologic presentation is unusual, or when the genetic information may have prognostic or predictive relevance. Thus the use of supportive genetic data varies considerably among sarcoma centers, depending on local traditions, technical and economic conditions, and skill of the pathologists.
The cursory knowledge of only a few soft tissue tumors notwithstanding, many genetic features have been shown to be strongly associated with morphologic features, and a rapidly growing subset of mutations promise to shed light on patient outcome. This chapter discusses major molecular pathogenetic features of soft tissue tumors, focusing on those that are clinically relevant, either as diagnostic markers or for treatment stratification.
Organization of the Human Genome: Implications for Molecular Genetic Pathology
The human diploid genome consists of 22 pairs of autosomes and one pair of sex chromosomes, XX in women and XY in men. Each chromosome is a single DNA molecule, to which a variety of proteins are associated, protecting the DNA and regulating its accessibility. The haploid genome (i.e., one copy of each chromosome) comprises approximately 3 × 10 9 nucleotides. A small fraction (approximately 1%) of the genome can be transcribed into RNA molecules that are subsequently translated into proteins; each such protein-coding gene consists of one or more exons, separated by introns that are spliced out from the messenger RNA (mRNA) molecule that forms the protein. Depending on which exons are included in the final mRNA, a process known as alternative splicing, different isoforms of a protein may be produced. The function of the remaining 99% of the genome is only partly understood. A substantial fraction consists of repetitive sequences, some of which, such as α-satellite DNA around the centromeres, are critical for the maintenance of a correct chromosome number at mitosis, and some of which, such as the repeated telomeric TTAGGG hexamers at each chromosome end, ensure the structural integrity of chromosomes. Other sequences, such as microRNA and long noncoding RNA, are transcribed into RNA molecules that are not translated into proteins but that have important roles in the regulation of the transcription of protein-coding genes. The significance, if any, of most of the noncoding DNA remains unexplored, but it is becoming increasingly apparent that what was previously known as “junk DNA” harbors sequences that are important for the regulation of protein-coding genes.
The organization of the human genome into different chromosomes with distinct sets of genes, of genes into exons that may be combined in multiple ways, and of the “noncoding” DNA into a variety of structures affecting gene transcription and chromosomal integrity opens up a multitude of mechanisms by which the expression, structure, and subcellular localization of proteins could be altered. In addition, the expression or function of proteins may be affected by more or less stable epigenetic changes, such as histone methylation, as well as by posttranslational modifications, such as glycosylation of proteins. Consequently, mutations that are associated with neoplastic transformation range from single nucleotide variants to complex genomic changes affecting entire chromosomes, augmented by epigenetic changes. Although it may be argued that the proper study of the neoplastic phenotype is at the protein level rather than at the DNA or RNA level, techniques for assessing all proteins in one experiment ( proteomics ) are currently not robust enough for clinical purposes, and data on soft tissue tumors are lacking. However, this restriction does not apply to the many immunohistochemical markers that complement or replace genetic markers.
Constitutional Mutations Predisposing to Soft Tissue Tumors
Germline mutations, also known as constitutional mutations, are genetic changes already present at conception (i.e., in the zygote). Several such mutations are known to be associated with an increased risk of developing soft tissue tumors ( Table 4.1 ). Often the phenotypic consequences of these mutations are quite extensive, leading to a recognizable collection of phenotypic effects—a syndrome—that may include malformations and intellectual impairment, as well as an increased risk for various neoplasms. Although inherited cancer predisposition is usually caused by small genetic variants, constitutional chromosomal rearrangements may also occasionally lead to an increased risk for soft tissue tumors.
Disorder | MIM ∗ | Inheritance † | Gene | Locus | Soft Tissue Tumors |
---|---|---|---|---|---|
Bannayan-Riley-Ruvalcaba syndrome | 153480 | AD | PTEN | 10q23 | Hemangioma, lipoma |
Basal cell nevus syndrome | 109400 | AD | PTCH1 PTCH2 SUFU | 9q22 1p34 10q24 | Cardiac fibroma, fetal rhabdomyoma, rhabdomyosarcoma |
Beckwith-Wiedemann syndrome | 130650 | Sporadic/AD | Complex, e.g., CDKN1C, IGF2 | 11p15 | Embryonal rhabdomyosarcoma, fibroma, hamartoma, myxoma |
Carney complex, type 1 | 160980 | AD | PRKAR1A | 17q24 | Cardiac and other myxomas, melanocytic schwannoma |
Carney-Stratakis syndrome | 606864 | AD | SDHB SDHC SDHD | 1p36 1q21 11q23 | Paragangliomas, gastrointestinal stromal tumors |
CLOVE syndrome, somatic | 612918 | Sporadic | PIK3CA | 3q26 | Lipomas, vascular malformations |
Costello syndrome | 218040 | AD | HRAS | 11p15 | Embryonal rhabdomyosarcomas |
Cowden syndrome | 158350 | AD | PTEN | 10q23 | Lipomas, hemangiomas |
Desmoid disease, hereditary | 135290 | AD | APC | 5q21 | Desmoid tumors |
Dicer 1 syndrome | 601200 | AD | DICER1 | 14q32 | Embryonal rhabdomyosarcoma |
Familial adenomatous polyposis | 175100 | AD | APC | 5q21 | Desmoid tumors, Gardner fibroma |
Gastrointestinal stromal tumor, familial | 606764 | AD | KIT PDGFRA SDHB SDHC | 4q12 4q12 1p36 1q23 | Gastrointestinal stromal tumors |
Glomuvenous malformations | 138000 | AD | GLMN | 1p22 | Glomus tumors |
Hemangioma, capillary infantile | 602089 | AD | ANTXR1 KDR | 2p13 4q12 | Hemangiomas |
Hyaline fibromatosis syndrome | 228600 | AR | ANTXR2 | 4q21 | Fibromatosis |
Juvenile myofibromatosis, type 1 Juvenile myofibromatosis, type 2 | 228550 615293 | AD AD | PDGFRB NOTCH3 | 5q32 19p13 | Myofibroblastic tumors Myofibroblastic tumors |
Klippel-Trenaunay-Weber syndrome | 149000 | Sporadic | PIK3CA | 3q26 | Cutaneous hemangiomas |
Hereditary leiomyomatosis and renal cancer | 150800 | AD | FH | 1q42 | Leiomyomas of skin and uterus |
Li-Fraumeni syndromes | 151623 | AD | TP53 CHEK2 | 17p13 22q12 | Various soft tissue sarcomas |
Lynch syndromes | 120435 | AD | MLH1 MSH2 MSH6 PMS2 | 3p22 2p21 2p16 7p22 | Various soft tissue tumors |
Maffucci syndrome | 614569 | Sporadic | IDH1 IDH2 | 2q34 15q26 | Spindle cell hemangiomas, angiosarcomas |
McCune-Albright syndrome | 174800 | Sporadic | GNAS1 | 20q13 | Intramuscular myxomas |
Mismatch repair cancer syndrome | 276300 | AR | MLH1 MSH2 MSH6 PMS2 | 3p22 2p21 2p16 7p22 | Various soft tissue sarcomas |
Mosaic variegated aneuploidy syndrome | 257300 | AR | BUB1B | 15q15 | Embryonal rhabdomyosarcomas |
Multiple endocrine neoplasia type 1 | 131100 | AD | MEN1 | 11q13 | Lipomas |
Neurofibromatosis type 1 | 162200 | AD | NF1 | 17q11 | Neurofibromas, malignant peripheral nerve sheath tumors, gastrointestinal stromal tumors |
Neurofibromatosis type 2 | 101000 | AD | NF2 | 22q12 | Schwannomas |
Nijmegen breakage syndrome | 251260 | AR | NBN | 8q21 | Rhabdomyosarcomas |
Noonan syndrome (includes LEOPARD syndrome, others) | 163950 | AD | Heterogeneous, e.g., PTPN11 KRAS SOS1 NRAS SHOC2 RAF1 | 12q24 12p12 2p22 1p13 10q25 3p25 | Various soft tissue tumors |
Proteus syndrome | 176920 | Sporadic | AKT1 | 14q32 | Lipomas |
Retinoblastoma | 180200 | AD | RB1 | 13q14 | Various soft tissue sarcomas |
Rhabdoid predisposition syndrome | 609322 613325 | AD AD | SMARCB1 SMARCA4 | 22q11 19p13 | Rhabdoid tumors Rhabdoid tumors |
Rubinstein-Taybi syndrome | 180849 | AD | CREBBP | 16p13 | Myogenic sarcomas |
Tuberous sclerosis | 191100, 613254 | AD | TSC1 TSC2 | 9q34 16p13 | Fibromas, cardiac rhabdomyomas, angiomyolipomas |
Venous malformations, multiple cutaneous and mucosal | 600195 | AD | TIE2 | 9p21 | Hemangiomas |
Venous malformations with glomus cells | 138000 | AD | GLML | 1p22 | Glomus tumors |
Von Hippel–Lindau syndrome | 193300 | AD | VHL | 3p25 | Hemangioblastomas |
Werner syndrome | 277700 | AR | RECQL2 | 8p12 | Various soft tissue sarcomas |
∗ MIM, Entry number in the online database Mendelian Inheritance in Man ( https://www.omim.org/ ).
For several reasons, it is clinically important to recognize monogenic cancer syndromes. First, the underlying mutation may affect the choice of treatment; for example, more cautious use of radiotherapy is recommended in the management of sarcomas in patients with mutations in the TP53 (Li-Fraumeni syndrome) or RB1 genes. Second, the mutation may be informative as to how the patient should be followed (e.g., with regard to breast cancer development in women with TP53 mutations). However, because most predisposing syndromes are exceedingly rare, and soft tissue tumors may appear anywhere in the body, no clinical consensus has been reached regarding if and how the patients and their relatives should be followed with regard to sarcomas. Third, constitutional mutations are often inherited from one of the parents. Therefore an extended investigation of the family, starting with the parents, is warranted to identify siblings and other relatives who may be carriers of the same mutation and thus should be monitored for early cancer detection. Lastly, the study of constitutional tumor-predisposing mutations may shed light on the pathogenesis of sporadic lesions.
Although clinically recognized syndromes account for only a small minority of soft tissue sarcomas, except for malignant peripheral nerve sheath tumors—half of which arise in patients with neurofibromatosis type 1 (NF1) —other types of constitutional mutations, alone or in combination, may account for more substantial proportions. When constitutional DNA from more than 1100 sarcoma patients was analyzed for mutations in 72 genes known to be associated with increased cancer risk, sarcoma patients more often than controls had one or more pathogenic variants, with the affected genes often being implicated in DNA damage response, such as TP53 , ATM , ATR , and ERCC2 . Importantly, the presence of such mutations was associated with younger age at onset and shorter disease-free survival. Thus, apart from mutations causing classic mendelian traits, it is likely that abundant “weak” mutations, combined with other mutations (polygenic inheritance) and/or environmental factors, significantly affect the risk for soft tissue tumor development, and that a large (25%) proportion of those mutations are of potential therapeutic relevance.
Some predisposing mutations are not present in the zygote but occur early in embryogenesis. Therefore, such mutations are confined to a variable fraction of the individual’s cells, a phenomenon known as mosaicism . Many are associated with overgrowth features, and whether the soft tissue lesions arising in this context are malformations or true neoplasms is debatable. Examples of conditions that are associated with soft tissue tumors and always caused by such mosaic mutations are Maffucci syndrome ( IDH1 and IDH2 mutations), Klippel-Trenaunay syndrome, and other phenotypes collectively known as the PIK3CA-related overgrowth spectrum ( PIK3CA mutations). Other conditions may be caused by either germline or mosaic mutations, such as Beckwith-Wiedemann syndrome (various mutations affecting imprinted loci in chromosome band 11p15), Noonan syndrome, and other RASopathies (mutations of KRAS , NRAS , BRAF , NF1 , or other genes involved in RAS/MAPK signaling pathway).
In summary, an increasing number of patients with soft tissue tumors have some type of predisposing constitutional mutation(s), making it prudent to include information on at least first-degree relatives when obtaining the medical history of the patient.
Somatic Mutations in Soft Tissue Tumors: General Concepts
The “somatic mutation” theory of cancer, first presented more than a century ago, stipulates that neoplastic transformation is caused by genetic changes. The validity of this hypothesis has been demonstrated through numerous studies, and it is now commonly accepted that all neoplasms arise through mutations. In most cases, several mutations are needed to achieve the proliferative advantages that separate the neoplastic cells from their normal counterparts. As outlined by Hanahan and Weinberg, the neoplastic cells must become self-sufficient in growth signals, develop reduced sensitivity to growth-inhibitory signals, and be able to evade apoptosis (programmed cell death). Furthermore, as the tumor grows, it must be able to induce vascular supply (angiogenesis), and malignant lesions need to acquire the ability to invade surrounding tissues and spread to other sites. Since all these features are unlikely to be achieved through a single mutation, it has also been suggested that an increased mutational rate (genetic instability), allowing for rapid evolvement of subpopulations with increased fitness, is a prerequisite, at least for malignant lesions.
Indeed, most tumors, especially malignant ones, display numerous mutations at the chromosome and nucleotide levels. Only some, however, contribute to tumor development, called “driver mutations”; most mutations are instead thought to constitute “passenger mutations,” with little or no impact on tumorigenesis or tumor progression. The more complex the genome of a neoplasm, the more difficult it is to distinguish driver mutations from passenger mutations, but by focusing on those that are recurrent, present in the earliest stages of tumor development, or accompanied by few other changes, it has been possible to identify a surprisingly large set of genes and mutations that play an active role in tumor development. Three main categories of genes with driver mutations can be discerned: those that positively (“oncogenes”) or negatively (“tumor suppressor genes”) regulate cell growth and survival and those that are involved in the maintenance of genomic stability (“caretaker genes”). While useful as general terms, this designation is context dependent. Thus some genes (e.g., RET, TP53 ) could act as both oncogenes and tumor suppressor genes, depending on the type of mutation. Furthermore, although most caretaker genes could be classified as tumor suppressor genes, some (e.g., TERT, important for maintaining adequate telomere length) fit better as oncogenes.
Although there are many ways by which oncogenes can become activated and tumor suppressor genes inactivated, three mechanisms are particularly important in tumor development: small genetic variants (single nucleotide mutations, insertions, and deletions), chromosomal imbalances (copy number changes), and gene fusions. All three mechanisms are operative in soft tissue tumors.
Small Genetic Variants
Mutations affecting a single nucleotide (SNV) or insertion/deletion ( indel ) of up to 10,000 nucleotides are often referred to as small genetic variants. Such mutations are common in neoplasia as well as in constitutional DNA. Indeed, any individual is thought to deviate from the reference genome at 4 to 5 million sites and to have 2000 to 2500 larger structural variants (gains/losses) affecting up to 20 million nucleotides. Some single nucleotide variants (SNVs) are fairly common (at least 1% of population)—also known as single nucleotide polymorphisms (SNPs)—and may confer increased risk for certain diseases. Neoplastic cells may harbor thousands of SNVs and indels that are not seen in the corresponding constitutional DNA. The vast majority of these mutations are located outside the coding regions, although this does not prevent them from having a significant impact on tumorigenesis; for example, SNVs in the promoter region of TERT increase the affinity for certain transcription factors, a common finding in myxoid liposarcoma and solitary fibrous tumor. Still, the mutations that lead to protein-level changes have attracted the most attention. SNVs could result in the exchange of a single amino acid (“nonsynonymous exonic variants”), introduce a premature stop codon, or abolish a splice recognition site. Similarly, indels could lead to frameshift mutations, truncated proteins, or proteins with novel amino acid sequences. Consequently, SNVs and indels can activate oncogenes, as well inactivate tumor suppressor genes. Only a minority of all the exonic nonsynonymous SNVs and indels found in neoplasms have a major impact on tumor development (driver mutations), while the remaining mutational load represents mutations occurring before neoplastic transformation or noise caused by increased genetic instability (passenger mutations). Several bioinformatic tools can predict the functional outcome of a given mutation, and numerous databases, such as COSMIC ( http://cancer.sanger.ac.uk/cosmic ), list reported mutations by gene and disease, greatly improving the interpretation of detected mutations. However, many mutations can still be difficult to evaluate and may require functional analysis in an experimental system.
Whole exome sequencing efforts initially focused on common epithelial malignancies, such as carcinomas of the breast, colon, and lung. When similar large-scale studies were done on soft tissue sarcomas, some interesting differences were noticed. First, SNVs and indels are much less common in sarcomas than in carcinomas, a difference that could reflect that many carcinomas develop through a gradual transformation of a normal cell, whereas most sarcomas seem to start as malignant lesions, without a recognizable dysplastic or benign precursor lesion. Another possible explanation is that carcinomas develop from stemlike progenitor cells that have undergone numerous cell divisions, each of which may generate SNVs and indels, before transformation, while the turnover of putative mesenchymal progenitor cells is much slower. Second, most sarcomas seem to have few and infrequent recurrent mutations, a phenomenon that may be explained by the presence of other strong driver mutations (see later). However, important exceptions include the frequent KIT and PDGFRA mutations in gastrointestinal stromal tumors (GISTs), RAS signaling pathway mutations in embryonal rhabdomyosarcoma, and MYOD1 mutations in spindle cell rhabdomyosarcoma. Still, there are comparatively few whole exome and whole genome sequencing analyses of sarcomas. Benign soft tissue tumors are even less extensively investigated with regard to somatic SNVs and indels. Frequent and recurrent mutations have thus far been observed in only a handful of benign lesions (e.g., of CTNNB1 in desmoid fibromatosis, NF2 in schwannoma, and PRKD2 in angiolipoma).
Although SNVs and indels presently have a limited impact on soft tissue tumor diagnostics, they may provide important information on treatment response. The prime example is GIST; not only the absence or presence of mutations, but also their precise location in the KIT and PDGFRA genes shows strong correlation with response to treatment with different tyrosine kinase inhibitors. Even if actionable mutations are otherwise rare in sarcomas, targeted treatment based on the presence of certain mutations might improve outcome. Therefore, in patients responding poorly to conventional treatment, mutational screening should be considered.
Chromosomal Imbalances
The term chromosomal imbalance is used here to refer to a structural or numeric rearrangement resulting in a quantitative deviation from the normal diploid state. Such chromosomal imbalances may theoretically range from gain or loss of a single nucleotide to whole chromosomes, but it is reasonable to reserve the term for rearrangements affecting at least an entire gene.
Numeric chromosomal aberrations are poorly tolerated at the organism level but are common in tumors, ranging from gain or loss of individual chromosomes ( aneusomy ) to gain or loss of one or more copies of the entire genome ( ploidy shift ). Chromosome counts as low as 23 and as high as 400 or more have been detected at chromosome banding analysis of tumors ( Fig. 4.1 ). In soft tissue tumors, numeric chromosome aberrations are found in almost two-thirds of all cases subjected to chromosome banding analysis, being much more common among sarcomas than among benign soft tissue tumors, about 90% versus one-third. The difference between benign and malignant soft tissue tumors becomes even more obvious when considering only tumors with chromosome numbers below 45 or above 47; such numbers are seen in only 10% of the benign lesions but in two-thirds of sarcomas. Aneusomies probably arise through incomplete separation of homologues at mitosis, a phenomenon facilitated by a variety of neoplasia-associated disturbances (e.g., of centrioles, centromeres, and telomere status) and tolerated through mutations in various caretaker genes. Gain of one or more complete sets of all chromosomes, polyploidization, is strongly associated with malignancy and is most likely caused by accidental events such as abortive mitoses, cell fusions, and endoreduplication. Also, more extensive, meiosis-like, loss of chromosomes ( near-haploidization ) has been observed in soft tissue tumors, notably inflammatory leiomyosarcoma and undifferentiated pleomorphic sarcoma; the mechanisms behind this phenomenon remain unknown.
The pathogenetic consequences of aneusomies are difficult to assess because they affect hundreds to thousands of genes. In general, however, gene expression levels vary with the number of copies; that is, trisomies and monosomies result in increased and decreased, respectively, expression of many, but not all, genes located on these chromosomes. In some cases the outcome of monosomies can be reduced to represent one step in the biallelic inactivation of one or more tumor suppressor genes. A good example is schwannoma , where inactivation of the NF2 gene, which maps to chromosome band 22q11, could be achieved through any combination of monosomy 22, partial deletions of chromosome arm 22q, and structural variants targeting NF2 specifically. In other tumors, where monosomy and partial deletions alternate, such as spindle cell lipomas, which display either monosomy 13 or partial deletions, with 13q14 as a minimal shared target, the remaining copy of chromosome 13 seems intact, suggesting that loss of one copy (haploinsufficiency) is enough for tumorigenesis. Further, some inactivating mutations, such as those affecting the CDKN2A , CDKN2B , and MTAP loci in chromosome band 9p22, typically do not affect larger segments of chromosome 9, indicating that loss of adjacent segments are deleterious.
Gain of entire chromosomes is even more difficult to reduce to single-gene effects, but it has been suggested that extra copies of chromosome 8 could substitute for gene fusions activating PLAG1 , which maps to chromosome 8, in lipoblastoma . However, the significance of most characteristic numeric aberrations in soft tissue tumors, such as extra copies of chromosome 8 in embryonal rhabdomyosarcoma or trisomies occurring as secondary changes in myxoid liposarcoma or infantile fibrosarcoma, remains poorly understood.
Also, chromosomal imbalances resulting from structural rearrangements are common in soft tissue tumors, especially in sarcomas. For several subtypes, such as GIST, it has even been suggested that the overall number of structural rearrangements leading to imbalances could be used to predict patient outcome. Two types of chromosomal imbalance have attracted particular attention: homozygous deletions and gene amplifications. Homozygous deletions are relatively rare and do not always have to affect bona fide tumor suppressor genes; indeed, some homozygous deletions are constitutional variants without phenotypic consequences. However, pinpointing homozygous deletions in tumors has been instrumental in detecting many of the classic tumor suppressor genes, such as RB1 and SMARCB1. Because men usually have only one X chromosome, a single deletion event on this chromosome will have the same effect as a homozygous deletion on an autosome (i.e., complete loss of one or more genes). Several potential targets for such deletions on the X chromosome have been described in soft tissue tumors, notably the DMD gene in myogenic sarcomas and sclerosing epithelioid fibrosarcoma.
Gene amplification is poorly defined but usually refers to a selective, more than three- to fivefold gain of a DNA sequence relative to adjacent sequences on the same chromosome. Different types of chromosome rearrangement are associated with gene amplification: double minutes (dmin), homogeneously staining regions (hsr), and ring chromosomes, all arising through different mechanisms. The presence of hsr or dmin is strongly associated with a malignant phenotype, whereas ring chromosomes are found in both high-grade malignant and benign/low-grade malignant soft tissue tumors; the latter tumors, however, have a clear propensity for transforming into high-grade tumors, probably because of the intrinsic mitotic instability of ring chromosomes. Ring chromosomes are particularly common among atypical lipomatous tumors/dedifferentiated liposarcomas and dermatofibrosarcoma protuberans, where they occur in 80% and 50% of cases, respectively, and often are seen as the sole cytogenetic aberration.
Another type of chromosomal imbalance, which does not change the gene copy number, is uniparental isodisomy (UPiD), which may affect entire chromosomes or parts of chromosomes. For UPiDs to occur, one normal copy must be lost, followed by duplication of the remaining copy. The effect could be the unmasking of a recessive mutation or, if imprinted genetic loci are involved, deregulated expression of genes inherited from the mother or father. The most commonly affected chromosomal region in soft tissue tumors, as well as in a variety of pediatric malignancies, is 11p. This chromosome arm contains a set of paternally and maternally imprinted loci in band 11p15, including the IGF2 gene, which is expressed exclusively on the copy inherited from the father. Various combinations of loss of the maternal copy of 11p, with or without duplication of the paternal copy, are common in embryonal rhabdomyosarcoma.
Lastly, it should be emphasized that the functional outcome of some imbalances, in particular deletions, does not necessarily need to be the copy number change as such. Each structural rearrangement resulting in loss or gain of chromosomal material could also lead to the juxtapositioning of two genes, one in each breakpoint. Indeed, the HAS2-PLAG1 fusion in lipoblastoma is typically created through an interstitial deletion of the intervening sequences on chromosome 8.
Gene Fusions
Structural chromosome rearrangements reshuffle the genetic material and may thus result in the juxtapositioning of (parts of) two genes. This in turn may lead to the translation of a deregulated and/or chimeric protein ( Fig. 4.2 ). Such gene fusions have been described in all types of neoplasia, including benign as well as malignant lesions. The pathogenetic impact is unknown for most of the gene fusions that have been reported in the literature, the vast majority of which were detected through large-scale deep sequencing studies. Many have only been described once and are accompanied by numerous other chromosome-level mutations and thus likely represent passenger events. Other fusions are repeatedly detected, are often restricted to one or a few morphologic subtypes, and are associated with relatively few other mutations, all suggesting that they constitute strong driver mutations; this is especially true for soft tissue tumors, where characteristic gene fusions abound. The indirect support for an important role in tumorigenesis has been further supported in a few cases by results from in vitro studies and from experimental animal models, showing that the gene fusion, at least if it occurs in a permissive cellular context, is sometimes sufficient for malignant transformation. It is important to emphasize, however, that several pathogenetically important gene fusions are present also in tumors with highly complex genomes, suggesting either that those gene fusions are weak transformers or that they facilitate the occurrence of other mutations. Further, the strong impact of the gene fusions on the tumor cells, coupled with chimeric genes being specific for tumor cells, makes them attractive as potential targets for treatment. Indeed, pharmacologic treatment of sarcomas displaying fusions that activate protein kinases (e.g., ALK) or growth factors (e.g., PDGFB) is already in clinical use.
The first gene fusions to be detected in soft tissue tumors were EWSR1-FLI1 in Ewing sarcoma in 1992 and FUS-DDIT3 in myxoid liposarcomas in 1993. Since then, almost 200 different gene fusions have been found, more than half being recurrent in a specific subtype; about one-third of all soft tissue tumor subtypes display one or more recurrent gene fusions. Table 4.2 shows gene fusions estimated to occur in at least 10% of a specific tumor type.
Gene Fusion | Tumor Type b | Frequency (%) |
---|---|---|
ACTB-GLI1 | Pericytoma with t(7;12) | 100 |
AHRR – NCOA2 | Soft tissue angiofibroma | >50 |
ASPSCR1 – TFE3 | Alveolar soft part sarcoma | 100 |
BCOR – CCNB3 | Undifferentiated round cell sarcoma | >20 |
C11ORF95 – MKL2 | Chondroid lipoma | >90 |
CIC – DUX4 | Undifferentiated round cell sarcoma | >25 |
CIC – DUX4L10 | Undifferentiated round cell sarcoma | >25 |
CLTC – ALK | Inflammatory myofibroblastic tumor | 15 |
COL1A1 – PDGFB | Dermatofibrosarcoma protuberans | >95 |
COL3A1-PLAG1 | Lipoblastoma | >10 |
COL6A3 – CSF1 | Tenosynovial giant cell tumor | 30 |
EML4-ALK | Inflammatory myofibroblastic tumor | 10 |
EP400 – PHF1 | Ossifying fibromyxoid tumor | 45 |
ETV6 – NTRK3 | Infantile fibrosarcoma | >90 |
EWSR1 – ATF1 | Clear cell sarcoma | >90 |
EWSR1 – CREB1 | Clear cell sarcoma Angiomatoid fibrous histiocytoma | >50 >80 |
EWSR1 – CREB3L1 | Sclerosing epithelioid fibrosarcoma | >60 |
EWSR1 – CREB3L2 | Sclerosing epithelioid fibrosarcoma | >10 |
EWSR1 – FLI1 | Ewing sarcoma | 90 |
EWSR1 – NR4A3 | Extraskeletal myxoid chondrosarcoma | 70 |
EWSR1 – WT1 | Desmoplastic small round cell tumor | >95 |
FN1-EGF | Calcifying aponeurotic fibroma | >80 |
FN1 – FGFR1 | Phosphaturic mesenchymal tumor | 60 |
FUS – CREB3L2 | Low-grade fibromyxoid sarcoma (90%) Hybrid LGFMS/SEF | 90 >90 |
FUS – DDIT3 | Myxoid liposarcoma | 95 |
HAS2 – PLAG1 | Lipoblastoma | >10 |
HEY1 – NCOA2 | Mesenchymal chondrosarcoma | 65 |
HMGA2 – LPP | Lipoma | 15 |
MIR143 – NOTCH2 | Glomus tumor | 35 |
MYH9 – USP6 | Nodular fasciitis | 75 |
NAB2 – STAT6 | Solitary fibrous tumor | >95 |
PAX3 – FOXO1 | Alveolar rhabdomyosarcoma | 65 |
PAX3-MAML3 | Biphenotypic sinonasal sarcoma | 80 |
PAX7 – FOXO1 | Alveolar rhabdomyosarcoma | 20 |
SERPINE1 – FOSB | Pseudomyogenic hemangioendothelioma | 100 |
SS18 – SSX1 | Synovial sarcoma | 60 |
SS18 – SSX2 | Synovial sarcoma | 35 |
TAF15 – NR4A3 | ESMCS | 25 |
TEAD1-NCOA2 | Congenital spindle cell RMS | 20 |
TPM3-ALK | Inflammatory myofibroblastic tumor | 15 |
VGLL2-CITED2 | Congenital spindle cell RMS | 30 |
VGLL2-NCOA2 | Congenital spindle cell RMS | 20 |
WWTR1 – CAMTA1 | Epithelioid hemangioendothelioma | 90 |
ZFP36 – FOSB | Epithelioid hemangioma | 15 |
∗ Only gene fusions that have been detected in at least 10% of a specific tumor type. The approximate frequency (as suggested by cytogenetic, molecular, and/or FISH studies) is given in parentheses.
Gene fusions were first discovered when the molecular outcomes of recurrent cytogenetic aberrations in particular translocations were pursued. At chromosome banding analysis, many of these cytogenetic rearrangements appear balanced; that is, no material seems to be lost or gained, but later analyses at the molecular level show that the recombination causing the fusion is frequently associated with smaller or larger deletions. Often the parts of the genes not included in the fusions are deleted, such as of the FUS and CREB3L2 genes in low-grade fibromyxoid sarcoma. Furthermore, some fusions, such as ASPSCR1 – TFE3 in alveolar soft tissue sarcoma, almost always arise through an unbalanced translocation, possibly adding to the transforming impact by simultaneously creating a fusion gene and deleting or gaining syntenic genes. Other fusions, however, such as COL1A1-PDGFB in dermatofibrosarcoma protuberans and PAX7-FOXO1 in alveolar rhabdomyosarcoma, are typically amplified, either in ring chromosomes or in double minutes. Many sarcoma-associated gene fusions could thus be indirectly detected by genomic arrays, revealing copy number shifts at the sites of fusion.
Fusions Involving Transcription Factors
About two-thirds of gene fusions in soft tissue tumors include a gene, typically as the 3′ partner, that encodes a transcription factor (TF) or another type of protein, such as a transcriptional coactivator or corepressor, that is directly involved in DNA transcription. More than 1500 human proteins are classified as TFs, which can be further subdivided into classes and families on the basis of their DNA-binding domains ( http://tfclass.bioinf.med.uni-goettingen.de/ ). Some of these TF classes are more frequently affected than others in soft tissue tumors: basic leucine zipper factors (ATF1, CREB1, CREB3L1, CREB3L2, DDIT3, FOSB), C2H2 zinc finger factors (GLI1, KLF17, PLAG1, PRDM10, WT1, ZNF444), tryptophan cluster factors (TCf; ERG, ETV1, ETV4, ETV6, FEV, and FLI1), and basic helix-loop-helix factors (AHRR, HEY1, NCOA1, NCOA2, TFE3); notably, all six TCf proteins belong to the family of Ets-related factors. Therefore the spectrum of TFs involved in fusions in soft tissue tumors is clearly nonrandom, and the involvement of a specific type of TF is typically seen in only one tumor type; for example, only Ewing sarcoma shows recurrent fusions involving an Ets-related factor as the carboxy-terminal partner. One possible explanation for the involvement of certain TFs in certain tumor types is that only these TFs are relevant for the genetic programs in the cell of origin. Indeed, the few chimeric TFs that have been analyzed in experimental systems, such as EWSR1-FLI1 or EWSR1-ATF1, show that only certain cell types can be transformed, that different genetic programs are affected in different cell types, and that the phenotypic effects vary depending on in which cell it is expressed. Furthermore, some of the TFs involved are crucial for the differentiation of the corresponding “normal” lineage. HMGA2 and DDIT3, involved in fusions in lipoma and myxoid liposarcoma, respectively, are important for adipogenesis, and PAX3 and PAX7, involved in fusions in alveolar rhabdomyosarcoma, are crucial for rhabdomyogenesis.
Fusions Involving Protein Kinases
The second largest group of proteins involved in gene fusions in soft tissue tumors is protein kinase s (PKs), most of which are receptor tyrosine kinases. The PK-encoding gene is always the 3′ partner, and fusion at the protein level results in constitutive activation of the kinase domain. In contrast to the gene fusions involving TFs, the tumor types showing PK fusions, such as benign fibrous histiocytoma and inflammatory myofibroblastic tumor, often display a large variety of different 5′ partners, reflecting that the main role of the 5′ partner is to ensure a high level of transcription by providing a more active promoter ( Fig. 4.2C ). As further support for this interpretation, the same PK, notably ALK, can be activated through other types of mutation. However, the amino-terminal partner may be important as well through contributing oligomerization domains or by ensuring a particular subcellular localization of the chimeric protein. Gene fusions involving PK seem less tissue specific than those affecting TFs. For example, the ETV6 – NTRK3 and EML4 – ALK fusions occur not only in soft tissue tumors, but also in a variety of other neoplasms. The chimeric PK in sarcomas, as in other malignancies with receptor tyrosine kinases activated by gene fusions or mutations, are excellent therapeutic targets.
Fusions Involving Chromatin Regulators and Other Protein Classes
Proteins that are involved in chromatin modification and remodeling have emerged as important players in tumorigenesis. These fusion proteins have a profound impact on the transcription machinery, which might help explain why soft tissue tumors with such fusions are either undifferentiated, as exemplified by undifferentiated round cell sarcomas with the BCOR – CCNB3 fusion, or display disparate lines of differentiation, such as synovial sarcoma with SS18 – SSX fusions or ossifying fibromyxoid tumor with PHF1 fusions. Several fusions involving TFs also to some extent likely exert their pathogenetic impact by affecting the chromatin configuration. For example, the amino-terminal part of the EWSR1 protein is known to interact with the SWI/SNF complex, and fusions combining an amino-terminal TF with the carboxy-terminal part of NCOA1 or NCOA2 retain the histone methyltransferase-interacting domains of the NCOA protein. Thus, DNA methylation and histone deacetylase inhibitors, initially developed for other neoplasms, might become useful for epigenetic treatment of some sarcomas as well.
Some gene fusions involve growth factors, such as the COL1A1 – PDGFB fusion in dermatofibrosarcoma protuberans or giant cell fibroblastoma and the COL6A3 – CSF1 fusion in tenosynovial giant cell tumor. Indirectly, however, these result in activation of PKs, the tyrosine kinase receptors PDGFRB and CSF1R, respectively. Therefore, in line with targeted therapies for fusions involving PK, tyrosine kinase inhibitors are of clinical value in unresectable or metastatic cases of dermatofibrosarcoma protuberans.
Completely different mechanisms seem to be involved in other tumors, such as nodular fasciitis. These tumors display fusions that activate USP6 expression through promoter swapping with MYH9 . USP6 encodes a ubiquitin-specific protease, and it seems as if overexpressed USP6 results in activation of the nuclear factor κB (NF-κB) TF complex, but the exact pathogenetic mechanisms remain to be elucidated.
From this brief summary of pathogenetic mechanisms in soft tissue tumors, it is clear that mutations need to be sought at different levels to support or refute a particular diagnosis. Thus it is important to understand what type of information can be obtained, and what cannot, from different types of genetic analysis.
Genetic Techniques
A multitude of genetic methods can be used to search for neoplasia-associated mutations. This summary outlines only techniques that currently are widely used in clinical molecular pathology or that have an obvious potential of becoming predominant within the next few years.
Chromosome Banding Analysis
Chromosome banding analysis is an excellent screening method for detecting both numeric and structural chromosome aberrations and was for many years the main method to identify new genetic subgroups among soft tissue tumors. It can only be performed on cells in mitosis, more specifically at the metaphase stage, when the chromosomes are contracted enough to be visualized under the microscope. Thus it requires access to fresh tumor tissue, obtained within 2 to 4 days after sampling. After mechanical and enzymatic disaggregation of the sample, cells can be cultured, typically for 1 to 7 days, to achieve metaphase spreads of sufficient quality and quantity.
The band staining of metaphase chromosomes can be achieved through a number of techniques. The most common, G-banding, is obtained through pretreatment with a saline solution or a proteolytic enzyme, followed by staining with Giemsa or similar stains. Several logistical and technical drawbacks hamper the use of chromosome banding analysis in the clinic. First, the need for fresh samples taken under sterile conditions demands efficient transportation from the surgeon or pathologist to the genetic laboratory. Second, even in cytogenetic laboratories with extensive experience, the analysis fails, usually because of overgrowth of normal stromal cells, in 20% to 35% of cases. Third, it is a work-intensive and relatively slow method compared with molecular and molecular cytogenetic techniques. Fourth, the resolution level is poor; structural rearrangements affecting less than 5 to 10 Mb (a chromosome band averages 10 Mb) cannot be detected. Finally, it has proved difficult to obtain tumor-representative karyotypes from fine-needle and core-needle biopsies.
Although chromosome banding techniques are thus increasingly being exchanged for other methods, they are still widely used clinically as well as for scientific purposes and continue to provide important reference data for other genetic studies; more than 2500 soft tissue tumors with abnormal karyotypes have been reported in the literature. In addition, chromosomal banding has shaped the terminology of cancer genetics, emphasizing the importance of being familiar with basic aspects of cytogenetic nomenclature. The autosomal chromosomes in principle are numbered according to size, from 1 to 22. Each chromosome is divided into two arms, separated by the centromere; the shorter, upper arm is called p and the lower, longer arm, q . Each arm is divided into one to four regions, each of which is further subdivided into bands ; regions and bands are numbered from the centromere toward the telomere. Thus, 1p34 denotes chromosome band 4 in region 3 on the short arm of chromosome 1 ( Fig. 4.3A ). Numeric aberrations are specified by a plus or minus sign; for example, +8 and −8 denote gain and loss, respectively, of one copy of chromosome 8 ( Fig. 4.3B ). Structural rearrangements are denoted by an abbreviation for the type of rearrangement ( Table 4.3 ); for example, a t(12;16)(q13;p11) denotes a balanced translocation between chromosomes 12 and 16 with breakpoints in bands q13 and p11, respectively. In a karyotype, which is the sum of all observed clonal changes in a sample, the chromosome number is specified first, followed by the sex chromosome complement, and then by the clonal aberrations observed ( Fig. 4.3B ). Rules for how to report karyotypes and details concerning the nomenclature can be found in the International System for Human Cytogenetic Nomenclature (ISCN, 2016). ISCN also provides directions on how to report results obtained through in situ hybridization, microarrays, and sequence-based assays.
Abbreviation | Meaning |
---|---|
cx | Complex karyotype with clonal changes that cannot be described |
del | Deletion |
dmin | Double minute chromosome (sign of gene amplification) |
dup | Duplication |
hsr | Homogeneously staining region (sign of gene amplification) |
i | Isochromosome |
ins | Insertion |
inv | Inversion |
mar | Marker chromosome (its centromere cannot be assigned to any specific chromosome) |
r | Ring chromosome |
t | Translocation |
Genomic Arrays
Genomic imbalances (i.e., gains and losses of chromosomal segments) in tumor cells may be detected by hybridizing extracted DNA to defined short DNA fragments, known as probes, attached to a surface, so-called genomic arrays ( Table 4.4 ). The signal intensity depends on the amount of DNA attaching to a certain probe; chromosomal segments that are under- or overrepresented in relation to the average copy number (which depends on the ploidy level of the tumor cells) will thus be recorded as lost or gained. By adding probes that detect SNPs, copy-neutral loss of heterozygosity also can be detected, and the copy number state in aneuploid tumors can be more readily appreciated (see Fig. 4.1 ). The resolution of the analysis depends chiefly on the number and chromosomal distribution of probes, typically amounting to more than 1 million in modern, high-resolution arrays; the information is thus at the exon level in such arrays. Although initially developed for DNA of high quality from fresh or frozen tissue samples, platforms have now been developed that work well with DNA from formalin-fixed, paraffin-embedded (FFPE) samples. Lastly, the amount of input DNA needed for the analysis has kept decreasing, making genomic arrays highly efficient also for analysis of preoperative needle biopsies.
Technique | Resolution | Balanced Chromosomal Rearrangements/Gene Fusions | Chromosomal Imbalances | SNVs/Indels | Advantages | Disadvantages |
---|---|---|---|---|---|---|
Chromosome banding | 5-10 Mb | Yes | Yes | No | Unbiased Information on cell-cell variation | Dependent on fresh samples Technically challenging Often noninformative |
Genomic arrays | <100 kb | Rarely | Yes | No | Unbiased | Balanced rearrangements not detectable Suboptimal results for FFPE samples |
Gene expression profiling | Exon/gene level | Rarely | Poorly | No | Unbiased | Dependent on sample quality Difficult to standardize |
FISH | Gene level | Yes | Yes | No | Robust Few cells needed FFPE compatible | Directed |
RT-PCR | Nt level | Yes | No | No | High sensitivity | Directed Dependent on sample quality |
MPS, gene panels | Nt level | No | No | Yes | Detects subclonal mutations FFPE compatible | Directed Currently few clinically important mutations in STT Mutations associated with STT often not covered in commercial panels |
MPS, gene fusion panels | Nt level | Yes | No | No | High sensitivity FFPE compatible | Not all STT fusions covered |
MPS, whole exome | Nt level | Rarely | Yes | Yes | Both SNVs and imbalances detectable | Poor software for copy number analysis Normal tissue needed for adequate SNV/indel detection |
MPS, whole genome | Nt level | Yes | Yes | Yes | Detects all types of DNA-level mutation | Expensive Large datasets |
MPS, transcriptome | Nt level | Yes | No | Some | Unbiased detection of fusion transcripts Robust information on gene expression levels FFPE compatible | Only hot spot mutations in expressed genes detectable |
The main conceptual limit of genomic arrays is that they fail to identify balanced chromosomal rearrangements. Thus, most gene fusions associated with soft tissue tumors cannot be detected. However, as previously mentioned, some characteristic fusions are almost always amplified in the tumor cells, whereas other genes involved in fusions often display partial deletions; thus they will then be indirectly identified as copy number shifts in or near the respective genes. Another obstacle is that the organization of the genome (i.e., how different parts of the genome are attached to each other) is not visualized. Thus, coamplified sequences in ring chromosomes in well-differentiated liposarcomas, for example, are seen as separate amplicons in different chromosomes. The major clinical drawback, however, is that the results are highly dependent on the admixture of normal cells to the sample from which DNA was extracted; if tumor cells constitute less than 15% to 20% of the cells in a sample, tumor-associated imbalances will not be detected.
Despite these technical and biologic issues, genomic arrays provide a useful screening method for soft tissue tumors. Unfortunately, comprehensive databases on the copy number profiles of soft tissue tumors are lacking.
Gene Expression Profiling
Array-based global gene expression profiling addresses the expression of all transcribed genes in the genome. Several platforms for such studies, with different resolution levels, have been developed. Although theoretically alluring, global gene expression profiling, or analysis of a restricted set of genes, has not yet become standard in soft tissue pathology. The main reasons are that quantitative RNA-based studies, especially when multiple genes are involved, are difficult to standardize, and that RNA molecules typically are less stable than DNA molecules. However, there have been several promising studies on the association between gene expression profile and clinical outcome. Notably, the French Sarcoma Group showed that a gene expression signature based on the expression levels of 67 genes outperformed both morphologic and genomic metastasis predictors in sarcomas with complex genomes (undifferentiated sarcomas, leiomyosarcomas, and dedifferentiated liposarcomas). This gene signature, called complexity index in sarcomas (CINSARC), mainly included genes involved in chromosome integrity and mitotic control. In later studies this group confirmed the prognostic impact of CINSARC in GISTs, synovial sarcomas, and leiomyosarcomas. Importantly, they validated the reproducibility of CINSARC when using RNA sequencing instead of traditional array-based profiling, and showed that RNA from FFPE samples can also be used.
Fluorescence In Situ Hybridization
In situ hybridization (ISH) methods utilize the fact that DNA is organized into two antiparallel complementary strands. Thus, if the two strands are denatured (separated) through heating, a single-stranded probe can bind its complementary target. At fluorescence ISH (FISH), the probes have been labelled directly or indirectly by fluorophores, allowing for detection by fluorescence microscopy. Sequences ranging in size from approximately 10 kb up to entire chromosomes can be visualized with FISH probes, and by using different fluorophores for different chromosomes, the entire genome can be studied (multicolor FISH, spectral karyotyping). FISH is thus highly versatile; “painting probes” label entire or parts of chromosomes, locus-specific probes target individual genes or unique sequences, and repeat sequence probes label specific chromosomal structures present in multiple copies (e.g., centromeres, telomeres). The probes can also be labeled with nonfluorescent haptens that can be used for secondary detection by enzymatic methods, so-called chromogenic ISH (CISH), of particular use for analysis of fixed tissue sections. More important clinically, ISH analyses can be performed on both dividing cells (metaphase; Fig. 4.4A and B) and nondividing cells (interphase; Fig. 4.4C and D ). Thus, in contrast to chromosome banding, prior culturing is not needed. Furthermore, ISH can be successfully performed on minute samples and thus is well suited for analysis of cells from fine- or core-needle biopsies. In addition, the technique can provide rapid results (within 1 to 2 days).
ISH, especially FISH with locus-specific probes, has thus become a robust and useful ancillary method in soft tissue tumor pathology. It is particularly useful for detecting gene rearrangements by “break-apart probes”; the status of the gene in question is queried by probes that flank the gene, typically with one end labeled in red and the other in green. If the gene locus is intact, the two signals remain close to each other and are perceived as a yellow signal, but if the gene is affected by a structural rearrangement such as a translocation or inversion, the probe is split and seen as separate red and green signals ( Fig. 4.4C ). To increase specificity and stringency, separate probes for the two partners in a gene fusion could be labeled in red and green, giving rise to yellow fusion signals when positive ( Fig. 4.4D ). Locus-specific probes can also be used to detect deletions and amplifications ( Fig. 4.4A ). Reliable FISH probes are now commercially available for many clinically relevant gene fusions and amplicons in soft tissue tumors.
FISH on FFPE tissue sections poses technical problems not encountered when using cell spreads or imprint slides. First, the chemical procedures involved in the fixation of tumor tissue damage the DNA, reducing the stringency of the hybridization. Second, as an effect of cells not being neatly separated in vivo and thus possibly overlapping each other, and because some nuclei are cut when the sections are prepared, the cutoff levels for false-positive and false-negative signals could be quite high.
Reverse-Transcriptase Polymerase Chain Reaction
The reverse-transcriptase polymerase chain reaction (RT-PCR) is based on the conversion of RNA into complementary DNA (cDNA), thus allowing highly sensitive analysis of gene transcripts. Although useful for quantitative as well as qualitative analyses of various aspects of gene expression, the role of RT-PCR in soft tissue tumor pathology is mainly gene fusion detection. Most gene fusions arise through the juxtapositioning of intronic sequences; because introns are typically much larger than exons, analysis at the DNA level would require amplification of very long (several kb) PCR products. In contrast, at the transcript level, where introns have been spliced out, a single primer combination can detect fusion transcripts arising through different exon combinations (see Fig. 4.2B ). Primer combinations used in the clinical setting typically give rise to PCR products of 100 to 300 bp. The size of the amplified product suggests which exons have been fused, but subsequent sequencing is required to verify this at the nucleotide level.
RT-PCR is much more sensitive than FISH. Furthermore, in contrast to FISH with break-apart probes, which only gives indirect support for the presence of a particular gene fusion, RT-PCR provides information on the exact breakpoints in both partner genes. However, the extreme sensitivity of RT-PCR also makes it highly susceptible to contamination; thus it should not be performed without including a negative control. In addition, because the performance of the amplification process depends on the quality of the RNA used as starting material, one or more positive controls and a housekeeping gene should be run together with the sample of interest. When performing RT-PCR on FFPE samples instead of fresh or frozen tissue, the degradation of RNA must be taken into account. Primers must be designed to amplify shorter products, and often several primer combinations are required to detect variants that can be readily detected in a single PCR reaction when using high-quality RNA.
Massively Parallel Sequencing
The previously mentioned genetic methods suffer from having insufficient resolution and success rates (chromosome banding, genomic arrays) or from being directed (FISH, RT-PCR). Massively parallel sequencing (MPS; also known as next-generation sequencing or deep sequencing ) overcomes these obstacles by simultaneously providing both width and depth in a single analysis. Multiple nucleotide sequences, up to the entire genome or transcriptome, can be analyzed at the same time, and each target nucleotide sequence is analyzed several times, allowing for the detection of rare, subclonal variants ( Fig. 4.5 ). Since its introduction in tumor biology more than a decade ago, MPS has been used in thousands of studies, providing detailed information on SNVs, fusion transcripts, genomic structures, and gene expression profiles in all types of neoplasia, including soft tissue tumors. To illustrate the virtual flood of new cancer data generated by MPS studies, the number of known neoplasia-associated gene fusions has increased from less than 1000 in 2008 to more than 21,000 at present. Furthermore, MPS has identified several new architectural features of cancer cells, such as chromothripsis (extensive fragmentation and reassembly of individual chromosomes) and kataegis (clusters of SNVs, often in conjunction with structural rearrangements), and associations between mutation patterns and etiologic agents have been disclosed.
Despite the spectacular success of MPS in cancer research, its introduction in clinical diagnostics has been relatively slow. This is mainly a result of the extensive efforts and costs required to set up a diagnostic laboratory with adequate sequencing machines, an infrastructure that can handle analysis and storage of massive datasets, and bioinformatic solutions that can reliably sort out clinically important findings from technical and biologic artifacts. The following sections summarize the current status of MPS in soft tissue tumor molecular pathology.
Gene Panels
Predesigned gene sequencing panels can be used to search for mutations in genes or parts of genes that are important for a disease or a phenotype. Initially, panels were designed to detect SNVs and indels at the DNA level, but now RNA also can be used, and the spectrum of mutations that can be found includes gene amplification and gene fusions. For cancer diagnostics, several commercial solutions based on target enrichment or amplicon sequencing are available, usually focusing on genes of general interest in carcinogenesis; the number of genes in the panels varies from less than 20 to more than 4000. Because each type of neoplasia has its own mutational signature, commercial panels are increasingly being designed for particular tumor types. The main benefits of gene panels are that they require low input of DNA or RNA, they produce relatively small and thus more manageable datasets, and each target region is analyzed at great depth (typically >200×), allowing detection of mutations occurring in only a small proportion of the cells, as when the sample is heavily contaminated with normal cells or when mutations are subclonal. Gene panels also work well on DNA and RNA from FFPE-samples, so it is possible to select areas that are tumor representative, further increasing the sensitivity.
Gene panels are rapidly becoming the gold standard for identifying mutations that predict response to therapy. Among soft tissue tumors, there are still relatively few examples of subtypes for which specific mutations are used for treatment stratification. An important exception is GIST, where the presence and location of mutations in the KIT and PDGFRA (and occasionally SDH and NF1 ) genes predict response to treatment with tyrosine kinase inhibitors. Also, panels detecting gene fusions are becoming widely used for diagnostic purposes.
Whole Exome Sequencing
Most commercial cancer gene panels have been designed to focus on common mutations in common cancers. Thus the spectrum of SNVs and indels in soft tissue tumors, which still remain poorly explored, is not necessarily covered by any commercial gene panel. One option is therefore to design specific gene panels (custom-targeted gene sequencing); another is to perform sequencing of all coding parts of the genome, called whole exome sequencing (WES). Such analyses require more input DNA, and the results are cumbersome to analyze without access to corresponding data on normal, constitutional DNA from the patient; each individual differs from the “reference genome” at thousands of nucleotide positions, making it difficult to identify the relevant somatic mutations if only tumor DNA is analyzed. Thus, costs for WES are relatively high, restricting its use in clinical practice. However, in patients with unclear tumors or an unexpected clinical course, WES could add important information. WES also yields copy number information and thus in some cases could replace genomic array analysis. WES has been applied to DNA from FFPE samples as well.
RNA Sequencing
Because gene fusions are the most important diagnostic mutations in soft tissue tumor pathology, most sarcoma centers employ some method(s) to detect at least a subset of the fusions occurring in diagnostically challenging tumors. As mentioned earlier, FISH typically does not reveal the fusion partner, and RT-PCR is cumbersome if a certain tumor type (e.g., small round cell tumors) can display many different gene fusions, or if (as in solitary fibrous tumor) a single gene fusion requires many primer combinations to be detected. Thus, MPS analysis of all transcribed genes (the transcriptome), called RNA-seq, is an alternative. In theory, RNA-seq will detect all potential gene fusions, both known and previously unknown. In practice, however, RNA-seq is limited by the quality of the RNA, the expression level of the gene fusion, and the depth of the analysis. For high-quality RNA from fresh-frozen tissue samples, at least 10 million reads are usually recommended. There is no need for running a normal sample in parallel, making the analysis less expensive than WES. In addition to gene fusion status, RNA-seq provides excellent information on splice variants and global gene expression levels. Mutations in expressed genes are present also at the transcript level.
Although most RNA-seq studies of soft tissue tumors have used high-quality RNA from fresh-frozen samples, it also works, although less efficiently, on RNA from FFPE samples. It remains to be explored how the results vary with type of fixation and time in storage.
Other MPS Applications
MPS is highly versatile, both in scale and target sequences. The most extensive analysis is whole genome sequencing (WGS) , by which the entire genome (except repetitive sequences) can be studied. In principle, this would provide comprehensive information about nucleotide-level as well as chromosome-level mutations. However, due to high costs, the huge amounts of data generated, and the depth needed to identify subclonal mutations, WGS is mainly used as an exploratory tool or at low depth to detect chromosomal rearrangements. The noncoding parts of the genome remain poorly investigated, but such findings as activating mutations in the TERT promoter in myxoid liposarcoma and solitary fibrous tumor suggest that clinically relevant mutations may go undetected when restricting the analysis to exons. Another approach to identify chromosomal rearrangements is mate-pair sequencing, a technique to produce paired-end reads with long inserts. The short inserts (200–550 bp) used at WGS can miss structural variants because of repetitive sequences or complex genomic features.
The complexity at the RNA level is much greater than at the DNA level. Most MPS studies have focused on protein-coding (mRNA) molecules; many protocols used for library preparation specifically enrich this subset of RNA molecules. However, there are numerous subtypes of noncoding RNA molecules with important roles in gene regulation and carcinogenesis, such as microRNA (miRNA), long noncoding RNA, or circular RNA. Several intriguing observations on the impact of such RNA molecules on the pathogenesis of various sarcomas have been reported, but clinical applications are not imminent. Also, epigenetic mechanisms, such as methylation of cytosines in CpG dinucleotides and various histone modifications, affect the accessibility and transcription of genes in both normal and neoplastic cells. Comprehensive information on the methylome can be obtained from MPS analysis of bisulfite-converted DNA, and DNA sequences attached to histones can be identified from chromatin immunoprecipitation sequencing (ChIP-seq).
Clinically, perhaps the most promising MPS application is sequencing of liquid biopsies. The most common approach is to sequence the cell-free DNA that circulates in the bloodstream. Although most of this DNA derives from normal cells, neoplastic cells in cancer patients also contribute and can be detected when sequenced at sufficient depth. Therefore, copy number alterations, point mutations, or gene fusions could be detected by MPS. Potential applications range from improved prediction of metastatic dissemination to evaluation of treatment response and up-front diagnostics.