Sarah J. Wheelan, MD, PhD, Associate Professor, School of Medicine, Johns Hopkins University
Protein-coding sequences (genes) are not contiguous in the genome; rather, the pre-mRNA transcripts derived from these regions are cut into pieces and assembled into a final, mature mRNA that is exported from the nucleus and translated into a protein. The machinery that performs this cutting and pasting called splicing is thought to have very high fidelity and is highly regulated, so that a final RNA sequence is generated by a single gene.
Rearrangements in the DNA, such as chromosome fusions, deletions, inversions, and other structural variations can create novel genes, in which protein-coding sequences from more than one original gene are now proximal and are fused into a single transcript, which may have new activity. In fact, many cancers are characterized by such fusion genes, as they can significantly disrupt normal gene regulation, protein function, and cell growth.
Our hypothesis is that RNA-level fusions can also occasionally occur when there is no underlying DNA fusion; this would happen if the splicing machinery erroneously pastes two separate transcripts together to form a new fusion gene. Interestingly, this could happen when the two parent genes are proximal in the nucleus, so such events would be extremely informative about the 3D structure of DNA.