A DNA Transcription Model

The technologies of DNA and RNA sequencing have enabled experiments that are providing a large amount of information about the complex processes required for and associated with RNA polymerase II transcription of DNA into RNA. This comment suggests a model for the central event of DNA transcription. A more explicit model of this event may help in the assembly of all the emerging information about DNA transcription and its regulation into a more organized model of all the involved processes.

The core step of DNA transcription transcribes the DNA wrapped around one nucleosome. It requires the transcription machinery to break down the nucleosome enough to gain access to the DNA. It exposes the DNA's sequence information in the RNA transcript. It reassociates the nucleosome and preserves the association between its histone tail chemical marks and the particular region of DNA.

The inputs to a step's processing include transcribed DNA's sequence information, histone tail chemical marks, and chemical information associated with the polymerase's carboxy terminal domain (CTD). Potential outputs include protein binding to the new RNA, modifications to the reassociated nucleosome's histone tails, and modifications to the chemical state of the polymerase's CTD. Much of this processing is likely carried out by a processing machinery associated with the CTD. The changing state information associated with the CTD integrates information gathered by the polymerase processing as it carries out a sequence of these steps. In this model, the information processing role of the CTD and associated machinery integrates the information provided by histone tail epigenetic marks, by DNA sequences, and by various signaling mechanisms gathered as the polymerase transverses an extended segment of DNA into the choices made by the polymerase in all of the remaining steps of its DNA transcription.

Splicing provides an example of how this model could work. The unspliced exon RNA at the 3' end of the nascent RNA could be associated with the polymerase CTD. At each processing step, a decision could be made whether or not the newly transcribed RNA is an exon. That decision could be driven by information from the RNA sequence and information from the nucleosome's histone tails. When an exon is identified, an additional decision could be made whether or not it should be included in the spliced transcript. That decision could depend on RNA sequence information, histone tail information, and or state information associated with the polymerase CTD. If the new exon is to be included, SR proteins could be bound to it to select it for splicing. Some kind of association with its CTD associated exon splice partner could be formed to prime the splicing process. The new exon could then replace its 5' exon splice partner as the CTD associated 3' nascent RNA end exon waiting for a splice partner.

Other kinds of polymerase II transcription activities could be directed at chromatin modifications associated with gene activation, cell type change, or establishment of chromatin configuration. In those kinds of activities RNA transcription would serve both as a motor to drive a very organized process across a long stretch of DNA and also as an information processor that integrates sequence information from that DNA into the decisions that determine the processing's output events. Those output events could include marking of histone tails or direct action to remove, add, or position nucleosomes. To limit consequences of random interactions with RNA transcribed by this kind of process, the nascent RNA would be rapidly degraded once it had served its purpose.