Ciliates are an ancient group of eukariotes (about 2.5 billion years old). They are known to be the most complex unicellular organisms on the Earth. Their main special feature which differs them from the other eukariotes is nuclear duality: ciliates have two types of nuclei (micronucleus and macronucleus) performing completely different functions. Micronuclei are used mainly to store genetical information for future generations while macronuclei contain genes used to produce proteins during the life-time of a cell. Genomes are stored in these two types of nuclei in two completely different ways: micronuclear genes are highly fragmented and shuffled, fragments (coding blocks) are separated from each other by non-coding blocks, while in macronuclei each DNA-molecule contains usually one gene stored in assembled (non-fragmented) way. During sexual reproduction genes from micronuclei get assembled into macronuclear genes. This microbiological phenomena involves heavy manipulation with DNA molecules, which is known to be the most complex DNA manipulation process in the Nature
Gene assembly in ciliates is interesting from the theoretical point of view. The structure of micronuclear genes resembles the structure of linked lists from computer science. Coding blocks refer to each other by means of “pointers”, which are short nucleotide sequences presented twice in micronuclear genes, once in the end of a coding block and then, in the beginning of the corresponded successor coding block. In this way, micronuclear gene patterns can be formalized via double-occurence strings or permutations of integers. Formally, the gene assembly process in ciliates can be characterized in terms of operations on strings, permutations, graphs, etc.