• Superposition of signals in DNA molecule is a sufficiently general principle of information coding. The degeneracy of the code is necessary for the effective realization of this principle, with which the possibility to transmit differently certain information also allows to place some other information on the same DNA fragment. Code words that are equivalent in the information sense -- synonyms -- form synonymous group; and the entire set of code words is broken into the synonymous groups. This paper is dedicated to construction and analysis of the model of synonymous coding. Sequence containing one word from each synonymous group is selected as the model signal. Meaningfully such sequences can be examined, for example, with respect to the triplet code. Cartesian synonymous groups are selected as a model of synonymous partition. Namely, assume that there is given a basic alphabet As = {a1, ... ,as}. Let us examine q of its, enerally speaking, different partitions Ai = {a(1,i), ..., a(ri,i)} where i = 1, ... ,q. Then, by definition, the set of the words from a(j1,1) x a(j2,2) x ...x a(jq,q) forms a Cartesian synonymous group.
  • We define and examine Cartesian synonymous partitions and their simplest properties. It is shown, that the task of constructing the model signal can be represented as the task of constructing a certain sequence in the special alphabet, letters of which form the subsets of the initial alphabet. The necessary concepts of dense sequence in the terms of this alphabet are introduced and their connection with the dense sequences of the letters of the basic alphabet is investigated. Transition to the new alphabet removes some combinatory difficulties related to the selection of the representative of a synonymous group. This makes possible to apply effectively the modification of the De Bruijn graph when examining the existence of a dense signal. It is convenient to investigate the concepts appearing in this case within the framework of a more general system - the system of abstract alphabets. This system is a straight generalization of sufficiently real biological models on the one side. On the other side it makes possible to describe more complex hypothetical rules of coding. Various conditions of existence of dense coding are obtained for a such system of abstract alphabets, based, in particular, on the theorem of Hall about perfect matchings in a bipartite graph and on the theory of flows in the networks. It is shown also that the sequences over abstract alphabets cannot always be transferred into one another by transposition of fragments, in contrast to the well known property of sequences of letters of a regular alphabet. The obtained general results are applied to the case of Cartesian synonymous partition for the words of length 2 and to the triplet code of amino acids. In the latter case it is proved that the synonymous groups of the amino-acid code are not always Cartesian, but they can be represented as an association of a small number of Cartesian synonymous groups. Therefore the analysis of the amino-acid code is reduced to the independent analysis of 16 versions of Cartesian synonymous partitions. This decomposition of the amino-acid code into a series of Cartesian codes made possible not only to analyze an existence of dense sequences in the triplet code, but also to calculate them all using a De Bruijn graph.

M. Gorel and V.M. Kirzhner (2006) Degenerate coding and sequence compacting. Preprint series: ESI preprints 1819. (http://www.esi.ac.at/preprints/ESI-Preprints.html)


V. Kirzhner, Z. Volkovich (2012) Model of Overlapping Messages with Degenerate Coding, Applied Mathematics 3: 188-197.