Information Theory—part 4

Fundamental theorems of Coded Information Systems Theory


Appendix—Examples to illustrate principles of coded Information systems

Example A1. The human brain is made up of about 100 billion nerve cells, organized into about a hundred specialized structures responsible to process signals and help make decisions.1 Nerve signals resulting from millions of photons impinging on the retina get sequentially processed until an object is identified, complete with properties such as distance, colour, and movement.

Example A2. A single fertilized egg produces about 210 different kinds of cells in the human body.2 Each has a special pattern of activated and deactivated genes, and cell morphologies can be very different. How to create new cells, when during embryology, and where these are to be located are not communicated exclusively by DNA sequences, the usual information source claimed for cellular processes. Proteins and other biomolecules supplement what DNA alone provides, and external environmental cues provide other guiding resources.

Example A3. Suppose a mechanical vehicle on Mars is to be guided to a particular location in a promising valley divided into 64 equally sized rectangles. Log2(64) = 6 bits would be needed to specify a rectangle. A binary coding alphabet such as {0,1} would communicate one bit per symbol and thus six symbols would be needed to identify a rectangle.3 A quaternary alphabet like {U,D,L,R} would communicate two bits per symbol (log2(4) = 2 bits) and thus three symbols would be needed in the message.4 The mapping of a unique message to specific locations is part of the coding convention. For example, a message [011101] could mean ‘Go to the rectangle identified by this address’, figure 2 part A in the main text. Here the message is very short, since each symbol eliminates half the possible target locations,5 but additional software resources must interpret what to do with the received message (determine the actual intended end location specified by the message) and guide the behaviour of the hardware. The hardware in this scenario would also need to be designed to respond flexibly, a challenging engineering requirement.

Example A4. Another CIS design is shown in figure 2, part B in the main text. The quaternary alphabet {U,D,L,R} communicates whether to move one unit Up, Down, Left, or Right. Now the vehicle would only need to be designed to response to sequential instructions and would not need to calculate the final destination. The limited range of movements necessary would simplify the engineering requirements.

Many other designs are possible. The quaternary code could be enhanced to include instructions to move 45° in four directions, saving time, energy, and matter. Another design could use messages which communicate how many degrees leftward or rightward to adjust during movement. If great distances are to be covered, the code could be made more efficient by indicating the number of units to proceed, for example 25U 11L 3U8 (i.e. 25 units straight Up; 11 Left; 3 Up).

Figure 4. Bees observing the scout’s waggle-dance fly off in a certain direction and to a certain distance. The trapezoid represents the communicated location of the food supply.

In an extreme design case, the message could communicate how to manipulate the equipment, or even how to build portions of it. This we find in cells, for example. The proteins and RNA needed to build molecular machines (ribosomes, enzymes, polymerases, etc.) are coded for and produced, to permit processing of subsequent instructions to produce new proteins.

Figure 2 also illustrates that before the vehicle can work with messages, it must be located at the edge of the search space. Placing the vehicle near the target within the search space in advance would simplify the content of the message needed afterward. However, such fortuitous advantages must not be taken for granted. They also represent a refinement factor towards a goal!

Example A5. Each missile launcher of an air defence system is to cover a region of space a kilometre away. The region is divided into n = 65,536 equal size portions, based on the destructive power the explosive can deliver. Coded messages can then communicate which square to target. This would require log2(65,536) = 16 bits of information for each message to always be able to identify the target location.

But suppose it would always be possible to split the area of coverage into four equal-size quadrants and to know in advance at which one to aim. For example, a satellite might be responsible to make this decision and to initiate the missile launcher. This refinement stage provides an improvement factor log2(4) = 2 bits of information, so that the next resource, the message, would only need to provide 14 more bits. The defence system, including the message transmitters and receivers, and the launchers, could be designed to work with the alternate scheme.

Example A6. Theorems 7 and 8 will be covered by this example. We notice that after finding a source of food, other bees often return to the same location. We suspect a CIS and go about dissecting how it works. We place sugar-coated flowers at different locations and record what the scout does back in the hive. We observe the ‘waggle-dance’ and then document the behaviour of the bees which watched the dance. We conclude that direction and distance are communicated via the coded message, and wish to separate the contribution from this message from other refinement components.

The candidate region the other bees should arrive at is represented by the trapezoid in figure 4. To help visualize, the edge of the trapezoid closest to the hive would be about 40 m long if located about 500 m from the hive.6

There are several ways to go about performing the analysis, and the experiments should be repeated several times to calculate an average value.

In one possible setup, we could compare the amount of time all bees spend in the trapezoid with the number which left the hive, before and after a scout found the food and communicated this fact. In this case several factors would be at play: the location of food; the distance from the hive; and the certainty that food is nearby, prompting longer searches in that region. The design must carefully address the question being asked.

In a more refined experiment, the number of bees exiting a hive and the number entering the trapezoid region during a fixed period of time are recorded, before and after the waggle-dance message is received.7 Before provides the baseline behaviour, the proportion, prandom, part A of figure 5.

A bunch of flowers are covered with sugar and we wait until a scout bee finds the food, returns to the hive and the waggle-dance has been concluded. This time we record the number of bees exiting the hive during a fixed period of time which had observed the dance and determine the fraction of unique bees8 which cross the line into the trapezoid. We’ll call this proportion pmessage.

Figure 5. Quantitative analysis of the direction of food supply communicated by scout bees. A: proportion of bees entering the region shown in figure 4 before food was deposited there. B: proportion of bees entering the region after the scout communicated the presence of food. The horizontal line is the baseline behaviour, due to chance, in the absence of the waggle-dance.

The improvement is the ratio pmessage / prandom. If values like 0.9 / 0.0127 = 71 result9, then log2(71) represent 6.1 bits provided by the part of the waggle-dance,10 which communicates location with respect to the hive. The analysis could also be performed by calculating the entropies of the before and after state, and finding the difference, in bits. This was discussed in Parts 2 and 3 of this series. The region around the target would be divided into small cells, and the probability of being located there for some time unit, pi, determined, to find the entropy:


In addition to location, the waggle-dance communicates distance. Perhaps some of the ‘informed’ bees did not cross into the trapezoid because they did not fly far enough (part of the 0.9 factor estimated above), and not because they went too far left or right. If calculating the contribution by the distance component is desired, additional experiments and baseline tests would be needed. One could determine how effective the waggle-dance is to direct to a specific trapezoid region. The experimenter would monitor the ‘informed’ bees, see what proportion enters the region of interest, and discard those that fly on out without beginning to search. This proportion would be compared to that found for the ‘uninformed’ bees. An additional improvement contribution factor from the waggle-dance message of at least 20 would be reasonable,11 or log2(20) = 4.3 bits more.

Once within the trapezoid region communicated by the coded message, additional refinement components must be activated to find the much smaller sugar-coated flowers. One factor is the conviction that nutrition is indeed nearby, something the ‘uninformed’ bees don’t know. One could compare the average time spent by ‘informed’ and ‘uninformed’ bees within the trapezoid the first time they entered. (The flowers could be hidden to isolate the single factor under consideration). The new log2(pmessage / prandom) expresses this, in additional contributing bits.

The next factor one could analyze is the improvement provided by being able to see. The baseline to compare against now is non-vision, so the size of objects which can be seen would be tested. Another improvement factor is provided by the sense of taste. One could use identical bunches of flowers (or objects far smaller) coated with salt.

The net effect of all these refining factors is to guide bees to a region about 125 cm3 out of a volume of about 1.57 x 1010cm3.12 Of course, bees favour various distances and heights for their searches, so that the distribution of possible search areas is not totally random. Nevertheless, these rough calculations suggest an effectiveness for this CIS, using all the refining components available, of about log2(1.57 x 1010/125) = 27 bits per bee and event.


  1. Ramachandran, V.S., The Tell-Tale Brain, W.W. Norton, New York, 2011; see p. 14. Return to text.
  2. en.wikipedia.org/wiki/List_of_distinct_cell_types_in_the_adult_human_body. Return to text.
  3. With a binary code each rectangle could be labelled: 000000; 000001; 000010; 000011; … 111111. Return to text.
  4. With a quaternary code the regions could be labelled: 000; 001; 002; 003; 010; 011; 012; 013;… 333. Return to text.
  5. Each bit provided halves the search space on average, if done properly. This resembles the game of ‘20 guesses’, where each question should eliminate as many possibilities as possible. One way to see this for a binary code is to recognize that on average a 0 or 1 will show up half the time for every position of the message. If a 1 is observed, half the possible codes will be eliminated, and so on for each next bit received. Return to text.
  6. If we offer food 500 m away, the circumference around the hive would be 3.14 x 2 x 500 = 3,140 m. Waggle dance communicates direction with respect to the current location of the sun. Using a reasonable estimate of 10° in communicating this factor, the edge of the search space would be about 1,570m x 10°/360° = 43.6 m, which we’ll round down to 40 m. Return to text.
  7. Once a bee has the information from the waggle-dance, she could return several times to the target region. Multiple counting should be avoided if the contribution of the dance alone is intended. Return to text.
  8. The intention is to identify the contribution from the waggle-dance. Once a bee has identified the food, she can return because of memory, which is a different contribution factor. Return to text.
  9. The edge was estimated at 40 m, and the circumference 3,140 m at a location 500 m from the hive. 40/3140 = 0.0127. Return to text.
  10. There are other ways to design the experiments. For example, the proportion of bees which did and did not watch the waggle-dance that then entered the trapezoid could be used to calculate the improvement factor over chance. Return to text.
  11. Based on documentary films of the waggle-dance. The length of the waggle-dance communicates distance, and dances seem to range between about two and twelve seconds. So there is a limited amount of precision possible. If we estimate the useful precision to be half a second, roughly one out of 20 slices of distance (10 seconds x 2 slices per second) would be communicated. Return to text.
  12. Based on an estimated search volume of (500 radius)2 x 3.14 x 10 m height. Return to text.