Tag: breeding

Digital Name Assimilation

Digital Name Assimilation is a sound composition/installation that uses DNA (Deoxyribonucleic acid) data taken from a dairy cow and her three daughters and realises it as audio through a process of synthesis using the computer software Max/MSP. The Max/MSP patch also makes predictions based on the mother’s DNA of what the DNA of her three daughters might be like, then outputs this as audio.

The piece is suitable for fixed listening over seven speakers in a large space (speaker layout diagram included below), I’m yet to fathom out a decent way to present this online. I’m not yet familiar with many people browsing blogs whilst hooked up to an 8 channel surround system, so for the time being I will resort to presenting the work as a textual commentary. Through this commentary I hope to draw attention to the extensive similarities of the DNA data (and that of a computer prediction also) and the comparisons that can be made between a digital representation of life and digital media.

As the music industry slowly responds to the internet age, specifically the loss of profits through illegal downloading of music, we have seen an adaptation of copyright into a technology called Digital Rights Management (DRM). This is currently a form of access control technology designed to restrict and control our use of digital media, as a digital recording of a work is still the same work that was encoded on the original CD sold or the download licensed to the consumer. This is similar to our ability (as consumers) to use software for a period of time. We never own the digital media (software), just a license to use it on a given number of devices. This confusion over the ownership of digital media has meant that it has only very recently been possible for insurance companies to offer coverage for not just downloaded software (that never existed physically, yet cost $450 to use in the case of Max/MSP) but also for a person’s digital media collection. It is quite reasonable that a person owns the licence to over £1000 worth of media when considered that I have a collection of over 500 physical CDs and that one song costs 79p to download from Apple’s iTunes store.

DNA data has been subject to similar protection rights issues, indeed, attempts have been made to ‘own’ a sequence of DNA, although most attempts have failed due to the ubiquitous nature of genes (Engineering the Farm: The Social and Ethical Aspects of Agricultural Biotechnology Britt Bailey and Marc A. Lappe 2002: p.72). It is of immense importance to human kind that this information stays in the public domain because subtle differences in sequences of base pairs create disease resistant plants, animals that grow more quickly and humans that are more or less susceptible to disease; this last point is of particular importance to insurance companies.

All animal life is made from one cell (the fusion of one sperm and one egg). Inside this one cell are the instructions on how the cell will function and ultimately divide into two cells, each of which will then both have their own complete set of instructions of how to function and divide yet again. Eventually a living organism will exist, made up of about 10,000 trillion cells (A Short History of Nearly Everything Bill Bryson 2004: p.450) which will each contain the full set of instructions on how to construct (by this method of division and growth) the same living being yet again. The process of organised and controlled cell division and differentiation is called growth. Uncontrolled it is called cancer. The instructions are the same in both cases.

This set of instructions is contained in DNA. The typical illustration of DNA (shown below) of the double helix represents the two parts that make up the list of instructions, which run in order along the double helix. It is these two strands that divide in two and each side goes into each of the two new cells(*1). Once in these two new cells it replicates the one strand into two and becomes a whole new cell (a complete double helix structure) ready for dividing again.

Joining the two sides of the double helix together are a pair of one of four types of molecules (one attached to each side of the double helix). This pair of molecules is called a base pair and is in fact a pair of amino acids. The instructions of how to construct an organism from just one cell are contained in the order and type of these base pairs joined to the backbone of the double helix (genes). That is, to observe the order of base pairs is to observe the genetic make up of a living being.

The four bases (amino acids) contained in DNA are Adenine, Guanine, Thymine and Cytosine, hence the short hand AGTC. A will only bind to T, and C only with G(*2). The sequence of the base pairs determines what type of protein is produced and the shape of the protein molecule, which in turn goes on to establish how it acts in the cell. So we can see that the sequence of the base pairs determines directly how the cell operates and what it produces (Almost Like a Whale Steve Jones 1999: p.120).

The technological ability to extract and analyse this quantity of base pairs at a reasonable cost is very new (available since the end of 2007) and demonstrates the increasing rate of insight we can gain of the natural world around us.

Incidentally, the 54,000 base pairs taken from each cow are not the first 54,000 of the sequence of billions. They are evenly distributed across all the chromosomes (like street lamps spread along a street). In the future, technology will allow us to extract 300,000 base pairs in a similar manner to adding in street lamps between already existing ones to observe the finer details of the street.

In animal breeding the best animals are mated together to theoretically produce the best offspring as a form of artificial selection (also referred to as selective breeding). This was first noted by Robert Bakewell (1725-1795) who is “usually regarded as the pioneer of livestock improvement as we know it” (Genetic Improvement of Cattle and Sheep Geoff Simm 1998: p.4). He spent time noting the best animals in his neighbourhood then followed the basic idea that ‘like begets like’ (Darwin) which itself draws on comparisons between animals favouring ‘good’ families. Of course for like to beget like there would need to be additional information passed on in the DNA from one generation to the next on top of the general blue print of the making.

There are large similarities between the DNA (both in data and audio form) between the daughters and their mother, even though the mother’s DNA is only responsible for half of the genetic make up of each daughter (the father being responsible for the other half). The same striking similarity is present between each daughter and the prediction of the daughter, even though the prediction of the daughter is only based on the mother’s DNA. If the father’s DNA were available, the prediction could have been made upon both responsible contributors. So large similarities and small variations in DNA make up the huge selection of species present in nature, much like the large similarities and small variations in sequences of only twelve notes that make up the massive variety of melodies in Western music. This relationship between cause and effect (along with most of the technical aspects of DNA) is much better explained in the generously thorough The Blind Watchmaker by Richard Dawkins.

At the core of my Max/MSP patch is a sound engine based on very simple FM (frequency modulation) and AM (amplitude modulation) synthesis and the relative levels of these modulated sounds (dry/FM/AM). There are actually seven of these included inside the patch correlating to each of the cows (including the predicted cows). The sound engine receives data from the DNA file of its cow in the form of one of sixteen numbers every 10 milliseconds. Each of these sixteen numbers represent the sixteen possible combinations of the two base pairs available in the DNA data. The DNA file is stepped through in order and the numbers one to sixteen are fed into a unique table for each DNA file. This table acts as a probability distribution for the random object that (through the use of a coll object) triggers the pitch of the sound engine. So as to cover a wide pitch range I mapped the sixteen combinations of base pairs to a four-octave major 7 #11 arpeggio.

My first experiments in sounding the data involved attaching a pitch and an attack to every step of the sequence of base pairs, resulting in 54,000 notes taken from a range of sixteen pitches, all occurring with the same rhythmic interval. At eight notes per second (equivalent to 16th or semi quaver notes at 120 bpms) each pitch was detectable to the ear and nice harmonic combinations would become apparent between the cows/speakers (whether the 16 notes were of a scale or chromatic). These would manifest spatially, as different pitch interval relationships would appear in different areas between the clashing speakers (when ever two or more speakers played the same pitch the sound would appear to originate from in between the given speakers, similar to ‘big mono’).

One of the problems with this arrangement was that it would have taken 1 hour, 52 minutes and 30 seconds to get through the 54,000 steps. On top of this, aesthetically, it sounded very similar all the way through the running time as each cow just sounded one of sixteen pitches. I also tried a similar set up by stepping through the data at a much faster rate, somewhere near 32 notes per second. At this speed, many of the same attributes existed; to complete the 54,000 steps still took a considerable amount of time, but now the pitch of each step was mostly indistinguishable and the overall sound was close to noise. In either of these arrangements it would have been reasonable for a listener to gather all they could from just observing a random two or three minute segment.

So as to get away from the minute details in the data and move toward the surprisingly subtle differences in the overall architecture in each cows DNA(*3), all further sound manipulation is the result of groups of base pairs present over a period of time.

For example, a parameter of the sound engine (say for this explanation, the FM depth) will count how long it has been in milliseconds since it last received two hundred of the first combination of the two base pairs (A-A). Once it has counted two hundred of those base pairs the time taken is reported, scaled to a usable number and applied as the new FM depth. A very similar mapping applies to all aspects of the sound machines, counting different amounts of base pairs (ranging from fifty to five hundred) and a different combination of base pairs per changeable attribute.

Upon opening the master patch the mothers DNA is automatically loaded into a table which acts as a probability distribution map for three random number generators (with range 1 to 16) that feed the remaining three sound machines representing the three predictions of what each daughters’ DNA will sound like. This simple method offers an unbiased resolution, not dependent on chemistry or biology that yet still contains a foundation in the mother and so if one were not able to consider this mutant creation a prediction of the mother’s daughter, it could at least be considered a further new daughter (and a bastard daughter at that).

My choice of purely synthetic sounds is founded on the notion that although the inner parts of a living body may make noise (as everything vibrates), this DNA data is a step removed from the real physical world and is simply a list of instructions on how to create the real and physical, similar to a life score. “Sounds are inaudible usually because they are small, they take place where we cannot hear, or we cannot hear them unaided.” (Noise, Water, Meat: A History of Sound in the Arts Douglas Kahn 2001: p.201). Representing something aurally that is not just inaudible to the human ear, but that is just inaudible period (data as data) takes us toward either a realisation of the unreal or back to a form of the real in which the unreal has been derived (Baudrillard). This also touches on Frayn’s idea of information ‘traffic’ in the amazing The Human Touch: Our Part in the Creation of a Universe.

The way synthesised sounds are constructed is by a similar method as biological development from a genetic code; a set of simple generic data organised into an appropriate and manageable arrangement that can be applied to waves and inform them (and their modifiers) on how to operate, thus creating physical movements in air from digital information (that never once existed in the real world, but was generated digitally).

The context of this work lies in the extreme notion of a digital life, or a digital representation of life. This notion seemed difficult to grasp for people introduced later in life to virtual reality, owning (a license for) digital media, social networking websites and advanced artificial intelligence in video games. Yet for anyone born after circa 1996, these ideas form the foundations of their acceptance of a very small and blurred boundary between real, unreal and physical and virtual. In the works’ current form, it is most suitable for fixed listening of some sort. With the inclusion of n more mother’s and three daughters’ DNA files, the Max/MSP patch could be installed in a gallery and could run through the DNA of family after family. It could even be linked to a network so as to update its database of DNA when more became available.

Listening to the piece in a space at least 5 metres squared at a volume you could just about talk over allows the listener to move around the speaker array and observe overlapping aural arenas of each cow with out too much interference from the room itself (listening in an acoustically controlled studio is helpful). This size of room and volume is preferred to limit (but not destroy) the overlapping of these aural arenas and to preserve the mono quality of each speaker. It is suggested that the listener begin by familiarising themselves with the sound of the mother, followed by her three daughters and lastly the sound of the three predicted daughters by standing in close proximity to the relevant speaker. The listener might then explore the overlap that should occur in a vertical line (shown on the speaker configuration diagram as the interphonic knot (The Sonic Composition of the City in The Auditory Culture Reader Jean-Paul Thibaud 2003: p.335)) through the middle of the speakers. Moving in and around this line is perhaps the most interesting activity as the relative differences (and extreme similarities) are not only most apparent here, but can also be changed dynamically by moving around the space.

The Max/MSP patch is laid out in the same way as the speaker array with outputs 1-7 matching up with the numbered speakers in the SPEAKER CONFIGURATION diagram below. Pressing space bar will start the sequence and a silence of anywhere up to thirty seconds can be expected before any audio is heard. Additionally there is a bar at the top of the patch that charts the progression of the nine minutes; the blue LEDs in the centre will also go out when the piece is finished.


I would like to thank the Scottish Agricultural College (SAC) for the use of DNA data from four anonymous cows from their Langhill lines of dairy cows.


*1: The process of cell division is called mitosis where the DNA strands separate, move to opposite ends of the cell and the cell membrane closes around each end, making two new cells.

*2: Upon inspecting the included data of the 54,000 base pairs taken from a mother and her three daughters it may seem confusing that A appears able to bind with C, G and T and indeed, that all sixteen combinations of amino acids appear to be present. This data actually represents 54,000 base pairs taken from two chromosomes (in the text document: one chromosome is on the left, the other chromosome on the right). Since A will only bind with T, the data only requires one letter per base pair as we always know what the other letter will be.

*3: Humans share about 95% of their genes with fish demonstrating that very small changes in a DNA sequence produces large variation in the final organism. Animals have about three billion base pairs and may differ between each other at only one million of these. However, this small group of differing genes may govern how the remaining shared genes operate. The genes are not immune; they can be transformed by the mother’s uterus and be passed on to offspring in an altered fashion. If that alteration is not favourable then death is inevitable, but out of all successful pregnancies comes variation, and improvement, hence evolution.