I decided to drop my study. It took me six months of dithering and one nervous breakdown to decide. Maybe I can pick it up again, unofficially in future, but for now I’d like to stop killing myself trying the equivalent of three full time jobs. Well meaning people have tried to talk me out of it. One mentioned how selfish I was to not publish. On the contrary – I’ll try to snip out some parts and run them here.
Keep in mind that these bits are both unfinished and carved out of a much longer document, but there should be some interest.
The dictionary definition of music includes its emotional expressiveness, immediately demanding a definition of emotions. These are difficult to define; sometimes measurable, sometimes culturally based, mostly accounted in individual reports. I need to form some operational definition of emotions before I can make any progress on their place in music.
Emotion is contingent on an event. If a lion were to chase me up and down the campus, I would show evidence of strong emotion in measurable physiological changes – rapid breathing, heart rate and so on. If many years later I was reminded of this episode I might again show measurable symptoms of fear, based on my recall. So an emotion can be seen as a process – a physiological response to a situation, or even recall of that situation.
But I may feel pride at having outrun the lion, or bereavement at having lost a leg. It is possible to report a feeling of pride and have no measurable symptoms of it, which may be described as holding an emotional state, not immediately responding to a current event.
Some viewpoints in the study of emotion can be quickly summarised.
The materialistic view is that emotions are part of a ‘folk psychology’ that will one day be better explained as specific operations of a neural network (for example Dennett). Until that day comes, I feel we best work with what is known. Another ‘hard science’ viewpoint, following on from William James, is that physiological changes are the emotion itself. Fear is the name we give to the changes of heart rate, breathing etc. which come about by the automatic response of what he calls ‘neural machinery’. An opposing view from Roseman and others is that an appraisal is always required. For if someone spills on drink on you by accident, you will be more forgiving than if it was intended, and the extent of that forgiveness will vary between personality types. It has been shown that emotions are felt and expressed in different ways by different people in different cultures; the various cultural mechanisms of grief for example have been studied extensively.
The appraisal and neural viewpoints are compatible. It is reasonable to say that emotions are held in physiological responses, which are modulated in intensity and duration by the attention given to them by the individual. The degree of attention can be individual, contextual and cultural. Using the terms we saw before, that means the process maintains the state, in which case music could act to direct our attention to physiological affects.
Emotions would seem to have an adaptive purpose. Fear motivates flight from danger, and sadness guides us to act to avoid more sadness. But bereavement, for example, may offer no realistic learning outcome – I cannot grow back the leg taken by a lion. I might instead feel empathy on seeing a one-legged person, which causes me to avoid lions. Empathy seems like a mechanism by which music could convey emotions, but for whom do we feel this empathy?
A literature review by Juslin and Vastjall (2008) found that over 1400 papers mentioned a connection between music and emotion, but only 1 outlined an empirical mechanism. In response they suggest six simultaneous pathways:
- The acoustic signal is taken as an urgent event – such as discordant sounds being taken as threats.
- Features of the music such as the pacing or the vocal quality of some instrumentation emulate human emotional communication.
- The music triggers a conditioned emotional reflex, involving no recall.
- The music is already associated with an episodic memory and recalls it.
- The sounds evoke visual imagery, which then maps to narrative.
- There is an exception to an expected pattern, which is socially determined.
I’ve re-organised the list to pair up three main paths that become evident in studies that follow – features of the acoustic signal itself, memory and recall, and comparison with narrative structure.
Matravers (2011) puts the question simply – a statement such as ‘the music is sad’ is unclear. Does it mean that the music sounds sad, or that it makes one feel sad? Where is the emotion stored? He outlines the three main positions. The first is that tone, rhythm, progression etc. provide ‘tokens’ that we accept as representing emotion. Fast paced music might for example represent urgency, and discord a threat to an expected pattern (as with atonal music in the horror film soundtrack).
A second view is that music contains no emotion itself, but by presenting a flow of agreement and conflict (changes of timbre, harmony and disharmony) arouses sequential mental states that constitute a ‘terrain’ of feeling. In that case the listener’s imagination is prompted to feelings akin to those from hearing a dramatic storyline, perhaps to the point of visualising such a story, or relating it to personal experience. Opera is overt in this storytelling function, but even an exercise by Bach has an implied progression of causal events that can be storified.
The more difficult view is that music prompts the listener’s empathy with a musical persona (also c.f. Robinson and Hatten 2012). A solo violin may not only have a vocal quality (a token), but by this recognition be perceived as carrying an emotion usually ascribed to another person. It is this persona that moves through a range of emotions in a musical piece and we feel empathy for it. The listener personifies the music. This is reasonable given that we are quicker to identify emotion by the qualities of the vocalisation than the words themselves (Pell et al. 2015).
Each of these views has problems better handled by the others. A cymbal crash is startling and urgent, but it will have different contexts in fearful and exultant musical passages – the tokens don’t exist in isolation. We don’t experience music as disassociated sound and an additional internally generated terrain of feelings. In film music for example, we fuse the music with events on screen, rather than construct non-diegetic emotional subtitles. But a persona, properly being an aspect of personality, demands a reason why music would be recognised by the brain as such.
As with theories of the emotions, there’s room for all of these views to collaborate – music involves empathy for a persona, characterised and embodied by sound events, animated through changes of emotion that are interpreted as causal chains similar to those that we find in narratives. But why should the brain recognise music in this complex way? What is the advantage?
Narrative as a mnemonic device
Narrative structure plays a part in episodic memory, being the long-term recall of life events (as opposed to short term memory – a phone number, and categorical memory – that ‘Madrid is the capital of Spain’). Memories from one’s childhood, or a dispute at yesterday’s meeting, are both examples of episodic memory, which involves a combination of “who, what, when, where” along with the emotional context. It is the emotion that marks the episode as worth remembering – bad emotions are guides to avoiding such events in future, while good emotions encourage behaviour that brings more of the same.
The narrative faculty of the brain is used to both discard information when storing memory, and flesh out the shorthand of retrieved information. Narrative acts as a kind of ‘lossy codec’ of life experience. When storing experience, any commonplace aspect lacking instructional value can be left out. In retrieval we manufacture details that bridge the stored impressions in our imperfect recall (Martindale 1981 p343). In experiments with electrical stimulation of the brain, subjects describe convincing sight and sound recall of past experience, but which include impossible elements such as seeing themselves in third person. This suggests that a narrative is assembled in the brain before the vision is reconstructed. Even when brain damage or dementia erases sections of memory, this narrative system still attempts to assemble memories as a false story in a symptom called confabulation.
Studies in dreaming indicate that we replay events of the day at high speed (in non-REM dreaming) and some more complex events at normal speed (in REM dreaming) with an associated high level of brain activity in the emotional centres – and reportedly mostly bad emotions. This recall has been shown in rats, which repeatedly reproduce the eye movements they had made in maze testing that day. It is still not clear whether dreaming is part of the condensation of memory, the erasure of unneeded experience, or even just an afterglow of the daytime workload. But it is clear from our own interrupted dreaming that the mind is forming causal chains for the improbable events thrown up in the dream function.
Why is narrative an effective means to condense experience? Aristotle summarised it in Poetics, describing a properly formed story as encompassing ‘an action that is complete, and whole, and of a certain magnitude’. The events taking place in a story run from cause to effect,with no earlier cause or later effect required for the audience to make sense of the narrative purpose. A well-formed story provides the least complex casual chain of events, and is efficient storage.
The value attached to the event chain is described in the associated emotion. The emotions we feel from reading, hearing or watching stories, and our selective recall of them over long periods of time indicate that these are valued in the same manner as personal experience. That music evokes emotions, is strongly associated with past events, and that we prefer music from particular periods of our life are all evidence that hearing it involves episodic memory, in which case it is (perhaps as a false positive) condensed as a narrative structure.
The Persona as an animated caricature of emotion
In narrative film there are traditions for designing engagement with the audience (e.g. McKee 1997 p135, Field 2005 p20). We are shown a protagonist, in an unsatisfactory situation. They begin a process of self-development, which leads to antagonism and conflict, which needs to be overcome. At the end of this journey the protagonist is shown elevated in their position, and wiser to the workings of the world. They earn this wisdom even in tragedy, in which they lose the conflict. The simplest template of this narrative structure are fairy tales, where a lowly, virtuous character defeats an evil foe to be able to marry their prince, or princess and live ‘happily every after’ (Bettelheim 1976 p35)
From Joseph Campbell we have the suggestion of monomyths that have appeared in different cultures around the world, being universal mythic tales providing both explanations for the operation of the universe and moral guidance for living in society and nature. He found, for example, many variants on the death and resurrection story at the heart of Christianity. There is supportive evidence from other methodologies such as phylogenetic studies of folkloric elements (Graça da Silva & Tehrani 2016). The monomyth suggests that the human mind is attuned to a particular narrative form, involving an individual’s life journey and struggle.
Carl Jung named this shared preference as the collective unconscious, in which there are found symbolic actors or archetypes such as the wise woman and the trickster that enact specific roles in myths and tales (Fordham 1961 p47). He described this as a genetic memory, although subsequently we have preferred to see these universal symbols as cultural rather than biological (Richerson & Boyd 2010). It is through Jung we use the psychological term persona, originally being the caricatured masks worn by stage actors. In the context of this discussion, music can be compared to a series of masks that convey emotion – or perhaps a single animated mask.
The persona identifies a character class, with a function (much as ‘a judge’ can be anyone dressed in the correct robes and wig). The identity of the actor behind the persona is secondary to their function in the narrative structure. Another way of saying this is – given the chance, the audience will assign the person of the actor to their role (dare I call this ‘the James Bond effect’?)
But do we have to provide a ‘somebody’ behind the mask? Does this persona have to be based on a realistic image of a person? An interesting finding from animation is that an audience will more likely empathise with a simple caricature than a naturalistic human. A character such as Wall-E is simply a cube on which are stacked cylinders and triangles. This is more effective than the complex humans that populate ‘realistic’ animated films such as The Spirits Within. One reason appears to be the exaggerated facial expressions and movements that these shapes provide fill more of our visual field. Another is that ‘realistic’ animated characters leave less blank space for the audience to project themselves. This may also be why some human actors, particularly comedians, disguise themselves with simplified high contrast faces and costumes.
There seems to be no lower complexity limit for empathy, as characters such as those in the 1965 short film The Dot and the Line made clear. As long as we are able to identify a narrative, we will apply motives even to fundamental shapes. Personification was at one time described (Piaget) as a primitive or childish means of understanding the animation of objects. It has more recently come to be seen as part of reasoning by analogy, in the situation where one’s own consciousness is the best available model for a comparison (Inagaki & Hatano 1987). At least in terms of our consumption of art, if it walks like a duck and quacks like a duck – it’s probably an animated duck.
So in an abstract video there need be no identifiable person visible to convey emotion. Instead the features of the image – colour, form and the contrasts of harmony and discord – may provide a sequence of events sufficient to be recognised as a narrative, involving a persona, for which the audience feels empathy. For example, when music is sad, it provides cues in its pacing, timbre, texture and so on that have the tokens of sadness. Rather than specify all of these factors, we shorthand our experience in the analogy that the music is itself an entity, for which we feel sad.
If we accept that an artwork can hold a representative character, it may be possible to measure and then control it. We talk of ‘a sad melody’ or ‘a brooding image’ for example, based on the feelings that they evoke. These terms could be translated into parameters that allow the notation of feelings within the framework of an individual score for visual music. To be clear, the measurement is not a scientific result in the real world, but rather the use of a well defined rule within the internal logic of an artwork. In a later section of this chapter I’ll come back to the comparison between this internal logic and the metaphysical and spiritual spaces of previous visual music.
At what level should the control take place? Two of Juslin and Vastjall’s list of pathways describe designed actions in the music – urgent events and disruptions to an expected pattern. Should a notation specify the content of these events? For the aims of this project it is better to differentiate between the score and the performance, or to use film terms, between the directing and the acting. The arrangement and orchestration of the work is left up to the individual performance of the score; the event could be indicated in any of a rapid change of colour, or cutting between discordant images, or in the motion. It is the intent that a rapid change of mood takes place at this point that appears in the score. The correct level for the notation is in directing the emotion of the persona over time.
Psychology offers competing ‘deep’ and ‘shallow’ theories for the description of personality. The ‘depth’ systems such as those put forward by Freud and Jung do not rest with the apparent emotion, but go on to hypothesise hidden origins for it. I only really need to notate that an image is intended to be ‘angry’ – the artist and audience will find their own narrative for it. That’s in keeping with a piece of music having meaning for the listener’s own life experience.
The so-called ‘shallow’ theories such as Behaviourism and Individual Psychology only deal with what is visible. The latter is particularly concerned with taking inventories, and categorising attributes, which is ideal for this task.
The next problem is that thousands of terms for emotions exist – a person can be ‘angry’, ‘irate’, ‘furious’ and so on. These can all potentially be controls on an interface, and need to be somehow reduced to the smallest viable set. It’s inevitable that we will cause some violence to the language and need to have a good reason to do so. Fortunately much work on terminology has already been done in Individual Psychology, and rather than having to invent some new and idiosyncratic field of knowledge, as did the visual musicians of the previous chapter, we can borrow from a well-documented practice, with all the justification that brings.