Composition of Vocal Music. Against Primacy of Content

PhD candidate, Department of Philosophy, History and Art Studies, University of Helsinki

Rebecka Sofia Ahvenniemi (b. 1982), originally from Finland, was educated in composition and philosophy in Bergen, Berlin, New York and Helsinki. She finished her master’s degree in composition at the Grieg Academy, where she studied with Morten Eide Pedersen. She is currently a PhD candidate in philosophical aesthetics at the University of Helsinki. Her joint disciplines of music and philosophy are expressed in her compositions, essay writing, research, and her teaching. These joint interests have led her to consider issues of cultural politics: she was a board member of the Norwegian Society of Composers from 2013 to 2018, and today she is a member of Kulturrådet.

This article critically examines the common idea that language itself carries pure content, while the acoustic or musical dimensions are something additional. According to philosopher Vibeke Tellmann, a ‘gap’ has emerged between music and language, especially where the formalist approach to music, deriving from 19th century romanticism, is concerned. Building further on the idea of this gap, five ‘myths’, in the sense of established but possibly false ideas about the relationship between music and language, are discussed in this article. The aim is to investigate the issue specifically from the perspective of composition of vocal music. The emphasis lies on theories that focus on timbral, embodied, and cultural aspects of performed language, such as the phenomenological concept of ‘dramaturgical voice’, presented by Don Ihde, and the ‘grain’ of the voice, as Roland Barthes describes it. A central question is whether language can ever be denied its musical or dramaturgical axis as constitutive of its meaning.

Keywords: Composition, vocal music, formalism, dramaturgy, embodiment


The purpose of this article is to critically examine the common idea that language itself carries pure content, while the acoustic or musical dimensions are something additional. This approach to language typically assumes the content to exist as such, while the speaking or singing voice, or other dramaturgical aspects, are considered something extrinsic: either pure sonic material or an added expressive layer on top of the content. While this article may dispute the primacy of content, the claim is not that something else comes first and content second, but that ‘pure content’, in an essentialist sense, might not exist.

The idea of language having content independent of form implies that the embodied or written dimension of the word functions as an expression of an immaterial thought. This approach to language might have had its culmination during Logical Positivism, which posited an understanding of language as a tool for expressing logical and scientific facts in an unambiguous way. Today it appears to be the point of departure in several disciplines, such as formal semantics and formal logic, but it is also deeply rooted in the prevailing ways in which language is approached.

The subject will be discussed specifically from the perspective of composition of vocal music.1 Basing the discussions on the many situations and challenges emerging from how one chooses to treat language or vocal expression in a musical context offers the advantage of observing language ‘in the making’. I will engage with the argument, presented by philosopher Vibeke Tellmann (2017), that a ‘gap’ has emerged between music and language, especially where the formalist approach to music, deriving from 19th century romanticism, is concerned.2 This gap implies that the content of music is purely musical, located in the internal relationships of tone and harmony, while the lyrics of vocal music function as an extra-musical element.

While the relationship between language and music has been widely analysed in the philosophy of music, asking, for example, whether music could be its own kind of language, the discussion appears to presuppose a specific idea of both language and music in the first place that rarely challenges the dualist point of departure. On the other hand, in the field of composition the relationship between language and music is not necessarily considered problematic. The ‘gap’ may manifest itself as apparent freedom in regard to how music and language could be linked together. In addition, there are few established ways for discussing the relationship without the ‘content of the text’ and ‘music’ appearing as fundamentally separate axes.3

Following in the footsteps of Susan Sontag’s critique presented in her essay ‘Against Interpretation’ (1964), I will begin with examining the consequences of the privileged position of the ‘content-oriented mind’ in discussions about art. The following five subsections will each deal with a specific compositional issue, here referred to as ‘myths’ in the sense of established but possibly false ideas, and observe how each of them resonates in regard to approaches to language. The emphasis will be on theories that focus on timbral, embodied, and cultural aspects of performed language, such as the phenomenology of Don Ihde (2007) and Maurice Merleau-Ponty (1965, 1988), the ‘grain’ of the voice, as Roland Barthes describes it (1972), and the performative aspect of language as understood in the thinking of Judith Butler (1993). The discussion of this article will also benefit from perspectives from Theodor W. Adorno, combined with the compositional thinking of Helmut Lachenmann, as representatives of the critical school. The critique of the existence of an isolated meaning of a text of vocal music concerns language on an epistemological level. At root, the issue is to examine the simultaneous power and vulnerability of works that engage music and language in ways that do not support a dualist division between them.

Fear of the form

A central challenge when discussing works of vocal music – which might reveal something fundamental to the issue – is finding a common ground for discussing language in a theoretical and musical context. As we read, write and speak, we already comprehend the language and are oriented towards the content. As I write words and sentences that are tied together to ‘deliver’ meaningful ideas, my writing already appears to support a specific privileging of content. The intention of the text seems fixed.

This paints a picture of a power hierarchy at work in the many situations where the content of a work of vocal music – whether a dramatic work, such as an opera, or another form of vocally performed music – is discussed. What may lie behind is the assumption that an abstract meaning could be extracted from the work, that the work is about something. To frame the issue in Susan Sontag’s terms: ‘It is still assumed that a work of art is its content. Or, as it's usually put today, that a work of art by definition says something’ (1964, 4). In Sontag’s critique, the expectation of art expressing content concerns art in general. Sontag’s is an important voice in this discussion, as she bans ‘interpretation’ when it merely attempts to abstract a message from the artwork. She describes the Western tradition of interpretation having its roots in the need to defend, explain and legitimise art, making art a product of mind. Sontag points out that what is needed is more attention to form in art; a descriptive rather than a prescriptive vocabulary (1964, 12). Instead of turning art into ‘thought’, making practice fit interpretation, we should rediscover our senses.

When applying Sontag’s critique of the ‘aboutness’ of a work of art to language in vocal music, my main argument is that the content-oriented approach impoverishes our understanding of the ways in which language comes to mean something in a work of vocal music. Could we look through the aspect of ‘content’ of language and discover elements such as embodied expression and social convention as constitutive of its meaning? For Sontag, ‘form’ refers to the sensuous quality of the work of art, and performs as a counter-expression to ‘abstract content’. Sontag’s approach should not be understood as concerning our senses in a purely empirical way, but rather as a call to pay more attention to the ‘depth’ of the material. When discussing the ‘form’ of language in the context of vocal music, Sontag's definition of form is pertinent, as it is charged with the suggestion that form may be able to ‘make us nervous’, performing as a negation of the established, the ‘safe’. Sontag writes: ‘By reducing the work of art to its content and then interpreting that, one tames the work of art. Interpretation makes art manageable, conformable’ (1964, 8). However, I will extend Sontag’s concept of form to cover the social implications of materials: the way in which they interact with the cultural context that gave birth to them in the first place.

In terms of art, it is a truism that the ‘form’ is not irrelevant to the content but performs as a part of it (Eisner 1981, 4). According to Adorno, the difficulty of ‘getting a grasp on it is in part due to the entwinement of all aesthetic form with content (1997 [1970], 185). In the following discussions, my aim is to study critically the dualist paradigm of form and content specifically from the viewpoint of musical composition. I will begin from the heteronomous situations and problems that arise when confronted with questions concerning vocal music, and observe how these questions reflect back on a content-centred understanding of language.

Myth I: Music adds a sonic layer to language

One of the consequences of content of language being regarded as detached from its form is that ‘voice’ comes to function merely as an acoustic element. The phonetic or timbral aspect of articulated words come to exist as a means of conveying content of language. What a text says is approached from the outside, as already established, while the timbral aspect of words becomes extrinsic to their meaning. One particular way this may impoverish the process of composition is by reducing voice – along with instrumental sounds – to mere sonic material. This approach appears to be supported by the scientific paradigm of our time where acoustics are often thought to form a basis for music. The timbral quality can further be analysed in detail as sound spectra.

Composer Pierre Boulez can be mentioned as a critical voice to the discussion of musical timbre. He describes the needs for finding bases for discussing music in a way that is neither reductive to the sonic, nor to individual experience. Boulez writes: ‘Even when one deals with the perception of sound phenomena and their quality, it is mostly a question of perception in isolation, exempt from any context. I feel that the truly artistic value of timbre is fundamentally forgotten using this approach’ (1987, 161). Boulez suggests that ‘timbre’ always refers to some stylistic criteria of compositional and instrumental tradition. For Boulez, timbre is not merely rooted in the quantitative qualities of a sound, but perceived through its context. However, today acoustic frameworks appear to offer some type of ‘stability’, a scientifically proved reference point. It is today seldom questioned that music concerns ‘sound’, ‘sound waves’, ‘acoustics’, or ‘sonic processes’. Boulez' concern with this quantitative approach to timbre appears to correlate with the matter of ‘sonification’ under discussion.

A relevant question,‘[w]hen did music become an art for the ears?’, was presented by Lydia Goehr at a conference in Bergen (Ahvenniemi 2019).4 In the Antiquity, music was performed, Goehr explains. Specific senses were not assigned to specific forms of art to begin with. The tendency of doing so, might have its roots in the formalist view that arose during the Romantic period. Musical formalism came to be associated with the expression or notion of the ‘absolute’, possessing meaning intrinsically; not in extramusical elements or purposes. Formalism in itself does not demand that music is treated ‘sonically’. However, the formalist approach of Eduard Hanslick, for example, which Goehr exemplifies in her work The Quest for Voice (1998) as empirical and objective, may have sown a seed of a positivistic approach to music (ibid., 13–14). Hanslick stands for the idea that the only thing tone materials express is the music itself: ‘[I]t is an end in itself, and it is in no way primarily a medium or material for the representation of feelings or conceptions’ (1986 [1854], 28). Goehr describes two types of trains of thought in aesthetic theory during the Romantic era, one being metaphysical, encompassing an upward move from concrete and empirical meaning to abstract and universal meaning, the other being formal, capturing the inward move, accommodating the work’s desired ‘unity of form and content’. Both regard music as pure. Goehr views Hanslick’s thinking as representing the latter, describing this tendency as ‘more empirical and gradually positivistic’ (1998, 14). It is not unreasonable to consider the modern tendency of sonification as a continuation of this development. When it comes to language, as Roger Lustig points out, formalism often results in ‘the belief that words, instead of being an essential component of a piece of music, are either irrelevant to or even distracting from its meaning [...]’ (1989 [1928], vii). The text is regarded as excessive to the musical content.

In her doctoral thesis, Vibeke Tellmann calls this ‘the romantic turn’, music and language coming to be regarded as radically different from each other (2017, 2). Tellmann depicts a shift from the Baroque era, when rhetoric was applied as a point of departure for the composition and analysis of music, this relationship being regarded as natural, until the Romantic era, when it came to be seen as problematic. Even though 20th century theorists have concerned themselves with contextual critique of music, the formalist approach still seems paradigmatic in the sense that the emphasis of musical discourse often lies on the internal relationships within harmonic material. ‘Music is music is only music’, to quote Susan McClary’s similar critique of this issue (1991, 116).

In compositional work, the split between the acoustic dimension and content of language is sometimes revealed through the composer’s ‘not really knowing what to do’ with the specific, embodied quality of voice in vocal music. One may falsely assume that by avoiding engagement with the voice, treating it as a producer of sound vibrations – as context free sound – this quality will magically disappear. The voice may be treated as an instrument, with few notational specifications that concern singing. However, one may be surprised how much the interpretive space around the voice and style of singing contribute to the work. For example, in Maurice Ravel's Vocalise-étude en forme de Habanera (1907), the voice sings a melody with one vowel, running virtuosically along the scale, perhaps resembling a romantic flute or harp. Even though the Vocalise is performed without text, what one hears is not simply sound as such. Rather, one hears that ‘someone sings without a text’. A voice evokes specific expectations that an instrument does not. In addition, one may hear that the singing is light and virtuosically performed – almost instrument-like. The expression is thus interpreted against a cultural background.

Another strategy of avoidance is trying to gain ‘total control’ of the sonic aspect of the voice by over-prescribing timbral attributes, in ways that are not well applicable to the art of singing, such as the width or speed of vibrato. On a string instrument, it is simple to perform poco vibrato, mezzo vibrato and molto vibrato, whereas these might present challenges to a vocalist, the singing being embodied in a peculiar resonance space. Another example is large interval gaps, combined with microtonality, if performed often and quickly. When employed without specific intentions other than that of treating voice as a ‘communicator of pitch’, this may show estrangement from vocal tradition.

‘Sonification’ of the spoken, or sung, word, has several shortcomings, one of which is failing to see the way in which sounds are culturally coded and interpreted. This duality of sound and meaning comes well to expression in the phenomenological thinking of Don Ihde: ‘The philosopher, concerned with comprehensiveness, must eventually call for attention to the word as soundful. On the other side, the sciences that attend to the soundful, from phonetics to acoustics, do so as if the sound were bare and empty of significance in a physics of the soundful’ (2007, 4). In vocal music, this means that there is no way around engaging with the expression as an embodied and cultural expression, from within the interpretive spaces of the culture it is performed in. Even when nonsensical language or other embodied expressions are employed, one does not primarily hear noises, but words or sounds that do not make sense. Nonsense in the sense of pure randomness of sound may not be possible to create at all.

Although sonification, as a reductive, scientific approach, may be seductive, it offers a false promise of objectivity..5

Myth II: Music adds an emotional layer to language

The way in which the qualities of the voice that come across as emotional might be seen to sit awkwardly in compositional work, could reveal something about the status of the relationship between language and music today. The expressivity of the voice exhibits qualities that seem to be tied to personal expression. Even though these qualities are difficult to pin down, and this thread might be an elusive one to follow, it is a useful path to explore.

The previously mentioned view of Hanslick creates a context for this subject. Hanslick makes the claim that the way in which music may arouse certain feelings ‘depends entirely upon the circumstances of each particular instance’ (1986 [1854], 7). His way of locating the content of music in music itself functions as an attempt to ‘save’ music from the contingency of emotion and historical change. According to Hanslick, music has its content neither in the arousal of ‘delicate feelings’ (ibid., 5), nor the ‘trembling of [the composer's] soul’ (ibid., 33) but in its harmonic relationships and development.6

The relationship between music and emotion has been explored extensively in 20th century music philosophy, often in an attempt to attach a value to the expressive (or emotional) quality of music, while simultaneously avoiding reducing it to something merely subjective. A question raised by many thinkers is whether music could be its own kind of language with syntactic rules. Susanne K. Langer, for example, describes musical elements as symbols of emotions. These symbols cannot be expressed in language. They do not refer to anything concrete or particular; they are ‘abstract’ in their nature (Langer 1963). Peter Kivy, who describes his own view as having a formalist point of departure, equals Langer to some extent in his claim that music expresses forms of feelings. Kivy speaks of ‘garden-variety emotions’, general feelings such as sorrow, joy and fear, that are perceived properties of the music itself, not based on the effect music has on the listener (Kivy 1989, 31).

What appears central to both Langer’s and Kivy’s thinking is a presumption of language as referential content. Stephen Davies claims that Langer’s view remains ‘between analytical assumptions about the “fixed meanings” which refer to objects in the world in verbal language, and a Romantic awareness, [...] of the need to explore ways of making sense beyond discursive language’ (Davies, n.d.). A similar approach is present in many modern attempts within philosophy, which, to quote Andrew Bowie, like logical positivism, relegate ‘statements about music as an art of “meaninglessness”, on the assumption that the meaning of a statement lies only in the ways in which it can be scientifically verified’ (Bowie, n.d.). If meaningful language is regarded as sentences with meaning that are verified, then indeed, one needs to make music an ‘art of meaninglessness’ in order to locate its value elsewhere.

If the ‘garden-variety emotions’ of Kivy were taken as a point of departure, a further consequence could be the emotional aspect tending towards the use of clichés, as music would only come to express ‘fixed feelings’. To quote Bowie again, music would express ‘an already familiar emotion as embodied in the music’ (Bowie 2007, 23). This gives an association to programmatic music, or film music when it delivers ‘idiomatic feelings’ to create certain effects on top of a story. Hanslick also makes a comment about this programmatic aspect, as he describes the harmfulness of an attempt to understand music as a kind of language. Hanslick claims, perhaps accurately, that it is those ‘composers of not much creative power, [that] opt for the programmatic significance of music’ (1986 [1854], 43). In this light, Kivy’s idea of ‘garden-variety emotions’ appears as romantic in a simplistic way, forwarding an approach that Hanslick himself criticised a hundred years earlier. Composition would consist of cobbling together existing idiomatic forms.

Perhaps the idea that the quality of music that comes across as emotional would be based on a fixed vocabulary of feelings contributes to some composers’ reluctance to engage with the emotional quality of music. In this scenario, language and emotional forms both appear as already fixed, and extrinsic to each other. It seems to be a simplistic point of departure to merely connect them together in compositional work.

Is there a way to ‘relocate’ the core of singing in a way that accommodates this dualism? Bowie makes the claim that ‘[t]he tone and rhythm of an utterance can be more significant than its “proportional content”, and [that] this already indicates one way in which the musical may play a role in signification’ (2007, 3). This specific formulation points out something important about the significance of embodied expression, but also appears to allow for the existence of such a thing as ‘proportional content’ separate from the dramaturgy of voice.

A very different approach to this relationship was presented by Jean-Jacques Rousseau, who made the claim that both music and language share a common origin, located in voice and gesture. ‘At first only poetry was spoken; there was no hint of reasoning until much later’, he writes in his famous essay ‘On the Origin of Language’ (1996 [1781], 12). Rousseau further depicts passion as the source of singing and speaking: ‘[A]ll voices speak under the influence of passion, which adorns them with their éclat. Thus verse, singing, and speech have a common origin’ (ibid., 50). As a late 18th century enlightenment thinker, Rousseau follows ‘the state of nature’ as a normative guideline. Today, relocating language in something naturalistic in this manner might only lead to further speculations on the origin of language. However, the thinking of Rousseau appears to resonate to some extent well with the attempt to locate the core of music and language in the embodied expression from a compositional standpoint. Vibeke Tellmann reads Rousseau and Roland Barthes in parallel, in an attempt to find common grounds for music and language in the voice. Tellmann mentions that Barthes touches on similar questions to Rousseu in his essay ‘The Grain of the Voice’, identifying the voice as the source of both language and music (2017, 8).7 To quote Barthes directly, he wishes to outline ‘the very precise space (genre) of the encounter between a language and voice’ (1977 [1972], 181). Barthes locates the centre of language in the voice and body, calling this ‘the grain’. Barthes himself describes the grain as ‘the body in voice as it sings, the hand as it writes, the limb as it performs’ (ibid., 188).

Simply defining language as ‘embodied expression’ as opposed to ‘communicating fixed ideas’, would solve little. This could remain within phonetic, or naturalistic, frameworks of voice. In addition, the vocal expression could be regarded as a personal or subjective expression. However, Barthes himself makes the following observations about the theatrical aspect of a verbal expression in the interview ‘From Speech to Writing’:

Not that speech is in itself fresh, natural, spontaneous, truthful, expressive of a kind of pure interiority; quite on the contrary, our speech […] is immediately theatrical, it borrows its turns (in the stylistic and ludic senses of the term) from a whole collection of cultural and oratorical codes: speech is always tactical. (1985 [1981], 3–4)

According to Barthes, the act of speaking is itself social, stylistic and theatrical. One could also depict it as participating in dramaturgical codes. It is not an ‘innocent’, naturalistic expression of the individual. This resonates with the thinking of Ihde, who uses the idea of a ‘dramaturgical voice’. Ihde writes: ‘All language is dramaturgical in a significant sense’ (2007, 196).

Moving away from using the expression ‘emotional’ and rather speaking of the ‘dramaturgical’ has two advantages with respect to a critique of ‘music adding an emotional layer to language’. First, it allows common, non-dualist grounds for the axis of language and voice in vocal music. Second, it allows engagement with these grounds in a non-naturalistic, non-private way. When engaging in vocal dramaturgy as a composer, one thus does not need to engage with ‘feelings attached to content’. Instead, one could regard vocal expression as it relates to common reference points of cultural understanding – an idea which will be explored further in the following subsections.

Myth III: Vocal music is interpretation of a text

There are several inadequacies in the vocabulary often employed when describing the encounter between music and text. On a concert programme, for example, the author and the composer are referred to as the originators of the text and the music, most often without further specifications about the interaction of these elements. It is neither mentioned to which extent a text is quoted nor whether the text is actually ‘used’ or engaged with in other ways.

The idea that vocal music interprets a text presupposes a duality: the text is thought to carry the meaning already, while the music comes to concern the way in which it is expressed. The order in which text and music are written and composed sets the frameworks for the process and, if not approached with awareness, may narrow the possibilities. It could, for example, result in dealing with the text ‘from the outside’ as a narrative. The shortcomings of the tendency of approaching a text in this way, are evident in the difficulty of interpreting an already well-functioning text, for example one's favourite poem, musically. The content appears as already complete in its written form, not necessarily creating space for musical engagement. The experience might be that the text is destroyed in the attempt to interpret it.

What needs to be acknowledged is that the original ‘written dramaturgy’ cannot be preserved. A text always turns into something else in the compositional process; a ‘new’ text is created. There are multiple ways of describing the encounter between an existing text and music. In the following examples ‘x’ stands for the text: a composer could be inspired by x, react on x, deconstruct x, or engage with x. To encounter a text musically one often needs to ‘create space’ for its dramaturgy, for example by quoting only a part of it, treating it non-chronologically, or interpreting its atmosphere rather than all the words as such.8

The dramaturgical possibilities do not only concern the vocal expression as such, but are deeply rooted in cultural context. The Requiem as a musical work illustrates this engagement aptly. It was originally a part of a catholic liturgy, as a Mass of the dead, but later became an independent musical format. The Requiem exhibits a specific rituality, with layers of meaning emerging through the many requiems of history. When, for example, the text from Dies irae is performed – Dies irae, dies illa; Solvet saeclum in favilla; Teste David cum Sibilla – it appears to communicate something else than merely the content of ‘what the text tells us’. One might almost hear all previous requiems echoing through the performed word. Yet another aspect is the ‘mysticism’ of the Latin text that carries the connotation of concealment. Liturgical Latin, an artificial language, was only comprehensible to specific guilds, such as priests, to begin with. Simultaneously, the English translation of the Latin phrases often appears either on a board or in the concert program, affecting the interpretive space: ‘Days of wrath, that day; will dissolve the earth in ashes; As David and the Sibyl bear witness.’ The ‘subtitles’ bring the attention to a more content-centred approach to the text, which creates a new layer to the interpretative space of the work. The translation appears as more referential and explanatory than the rituality of the sung Latin word.

It would be strange to call the Latin text the ‘lyrics’ of the Requiem, or to say that a composer ‘uses’ the biblical text in his or her work. Even saying that a composer ‘interprets’ the text is a stretch, as what is interpreted – or responded to – is a whole history of requiems. One could perhaps say that a composer participates in the Requiem by composing his or her own version of it – a version from the viewpoint of their own time.9 Choosing a position where all vocal expression is connected to a culture and a tradition also turns the process of composition into an activity where one does not simply ‘use’ or ‘interpret’ text to express ideas musically, but engages in a complex web of cultural codes.

Myth IV: Classical singing is free from cultural identity

Extending the discussion of the ‘grain’ of the voice under Myth II, this subsection will focus on the culture of singing, based on the concepts of ‘geno-song’ and ‘pheno-song’ as described by Barthes (1977 [1972]). In Western art music, the specific codes of classical singing are sometimes not regarded as part of a historical development, but as pure musical expression serving the music in accordance with an ideal of musical autonomy.

Barthes borrows Julia Kristeva’s concepts of ‘geno-text’ and ‘pheno-text’, which describe a continual movement in a text where the ‘structure’ emerges from the ‘structured’ (Kristeva 1986, 121).10 Within Barthes’ definition geno and pheno come to concern two ideals of singing. The ‘grain’, already discussed, indicating the materiality and roughness of the voice, is found within the geno-song. This is an approach to singing which, according to Barthes, has been overlooked. It is ‘the singing and speaking voice, the space where significations germinate “from within language and in its very materiality”’ (Barthes 1977 [1972], 182). The pheno-song, on the other hand, covers

all of the features which belong to the structure of the language being sung, the rules of the genre, the coded form of the melisma, the composers idiolect, the style of the interpretation: in short, everything in the performance which is in the service of communication, representation, expression, everything which it is customary to talk about, which forms the tissue of cultural values [...]. (ibid., 182)

Cultural codes, rules and meanings form the pheno-song. The voice is subordinated, a medium through which to say or express something.

The definition of pheno-song appears to have explanatory power especially when a clear message is expressed, such as in the German Lied, where the text is to some degree approached ‘from the outside’ as a narrative, performed through tidy, proper rule-bound classical singing. However, pheno is also the ideal of the former examples of the Vocalise and the Requiem, in that the voice is in the service of what is communicated. The focus is on expression that is mediated through strict cultural codes. A singer in a choir, for example, is not to be heard individually, but to contribute to a bigger whole. If a singer needs to inhale where it is not a natural end of a phrase, he does this inaudibly, in a spot where it will not draw attention.

The pheno-ideal appears to correlate to some degree with the idea of subduing nature in the pursuit of sublime order. The embodiment of voice and its restrictive properties are, in many ways, minimised: the shift in vocal register from chest to head voice, for example, is considered vulgar, and is not to be heard. Hanslick may have a similar ideal in mind, when he criticises the singer lowering her voice in range when ‘saying’ something important. In this scenario the music follows the expression of spoken language instead of the opposite (Hanslick 1986 [1854], 42). Hanslick also comments on interruptions in the melodic flow, such as a recitative, as follows: ‘These startle the hearer and behave as if they signify something special, but in fact they signify nothing but ugliness’ (ibid., 43). For Hanslick, music itself – harmony and melody – ought to come first and the vocal part is to be subordinated to the musical point of the venture.

Relating this to the subject at hand, it appears that in the pheno-ideal the embodied voice withdraws, or creates space, so that ‘what is expressed’ can be as present as possible. While Kristeva’s concepts of the geno and the pheno appear to some degree to be essentialist categories, being presented as part of her semiotic system, it is not always clear whether Barthes’ concepts are to be understood in an essentialist way, or whether he merely describes two interpretational approaches to singing: two different ideals. It is, however, clear that Barthes’ concept of the ‘grain’, present in the geno-voice, indicates something in the culture that he claims to have been overlooked. I suggest that the difference drawn between geno-song and pheno-song is most productive if not considered in an essentialist way, but rather as difference of emphasis.

The geno-song may give the impression of being more natural or personal, in the sense that it places emphasis on articulations which come close to speech and are more often located in the chest voice. In the writing of Barthes it is clear, however, that ‘geno’ does not refer to anything personal. He depicts the grain of the Russian bass singer, manifest and stubborn, and writes explicitly as follows:

The voice is not personal: it expresses nothing of the cantor, of his soul; it is not original (all Russian cantors have roughly the same voice), and the same time it is individual: it has us hear a body which has no civil identity, no ‘personality’, but which is nevertheless a separate body. (Barthes 1977 [1972], 182)

This quote shows that the ‘individual body’, for Barthes, is also a ‘cultural body’. The ‘cultural grain’ of singing could perhaps be compared to learning a dialect, an aspect discussed by Tove Dahlberg in a conference presentation (2018).11 The body, with its tiny muscles at work, has adopted a style or a practice through years of participation in a specific expression and musical repertoire.

There are several advantages to looking at classical song as a geno-tradition, too. In any tradition of song, there are unarticulated rules that are not notated, but implicitly understood. Comparing, for example, the decorative melismas in performance of renaissance works by Claudio Monteverdi to the virtuosically sung melismas of the R&B artist Beyoncé, one may find similarities in vocal technique (Habbestad and Ahvenniemi 2018).12

The implicit codes of singing within a tradition, the cultural grain, and how these are present in the shaping of a voice, have not always been taken into account by composers of the classical tradition, while the content – that which is communicated – stands in focus. In wordless music, such as a vocalise, singing may be regarded as communicating melody, or pitch material. It would appear advantageous to promote a conscious interaction with the tradition. This could affect compositional choices, offering ways of specifying the materiality of voice as a part of the expression. The specific ‘cultural grain’ of voice will always be present in vocally performed music.

Myth V: Singing forms a foreground, instrumental music a background

When a singer faces the audience from the stage, standing in front of the ensemble, this suggests that the singer is going to communicate something that is likely to appear as the foreground, while the instruments function as the accompaniment, the background. The constellation on the stage creates specific expectations within a listener, and often correlates with the structures of a musical work, too. This constellation is even mirrored in recordings, where mixing conventions usually place the vocals front and center in the mix. We can refer to a reflective engagement in the encounter between music and language – either as composer or dramaturge – as exploring the middle ground. This implies challenging idiomatic approaches to both language and music.

To understand the mindset behind this constellation, and how it forms the contours of the compositional process, one ought to reverse the assumed order in which a musical work and its context are envisioned to appear. It is not the case that a work is first composed, as a product of mind, and then put in a context within a specific cultural framework, such as a concert situation or a performative practice. The scenic structures and the rituals around performing and listening are already present when the music is made. Maurice Merleau-Ponty’s phenomenological thinking illustrates this, as he takes the embodied, intentional subject as the point of departure – someone who always already finds herself in a concrete and spatial world. Merleau-Ponty presents the example of a football field not existing merely as an object for the player in action: ‘The field itself is not given to him, but present as the immanent term of his practical intentions; the player becomes one with it and feels the direction of the goal [...]’ (1965, 168). In a similar way, the ‘culture of vocal music’ with its complex webs of habits and expectations, including stage constellations, is already present in an immediate way when a composition is made. It forms a framework which one – either knowingly or unknowingly – reacts to.

When exploring the middle ground, one approaches something hidden. Instead of a constellation that dictates ‘now the soprano will sing’ and ‘now the violin will play’, one may try to explore the possibilities of intertwining them, or encountering them from another dimension, for example from the inside of the emerging expression. To cite an example, composition pedagogue Morten Eide Pedersen (1958–2014) writes about ‘unstable’, ‘shimmering’ and ‘nervous’ sound objects, and ‘questioning’ fragments (2005, 11).13 This may challenge both an ordinary, mimetic understanding of language and formalist tendencies within the understanding of music. The culturally determined contours of the ‘vocal singer’ and the ‘instrumental sounds’ start to fade. The instruments are freed from their traditional ties to sound (ibid., 9).14 This can also be seen as relating to composer Helmut Lachenmann’s topology of sounds as presented in Musik als existentielle Erfahrung (1996). Lachenmann shifts our attention to sound types and sound processes, either for single or multiple instruments. He ‘defamiliarises’ sounds from their ordinary instrumental prominence and redirects our focus to such qualities as accents of sound and the timbral colour of a sound fading away.

This activity of ‘defamiliarising’ could be misunderstood to refer to finding mere sonic interaction between instruments and vocal sounds. However, even when the qualities under discussion are described with words such as ‘timbral’, ‘material’, ‘sonic’ or ‘acoustic’, this may be for a lack of better vocabulary. ‘Exploring the middle ground’ is to some extent a critical engagement with the interpretative space. According to Lachenmann, sounds always carry historical implications, and there is no such thing as ‘natural material’ in music (Wellmer 2009, 35).15 In an interview in Tempo magazine, he describes the defamiliarising process as follows:

Such a perspective demands changes in compositional technique, so that the classical base-parameters, such as pitch, duration, timbre, volume and their derivatives retain their significance only as subordinate aspects of the compositional category which deals with the manifestation of energy. (Ryan and Lachenmann 1999, 21)

For Lachenmann composition is a ‘negative’ activity in a positive sense, whereby the composer seeks to engage with something ‘unknown’. Lachenmann writes: ‘It goes back to the idea of renewal: after listening to a work of music, the listener should be a new person’ (ibid., 22).

The connections between musical thinking and dramaturgical space touch on questions of institutional practice, too. The frameworks for a project are not necessarily chosen by an individual composer or participant. If ‘text’, ‘music’ and ‘scenic dramaturgy’ are regarded as strictly separated categories in the process of creating a work, this may create structural resistance to ‘exploring the middle ground’. These parameters are co-dependent. The thinking of Judith Butler sheds light on aspects of this, since for Butler language is based on repetition of norms (Butler 1993). The way we refer to things, or perform language, is not based on individual expression, but norms and structures. As Butler argues: ‘Discursive performativity appears to produce that which it names, to enact its own referent, to name and to do, to name and to make’ (ibid., 107). In this view language functions as a social institution, and its practice both consolidates and is consolidated by existing power dynamics. Language functions as a political force. From the viewpoint of Butler, no language is neutral.

There are often concrete, institutional frameworks that shape the process of composition, such as scenic possibilities and the possibility of applying electronics; in contemporary music, one may use amplification to be able to zoom into sounds and explore their entwinement in new ways, even in terms of minimal dynamics, while more traditional musical practices that embrace idiomatic approaches to singing may consider the use of electronics solely as a means of compensating for a lack of volume in singing. These frameworks are often rooted in cultural values that are present in ways of speaking or acting. When repeating norms (for example, in categories such as ‘text’, ‘music’, or ‘singing’), one does not only refer to specific isolated objects or phenomena, but to a whole tradition, to the ways to which they have been referred to previously. Structural resistance to ‘exploring the middle ground’ is partly due to the lack of an established language that can articulate alternative frameworks.

Language as creation

Each of the five myths discussed raises questions of the possibility of language existing purely as content. This links to the question raised by Bowie, when he asks whether it is possible to ‘establish context-independent criteria for identifying when a piece of language can be understood purely literally, so that metaphorical, perforative, “musical” and other dimensions of language can be separated from it’ (2007, 4). The specific aim of this article has been to observe language in the situations of composition of vocal music, where it is ‘in the making’. A further question, following from this, could be raised as follows: is it possible to explore the middle ground of language outside of musical context, too, and regard its musical and performative sides as, not separate, but as a part of it?

The most extreme approach towards a purely literal understanding of language may be found in the historical attempt of Logical Positivism to rid language of any metaphysical confusion or ambiguity; Rudolf Carnap, for example, posited the idea of a ‘universal language’ to which all language could be translatable, and which would express states of affairs without ambiguity. Such attempts are in line with a scientific ideal of preventing form from confounding content. However, the elimination of ‘form’ would seem to eliminate some of the preconditions that allow meaning to be constituted in the first place.

A counterclaim to the positivist approach is found within Adorno's critique of modern rationality, according to which the relationship between a statement and the world has become circular. Rather than describing the world, language reduces the world to already established categories and ideas. Adorno calls this ‘identity thinking’: the heteronomous and concrete world – which Adorno refers to as the ‘other’ or the ‘unknown’ – is forced into identity with pre-existing ideas. A cat becomes a sample of the abstract concept of ‘cat’, just as a human being becomes an exemplification of ‘human species’. Adorno and Horkheimer argue that the untruth of the modern rationality ‘does not lie in the analytical method, the reduction to elements, the decomposition through reflection [...] but in its assumption that the trial is prejudged’ (2002 [1944], 18). In the end, what is considered ‘real’ is that which corresponds to pre-existing ideas. The circularity stems from the fact that the unknown is interpreted through what is already known, which then becomes confirmed by the unknown.

Adorno’s view may contribute to the subject at hand through a reversed epistemology, where art is considered to carry the potential of indicating something ‘true’ – precisely due to the way it opposes rationality, including the positivist approach to language. Instead of raising counterclaims to existing ideologies, art identifies with the codes of the society it belongs to, and negates them through its form. According to Adorno, in highly developed artworks, ‘form tends to dissociate unity, either in the interest of expression or to criticize art’s affirmative character’ (1997 [1970], 186). Adorno also writes: ‘the most advanced musical works, as a consequence of that very rationalization, emit forces from within themselves that ultimately, perhaps, may heal the wounds that rationalization and perfection have inflicted on the work of art’ (2002, 153). Art offers a critique of the world without withdrawing from it.

Many of the theories discussed here – the phenomenological views of Don Ihde and Merleau-Ponty, the grain of the voice by Barthes, and the institutional focus of Judith Butler – appear to be applicable to the myths discussed. However, Adorno’s thinking, even though primarily concerned with relationships between music and language, appears to be useless when exploring embodied or dramaturgical dimensions of language. This observation raises a relevant question about possibilities of discussing language objectively from within language at all. The reasons for Adorno’s critique of phenomenological thinking, for example, which often forms new words to embrace the precondition of ‘always-already-being-in-the-world’,16 is that it, according to Adorno, functions as an ever-new ideology created by the human subject, projected upon the world. There is no ‘outside’ to modern rationality. According to Adorno, the task of philosophy is not to form new theories, but to perform a ‘self critique of rationality’. This method is continuously present in the writing of Adorno himself, and is evident in his revealing affinities and discussions of the opposite poles of an issue without the text forming a linear, conclusive development.

Helmut Lachenmann's approach to composition offers insight into what the epistemological reversal described above means for artistic choices. He points out that a composer ‘who knows exactly what he wants, wants only what he knows – and that is one way or another too little’ (Ryan and Lachenmann 1999, 24).17 In other words, composition should not be based on ‘delivering musical ideas’ but engaging in an exploration. Lachenmann distances himself from working in a sedimented style or making a ‘sport’ out of contemporary composition: ‘There were, and are, composers who have in this sense reduced the slogan Épatez le bourgeoisie to nothing more than a musical cooking recipe’ (ibid., 21). For Lachenmann, the possibilities of composition lie in the activity that challenges the cultural connotations in the very context they are presented. These are individual for each work.

If the lack of context-independent criteria for deciding when language is to be understood purely literally is accepted as a premise, this creates space for language, too, ‘being in the making’ in a similar way as it would in musical composition. One of the reasons for the difficulty of seeing this engagement may be that one is oriented towards the already constituted meaning, or to quote Merleau-Ponty, ‘to observations about thoughts that have already matured in the person speaking and are at least immanent in the person listening’ (1988, 80). According to Merleau-Ponty, the result is that such theory loses sight of the heuristic value of language. ‘Exploring the middle ground’ of language, thus, must also be a reflective activity, performed in the very contexts it occurs.

Vocal music lacks its voice

Returning to the initial question of interpretation, in the light of the five myths presented, the work of vocal music could be said to already deliver what it says through the ways in which its materials come to mean something within a specific context. Both its strength and its weakness may lie in its inability to be directly reductive to explanatory language. This correlates with the way Adorno depicts the general situation of art: ‘As a thing that negates the world of things, every artwork is a priori helpless when it’s called to legitimate itself to this world’ (1997 [1970], 159). The challenges of discussing what a work of art is about, or what it ‘means’ in literal language, manifest themselves in the potential for misinterpretation, as demonstrated by Sontag.

This does not imply that art should not be discussed in language. There are several ways of creating a dialogue around a work of vocal music that do not give ‘content’ primacy through focussing on a supposed abstract idea that the work expresses. In conclusion, I suggest two possibilities. First, one could discuss what the work does instead of what it says. This could bring the attention to the interpretive space the work opens up through the ways its elements are intertwined and how the work interacts with its socio-historical context. Secondly, one may ask what the urge of the work is – that is, why it is important to make this work, today.18 This presents the possibility to look at the work in its contemporary context, the social implications of its materials, and how the work engages culturally.

Perhaps language needs to challenge itself, and the frameworks of established meaning, when interpreting art. In his essay ‘Music and Language: A Fragment’, Adorno writes: ‘To interpret language means: to understand language. To interpret music means: to make music’ (1998, 3). In the case of vocal music, this sentence could be altered to: to interpret language means: to make language. To overcome the impoverished understanding of how a work of vocal music may speak to us, one ought to interpret it in a way that reflects its complexity. The embodiment and cultural implications of the performing voice, just as well as other musical elements, are not additional to the content, but function as a part of the meaning of the work to begin with.


1The subject of this analysis is centred around vocally performed music in Western culture. However, the aim is not to focus on case studies, but issues that arise from the activity of composition. ‘Composition’ is here designated to refer to an artistically reflective activity that raises questions about the tradition through choice of musical material and method, and seeks new paths.
2In her dissertation Musikalitet i teorien. Om relasjoner mellom musikk og språk (2017), Vibeke Tellmann poses the question of why it is so difficult to talk about how we experience music, and investigates the relationship between music and language from several perspectives, such as the historical and phenomenological, in pursuit of an approach that listens to music and language ‘based on a common sensory mediality’ (ix).
3Even though several theories of the 20th century have explored approaches to language that understands it in a wider context, as a part of our ‘lived world’, such as language games of Ludwig Wittgenstein or linguistic acts of J. L. Austin, the dual approach to language – the split between form and content – still appears as normative in many situations, such as when the relationship between music and language is discussed.
4 Quotes confirmed by Lydia Goehr.
5This does not only apply to voice, but other musical or acoustic elements, too. From a phenomenological perspective the question concerns beginning the reflection ‘from the middle of the world’, observing things through the way they are given to us rather than just objects. According to Martin Heidegger, we first and foremost do not hear complexes of sounds, but a squeaking car, or a sparkling fire (2006 [1927], 163–164).
6Hanslick does not deny the fact that music arouses feelings. Rather, he makes the claim that its content cannot be defined by them.
7Original quote: ‘Barthes identifiserer stemmen som å være en kilde til både musikk og språk. Det vil si at han på mange måter identifiserer et felles opprinnelsessted for musikk og språk som betydningsproduserende uttrykk. Barthes’ essay berører derfor, om enn på andre måter og andre premisser, et liknende spørsmål om språkets og musikkens opprinnelse som er utgangspunktet for Rousseaus og Herders essays.’
8Taking this thought experiment further, a text which is extracted from a composition after it is made, can function as its own type of ‘poetry’ which seldom works well on its own. In popular music, for example, the written lyrics could appear as ‘Oh, oh, yes, padam,’ and seem nonsensical. However, as discussed in reference to Ravel's Vocalise, the lack of meaningful text does not suggest a lack of meaning altogether. The vocal expression comes to mean something within a cultural framework.
9Art historian Gunnar Danbolt pulls the Requiem out of its specifically religious context as he writes: ‘One has to remember that the Middle Age Hell, especially how Dante pictures it, is a blueprint of our own world. This work is equally much about the fear that Hell could also “hit us”, here and now’ (my translation). Further, Danbolt draws a parallel to what is going on in Syria today, and a world filled with war, conflict and threatening climate change (Danbolt 2018, 5).
10Julia Kristeva presents geno-text as the process from which text emerges, the ‘language's underlying foundation’, while pheno-text designates the language that ‘serves to communicate’, referring to the formula and structure.
11In her ongoing research, as yet unpublished, Tove Dahlberg further develops Kristina Hagström-Ståhl’s idea of a feminine ‘dialect’. Dahlberg explores the connections between singing style and performative practice from a gender perspective. Dahlberg suggests that cultural practices and expectations socialise a singer to perform in specific, gendered ways. This quote is confirmed by Dahlberg and Hagström-Ståhl.
12See Habbestad and Ahvenniemi (2018) for a discussion of the similarities of a pompous self-presentation in baroque and R&B styles of singing.
13Original quote: ‘I sammensetningen av flyktige, skimrende, “nervøse”, klangobjekter til større utkomponerte forløp, provoserer Sciarrino ved at han i liten grad følger tradisjonelle språkmimetiske koder (som melodi, motiv, tema) som står så sentralt i vårt musikalske forståelsesapparat.’
14Original quote: ‘(...) hvilket instrument som egentlig “eier lyden”, blir tilslørt når den lever så på kanten av eksistensen. Det blir ikke lenger så relevant å bekrefte for seg selv at “nå spiller fiolinen, nå spiller fløyten”.’
15Original quote of Wellmer: ‘[...] wobei Lachenmann zugleich deutlich macht, dass es einen wirklichen “Naturstoff” der Musik gar nicht gibt, weil alles, was wir so nennen könnten, immer schon vorweg mit, sei es musikalischen Bezügen, sei es aussermusikalischen Konnotationen, aufgeladen ist’.
16Martin Heidegger forms expressions such as ‘da-sein, mit-sein and in-der-Welt-sein’, that indicate that one is always already situated in the world and cannot initiate a reflection from an isolated position.
17To avoid misunderstanding, Lachenmann adds: ‘intuition is no substitute for compositional, constructional thinking. And intuition is not the same thing as “instinct”’ (1999, 24).
18The question of the ‘urge’ was posed by opera director Sjaron Minailo for three composers to answer, at a workshop arranged by Bergen National Opera (9 January 2018). See Miller (2018).

