Substantia. An International Journal of the History of Chemistry 2(1): 77-91, 2018 Firenze University Press www.fupress.com/substantia ISSN 1827-9635 (print) | ISSN 1827-9643 (online) | DOI: 10.13128/substantia-42 Citation: J. Wolfe (2018) From idea to acoustics and back again: the creation and analysis of information in music. Substantia 2(1): 77-91. doi: 10.13128/ substantia-42 Copyright: © 2018 J. Wolfe. This is an open access, peer-reviewed article published by Firenze University Press (http://www.fupress.com/substantia) and distribuited under the terms of the Creative Commons Attribution License, which permits unrestricted use, distri- bution, and reproduction in any medi- um, provided the original author and source are credited. Data Availability Statement: All rel- evant data are within the paper and its Supporting Information files. Competing Interests: The Author(s) declare(s) no conflict of interest. Research Article From idea to acoustics and back again: the creation and analysis of information in music1 Joe Wolfe University of New South Wales, Sydney, 2052, Australia E-mail: J.Wolfe@unsw.edu.au Abstract. The information in musical signals – including recordings, written music, mechanical or electronic storage files and the signal in the auditory nerve – are com- pared as we trace the information chain that links the minds of composer, performer and listener. The (uncompressed) information content of music increases during stages such as theme, development, orchestration and performance. The analysis of performed music by the ear and brain of a listener may reverse the process: several stages of pro- cessing simplify or analyse the content in steps that resemble, in reverse, those used to produce the music. Musical signals have a low algorithmic entropy, and are thus readily compressed. For instance, pitch implies periodicity, which implies redundancy. Physi- ological analyses of these signals use these and other structures to produce relatively compact codings. At another level, the algorithms whereby themes are developed, har- monised and orchestrated by composers resemble, in reverse, the means whereby com- plete scores may be coded more compactly and thus understood and remembered. Features used to convey information in music (transients, spectra, pitch and timing) are also used to convey information in speech, which is unsurprising, given the shared hard- and soft-ware used in production and analysis. The coding, however, is different, which may give insight into the way music is understood and appreciated. Keywords. Information, music, composition, cognition, coding. INTRODUCTION1 Many digital recordings encode microphone signals as 16 bit numbers, which gives a dynamic range (maximum signal range/digitisation step) of 216 = 96  dB. The signal is sampled at 44.1  kHz. This gives a data transmission rate of 706,000 bits per second or 706  kBaud per channel, not counting error correction bits. A traditional compact disc (CD) can store about a thousand megabytes of data: enough to store several hundred novels, or about eighty minutes of uncompressed recorded music. This raises the questions: Where do all these data come from? How much is provided by the composer, by the players and the instruments? 1 This paper was originally presented and published as a plenary lecture at the Eighth Western Pacific Acoustics Conference, Melbourne, 2003. (C. Don, ed.) Aust. Acoust. Soc., Castlemaine, Australia. 78 Joe Wolfe What happens to that torrent of data when it reaches the listener? The rate delivered by a stereo CD – about one and a half million bits per second – appears to be equivalent to a novel every several seconds. Can our ears and brains cope with such a rate? And finally: Why do we like it? As a composer and physicist, I try here to address these questions from both sides. I suggest some answers, and indicate where research is currently look- ing for others. Data compression Data files can usually be simplified or compressed because they contain much redundancy. For instance, a CD could contain 75 minutes of 1  kHz test tone. This is redundancy on a scale of 1  ms: to a suitably sophis- ticated receiver, the signal could be sent as the text instruction “p  =  (1  mPa)  sin  (2pi*t/ms), 0  <  t  < 4500  s”, which requires only 352 bits in ASCII. For an example of redundancy on a longer scale, consider “house music” in which short sound segments are sampled and repeated many times. Kolmogorov [1] and Chaitin [2] independently intro- duced algorithmic entropy to quantify the difference between unpredictable and redundant signals. To para- phrase Chaitin, consider two binary numbers: 1011110010001101010110111011000001101010 and 0101010101010101010101010101010101010101. The first “looks” random: it was obtained by toss- ing a coin forty times. The simplest way of transmitting that number is sending the number itself. The second does not “look ” random: it can be reconstructed from the instruction “print ‘01’ twenty times”. That instruc- tion contains more than forty bits of information, but for a very long predictable number, the reproduction instruction may be rather smaller than the number (e.g. the 208 bit instruction “print ‘01’ a million times” pro- duces a 2 million bit output). The algorithmic entropy is proportional to the number of bits of information in the minimum message needed to reconstruct a signal. (It is thus proportional to the log of the number of permuta- tions and consistent with Gibbs’ definition.) The more simple or predictable a signal, the lower its algorithmic entropy and the more it may be compressed. Conversely, the richer in information, the higher the entropy, and the more it resembles a random signal – at least to a receiver that cannot decode it. When sound signals are stored to be heard by humans, they are often compressed using the MPEG (mp3) algorithms. These take advan- tage of masking in human hearing: one frequency band may mask others, so the masked sounds are omitted. A reconstructed MPEG waveform produces an auditory illusion: its waveform has little resemblance to the origi- nal, but it sounds very similar. Recorded music has relatively small algorithmic entropy. Indeed, its underlying order, at several differ- ent levels, is one of its attractions. At the lowest level, there is high redundancy in the waveform. A note with a definite pitch is quasi-periodic: one cycle with the pitch period is followed by many others very like it. Of course, in real, interesting instruments, the periodicity is only approximate: transients and vibrato lead to varying waveforms, as do non-harmonic components in percus- sion and plucked strings. Systems of music notation take advantage of this redundancy. In standard (Western) notation, vertical positions of notes on the staff plus accidentals specify pitches and thus, approximately, frequencies. A discrete set of note symbols, plus a few other data (tempo and articulation), specify durations. Some information about the type of waveform, and much else, is contained in a word at the beginning of the music: the name of the instrument that is to play it. From this relatively small data set, performers and instruments construct complete waveforms. The information content of written music is rela- tively easy to quantify because written music is digital in pitch and in time: relatively small sets of discrete pitches and durations are used. In contrast, performed music is only approximately digital: musicians make fine adjust- ments to the durations and timing and, except for key- board players, adjust the pitch slightly according to con- text. These adjustments contribute to musical interpreta- tion, to which topic we shall return. Fig. 2 shows a short example: the first two phrases of the theme of the slow movement in Mozart’s clarinet concerto. One way of coding it is to sample the pitch regularly in time. The lowest suitable sampling frequen- cy is the metronome marking times the lowest common multiple of its subdivisions. Most simple themes could be adequately sampled at a rate of order 10 Hz. Five octaves (61 notes) covers the range of most orchestral instruments and can be coded with 6 bits (i.e. 61  <  26), so the notes and rests could be coded at about 60 bits-1 (60 Baud). Most notes are longer than the sampling time, how- ever, so this signal can be compressed by coding for the durations of the notes as well as their pitch. Traditional notation does just this, inter alia (Fig. 2b). The bar lines appear to be redundant, but to musicians they also give contextual information relevant to musical expression [3]. They also provide a correction mechanism for accu- mulated errors in duration decoding. 79From idea to acoustics and back again: the creation and analysis of information in music Figure 2c shows how a simplified binary parallel coding can represent those aspects of traditional nota- tion used here. This example has a data content of 266 bits and, over a duration of about 13  s, a transmission rate of only 20 Baud. No correlation between the quan- tity of information and its value is implied, of course: many people consider this 266 bit theme more valuable than, say, a Gbyte of white noise! The encoding used by music sequencers is close to that of music notation. These, the electronic progeny of the musical automata in Fig 1, are computer programs that output signals to synthesisers via a standard Music Industry Digital Interface (MIDI). The MIDI standard transmits data at 31.25 kBaud in serial form. This per- mits parallel voices and a range of instructions, and its design allowed bandwidth for further developments. Alternative coding protocols have been proposed [4]. More sophisticated representations include expression – variations in loudness, amount of vibrato, fine adjust- ments to pitch and to timing [5,6]. Another crude but pragmatic way of computing data content is to look at the data files of note proces- Figure 1. Four digital storage media. (a) The cylinder and comb from a music box play 16 bars from Lara’s Theme (M. Jarre). The 18 tines of the comb have different masses and thus play different notes when struck by spikes on the cylinder. It has 18 parallel channels – circles round the cylinder. The loudness is binary (spike or no spike, note or no note). The timing is in principle analog, but is here quantised in multiples of 1/12 of a bar. The uncompressed data content of this cylinder is therefore 18 x 12 x 16 = 3456 bits. (b) The pianola roll in the background also has parallel binary channels, but the length of the hole determines the time the strings sound before the damper is replaced. In that sense, both duration and timing could be analogue, but again they are quantised in this example. The uncompressed data content is 35,000 bits per metre. (c) Standard Western music notation is (largely) parallel binary digital coding: each line and space (parallel chan- nels) represents a pitch, though that pitch can be varied by sharps and flats. The time coding is encoded digitally in symbols (see Fig 2). This example (The Rite of Spring, I. Stravinsky) has about 30,000 bits on this page, which lasts a few seconds, using a coding somewhat like that in Fig 2c. (d) The CD also carries a binary digital signal (“pit” or “no-pit” in the track) but it is different in all other aspects. The signal is car- ried in serial rather than in parallel, and it encodes numbers that are proportional to the pressure of a sound wave. This CD records about 5  x  109 bits, not counting error correction bits. The storage efficiencies are approximately: a)  5  x  105  bit.kg1, b)  106  bit.kg-1, c)  107  bit.kg-1, d) 3 x 1011 bit.kg-1. The apparatus required for re-creation varies greatly in size: that for (a) is shown (~ 0.01 kg), that for (c) is ~ 104 kg. 80 Joe Wolfe sors. These are to music what word processors are to text, and are widely used by composers and editors to write and to print music (Sibelius and Finale are com- mercial examples). They store written music in digital files that are similar to, but more elaborate than that in Fig 2c. On my hard disc is a 160  kbyte note processor file for a symphonic work. It takes 23 minutes to play, and so its printed score delivers data to the conductor at an average rate of 900  Baud, or 900 bits per second. To achieve the same transmission rate reading this article (not counting figures), one would need to read it at 1100 words per minute. It should be noted that conductors do not absorb all the information in a score in real time. While comparing written music and written text, it is worthwhile contrasting them as well. One differ- ence is cultural: more people can read text than can read music. Even to those literate in both, however, the aural re-creation is more important in music. Most musicians prefer hearing performances to reading scores, whereas I expect that most text-literate people prefer reading nov- Figure 2. Three ways of coding the first four bars of the theme of the slow movement of Mozart’s clarinet concerto. (a) is a semi-log plot of the pitch frequency as a function of time. On the time axis, the larger tics are bars (measures) and the smaller are beats. On the frequency axis, the larger tics are octaves. Notes an octave apart have the same letter name e.g. C5 and C6. The reference frequency is the note called C0, which is currently about 16.3  Hz. The smaller tics are one twelfth of an octave i.e frequency ratio of 21/12 ≅ 1.059). These are called equal-tempered semitones: they correspond to the notes on an electronic keyboard. (b) is essentially traditional notation. The vertical and horizontal axes have been adjusted to make it an exactly semi-log plot by varying the spacing between lines, which may represent 3 or 4 semitones. The shapes of notes are a digitised code for duration that has several advantages over the analog time scale used in (a). (c) is a parsimonious parallel binary coding, which is more akin to traditional notation than to (a). The pitches of notes are shown by their octave (top 3 bits) and the note names (next 3 bits) with the most significant bit at the top. The next 2 bits allow for accidentals (sharps, flats and naturals) that are not needed in this example unless the key signature is omitted. The next bit indicates slurs: whether the note is continu- ous with the preceding one (the curved lines or slurs in (b)). The next bit indicates a rest (silence) of the appropriate length. The next 3 bits show the negative log durations with respect to a whole note. Semibreves, minims, crotchets, quavers and semiquavers (whole, half, quarter, eighth and sixteenth notes) are represented by 000 to 100. 101 is used for a bar line. The final bit allows an increase of 50% in duration (indicated by a dot in (b)). The duration code 111 is reserved as a signal to toggle the coding to text, so that occasional data such as tempo, key signature, expression marks can be added more efficiently. (The unequal spacing of channels is a guide for the eye only). 81From idea to acoustics and back again: the creation and analysis of information in music els (at a rate of several hundred Baud) to hearing them read aloud, at slower rates. In both cases, the auditory transmission contains a great deal more information than does the written version. THE ORIGIN OF INFORMATION IN MUSIC Melodic and harmonic structures are good exam- ples of redundancy. In a high information/ high entropy signal, all pitches would occur in approximately equal numbers and it would be impossible to predict the next note: a high information signal sounds or looks random. Music is ordered2, and this order makes music files com- pressible. The generation of information is easy to follow in (Western) concert music because it is usually written down at several different stages, which may be (i) motifs; (ii) their extension to melody, their transformation and development; (iii) the addition of other voices (usually in harmony or polyphony); and iv) orchestration or arrang- ing. In formal music, this results in an orchestral score. In less formal music, analogous processes may lead to a score that is stored in one or more person’s memory. In improvised music, the entire “score” may never be stored. A motif is a characteristic phrase of several notes. The opening four notes of Beethoven’s fifth symphony is an example, of which more anon. A motif is usually the origin of a musical composition. Several different pitches over a modest pitch range, and allowing for several dif- ferent note durations, implies a possible information content of a few hundred bits. Although the production of this information is dif- ficult to study in detail, textbooks on composition give advice on producing motifs from simpler patterns. Schönberg [7], for example, gives numerous examples of how musically interesting phrases can be constructed from the three notes of a major chord by adding passing notes, repetitions, upbeats, appoggiaturas and alterations of notes. Many composers use comparable techniques to produce melodies. The processes used by human composers are rarely written down, and are difficult to study explicitly [3]. It may seem prosaic to speculate that they are algorithms (as yet unknown) operating on aspects of the composer’s background and stimuli, but to do otherwise seems to 2 Predictability necessarily implies redundancy. Hearing an unknown piece of tonal music from which some notes had been replaced with obvious blanks, many listeners would be able to guess the missing notes with better than chance scores, just as yo_ cou_d gues_ the _issing lette_s in this sentence. lead to Cartesian dualism. A range of explicit automata have been devised to create melodies. A famous exam- ple is the dice music attributed to Mozart, in which cast- ing a die decides among several possible subunits. In electronic versions, a random number generator replac- es the die. Further, while Mozart’s subunits are musi- cal phrases, some composition algorithms start with a scale of notes, some random input and a set of rules. Various automatic composers have thus been devised [8] since Harry Olsen created one in 1951 using rules gen- eralised from the songs of Stephen Foster [9]. Michael Smetanin is an example of a contemporary composer who has used simple rules or algorithms to create musi- cal compositions. It is difficult for an outsider to judge the success of such algorithms per se, however, because there is usually some discretionary intervention by a human at the input or output stage. In ‘Strange Attrac- tions’, Smetanin [10] chose a particular algorithm because it gave melodies that he found attractive. An extreme example of choosing an algorithm and then let- ting nature take its course is ‘White Knight and Beaver’ by Martin Wesley-Smith [11], in which the composer assigns a note to each of the four bases of the DNA code, and then notates musically a section of the genome of the bacterium E. coli3. When other examples are given of tunes created by various algorithms, however, it is usu- ally the case that only the ‘best’ results are presented – so human decision-making has intervened at the output stage. Use of a set of “rules” or fashions to generate com- binations of notes and then a decision about which ones to keep is a simple model for the way some human com- posers work. The “rules” need not be laws (such as “the leading note always rises”4) decreed by some author- ity and observed by composers [12]. Rather they may be habits or tendencies in styles of music. For instance, virtually all composers recognise the octave as the most important and harmonious interval. Even the ‘democ- ratisation’ of intervals by serialist composers leaves the octave as a very special case [13]. In this case there is a physical explanation: the harmonics of a particular note are a subset of those of the note one octave below, so adding an octave does not, or need not, add any new frequency components. In other cases, the “rules” have more complicated origins: for instance, most compos- 3 Does it sound like something that came out of a human colon, one might ask. Well, there are only four notes and they are not discordant. It sounds pleasant and musical, but this listener cannot readily extract a musical meaning. 4 This rule shows a good example of redundancy: if the leading note were always followed by the note above, then an encoding could omit the pitch of the latter, just as one could omit the “u” following “q” in coding English. 82 Joe Wolfe ers confine themselves to scales with twelve semitones to the octave. This has a little to do with the physical basis of harmony [14], but it also has to do with what conventional instruments and players can play, what we are used to hearing, and a series of compromises among consonance and keeping the number of notes small. The “rules” for composition in most styles would be difficult to list specifically, but the musical heritage and educa- tion of the composer must incline him/her towards some patterns and combinations. Composers have a variety of processes (algorithms) for transforming an old motif into a new one, such as inverting it, changing the rhythm, reversing it, changing one or more inter- vals [15]. Perhaps the most important stage in produc- ing a good motif is deciding which of many candidates is good. This process, while difficult to analyse, is at least almost universally comprehensible because many music lovers claim an ability to discern a good theme from a bad. Thus, in one common method of composition, input data and a series of different, often unconscious algo- rithms generate a short phrase or idea with perhaps some tens or hundreds of bits. This may be developed into a longer melody. In written music, the data content increases in proportion with the length of the melody, but many of the extra data thus produced are redundant, in the scientific sense. The “same” motif may be repeat- ed, transposed, inverted and otherwise transformed to create a much larger work. For one example, note the similarity in the two phrases in Fig. 2. For another, con- sider the famous opening phrase of Beethoven’s fifth symphony: . Much is made of this simple phrase: the motif of three quavers followed by a descent of a third is used dozens of times in the begin- ning. Simple modifications of it occur in almost every bar of the movement: it is transposed to different posi- tions in the scale, the final interval is changed to a sec- ond and sometimes a fourth, the last of the quavers sometimes falls, or the whole phrase is inverted in pitch. Further variants appear in the other movements – a remarkable example of much created from little. The redundancy or structure that is created by rep- etition with variation is very common in melodies. In the sixteen bar ‘Freude’ air of Beethoven’s ninth, for example, the phrase of the first four bars is repeated with slight variations in bars five to eight and thirteen to sixteen. This pattern (a,a,b,a) is extremely common, especially in songs. On a larger time-scale, redundancy through explicit repetition is so common that a variety of musical notations exist, including various repeat signs and musical ‘goto’ statements. In formal music, there is often a development sec- tion in which the original idea is variously transformed: it may appear in different keys, different rhy thms, inverted or melodically varied or decorated. The trans- formed phrase is often sufficiently different that a sim- ple coding cannot easily reduce the length of the sim- plest representation. The data contained in such sections are thus created by treating the input data (the initial phrase). The existence of important structures with a variety of time scales5 have made it difficult to formalise or to automate this operation, however. Further, selec- tion among different algorithms and outputs is again an important process. (See the discussions in [3,16].) Adding harmonies and counter melodies to a prin- cipal melodic line adds more data, but in some instances the extra data have relatively great redundancy. A canon is an extreme case, in which the original melody accom- panies itself with a time lag, so the only extra informa- tion required is the period of the delay. In a fugue, the same or a similar melody enters with a delay, and often a symmetry operation, i.e. transposed or inverted in pitch, with doubled or halved tempo. In these cases, and in polyphony, several parallel channels of melody are of approximately equal importance. In much music how- ever, there is one melody (or foreground) of pre-eminent importance and a harmony or accompaniment (middle- ground and background). In many musical styles, the harmony is subject to rules of varying strictness, which to some extent limit the freedom of other voices and thus introduce further redundancy. Students of traditional Western harmony will agree: it often seems that the combination of strict harmony rules and voice ranges, when applied to the melody set in a harmony exercise, allow only a small number of possible ‘solutions’. In many styles of music, the second most important line is the bass. If strict har- mony rules are applied to a given melody and bass line, the possibilities for further parts is severely limited. Altos and tenors in choirs, or the players of second vio- lin or viola sometimes feel that theirs are the ‘left over’ notes and that the result is a part that both more dif- ficult and less satisfying than the top or bottom lines. Strict rules are extreme examples [12], but it is rare that harmony or polyphony is without rules, whether for- mal or informal, rigorous or fuzzy. Thus the generation of the harmony or accompaniment is often aided by the operation of algorithms on the information in the mel- ody [18,19]. Sometimes the harmony is coded in a com- 5 For example, the use of time-series analysis to predict the next note from the previous several notes may work well for short time scales, but is prone to wander rapidly among keys. Reviewed by Dubnov and Assayag [17]. 83From idea to acoustics and back again: the creation and analysis of information in music pact but inexplicit way, such as chord symbols or figured bass. Some of its information (e.g. the chord symbol) is sufficiently important that the composer chooses to specify it, but the octave in which the notes occur, or their timing, is left to the performer. Information other than notes, including articulation, ornamentation and expression marks, may be written above or below the musical staff, to convey information about pitch and duration (e.g. trill, staccato etc.) in ways that are more compact and legible than the explicit nota- tion. Others carry information about loudness, articula- tion and tempo (pp, sfz, accel. etc). Others, particularly in contemporary music, contain instructions about tim- bre or tone colour [20]. Schönberg proposed the devel- opment of Klangfarbenmelodie (tone colour melody) in which changing patterns and structures of timbre would attain a status similar to that of changing pitch in tra- ditional melody. Achievement of this aim might require extra data at a rate of tens or hundreds of bits per second. Some contemporary concert music contains highly spe- cific instructions for performance, sometimes even sever- al instructions per note. Where pitch intervals less than a semitone (microtones) are explicitly required, this is indi- cated by further qualification (half flat etc.). The require- ment for slight pitch adjustments is usually implicit: many musicians do not play exactly tempered scales but, according to musical context, make fine adjustments. One of the most important instructions about tim- bre is the name of the instrument that plays each part. Orchestration, the process of distributing the parts among the instruments of the orchestra, adds further information. However, there is sometimes a high redun- dancy when the same notes are played by different instruments. How many data are stored in an orchestral score? Stravinsky’s “The Rite of Spring” [21] provides an exam- ple of high content: it is written for a large orchestra and often the parts are relatively independent. In some sections, there are more than 40 distinct musical lines, although of course at any instant there is doubling of notes (Fig 1c). Coding just the notes of this score by sampling in time (cf Fig 2a) would require high trans- mission rates – over 100  kBaud – because of the compli- cated rhythms. Traditional coding (Fig 2b) is more eco- nomical, and requires only several thousand Baud6. So a transfer rate of up to several kBaud (equiva- lent to a few hundred words per second) is available to 6 The example cited is from rehearsal mark 11 in [21]. Demisemiquavers with triplets, quintuplets and septuplets at crotchet = 66 require sam- pling at 924  Hz. With 6 bits for pitch, the 31 parts require 172  kBaud. Using a code like Fig 2b, but with several more bits of articulation and expression marking, 200-300 notes per bar require several kBaud. the conductor of such a work, from the score alone. Not all of this is discernible: if one player in a tutti failed to accent a note, or if the bass clarinet and second bassoon exchanged parts, this would probably pass unnoticed. When one is not conducting nor listening to a perfor- mance, there is no need to read a score in real time, and one may spend minutes reading carefully a single page of score, which is played in several seconds. The performer: information input and output Orchestral players usually read only one line, so they receive and process their written parts at rates of up to a few hundred bits per second. Other visual inputs come from the movements by other musicians, especially the conductor’s baton, the leader’s bow and the ‘body lan- guage’ of section leaders. Musicians hear the sound around them, and read the gestures and ‘body language’ of the conductor. This affects their processing of the written information. The interpretation of a dynamic instruction such as forte depends on the ensemble loud- ness at the time. Fine pitch adjustments depend on the prevailing pitch and harmonic context. Players also receive feedback from the interaction with the instru- ment of their hands, arms and mouths – but this is get- ting ahead of the logical order, in which the obvious next question is: how much information does the musi- cian put out? Some instruments have a binary digital compo- nent. In keyboard instruments, and in some percussion, the individual pitches are effectively a finite number of parallel pitch channels. In harpsichords and organs, the keys are strictly digital: a key is either depressed or not, and the player’s control of the loudness of that note is binary. Bach reportedly said, disingenuously, of his organ playing: “There is nothing remarkable about it. All you have to do is hit the right notes at the right time, and the instrument plays itself ” [22]. Bach, who played the viola too, would of course have known that playing a single, beautiful note on such an instrument requires much more than simply starting and stopping at the right time. The exact timing of the depressing and release of keys are analogue parameters of great impor- tance in musical expression. In the piano, another ana- logue parameter is the momentum with which the ham- mer strikes the string. In percussion instruments, there are the complications of the position, speed and angle of the strike. Most woodwind and brass instruments have keys and valves used almost always in a binary way: either depressed or not. This does not however restrict the pitch to discrete values because pitch is also con- trolled by the player’s lips and air pressure. In orches- 84 Joe Wolfe tral string instruments, the pitch is controlled by a con- tinuous parameter (position of the finger stopping the string) plus choice of string. Phrasing and expression are largely supplied by performers. Consciously or unconsciously, musicians decide how to ‘shape’ the phrase. This includes varying the loudness and amount of vibrato of individual notes, and making slight adjustments to indicated durations. A note judged to be important might be given emphasis by increasing the loudness and vibrato, and by increasing its duration slightly beyond the indicated value. This is one notable – and valuable! – difference between a per- formance by a musician and one by a primitive music sequencer. To some extent these elements of interpre- tation are similar among musicians [23] and so they may, to that extent, be codified. Acousticians Friberg, Sundberg and colleagues, in consultation with promi- nent musicians, have induced and formalised perfor- mance rules that add such elements of interpretation to a sequence representing written music [5,24,25]. Their software produces a ‘performance’ that is much more idiomatic and “musical” than that produced by an ordi- nary sequencer. These ideas have inf luenced modern commercial music sequencers7. The instrument: input and output Written music is an incomplete instruction set. To oversimplify, the individual musician reads at typi- cally 100 Baud or less, and outputs time-varying con- trol signals, which may have several times this rate. The instrument outputs an analogue signal. For most mono- phonic instruments the output spectrum is dominated by approximately harmonic components whose fun- damental frequency (which equals the harmonic spac- ing) determines the pitch. The pitch varies in time (with vibrato and with successive notes) and the amplitudes of the spectral components vary in time. The information required to encode this output depends on the fidelity and dynamic range required. It is at this stage that there is a great increase in the data required for encoding. If the performance is recorded uncompressed on a CD, then it results in the same enormous data transfer rate whether it be the intricate orchestration of ‘The Rite of Spring’ or one of the much simpler examples given above. On many instruments, players control several inter- dependent analogue parameters connected with phras- ing, such as vibrato, loudness, and variations in tim- ing and intonation. Performers may also control sev- eral parameters that contribute to the timbre. In string instruments these include bow position, speed and force. In wind instruments, they include blowing pressure, sev- eral aspects of embouchure (e.g. lip tension, jaw position, position of lips on reed) and the shape of the vocal tract. These parameters may be adjusted several times per sec- ond, and each may have several bits of precision. Togeth- er they may contribute up to a few hundred Baud. The instrument, then, is where the data rate increas- es dramatically. But surely the instrument is not creating information? Rather, we could say that the instrument increases the redundancy – creates redundant data – by a large factor: one period of the note is very similar to the preceding one. This oversimplifies a little: two simi- lar hypothetically identical performances by a player – or even by a music sequencer and synthesiser – will not 7 These typically have a range of settings for ‘expressive’ performance, from meccanico to molto espressivo and molto rubato, with varying inter- pretations including straight, swing, Viennese waltz and funk. Figure 3. One information chain, from composer’s original ideas to performed music. The approximate data content is given in bits and kilobytes (1  kbyte ≅ 8000 bits) and the rate of data transfer is given in Baud (1 Baud = 1 bit per second) and kiloBaud. 85From idea to acoustics and back again: the creation and analysis of information in music produce the same waveform, but the differences are not information to be transmitted from composer and player to listener. TRANSMISSION AND RADIATION In performance, instruments radiate sound into the air. These signals, plus background noise, are convoluted by the delays and multiple reflections of the performance venue. This extra information is recognised by listeners who can discern some details about the venue from lis- tening to a recording – the difference between a cathedral and open air is an extreme example. This information contributes feedback to the conductor and players, who in general adapt their performance to the acoustic envi- ronment. For instance, they might play more quietly in a room with a low background noise and more slowly and more marcato in a room with a long reverberation time. A performance creates a sound pressure field: the sound pressure p varies with position vector (r) and time (t). It would take a prodigious number of data to record such a field with a resolution in space and time corresponding to the half-wavelength and half-period of the highest audible frequencies (say 30  µs and 1  cm). Of course, the whole field is not sampled by a single lis- tener, who receives just the sound pressure at each ear (p(r1,t) and p(r2,t)), although the positions of the ears may vary in time as the listener moves his/her head. So each ear receives an analogue signal which, if the level of background noise is sufficiently low, may have the same dynamic and frequency range as the sum of the signals from the instruments. Our imaginary composer, orchestrator, musicians, conductor and performance venue have now delivered to the ear the great data rate mentioned in the introduc- tion. Because of the high signal redundancy, the infor- mation rate or algorithmic entropy rate is considerably lower, but still perhaps impressive. The information has been generated by mental processes of the composer and performers, which we may consider as algorithms – subtle and in many cases not understood – processing inputs from memory, education and culture. The instru- ment has turned this information into the radiated sig- nal, which has been filtered and convolved by the acous- tic environment. It’s now time to follow the signal into the listener’s head. THE ANALYSIS OF INFORMATION The outer and middle ear are, for our purposes, pri- marily acoustic and mechanical impedance transformers that overcome the mismatch between the air of the radi- ation field and the cochlear fluid in the inner ear. (They are also filters, transmitting some frequencies more effectively than others.) The qualitative change occurs in the cochlea of the inner ear in which the input signal – single channel analog – is actively filtered, compressed and converted to parallel digital electrical signals in the auditory nerve. Because of the position-dependent mechanical prop- erties of the basilar membrane, pitch is in part coded by channel: only low frequency waves reach the apical end of the membrane, so nerve fibres from this region carry information about low frequencies. It is also partly cod- ed in rate of firing, at low frequencies at least, because the hair cells are stimulated at the frequency of the motion8. Signal amplitude is also partly coded by chan- nel (some fibres only respond to large signals) and partly by (analog) signal firing rate: overall, larger stimuli pro- duce higher firing rates. The minimum firing rate is not however zero: most neurones have a ‘background firing rate’ – a rate at which they fire in the absence of any sig- nal. This makes a neuron capable of carrying a “nega- tive” signal: if the cell is inhibited by a neighbour, its firing rate falls below the background rate. Lateral inhi- bition among neighbouring cells is useful in amplify- ing small simultaneous differences. Nerves also become less sensitive with continued stimulation, so a changing signal usually has a greater effect than a steady one. For more detail, the reader is referred to reviews of percep- tion and neurobiology [27,28,29,30,31,32]. Coding in the auditory nerve The pulses in the nerve fibres, called action poten- tials9, are binary – either the stimulus is strong enough produce an action potential, which travels along the nerve fibre, or else nothing happens. As in electron- ics, the advantage of digital signals is their immunity to noise and distortion. Nerve fibres are very lossy coaxial cables, so an unamplified signal is substantially lost after transmission of a few millimetres. Many stages of ampli- fication and pulse shaping are conducted by the nerve membrane where it is exposed at the nodes of Ranvier. What is the data transfer rate at this stage? There are about 30,000 nerve fibres or channels, each capable of 8 Experiments with implanted electrodes show that, at low stimulation rates, perceived pitch depends approximately logarithmically on the stimulation rate but also linearly on the electrode position [26]. 9 The voltage inside biological cells is usually tens of mV negative. When nerve cells are stimulated (by briefly making their membrane “insula- tion” leaky), the internal voltage rises ~100 mV before returning to the resting value. 86 Joe Wolfe transmitting a few hundred action potentials per second. If the coding were strictly digital, the data transfer rate would surpass that of a CD. The practical rate is much less, because of redundancy: in part because nearby fibres carry highly correlated signals. What happens to this signal in the brain is difficult to follow directly. The experimental observations of psychophysics include inte- gration, sampling and signal treatment at higher levels. Effects including the active filtering in the basi- lar membrane give rise to the masking of weak signals by strong signals in nearby frequency bands. There are only roughly 30 critical bands so, instead of 30,000 par- allel frequency channels, perception effectively involves only of the order of 30. For an unmasked tone, the just noticeable difference (JND) in sound level is roughly 1  dB. Over a short term dynamic range of 60 dB, this gives about 60 perceptible loudness levels (requiring 6 bits). The JND for frequency may be as small as tenths of a percent for sustained signals, but in our calculation the maximum frequency resolution is limited over most of the range by the temporal sampling rate. The great- est perceptual resolution in time is a few tens of milli- seconds. At this rate, the number of different frequency percepts is about 1000 (10 bits). So there are about 16 bits, sampled at up to 30 times per second, in 30 chan- nels. The product gives data transmission rate of 16 kBaud: a considerable overestimate because the JNDs increase towards the ends of the frequency range and as sampling rate and number of simultaneous stimu- li increases10. Whatever the actual maximum rate, to achieve it would require a signal that, at the perceptual level, had no redundancy or order: a signal that sounded random. Not music. Processing – sorting into notes It is easier to perceive notes (which usually include several or many separate frequency components) than to perceive the individual frequency components of its spectrum. With practice and careful listening, one can distinguish some spectral components in notes in some circumstances11. That naïve listeners rarely do so suggests that we have either a very well-learned or an inbuilt mechanism for combining the various frequen- cy components of a note together and perceiving it as a whole. This capacity is partly explained in terms of two 10 There are further complications such as feedback loops and other control signals which come “downwards” from the brain to the ear, and these affect the “upwards” signals to the brain [33]. 11 Or, conversely, a small number of harmonics may be made sufficiently louder than the rest that they can be identified as separate notes, as in harmonic singing. general properties attributed to the nervous system: that change is more noticeable than lack of change, and that things that change in the same way are often grouped together. Consider a note comprising several harmon- ics: if the pitch of the note changes (either melodically or due to vibrato), then the pitches of all its components change in exact proportion; if the loudness changes, then the loudness of the harmonics also changes. Evi- dently we possess signal processors that group these separate, but similarly changing elements together and identify them as a single note. Instrumental and operatic soloists make use of vibrato to make their notes identifi- able against the sound of the orchestra12. The system works especially well for notes whose spectral components are approximately harmonic, which we identify as having a definite pitch. This capacity may have been important in the evolution of human audi- tion. Many human vocal sounds (the vowels in speech, but also inarticulate cries and screams, whether sung or spoken) have at any instant a definite pitch and spec- tral components which fall in the harmonic series. It is likely that we have evolved hard- and soft-ware capable of identifying vocalised sounds among other sounds that do not have harmonic structure, such as wind noise. The system works so well that we hear missing fundamentals and Tartini tones. Analysis in time The shortest time scale of interest in music is the period of the vibration. This ranges from about 50  μs to 50  ms. For low pitches, the auditory nerve carries some information about pressure variation on this time scale, but while we are aware of pitch, we are rarely aware of the variation in pressure that gives rise to that pitch13. The next time scale is that of transients. When an instrument begins to play a note, there is a short time (tens of milliseconds) over which the amplitudes of the various components vary considerably before ‘settling down’ to establish a relatively unvarying spectrum. These transients are so important to the timbre of a note that different wind instruments are readily confused if the initial and final transients are removed [34]. Tran- sients in musical notes are analogous to plosive conso- 12 This effect is especially useful if some of the harmonics of the soloist occur in a frequency range where the accompanying sounds have rela- tively low level – if we can hear one component clearly, it seems that we can track other components which have the same vibrato and phrasing. 13 A contrabassoon can play Bb0 at 29  Hz. When this note is played loudly, we can just detect a periodic variation as the reed opens and closes 29  times per second. Most of the sound we hear, however, is in the higher harmonics rather than the fundamental. 87From idea to acoustics and back again: the creation and analysis of information in music nants (d, t, g, k, b, p) in speech or singing. In both cases we are capable of concentrating and hearing them with some clarity, but under most circumstances these details are analysed subconsciously. The third time scale (several tens of milliseconds and longer) is that of notes [35,36]. It is at this level that we sense pitch and timing: the basic elements of melody. With little concentration, we can readily be conscious of the rhythm and the pitch, and also of the timbre of the instrument playing it. It is, however, difficult to intro- spect much beyond this: although our ears and their associated low-level processing have coded the various component frequencies and how they vary on the scale of tens of milliseconds, we are usually aware of the sig- nal at a higher level: that of pitches, rhythms and tim- bres. A changing signal is less redundant than a constant one, and our senses reflect this. After a while we no longer notice the sound of the wind, the weight of our clothes, the strange colour of artificial lighting; but we do notice sudden changes in them – changes over time. Similarly, we notice sharp boundaries in a visual image rather than a gradual change between two colours or shades – changes in space or channel. Changes in time are enhanced by the property of nerves to fire more rapidly when first excited than they do during a steady stimulus. Differences in space or channel number are enhanced by neural circuits that effectively subtract the signals from adjacent nerves using lateral inhibition [37]. Pitch sensitivity provides a good example. A single note without vibrato is a steady signal, which is prob- ably carried at all times by the same nerve fibres. A note with vibrato is a varying signal, which is probably car- ried at different times by different nerve fibres. Vibrato makes notes more noticeable, and also makes it easier to identify a single instrument in an ensemble. Timing sensitivity provides another example. We are usually less conscious of the duration and end of the note than the beginning: a variation in the timing of the end of each note is noticed as a change in articulation – some notes more staccato than others; a variation in the timing of the beginning is noticed as a variation in the rhythm, and is more noticeable. Symmetries: the ear and the instrument In this sense, our ears and their associated low-lev- el processing perform a role that is the reverse of that of the instrument: the player controls the note’s pitch, duration and often the timbre; the instrument converts the player’s partly digital, partly analogue parallel sig- nal into a complicated vibration, or equivalently a set of simple (usually harmonic) vibrations in a mechani- cal oscillator (string or air column). These vibrations, often via an impedance transformer (bridge and body of string instruments, bells of brass instruments) cause a pressure wave that is a single analogue signal: p(t). The ear receives a wave p’(t) and, via impedance transformers (the outer and middle ear) this causes a complicated vibration, or equivalently a set of sim- ple, often harmonic, vibrations in a mechanical oscilla- tor (the basilar membrane). These vibrations are sensed and processed, and we perceive the note’s timing, pitch, duration and timbre. The perception of notes is subject to categorisation (i.e. digitisation): when fine differences in pitch are pre- sented, listeners, especially those with musical training, tend to sort them into the discrete notes in a scale [38]. Thus the perception of pitch is partly digital and partly analogue – we perceive a note, but may remark that it was a little sharp or flat. More symmetries: the listener and the composer On time-scales larger than those discussed above, listeners are capable of perceiving structures and fea- tures in music: we may identify (whether consciously or otherwise) themes, harmonies, orchestration etc. This article gives no more than some pointers to research in this area. Sloboda [3] compares the analysis of linguis- tic structure by Chomsky with the analysis of musi- cal structure by Schenker, which uses hierarchies of note groupings and their functions. Some seem general, while others are specific to certain cultures. One way of studying this level of structure is by proposing plausible models and comparing their performance with that of human subjects [39-41]. These processes complete the communication sym- metry. To the extent that the listener hears melodic pat- terns, repeats and transformations of thematic mate- rial, s/he reverses the process of composition and may leave the concert hall humming the themes or ideas that began the whole process. The information transmitted between the minds of composer and listener may differ in detail, but the cod- ing is physiologically similar in the two minds, in that it involves many parallel digital signals in neurones. Between the two, however, the information passes through a coding totally foreign to the operation of the brain – a data-rich, serial, analogue signal. The inter- preters for this foreign signal are the musical instrument in one direction and the ear in the other, whose symme- try is discussed above. The performing musicians direct and supervise translation at one end. The listener has an 88 Joe Wolfe interpretive role that may be the reverse of those of play- er and composer, depending on training and attitude. A discussion of this is beyond our current aim. MUSICAL COMMUNICATION To a communications engineer, music might seem inef f icient and unreliable. Dif ferent listeners may extract different messages from the same signal. Lis- teners may differ with the composer over the question “what is it about?” This does not mean, of course, that it is without meaning or value: the signal is rich in information often input by different people (composer, performers, conductor) so it is not surprising that dif- ferent people extract different subsets of that informa- tion, or interpret it differently. To quote Aaron Cop- land: “‘Is there a meaning to music?’ My answer to that would be, ‘Yes’. And ‘Can you state in so many words what the meaning is?’ My answer to that would be, ‘No’.” [42]. Researchers are however quantifying aspects of the meaning. Schubert [43], for instance, measures emotional responses to music in a two-parameter space and finds reasonably consistent responses, with a reso- lution of a few bits in each direction and a time reso- lution of seconds. This gives a Baud rate not far below that of text being read. In the context of musical enjoyment, the processes of encoding and decoding may be at least as important as any part of the communication. But why do we so enjoy this encoding and decoding? Why have we evolved the capacity for this sophisticated, complicated but imprecise method of communication of abstract ideas? Does musical ability confer survival advantages on indi- viduals possessing it? Why can such abstract communi- cation have powerful emotional effects? These questions are invitations for speculation, but it is interesting to look at them with regard to information coding. Music and speech: similarities and complementarities The physiological hardware used for listening to music and speech is the same, and some of the software may be shared too. Most speech sounds involve vibra- tion of the vocal folds. The time scale of these vibra- tions is shorter than that of nerve or muscle response, so any given vibration is very similar to its predecessor, so the sound produced is usually quasi-periodic. These periodic speech sounds (as well as screams, cries, and moans) have harmonic spectra. The ability to discern a set of harmonic frequency components as an entity, and to track simultaneous changes in that set, is an ability to discern one voice or cry from background sound. It is also much of the ability to follow a melody. On the other hand, the signal codings of speech and music are different. Oversimplifying for the sake of the argument, we could say that they are almost comple- mentary, especially with regard to digitisation. Speech coding is digital in that it uses a discrete set of speech sounds (phonemes). In alphabetic languages, (a subset of) these are all that is recorded, as letters, in the text or transcribed form. Further, they are digitised in percep- tion (i.e. they are perceived categorically [44]). Phonemes are encoded by features of the sound spectrum (for- mants and formant trajectories) and by transients. But in music, transients (especially the way notes start) and fea- tures of the spectrum are together what we call timbre. Most of the ‘text’ of music is notes: digital representation of pitch and timing. These are also perceived digitally (categorically) in music [38]. In speech, however, these features are prosody and (except in tonal languages such as Mandarin and Thai) they are analog variables, which are not notated. So the texts of music and speech use the acoustical features and digitisation in almost comple- mentary ways, as the table shows. I discuss this in great- er detail elsewhere [45]. Why music? The capacity to communicate using sound, wheth- er by speech or more primitive articulations, may have been sufficiently important to select for a suitable capac- ity for sound analysis. This explains (at the evolutionary level) why we have the mechanisms that we use for ana- lysing music. But why do we so use those mechanisms? Why do parents sing to infants? Why do we like and make music? Perhaps signal processing can provide part of the answer. Those who write or use automatic speech recogni- tion software know that it is non-trivial to extract the spectral features, envelope and pitch that carry informa- tion in both speech and music, especially in the pres- ence of background noise. In some cases, however, it may be easier in music. Consider an unaccompanied melody, sung or played by a single instrument, which might be an example of music from our early pre-his- tory. This signal has frequencies that are usually sta- ble during a note, compared with the rapid, continu- ous (i.e. analogue) pitch changes in speech. Rhythms in music are also more regular in music than in speech. In instrumental music and in vocalise (singing without words),  the spectral features change less, and in a more regular way, than they do speech. When we sing to babies [46], is it possible that we are using the reduction- 89From idea to acoustics and back again: the creation and analysis of information in music ist method to teach them how to listen, developing the skills necessary to understand speech? Could music be a game for the ear? Games are often described as models of social behaviour, that develop use- ful mental and physical skills. Games develop reflexes, co-ordination and muscular strength that may con- fer evolutionary advantages. Intellectual and socialising games develop skills that could also confer survival or mating advantages. If speech and signal processing skills enhanced our ancestors’ chances of survival or mating, the game of music may have been selected, whether it were transferred between generations by genetics or culture. The basic skills of sound analysis are subtle and beyond introspection, but that is true of many games: we are no more conscious of how we analyse sounds than we are of the muscular control we used to catch a ball. What we do with these skills is sometimes elaborate, but that is also true of games such as cricket and chess. In games and in music, our enjoyment of neurological exer- cise and challenges seems to require successively more complicated games as our capacities develop. Speech carries the meaning of the words spoken, but it also carries information in the way in which the words are spoken. The rhythms and tempi, subtle pauses and variations in articulation and loudness, the overall register and the changing pitch – all carry information. Information of this latter type gives subtle shades to the meaning conveyed by the words, and it often tells of the speaker’s emotional state. The ability to convey this information distinguishes a good actor from someone who just reads the words. Music also carries expressive information in subtle variations in rhythm and phrasing [24,47], coded in a comparable way [48]. However, an important vehicle for affective infor- mation in speech is prosody. These features, completely omitted in the text of speech, are the dominant features of music, whereas the features used to encode the explic- it information in speech are used, in music, for timbre and are often varied little. I end by inviting the reader to wonder, as I do, whether this may one of the reasons for the attraction and emotional power of music, this pecu- liarly coded, abstract method of communication. ACKNOWLEDGMENT I thank Sten Ternström and Emery Schubert for helpful comments. REFERENCES 1. Kolmogorov, A.N. “Three approaches to the defini- tion of the concept “amount of information”.” (1965) Problemy Peredachi Informatsii, 1, 3-11 (Russian, cited by Chaitin, ibid.) Table 1. Acoustical features of music and speech signals show complementary coding. (Reproduced from [45]). Acoustical feature Music without words Speech Fundamental frequency (when quasi periodic) pitch component of melody pitch component of prosody categorised not categorised notated not notated precision possible variability common Temporal regularities and quantisation on a longer time scale rhythmic component of melody rhythmic component of prosody categorised not categorised notated not notated precision possible variability common Short silences articulation parts of plosive phonemes sometimes notated implicitly notated Steady formants components of instrumental timbre components of sustained phonemes not notated notated not categorised per se categorised Varying formants not widely used components of plosive phonemes — categorised notated Transient spectral details components of timbre components of consonants not categorised categorised sometimes notated notated 90 Joe Wolfe 2. Chaitin, G.J. “Information, Randomness and Incom- pleteness”. (World Science, Singapore, 1987). 3. Sloboda, J. “The Musical Mind” Oxford: Clarendon Press, (1985). 4. Garnett, G. “Music, Signals, and Representations: a Survey” in “Representations of Musical Signals”, de Poli, Piccialii and Roads, eds., (MIT Press, Cam- bridge Mass, 1991) 5. Sundberg, J. “Musical performance: a synthesis-by- rule approach”. Computer Music J. 7, 37-43 (1987). 6. Friberg, A. “Generative rules for music performance: a formal description of a rule system”. Computer Music J., 15, 49-55 (1991). 7. Schönberg, A. “Fundamentals of Musical Composi- tion” (Faber, London, 1967). 8. Schwanauer, S.M. and Levitt, D.A. eds. “Machine Models of Music”, (MIT Press, Cambridge MA, 1993). 9. Cope, D. “Experiments in musical intelligence (EMI): non-linear linguistic-based composition”, Interface, 18, 117-139 (1989). 10. Smetanin, M. “Strange Attractions”, (Sounds Austral- ian, Sydney, 1990). 11. Wesley-Smith, M. “White Knight and Beaver”. (Sounds Australian, Sydney, 1984). 12. Masson, C. “Nouveau Traité des Règles pour la Com- position de la Musique” (1705). (Facsimile edition, Minkoff, Geneva, 1971). 13. Leibowitz, R. “Introduction à la musique de douze sons”. (L’Arche, Paris, 1949) 14. Helmholtz, H.L.F. “On the Sensations of Tone as a Physiological Basis for the Theory of Music”, (1877) English translation by A.J. Ellis, (Dover, N.Y. 1954). 15. Stravinsky, I “The Rite of Spring: sketches 1911-1913. Facsimile reproductions with commentary by R. Craft”. (Boosey & Hawkes, London, 1969). 16. Dirst, M. and Weigend, A.S. “Baroque forecasting: on completing J.S. Bach’s last fugue”, in “Time Series Prediction: Forecasting the Future and Understand- ing the Past” A.S. Weigend and N.A. Gershenfeld, eds. (Addison-Wesley, Reading MA 1993). 17. Dubnov, S. and Assayag, G. “Universal pediction applied to stylistic music generation”, in “Math- ematics and Music” G.Assayag, H.G.Feichtinger and J.F.Rodrigues, eds. (Springer, Berlin, 2002). 18. Maxwell, H.J. “An expert system for harmonizing analysis of tonal music” in “Understanding Music with AI: Perspectives on Music Cognition” M. Bala- ban, K. Ebcioglu and O.Laske, eds.pp 335-353. (MIT Press, Cambridge, MA, 1992). 19. Hild, H., Feulner, J. and Menzel, W. “HARMONET: a neural net for harmonizing chorals in the style of J.S. Bach” in “Advances in Neural Information Proceess- ing Systems, J.E. Moody, S.J. Hanso and R.P. Lipp- man, eds, 4:267-274. Morgan Kauffman, San Mateao, CA (1992). 20. Stone, K. “Music Notation in the Twentieth Century”. (Norton, New York, 1980). 21. Stravinsky, I. “The Rite of Spring” (1921). The example cited is from rehearsal mark 11. Boosey & Hawkes, London (1967). 22. Köhler, J.F. “Historia Scholarum Lipsiensium” (1776), quoted by David, H.T. and Mendel, A. “The Bach Reader”, (Norton, NY, 1972) 23. Repp, B.H. “A constraint on the expressive timing of a melodic gesture: evidence from performance and aesthetic judgment”, Music Perception, 10, 22-242 (1992). 24. Sundberg, J. Fribert, A. and Fryden, L. “Threshold and preference quantities of rules for music perfor- mance”, Music Perception, 9, 71-92 (1991). 25. Juslin, P.N; Friberg, A., Bresin, R. “Toward a compu- tational model of expression in music performance: The GERM model.” Musicae Scientiae. Spec Issue, 2001-2002, 63-122 (2002). 26. Fearn, R., Carter, P. and Wolfe, J. “The perception of pitch by users of cochlear implants: possible signifi- cance for rate and place theories of pitch” Acoustics Australia, 27, 41-43 (1999). 27. Barlow, H.B. in “Physics and mathematics of the nervous system” (Conrad, M, Güttinger, W. and Dal Cin, M., eds) (Springer-Verlag, Berlin, 1974). 28. Møller, A.R. “Auditory Physiology”. (Academic, NY, 1983). 29. Fletcher, N.H. “The physical bases of perception”, Interdisciplinary Sci. Rev., 9, 6-13 (1984) 30. Kandel, E.R. and Schwartz, J.H. Principles of Neural Science, (Elsevier, 1985). 31. Altschuler, R.A., Bobbin, R.P., Clopton, B.M. and Hoffman, D.W. “Neurobiology of Hearing: the Cen- tral Auditory System”. (Raven, NY, 1991). 32. Yates, G.K. “The Ear as an Acoustical Transducer”, Acoustics Australia, 21, 77-81 (1993). 33. Spangler, K.M. and Warr, W.B. “The descending audi- tory system” in “Neurobiology of Hearing” R.A. Alts- chuler et al, eds, pp 27-45, (Raven, NY, 1991). 34. Berger, K.W. Some factors in the recognition of tim- bre, J. Acoust. Soc. Am. 36, 1888 (1963). 35. Warren, R.M., Gardner, D.A., Brubaker, B.S. and Bashford, J.A. “Melodic and nonmelodic sequences of tones: effects of duration on perception”, Music Per- ception, 8, 277-290 (1991). 36. Warren, R.M. “La perception des séquences acous- tiques: intégration globale ou résolution tempo- 91From idea to acoustics and back again: the creation and analysis of information in music relle?” in “Penser les Sons. Psychologie Cognitive de l’Audition” McAdams, S. and Bigand E., eds., Presses Universitaires de France (1994). 37. Shepard, G.M. “Neurobiology”, (Oxford Uni. Press, 1988). 38. Locke, S. and Kellar, L. “Categorical perception in a non-linguistic mode” Cortex, 9, 355-369 (1973). 39. Lischka, C. “Understanding Music Cognition: A Connectionist View” in “Representations of Musi- cal Signals”, de Poli, Piccialii and Roads, eds., (MIT, Cambridge Mass, 1991). 40. Longuet-Higgins, H.C. “Artificial intelligence and musical cognition”, Phil. Trans. R. Soc. Lond. A 349, 103-113 (1994). 41. Longuet-Higgins, H.C. and Lisle, E.R. “Modelling musical cognition”, Contemporary Music Review, 3, 15-27 (1989). 42. Copland, A., “What to listen for in music”. (New American Library, NY, 1967). 43. Schubert, E. “Continuous measurement of self-report emotional response to music” in “Music and emo- tion: Theory and research. Series in affective science.” Juslin, P.N. (Ed); Sloboda, J.A. (Ed), eds. pp 393-414. (Oxford University Press, London, 2000). 44. Clark, J. and Yallop, C. “An Introduction to Phonetics and Phonology” (Blackwell, Oxford, 1990). 45. Wolfe, J. “Speech and music, acoustics and coding, and what music might be ‘for’”. International Con- ference on Music Perception and Cognition, Syd- ney, 2002, K Stevens, D. Burnham, G. McPherson, E. Schubert, J. Renwick, eds. pp 10-13 (2002). www. phys.unsw.edu.au/~jw/ICMPC.pdf 46. Gérard, C and Auxiette, C. “The processing of musi- cal prosody by musical and nonmusical children” Music perception, 10, 93-126 (1992). 47. Mersenne, M. “Harmonie Universelle, contenant la Théorie et la Pratique de la Musique” (1636). (Fac- simile edition, CNRS, Paris, 1975). 48. Banse, R. and Scherer, K.R. “Acoustic profiles in vocal emotion and expression” J. Personality and Social Psychology, 70, 614-636 (1996). Substantia An International Journal of the History of Chemistry Vol. 2, n. 1 - March 2018 Firenze University Press Why Chemists Need Philosophy, History, and Ethics Emulsion Stability and Thermodynamics: In from the cold Stig E. Friberg Finding Na,K-ATPase Hans-Jürgen Apell Mechanistic Trends in Chemistry Louis Caruana SJ Cognition and Reality F. Tito Arecchi A Correspondence Principle Barry D. Hughes1,* and Barry W. Ninham2 From idea to acoustics and back again: the creation and analysis of information in music1 Joe Wolfe Snapshots of chemical practices in Ancient Egypt Jehane Ragai The “Bitul B’shishim (one part in sixty)”: is a Jewish conditional prohibition of the Talmud the oldest-known testimony of quantitative analytical chemistry? Federico Maria Rubino Michael Faraday: a virtuous life dedicated to science Franco Bagnoli and Roberto Livi