Granular synthesis is a technique in which a source sound or waveform is broken into many fragments, often of very short duration, which are then being restructured and rearranged according to various patterning and indeterminacy functions.
If we imagine the simplest possible granular synthesis algorithm in which a precise fragment of sound is repeated with regularity, there are two principle attributes of this process that we are most concerned with. Firstly the duration of each sound grain is significant: if the grain duration if very small, typically less than 0.02 seconds, then less of the characteristics of the source sound will be evident. If the grain duration is greater than 0.02 then more of the character of the source sound or waveform will be evident. Secondly the rate at which grains are generated will be significant: if grain generation is below 20 hertz, i.e. less than 20 grains per second, then the stream of grains will be perceived as a rhythmic pulsation; if rate of grain generation increases beyond 20 Hz then individual grains will be harder to distinguish and instead we will begin to perceive a buzzing tone, the fundamental of which will correspond to the frequency of grain generation. Any pitch contained within the source material is not normally perceived as the fundamental of the tone whenever grain generation is periodic, instead the pitch of the source material or waveform will be perceived as a resonance peak (sometimes referred to as a formant); therefore transposition of the source material will result in the shifting of this resonance peak.
The following example exemplifies the concepts discussed above. None of Csound's built-in granular synthesis opcodes are used, instead schedkwhen in instrument 1 is used to precisely control the triggering of grains in instrument 2. Three notes in instrument 1 are called from the score one after the other which in turn generate three streams of grains in instrument 2. The first note demonstrates the transition from pulsation to the perception of a tone as the rate of grain generation extends beyond 20 Hz. The second note demonstrates the loss of influence of the source material as the grain duration is reduced below 0.02 seconds. The third note demonstrates how shifting the pitch of the source material for the grains results in the shifting of a resonance peak in the output tone. In each case information regarding rate of grain generation, duration and fundamental (source material pitch) is output to the terminal every 1/2 second so that the user can observe the changing parameters.
It should also be noted how the amplitude of each grain is enveloped in instrument 2. If grains were left unenveloped they would likely produce clicks on account of discontinuities in the waveform produced at the beginning and ending of each grain.
Granular synthesis in which grain generation occurs with perceivable periodicity is referred to as synchronous granular synthesis. granular synthesis in which this periodicity is not evident is referred to as asynchronous granular synthesis.
<CsoundSynthesizer> <CsOptions> -odevaudio -b512 -dm0 </CsOptions> <CsInstruments> ;Example by Iain McCurdy sr = 44100 ksmps = 1 nchnls = 1 0dbfs = 1 giSine ftgen 0,0,4096,10,1 instr 1 kRate expon p4,p3,p5 ; rate of grain generation created as an exponential function from p-field values kTrig metro kRate ; a trigger to generate grains kDur expon p6,p3,p7 ; grain duration is created as a exponential funcion from p-field values kForm expon p8,p3,p9 ; formant is created as an exponential function from p-field values ; p1 p2 p3 p4 schedkwhen kTrig,0,0,2, 0, kDur,kForm ;trigger a note(grain) in instr 2 ;print data to terminal every 1/2 second printks "Rate:%5.2F Dur:%5.2F Formant:%5.2F%n", 0.5, kRate , kDur, kForm endin instr 2 iForm = p4 aEnv linseg 0,0.005,0.2,p3-0.01,0.2,0.005,0 aSig poscil aEnv, iForm, giSine out aSig endin </CsInstruments> <CsScore> ;p4 = rate begin ;p5 = rate end ;p6 = duration begin ;p7 = duration end ;p8 = formant begin ;p9 = formant end ; p1 p2 p3 p4 p5 p6 p7 p8 p9 i 1 0 30 1 100 0.02 0.02 400 400 ;demo of grain generation rate i 1 31 10 10 10 0.4 0.01 400 400 ;demo of grain size i 1 42 20 50 50 0.02 0.02 100 5000 ;demo of changing formant e </CsScore> </CsoundSynthesizer>
The principles outlined in the previous example can be extended to imitate vowel sounds produced by the human voice. This type of granular synthesis is referred to as FOF (fonction d'onde formatique) synthesis and is based on work by Xavier Rodet on his CHANT program at IRCAM. Typically five synchronous granular synthesis streams will be used to create five different resonant peaks in a fundamental tone in order to imitate different vowel sounds expressible by the human voice. The most crucial element in defining a vowel imitation is the degree to which the source material within each of the five grain streams is transposed. Bandwidth (essentially grain duration) and intensity (loudness) of each grain stream are also important indicators in defining the resultant sound.
Csound has a number of opcodes that make working with FOF synthesis easier. We will be using fof.
Information regarding frequency, bandwidth and intensity values that will produce various vowel sounds for different voice types can be found in the appendix of the Csound manual here. These values are stored in function tables in the FOF synthesis example. GEN07, which produces linear break point envelopes, is chosen as we will then be able to morph continuously between vowels.
<CsoundSynthesizer> <CsOptions> -odevaudio -b512 -dm0 </CsOptions> <CsInstruments> ;example by Iain McCurdy sr = 44100 ksmps = 16 nchnls = 2 0dbfs = 1 instr 1 kFund expon p4,p3,p5 ; fundemental kVow line p6,p3,p7 ; vowel select kBW line p8,p3,p9 ; bandwidth factor iVoice = p10 ; voice select ; read formant cutoff frequenies from tables kForm1 table kVow,1+(iVoice*15),1 kForm2 table kVow,2+(iVoice*15),1 kForm3 table kVow,3+(iVoice*15),1 kForm4 table kVow,4+(iVoice*15),1 kForm5 table kVow,5+(iVoice*15),1 ; read formant intensity values from tables kDB1 table kVow,6+(iVoice*15),1 kDB2 table kVow,7+(iVoice*15),1 kDB3 table kVow,8+(iVoice*15),1 kDB4 table kVow,9+(iVoice*15),1 kDB5 table kVow,10+(iVoice*15),1 ; read formant bandwidths from tables kBW1 table kVow,11+(iVoice*15),1 kBW2 table kVow,12+(iVoice*15),1 kBW3 table kVow,13+(iVoice*15),1 kBW4 table kVow,14+(iVoice*15),1 kBW5 table kVow,15+(iVoice*15),1 ; create resonant formants byt filtering source sound koct = 1 aForm1 fof ampdb(kDB1), kFund, kForm1, 0, kBW1, 0.003, 0.02, 0.007, 1000, 101, 102, 3600 aForm2 fof ampdb(kDB2), kFund, kForm2, 0, kBW2, 0.003, 0.02, 0.007, 1000, 101, 102, 3600 aForm3 fof ampdb(kDB3), kFund, kForm3, 0, kBW3, 0.003, 0.02, 0.007, 1000, 101, 102, 3600 aForm4 fof ampdb(kDB4), kFund, kForm4, 0, kBW4, 0.003, 0.02, 0.007, 1000, 101, 102, 3600 aForm5 fof ampdb(kDB5), kFund, kForm5, 0, kBW5, 0.003, 0.02, 0.007, 1000, 101, 102, 3600 ; formants are mixed and multiplied both by intensity values derived from tables and by the on-screen gain controls for each formant aMix sum aForm1,aForm2,aForm3,aForm4,aForm5 kEnv linseg 0,3,1,p3-6,1,3,0 ; an amplitude envelope outs aMix*kEnv, aMix*kEnv ; send audio to outputs endin </CsInstruments> <CsScore> f 0 3600 ;DUMMY SCORE EVENT - PERMITS REALTIME PERFORMANCE FOR UP TO 1 HOUR ;FUNCTION TABLES STORING FORMANT DATA FOR EACH OF THE FIVE VOICE TYPES REPRESENTED ;BASS f 1 0 32768 -7 600 10922 400 10922 250 10924 350 ;FREQ f 2 0 32768 -7 1040 10922 1620 10922 1750 10924 600 ;FREQ f 3 0 32768 -7 2250 10922 2400 10922 2600 10924 2400 ;FREQ f 4 0 32768 -7 2450 10922 2800 10922 3050 10924 2675 ;FREQ f 5 0 32768 -7 2750 10922 3100 10922 3340 10924 2950 ;FREQ f 6 0 32768 -7 0 10922 0 10922 0 10924 0 ;dB f 7 0 32768 -7 -7 10922 -12 10922 -30 10924 -20 ;dB f 8 0 32768 -7 -9 10922 -9 10922 -16 10924 -32 ;dB f 9 0 32768 -7 -9 10922 -12 10922 -22 10924 -28 ;dB f 10 0 32768 -7 -20 10922 -18 10922 -28 10924 -36 ;dB f 11 0 32768 -7 60 10922 40 10922 60 10924 40 ;BAND WIDTH f 12 0 32768 -7 70 10922 80 10922 90 10924 80 ;BAND WIDTH f 13 0 32768 -7 110 10922 100 10922 100 10924 100 ;BAND WIDTH f 14 0 32768 -7 120 10922 120 10922 120 10924 120 ;BAND WIDTH f 15 0 32768 -7 130 10922 120 10922 120 10924 120 ;BAND WIDTH ;TENOR f 16 0 32768 -7 650 8192 400 8192 290 8192 400 8192 350 ;FREQ f 17 0 32768 -7 1080 8192 1700 8192 1870 8192 800 8192 600 ;FREQ f 18 0 32768 -7 2650 8192 2600 8192 2800 8192 2600 8192 2700 ;FREQ f 19 0 32768 -7 2900 8192 3200 8192 3250 8192 2800 8192 2900 ;FREQ f 20 0 32768 -7 3250 8192 3580 8192 3540 8192 3000 8192 3300 ;FREQ f 21 0 32768 -7 0 8192 0 8192 0 8192 0 8192 0 ;dB f 22 0 32768 -7 -6 8192 -14 8192 -15 8192 -10 8192 -20 ;dB f 23 0 32768 -7 -7 8192 -12 8192 -18 8192 -12 8192 -17 ;dB f 24 0 32768 -7 -8 8192 -14 8192 -20 8192 -12 8192 -14 ;dB f 25 0 32768 -7 -22 8192 -20 8192 -30 8192 -26 8192 -26 ;dB f 26 0 32768 -7 80 8192 70 8192 40 8192 40 8192 40 ;BAND WIDTH f 27 0 32768 -7 90 8192 80 8192 90 8192 80 8192 60 ;BAND WIDTH f 28 0 32768 -7 120 8192 100 8192 100 8192 100 8192 100 ;BAND WIDTH f 29 0 32768 -7 130 8192 120 8192 120 8192 120 8192 120 ;BAND WIDTH f 30 0 32768 -7 140 8192 120 8192 120 8192 120 8192 120 ;BAND WIDTH ;COUNTER TENOR f 31 0 32768 -7 660 8192 440 8192 270 8192 430 8192 370 ;FREQ f 32 0 32768 -7 1120 8192 1800 8192 1850 8192 820 8192 630 ;FREQ f 33 0 32768 -7 2750 8192 2700 8192 2900 8192 2700 8192 2750 ;FREQ f 34 0 32768 -7 3000 8192 3000 8192 3350 8192 3000 8192 3000 ;FREQ f 35 0 32768 -7 3350 8192 3300 8192 3590 8192 3300 8192 3400 ;FREQ f 36 0 32768 -7 0 8192 0 8192 0 8192 0 8192 0 ;dB f 37 0 32768 -7 -6 8192 -14 8192 -24 8192 -10 8192 -20 ;dB f 38 0 32768 -7 -23 8192 -18 8192 -24 8192 -26 8192 -23 ;dB f 39 0 32768 -7 -24 8192 -20 8192 -36 8192 -22 8192 -30 ;dB f 40 0 32768 -7 -38 8192 -20 8192 -36 8192 -34 8192 -30 ;dB f 41 0 32768 -7 80 8192 70 8192 40 8192 40 8192 40 ;BAND WIDTH f 42 0 32768 -7 90 8192 80 8192 90 8192 80 8192 60 ;BAND WIDTH f 43 0 32768 -7 120 8192 100 8192 100 8192 100 8192 100 ;BAND WIDTH f 44 0 32768 -7 130 8192 120 8192 120 8192 120 8192 120 ;BAND WIDTH f 45 0 32768 -7 140 8192 120 8192 120 8192 120 8192 120 ;BAND WIDTH ;ALTO f 46 0 32768 -7 800 8192 400 8192 350 8192 450 8192 325 ;FREQ f 47 0 32768 -7 1150 8192 1600 8192 1700 8192 800 8192 700 ;FREQ f 48 0 32768 -7 2800 8192 2700 8192 2700 8192 2830 8192 2530 ;FREQ f 49 0 32768 -7 3500 8192 3300 8192 3700 8192 3500 8192 2500 ;FREQ f 50 0 32768 -7 4950 8192 4950 8192 4950 8192 4950 8192 4950 ;FREQ f 51 0 32768 -7 0 8192 0 8192 0 8192 0 8192 0 ;dB f 52 0 32768 -7 -4 8192 -24 8192 -20 8192 -9 8192 -12 ;dB f 53 0 32768 -7 -20 8192 -30 8192 -30 8192 -16 8192 -30 ;dB f 54 0 32768 -7 -36 8192 -35 8192 -36 8192 -28 8192 -40 ;dB f 55 0 32768 -7 -60 8192 -60 8192 -60 8192 -55 8192 -64 ;dB f 56 0 32768 -7 50 8192 60 8192 50 8192 70 8192 50 ;BAND WIDTH f 57 0 32768 -7 60 8192 80 8192 100 8192 80 8192 60 ;BAND WIDTH f 58 0 32768 -7 170 8192 120 8192 120 8192 100 8192 170 ;BAND WIDTH f 59 0 32768 -7 180 8192 150 8192 150 8192 130 8192 180 ;BAND WIDTH f 60 0 32768 -7 200 8192 200 8192 200 8192 135 8192 200 ;BAND WIDTH ;SOPRANO f 61 0 32768 -7 800 8192 350 8192 270 8192 450 8192 325 ;FREQ f 62 0 32768 -7 1150 8192 2000 8192 2140 8192 800 8192 700 ;FREQ f 63 0 32768 -7 2900 8192 2800 8192 2950 8192 2830 8192 2700 ;FREQ f 64 0 32768 -7 3900 8192 3600 8192 3900 8192 3800 8192 3800 ;FREQ f 65 0 32768 -7 4950 8192 4950 8192 4950 8192 4950 8192 4950 ;FREQ f 66 0 32768 -7 0 8192 0 8192 0 8192 0 8192 0 ;dB f 67 0 32768 -7 -6 8192 -20 8192 -12 8192 -11 8192 -16 ;dB f 68 0 32768 -7 -32 8192 -15 8192 -26 8192 -22 8192 -35 ;dB f 69 0 32768 -7 -20 8192 -40 8192 -26 8192 -22 8192 -40 ;dB f 70 0 32768 -7 -50 8192 -56 8192 -44 8192 -50 8192 -60 ;dB f 71 0 32768 -7 80 8192 60 8192 60 8192 70 8192 50 ;BAND WIDTH f 72 0 32768 -7 90 8192 90 8192 90 8192 80 8192 60 ;BAND WIDTH f 73 0 32768 -7 120 8192 100 8192 100 8192 100 8192 170 ;BAND WIDTH f 74 0 32768 -7 130 8192 150 8192 120 8192 130 8192 180 ;BAND WIDTH f 75 0 32768 -7 140 8192 200 8192 120 8192 135 8192 200 ;BAND WIDTH f 101 0 4096 10 1 ;SINE WAVE f 102 0 1024 19 0.5 0.5 270 0.5 ;EXPONENTIAL CURVE USED TO DEFINE THE ENVELOPE SHAPE OF FOF PULSES ; p4 = fundamental begin value (c.p.s.) ; p5 = fundamental end value ; p6 = vowel begin value (0 - 1 : a e i o u) ; p7 = vowel end value ; p8 = bandwidth factor begin (suggested range 0 - 2) ; p9 = bandwidth factor end ; p10 = voice (0=bass; 1=tenor; 2=counter_tenor; 3=alto; 4=soprano) ; p1 p2 p3 p4 p5 p6 p7 p8 p9 p10 i 1 0 10 50 100 0 1 2 0 0 i 1 8 . 78 77 1 0 1 0 1 i 1 16 . 150 118 0 1 1 0 2 i 1 24 . 200 220 1 0 0.2 0 3 i 1 32 . 400 800 0 1 0.2 0 4 e </CsScore> </CsoundSynthesizer>
The previous two examples have played psychoacoustic phenomena associated with the perception of granular textures that exhibit periodicity and patterns. If we introduce indeterminacy into some of the parameters of granular synthesis we begin to lose the coherence of some of these harmonic structures.
The next example is based on the design of example 04F01.csd. Two streams of grains are generated. The first stream begins as a synchronous stream but as the note progresses the periodicity of grain generation is eroded through the addition of an increasing degree of gaussian noise. It will be heard how the tone metamorphosizes from one characterized by steady purity to one of fuzzy airiness. The second the applies a similar process of increasing indeterminacy to the formant parameter (frequency of material within each grain).
Other parameters of granular synthesis such as the amplitude of each grain, grain duration, spatial location etc. can be similarly modulated with random functions to offset the psychoacoustic effects of synchronicity when using constant values.
<CsoundSynthesizer> <CsOptions> -odevaudio -b512 -dm0 </CsOptions> <CsInstruments> ;Example by Iain McCurdy sr = 44100 ksmps = 1 nchnls = 1 0dbfs = 1 giWave ftgen 0,0,2^10,10,1,1/2,1/4,1/8,1/16,1/32,1/64 instr 1 ;grain generating instrument kRate = p4 kTrig metro kRate ; a trigger to generate grains kDur = p5 kForm = p6 ;note delay time (p2) is defined using a random function - ;- beginning with no randomization but then gradually increasing kDelayRange transeg 0,1,0,0, p3-1,4,0.03 kDelay gauss kDelayRange ; p1 p2 p3 p4 schedkwhen kTrig,0,0,3, abs(kDelay), kDur,kForm ;trigger a note (grain) in instr 3 endin instr 2 ;grain generating instrument kRate = p4 kTrig metro kRate ; a trigger to generate grains kDur = p5 ;formant frequency (p4) is multiplied by a random function - ;- beginning with no randomization but then gradually increasing kForm = p6 kFormOSRange transeg 0,1,0,0, p3-1,2,12 ;range defined in semitones kFormOS gauss kFormOSRange ; p1 p2 p3 p4 schedkwhen kTrig,0,0,3, 0, kDur,kForm*semitone(kFormOS) ;trigger a note (grain) in instr 3 endin instr 3 ;grain sounding instrument iForm = p4 aEnv linseg 0,0.005,0.2,p3-0.01,0.2,0.005,0 aSig poscil aEnv, iForm, giWave out aSig endin </CsInstruments> <CsScore> ;p4 = rate ;p5 = duration ;p6 = formant ; p1 p2 p3 p4 p5 p6 i 1 0 12 200 0.02 400 i 2 12.5 12 200 0.02 400 e </CsScore> </CsoundSynthesizer>
The next example introduces another of Csound's built-in granular synthesis opcodes to demonstrate the range of dynamic sound spectra that are possible with granular synthesis.
Several parameters are modulated slowly using Csound's random spline generator rspline. These parameters are formant frequency, grain duration and grain density (rate of grain generation). The waveform used in generating the content for each grain is randomly chosen using a slow sample and hold random function - a new waveform will be selected every 10 seconds. Five waveforms are provided: a sawtooth, a square wave, a triangle wave, a pulse wave and a band limited buzz-like waveform. Some of these waveforms, particularly the sawtooth, square and pulse waveforms, can generate very high overtones, for this reason a high sample rate is recommended to reduce the risk of aliasing (see chapter 01A).
Current values for formant (cps), grain duration, density and waveform are printed to the terminal every second. The key for waveforms is: 1:sawtooth; 2:square; 3:triangle; 4:pulse; 5:buzz.
<CsoundSynthesizer> <CsOptions> -odevaudio -b512 -dm0 </CsOptions> <CsInstruments> ;example by Iain McCurdy sr = 96000 ksmps = 16 nchnls = 1 0dbfs = 1 ;waveforms used for granulation giSaw ftgen 1,0,4096,7,0,4096,1 giSq ftgen 2,0,4096,7,0,2046,0,0,1,2046,1 giTri ftgen 3,0,4096,7,0,2046,1,2046,0 giPls ftgen 4,0,4096,7,1,200,1,0,0,4096-200,0 giBuzz ftgen 5,0,4096,11,20,1,1 ;window function - used as an amplitude envelope for each grain ;(hanning window) giWFn ftgen 7,0,16384,20,2,1 instr 1 ;random spline generates formant values in oct format kOct rspline 4,8,0.1,0.5 ;oct format values converted to cps format kCPS = cpsoct(kOct) ;phase location is left at 0 (the beginning of the waveform) kPhs = 0 ;formant(frequency) randomization and phase randomization are not used kFmd = 0 kPmd = 1 ;grain duration and density (rate of grain generation) created as random spline functions kGDur rspline 0.01,0.2,0.05,0.2 kDens rspline 10,200,0.05,0.5 ;maximum number of grain overlaps allowed. This is used as a CPU brake iMaxOvr = 1000 ;function table for source waveform for content of the grain is randomized ;kFn will choose a different wavefrom from the five provided once every 10 seconds kFn randomh 1,5.99,0.1 ;print info. to the terminal printks "CPS:%5.2F%TDur:%5.2F%TDensity:%5.2F%TWaveform:%1.0F%n",1,kCPS,kGDur,kDens,kFn aSig grain3 kCPS, kPhs, kFmd, kPmd, kGDur, kDens, iMaxOvr, kFn, giWFn, 0, 0 out aSig*0.06 endin </CsInstruments> <CsScore> i 1 0 300 e </CsScore> </CsoundSynthesizer>
The final example introduces grain3's two built-in randomizing functions for phase and pitch. Phase refers to the location in the source waveform from which a grain will be read, pitch refers to the pitch of the material within grains. In this example a long note is played, initially no randomization is employed but gradually phase randomization is increased and then reduced back to zero. The same process is applied to the pitch randomization amount parameter. This time grain size is relatively large:0.8 seconds and density correspondingly low: 20 Hz.
<CsoundSynthesizer> <CsOptions> -odevaudio -b512 -dm0 </CsOptions> <CsInstruments> ;example by Iain McCurdy sr = 44100 ksmps = 16 nchnls = 1 0dbfs = 1 ;waveforms used for granulation giBuzz ftgen 1,0,4096,11,40,1,0.9 ;window function - used as an amplitude envelope for each grain ;(bartlett window) giWFn ftgen 2,0,16384,20,3,1 instr 1 kCPS = 100 kPhs = 0 kFmd transeg 0,21,0,0, 10,4,15, 10,-4,0 kPmd transeg 0,1,0,0, 10,4,1, 10,-4,0 kGDur = 0.8 kDens = 20 iMaxOvr = 1000 kFn = 1 ;print info. to the terminal printks "Random Phase:%5.2F%TPitch Random:%5.2F%n",1,kPmd,kFmd aSig grain3 kCPS, kPhs, kFmd, kPmd, kGDur, kDens, iMaxOvr, kFn, giWFn, 0, 0 out aSig*0.06 endin </CsInstruments> <CsScore> i 1 0 51 e </CsScore> </CsoundSynthesizer>
This chapter has introduced some of the concepts behind the synthesis of new sounds based from simple waveforms by using granular synthesis techniques. Only two of Csound's built-in opcodes for granular synthesis, fof and grain3, have been used; it is beyond the scope of this work to cover all of the many opcodes for granulation that Csound provides. This chapter has focussed mainly on synchronous granular synthesis; chapter 05G, which introduces granulation of recorded sound files, makes greater use of asynchronous granular synthesis for time-stretching and pitch shifting. This chapter will also introduce some of Csound's other opcodes for granular synthesis.