4 – Listening Tests

4.1           Array Comparison

The three recording arrays being compared are made up of the Multi Microphone Array and two Soundfield derivations which were discussed in Sections 3.3 and 3.5.1 respectively. This section outlines how these arrays were compared. The Multi Microphone Array will be referred to as MMA while the Soundfield in ITU specification and Soundfield in adjusted specification will be referred to as SFI and SFA respectively.

The arrays will be compared in pairwise comparison where arrays will be presented in pairs and participants will be asked questions based on each pair. To test all possible combinations of each array when grouped into pairs, the following pairwise comparisons will be made:

1) MMA vs. SFI

  • The Multi Microphone Array versus the Soundfield derivation in ITU specification will highlight if the default settings of the Harpex-B decoding is preferable.

2) MMA vs. SFA

  • The Multi Microphone Array versus the adjusted Soundfield derivation will highlight if the post recording benefits of the Soundfield system translates into significant preference for it.

3) SFI vs. SFA

  • By comparing the two Soundfield derivations, differences between them can be established. As the front of the image for both derivations is the same, results here will focus on how the difference in rear pickup will influence preference.

The results of the comparisons one and two can have a bearing on the results of comparison three. For example, if the MMA is preferred in comparison one and SFA is preferred in comparison two, then SFI should not be preferred in comparison three as shown in Equation 4.1.


Equation 4.1 – Expected Results

Two pieces of music were selected for testing. Each differed in performance characteristics such as perceived loudness and dynamic range. Two twenty second sections of each piece were selected for use in the test. Each section was sourced from distinct parts of the musical piece. The aim of using these was to establish whether the difference in performance characteristics had an influence on the results. Table 4.1 outlines the comparison of the arrays used. As there are three comparisons per set, a Bonferroni correction of three was be applied to the statistical analysis which is detailed in full in Appendix B.1.

Piece 1, section 1 Piece 1, section 2 Piece 2, section 1 Piece 2, section 2

Table 4.1 – Extract Comparison

4.2           The Listening Room

Studio D of the Newton building is the University of Salford’s audio post production and mastering suite. Equipment applicable to this project includes:

  • Blue Sky Sat 6.5 MkII surround monitoring system including LFE.
  • Digidesign C24 control surface with 5.1 monitor controller.

Studio D is isolated from the building structure and other studios by way of acoustic damping to help eliminate external noise influencing the environment within the studio. It has been designed to minimise the effects of room modes and is acoustically treated to provide an accurate monitoring environment (Salford University, 2014). The studio’s dimensions are shown in Figure 4.1.


Figure 4.1 – Studio D dimensions (H. Mattsson, personal communication, 15 April 2014)

4.2.1       ITU-R BS.1116-1 Room Standards

This section outlines the room requirements with respect to background noise, room dimensions and speaker setup given by the ITU-R BS.1116-1 subjective audio testing standard and outlines the compliance of Studio D in these respects.      Area

The room floor area for multichannel playback situations should be between 30m2 and 70m2. The floor area of Studio D is 15.86m2 so does not meet this requirement (ITU, BS.1116-1, 1997, p. 10).      Shape

The listening room should be vertically symmetrical in the centre of the room, perpendicular to the listening position which studio D complies with (ITU, BS.1116-1, 1997, p. 10). The room should also be in a rectangle or trapezoid shape (ITU, BS.1116-1, 1997, p. 10). Studio D complies with the former; however, the corners on the listening end are tapered to facilitate a door way in one corner and room symmetry in the other. These characteristics are shown in Figure 4.1.      Proportions

ITU-R BS.1116-1 supplies three formulae to assess the suitability of the room’s dimensions (ITU, BS.1116-1, 1997, p. 11). Studio D fails the first, as can be seen in Equation 4.2; however, it satisfies the remaining equations which are detailed in Equation 4.3 and Equation 4.4.


Equation 4.2 – BS.1116-1 Room Dimensions Calculation and Result 1


Equation 4.3 – BS.1116-1 Room Dimensions Calculation and Result 2


Equation 4.4 – BS.1116-1 Room Dimensions Calculation and Result 3      Background Noise


Figure 4.2 – Background Noise Measurement

The background noise of the testing venue is recommended to be below the NR10 noise rating curve (ITU, BS.1116-1, 1997, p. 14). Figure 4.2 shows the actual reading versus the ITU recommendations. The measurements were made by configuring Studio D for the listening test and switching off all unrequired equipment.      Listening Level

The ITU-R BS.1116-1 calculations state that for playback over a five speaker system, playback levels were to be calibrated to 83.6dBA, as shown in Equation 4.5 (ITU, BS.1116-1, 1997, p. 14). However, playback level was calibrated to 80dBA as per guidelines of the testing venue and with considerations of the findings by the European scientific committee on Emerging and Newly Identified Health Risks (European Commission, 2014). It should be noted that this level adjustment is within the margin described in Section 4.3.4 which outlines the facility for participants to adjust the playback level to a comfortable setting within a 6dB margin above and below the ITU-R BS.1116-1 calculated listening level.


Equation 4.5 – BS.1116-1 Playback Level Calculation and Result      Monitor Placement

The height of monitors should be 1.2 metres high and at least 1 metre from each wall (ITU, BS.1116-1, 1997, p. 15). Studio D meets the former; however, all speakers are within a metre of the wall due to space constraints which is reflected by the room area calculations shown in Section      Listener Position

The distance from the loudspeaker to the listener should be between 2 and 5.1 metres and meet ITU-R BS.775-3 speaker angle specifications (ITU, BS.1116-1, 1997, p. 16). Studio D meets these criteria.

4.3           Delivery

A total of twenty-one volunteers took part in the listening tests. The results of one participant were excluded in post screening as they skipped two sections of the test. Out of the remaining allowable results, participant ages ranged from nineteen to thirty-eight years with an average age of twenty-four years. Twenty males and one female took part. All participants were students at the University of Salford. Subject areas ranged from undergraduate video, audio and acoustic courses to postgraduate audio, acoustics and spatial audio courses.

Participants took part in the listening test individually. Each test took place in Studio D in the Newton Building of the University of Salford. This studio was the most appropriate listening space available during the project time frame to meet the ITU-R BS.1116-1. A Max 6.1 patch was used to play the audio material, collect answers and export them for statistical analysis. A Dell E6530 was used to run the test interface, provide keyboard and mouse input and playback the audio extracts. A Focusrite Saffire Pro 24 firewire audio interface was used to patch into Studio D monitoring equipment via the patch bay. Technical specifications of the Pro 24 interface can be found in Appendix C.

4.3.1       Introduction and Briefing

Participants used the provided keyboard and mouse to fill details such as their age, gender and their occupation. These details would be used to identify any age, gender or occupation based trends providing the test sample offers such opportunities.

Participants were then briefed in what to expect from the test in relation to duration and questioning. An introductory paragraph of text was provided on screen which welcomes them to the session and describes the testing process. They were introduced to the test interface and guided through the training and interface familiarisation section.

4.3.2       Participant Training

With respect to the attribute questions of the test, each definition was supported by two audio extracts which displayed more of and less of each attribute. Participants were then asked to select which extract displayed more of each definition. Once the participant was comfortable in the aspects of the test, the test supervisor left the room. The supervisor used a window, behind the participant, to monitor how they interacted with the interface. If an issue was identified, the supervisor would enter the room to correct it.

4.3.3       Test Interface

Cycling74’s Max 6.1 software was used to create the test interface (Cycling 74, 2014). It is a piece of object based programming software designed for multimedia and interactive applications. It was used to:

  • Provide a means of training of participants,
  • Provide a test interface which will allow for pairwise comparison of two recording extracts,
  • Provide a means to randomise playback of test stimuli,
  • Provide looped surround sound playback of test stimuli,
  • Provide real time switching of test stimuli,
  • Allow for participants to submit their answers to test questions,
  • Provide test data for collation and processing

Details of the patch can be found in Appendix A.

A copy of the pairwise comparison sections was provided to participants, as seen in Figure 4.3. In this section, participants were given details on each part of the test interface. Participants were encouraged to press various buttons to familiarise themselves with the progression of the test. For example, when all ‘Click me!’ buttons were pressed and the answers were confirmed, the patch displayed a box instructing the participant to move to the next section.

With the participant familiar and comfortable with the interface and text questions, they continued to the twelve pairwise comparison sections. All recording extracts used in the test up until this point were derived from a separate performance from the same recording session from what was used in the paired comparison sections. The definitions were supplied to participants on screen and on paper for reference throughout the test.

Figure 4.3 – Example of the layout section

4.3.4       Listening Level Adjustment

If required, the listening level of the playback system could be adjusted to a comfortable level by the listener. The ITU-R BS.1116-1 documentation outlines a reference listening level calculation for testing playback which is outlined Section which also details the alternative use of a maximum level of 80dBA.

Given that the general impressions made by musical sound sources were being assessed rather than defects in audio codecs which ITU-R BS.1116-1 covers, a flexibility of ± 6dBA of the listening level was given to the participants to allow for comfortable listening levels of the dynamic music (ITU, BS.1116-1, 1997, pp. 12, 13). The adjusted playback level of 80dBA outlined in Section is within the ± 6dBA margin and although this facility of level adjustment was offered to participants during each test, all participants were comfortable with the initial playback level of 80dBA.

4.3.5       Completion

With the test complete, the participants were instructed by the test interface program to contact the test supervisor. Participants were thanked for their time with any of their questions answered and requested not to share any information about the test with any waiting participants. Their results were then logged and saved in a master Excel spread sheet.

4.4           Summary

The makeup of the test sample has been described. The testing delivery method and procedure is outlined. The Multi Microphone Array and two derivations of the Soundfield system were compared together using a non-parametric pairwise comparison method. The results given by participants were categorical with respect to their preference between the presented pair of stimuli and for which stimulus in a pair conveyed the most of certain sonic attributes. The testing venue was compliant in most respects of the relevant testing standards.

Next post

5       Statistical Analysis