The Value of Subjective Testing
By Jim Thompson
Many noise control engineers focus on objective tests and the data provided by such tests. This is natural because of the possible distortions of subjective impressions or personal bias. At the same time, many noise control engineers working in areas related to subjective response or sound quality do extensive testing to evaluate subjective responses to noise. In many of these cases, one is trying to turn subjective responses or preferences into objective data.
Most engineers are a little suspicious of subjective data. They have been trained in school to believe in hard data. All their training has been focused on how to work with such data. I thought relating some of my experiences dealing with subjective data – specifically subjective ratings by test drivers might be of interest and might even illustrate some of the strengths and weaknesses of such data.
Let me begin with when I first went to work for a tire company. I had a Ph.D. and nearly 10 years of experience in occupational and building acoustics at that point. Within the first week of my joining the company, I was asked to go on a ride with one of the test drivers on our track. He was doing tire noise evaluations and my boss thought it would be a good experience for me. It was a rude awakening for me. I could not hear most of the sounds he was talking about rating on a 10-point scale. I could hear the difference between the 7.5 and 4.5 rated tires.
One of the smart things I did was I went back to my boss’s office and told him about my inability. He congratulated me on my honesty and suggested that I remember for the rest of my career this vital lesson. Yes, we wanted our tires to be successful for the owner of the vehicle, but we were designing our tires based on the ratings of our own and the automotive manufacturer’s test drivers.
It is also important to note that different auto manufacturers had different test procedures that our test drivers had to match. One company ran tests on six different road surfaces to rank tires. In each case, these were well-defined tests that employed controls and repetition to insure repeatability. In the end, these test drivers did the ratings and could reliably rank differences the public would seldom notice.
This is a good point to note that this reliance on test driver subjective ratings was a strange criterion and could lead to odd circumstances. I feel comfortable in saying that 99% of vehicle owners or occupants would not notice the differences between 6 and 9 (out of 10) rated tires. No tire ever got a 10 rating. However, there was one customer in 100 or 1000 that might hear the noise. This was who was driving the ratings and test procedures.
Once I was put in charge of the acoustics group, my first assignment for every new engineer or technician was to have them go on an evaluation with one of the test drivers. The technicians would come back from that ride and freely admit they could not hear what the test driver was rating. I had to prompt most of the engineers to admit their limitations. For all, it was a good lesson and helped them to understand the difficulty of our job and the skill of the test drivers.
Over time, we all learned to be proficient at subjective evaluations and worked well with the test drivers and product development engineers. The designers developing tires believed in and relied upon these subjective ratings. One of the first major successes I had working for this company was developing algorithms to convert objective sound spectra to the equivalent of test driver subjective ratings. The tire designers still had more faith in the real subjective ratings, but since we could turn around multiple tire tests in the lab in hours compared to days for tests on the track, they often accepted our estimated subjective ratings. Over time they even began to trust them.
So, does this mean I fully accept and believe in subjective ratings? Not in the least. This was a special case of highly trained and experienced test drivers who knew multimillion-dollar programs rested on their ratings. They had strict procedures, were careful, and if they were not sure of their ratings, would rerun multiple sets of tires to be sure.
So, as you may expect, there were some exceptions. During my time at the tire company, we had a very successful all-season passenger tire. It was always a challenge to make all-season tires quiet for several reasons. When we first introduced this tire, it was rated by one auto company as a 7.5. This was the highest rating ever given to an all-season tire. Four years later when we introduced a replacement for this phenomenally successful tire, it was receiving ratings of 4.5.
What had happened? The test drivers had gotten accustomed to the performance of our tire. Our competitors had made improvements. Finally, auto manufacturers are always looking for better performance in noise and everything else. Objectively the tires performed slightly better than when they were introduced due to running improvements, but subjective ratings are subjective. By the way, this meant we had to continuously adjust our objective to subjective conversion algorithms.
I have one example that illustrates an important aspect of subjective ratings. Many years after leaving the company, I was working with a large team to develop a break squeal measurement standard. This took years of work and the participation of brake and brake component manufacturers from around the world. In the end, a highly successful measurement standard was developed, and I wrote a book on how to do this measurement.
Now you are asking what does this have to do with subjective testing? Like tires, brakes are evaluated subjectively for noise performance by most auto manufacturers. Towards the end of our test specification development process, we asked each of the participants to run tests in the lab using the procedure we had developed and subjective tests to assess correlation. When we got together to review the results, there was a good correlation with a few exceptions. The major exception was that one company had little to no correlation with its subjective tests.
There was a lengthy discussion as to why this one company was so different. To make a long story short, they seemed to be following similar test procedures to the other companies. However, when we discussed how frequently their test drivers’ hearing was tested, they were quiet. Since brake squeal commonly occurs at frequencies of 10,000 Hz and up, even moderate hearing damage can adversely affect one’s ability to hear the squeal. I found out later that his company did tester their drivers’ hearing and had to disqualify all from break subjective tests.
In closing, I would say that I have seen cases where subjective testing can be effective and reliable. However, this takes care and rigid adherence to procedures and policies. So, I guess I believe in subjective testing with a lot of qualifications.