Why is the soundstage portion given a lower weight in 1.5?
I assume the answer is that all IEMs have lousy soundstage, and a high weight would inflate the significance of tiny differences between terrible and awful but marginally less terrible soundstages between IEMs. It seems like ideally there would be separate scoring formulas for headphones and IEMs, but I think they’re trying to do a balancing act with a single formula.
GriffinSilver216, gotta say, Test Bench 1.5 looks like a measured but real shift towards what you suggested. If they took another 7% off FRC and put it on Imaging, it’d be basically your proposal.
So, the Apple AirPods Pro have an H1 chip, which allows you to seamlessly pair them with other Apple devices. Features like Spatial Audio and their companion interface are also only available when using an iOS device (Spatial Audio only works on iOS/iPadOS 14 or higher). They also charge via lightning, which, if you already use an Apple device, makes it so that you don’t need to have a different cable in order to charge them. Aside from Apple-specific features, they also have an ANC, which does an excellent overall job of cutting down noise around you. Simply put, they’re definitely both compatible with the opposing devices, but there’s some features that are specific to the ‘matching’ device, which definitely plays a role in the recommendation here! :)
Thanks for explaining. How does Rtings weight things like Lightning cable vs. higher bass accuracy scores here? Is there a formula like there is for neutral sound? If so, is ANC given more weight for iPhone users than in the general “Best Wireless Bluetooth Earbuds” category?
I think the answer is that a very large part of RTings’ neutral sound score just measures the extent to which an earbud’s tuning matches the Harman In-Ear Target. Because the Buds Pro have a warmer tuning than the Buds+ (and the Hartman Target), RTings registers that as a lower score. In my view, this isn’t ideal, because the Buds Pro tuning actually addresses complaints some audiophiles have about the Harman Target’s midrange. But RTings’ focus is on objective measurement (which is cool and unique, in my view), and I’m not sure if there are better proxies for sound quality that are objectively measurable (at reasonable cost).
If you look up Oratory1990 on Reddit, he/she has posted how to EQ a ton of different products to get to his/her perception of a neutral tuning, which is very popular. I think it’s basically Harman with the upper-mids reined in a bit, similar to Crinacle’s preference curve.
As thomaslmusic55 mentioned, our dummy head is different from the one used in Harman’s research so even measuring the same headphone would give slightly different results between the two dummy heads. That’s why the target also looks a bit different.
Sorry to resurrect an old thread, but I’d suggest including something like– the following language in your Test Results & Methodology pages for Bass Accuracy, Mid Accuracy, Treble Accuracy: “[Bass/mid/treble] accuracy measures how closely headphones follow RTings’ target curve between [X] and [Y] Hz. Our target curve complies with the latest research by Harman. For over-ear headphones, our target curve seeks to reproduce the _____[Harman 2018 Over-Ear Target?]. For in-ears, our target curve mimics the Harman 2019 In-Ear Target. Our dummy head (HMS) is different from the one used by Harman, so our target curve differs slightly from Harman’s, mainly at [frequencies where differences are most notable] for over-ears and [same thing] for in-ears. In most cases, the differences between RTings’ curves and Harman’s should be relatively minor for both over-ears and in-ears.”
For the Sound Profile page, I’d explain that you’re showing the frequency response compensated for the RTings target curve, and add the same explanation about its relationship to Harman’s targets.
Edited 4 years ago: Fixing typos, adding OE language
If the suggestion (which if I understand is allowing users to customize target curves) isn’t feasible, one alternative could be to just have a selection of curves (Harman, Diffuse Curve, Oratory1990, warm Harman, etc.). Sounds like Danegraphics might want something like the Etymotic curve.
That said, I realize the fact that in-ear and over-ear curves are different adds some work here.
Edited 4 years ago: Realized in-ear and over-ear require different curves
Thanks for the consideration!
I do understand the desire for consistency, but consider that in-ears are different from over-ears/on-ears enough that they are basically different products with different use cases. Almost as different as speakers are from headphones in general.
If there were an overall headphone score, I wouldn’t weigh the Sports/Fitness category the same between in-ears and over/on-ears. In fact, Sports/Fitness really can’t be reasonably looked at for over/on-ears almost at all (except for in like, I dunno, cycling classes or something).
So I think that different weights should be used between different product types depending on the features that customers tend to reasonably look for in them.
Regardless, I’m happy that I can at least create my own ratings chart to filter out what I am and am not looking for.
But thank you again for at least considering it!
It looks from the ratings, though, that while all in-ears have lousy soundstage, some are slightly less lousy than others. That strikes me as relevant to sound quality if I’m picking between two in-ears. If the downside of that is that there’s a functional ceiling on how high an in-ear can realistically score, yeah, that’s slightly inelegant, but isn’t what matters the relative score of two in-ears?
Hey danegraphics, nice to see you again!
Thanks again for such detailed recommendations! I think you raise some good points and it does seem to be a good idea to add a test of how to measure how faithful they are to the original signal. We’ve also been considering adding a Cumulative Spectral Decay test! That said, I’ve added this to our TB suggestions for future consideration in a TB update! It’s worth mentioning though that it’s unlikely to be included in TB 1.5 as this test bench has been dedicated to fixing an issue!
With regards to overshoot/ringing amplitude, I’ve passed on this information to our test devs and they mentioned that it’s an interesting idea. It’s something they’ll be looking into! :)
Thanks again for always being so active! :)
Well, with in-ears, FRC becomes a lot more negligible so the same applies but even moreso. Imaging should still be upgraded in importance in place of Distortion, and Soundstage (as measured on RTINGS) should have no bearing on scores for an in-ear, as it bypasses the pinnae. As far as Treble Accuracy being nerfed, as described in the end of the OP, I still think that’s a good idea due to individual anatomy of the inner ear playing a potentially huge part in causing personalized deviations, though it’s possible treble deviations would be minimized due to lack of interaction with the pinnae.
Interesting, so like Bass 27%, Mid 27%, Imagine 24%, Treble 17%, WHD 5%? Something like that? I take it peaks/dips aren’t an important factor?
I’m a lot less knowledgeable, frankly, but insofar as FRC is more a matter of taste than are “technicals,” I think there’s a lot in these suggestions that sounds right to me.
GriffinSilver216: Would your recommendations be the same for in-ear truly wireless as well as for other types of products?
Hey! Thanks for asking!
It just has to do with the total number of headphones. We are a small team with limited resources and we retested 216 headphones with that update. We try to select the models we think most users are searching for and this pair just didn’t make the cut.
That makes sense generally, and appreciate how many headphones are out there. But ER4XR seems like a pretty high-profile item in the audiophile world (not that everyone loves it, but everyone’s aware of it and has opinions about the sound).
Actually, to put a finer point on this, I would not only test attack and fade/decay, but would make weight technical ability more heavily than tonality (i.e. frequency response), since preferences for tonality (Harman, Diffuse Field, etc.) is probably much more diverse than preference for speedy attack and a graceful, concave decay.
I assume the answer is that all IEMs have lousy soundstage, and a high weight would inflate the significance of tiny differences between terrible and awful but marginally less terrible soundstages between IEMs. It seems like ideally there would be separate scoring formulas for headphones and IEMs, but I think they’re trying to do a balancing act with a single formula.
GriffinSilver216, gotta say, Test Bench 1.5 looks like a measured but real shift towards what you suggested. If they took another 7% off FRC and put it on Imaging, it’d be basically your proposal.
Thanks for explaining. How does Rtings weight things like Lightning cable vs. higher bass accuracy scores here? Is there a formula like there is for neutral sound? If so, is ANC given more weight for iPhone users than in the general “Best Wireless Bluetooth Earbuds” category?
I think the answer is that a very large part of RTings’ neutral sound score just measures the extent to which an earbud’s tuning matches the Harman In-Ear Target. Because the Buds Pro have a warmer tuning than the Buds+ (and the Hartman Target), RTings registers that as a lower score. In my view, this isn’t ideal, because the Buds Pro tuning actually addresses complaints some audiophiles have about the Harman Target’s midrange. But RTings’ focus is on objective measurement (which is cool and unique, in my view), and I’m not sure if there are better proxies for sound quality that are objectively measurable (at reasonable cost).
If you look up Oratory1990 on Reddit, he/she has posted how to EQ a ton of different products to get to his/her perception of a neutral tuning, which is very popular. I think it’s basically Harman with the upper-mids reined in a bit, similar to Crinacle’s preference curve.
Sorry to resurrect an old thread, but I’d suggest including something like– the following language in your Test Results & Methodology pages for Bass Accuracy, Mid Accuracy, Treble Accuracy: “[Bass/mid/treble] accuracy measures how closely headphones follow RTings’ target curve between [X] and [Y] Hz. Our target curve complies with the latest research by Harman. For over-ear headphones, our target curve seeks to reproduce the _____[Harman 2018 Over-Ear Target?]. For in-ears, our target curve mimics the Harman 2019 In-Ear Target. Our dummy head (HMS) is different from the one used by Harman, so our target curve differs slightly from Harman’s, mainly at [frequencies where differences are most notable] for over-ears and [same thing] for in-ears. In most cases, the differences between RTings’ curves and Harman’s should be relatively minor for both over-ears and in-ears.”
For the Sound Profile page, I’d explain that you’re showing the frequency response compensated for the RTings target curve, and add the same explanation about its relationship to Harman’s targets.
If the suggestion (which if I understand is allowing users to customize target curves) isn’t feasible, one alternative could be to just have a selection of curves (Harman, Diffuse Curve, Oratory1990, warm Harman, etc.). Sounds like Danegraphics might want something like the Etymotic curve.
That said, I realize the fact that in-ear and over-ear curves are different adds some work here.
It looks from the ratings, though, that while all in-ears have lousy soundstage, some are slightly less lousy than others. That strikes me as relevant to sound quality if I’m picking between two in-ears. If the downside of that is that there’s a functional ceiling on how high an in-ear can realistically score, yeah, that’s slightly inelegant, but isn’t what matters the relative score of two in-ears?
Love this idea!
Interesting, so like Bass 27%, Mid 27%, Imagine 24%, Treble 17%, WHD 5%? Something like that? I take it peaks/dips aren’t an important factor?
I’m a lot less knowledgeable, frankly, but insofar as FRC is more a matter of taste than are “technicals,” I think there’s a lot in these suggestions that sounds right to me.
GriffinSilver216: Would your recommendations be the same for in-ear truly wireless as well as for other types of products?
Seems like a no-brainer.
That makes sense generally, and appreciate how many headphones are out there. But ER4XR seems like a pretty high-profile item in the audiophile world (not that everyone loves it, but everyone’s aware of it and has opinions about the sound).
Actually, to put a finer point on this, I would not only test attack and fade/decay, but would make weight technical ability more heavily than tonality (i.e. frequency response), since preferences for tonality (Harman, Diffuse Field, etc.) is probably much more diverse than preference for speedy attack and a graceful, concave decay.
This seems like a pretty good idea. Attack and also decay.
Strongly agree re the headphone comparison. I think the feature is a great idea, but as implemented, it’s not very usable.