Get insider access
Preferred store
Your browser is not supported or outdated so some features of the site might not be available.
TV 2.0 is finally here! With this massive update, we've completely revamped the way we test and score TVs, with an emphasis on how a TV performs in a bright room. You can read about all the changes in our 2.0 changelog.

Big Spring Sale Deal: The Hisense U6/U6N 55" has dropped by $250 on Amazon.com. See all TV Big Spring Sale deals.

Test Bench 2.0
Changelog

Updated
Bright room reflection handling on 12 TVs.

See the previous 1.11 changelog.

Goal

TV 2.0 represents a significant change in our TV testing. The culmination of thousands of hours of research & development, nearly every aspect of our TV reviews has been changed in some way, from test coverage to the individual objective scoring splines. We've also added new reflection tests, changed how we measure color volume, and completely overhauled our response time testing. We've also completely restructured our reviews, as some sections like "Picture Quality" were getting very long, making it hard to find specific sections. We now group related tests together, making it easier to find the information you care the most about.

Summary Of Changes

Structure

One of the first things you'll notice when you open a review on TV 2.0 is that we've completely changed the structure. Some of our old sections, like Picture Quality, were getting very long and filled with tests that not everyone cared about. To remedy this, we've restructured the reviews to show the most important tests first. Instead of 6 main test categories on TV 1.11, we've broken down TV 2.0 into 12 separate sections. These sections more or less match the usages at the top of the review.

Scoring

Comparison of the mixed usage normal distribution on TV 1.11 vs 2.0.
Mixed Usage normal distribution before and after TV 2.0

From individual test scores to usages at the top of the review, TV 2.0 represents the biggest change to our scoring we've ever made. Most TVs score significantly lower, which better represents how they actually perform. The mixed usage score, for example, has decreased by an average of 0.7, but the difference is even more noticeable at the low end of the market, with many cheap TVs now scoring in the deep red. While there can still be a market for some of these TVs, if you care about picture quality and performance, the scoring now better reflects the overall experience with these TVs.

Usage Scores

Usage NameChangeDescription
Mixed UsageUpdatedRevamped weights to use individual scores instead of other usages.
SportsUpdatedSignificantly increased the weight of SDR Brightness, adjusted other weights down.
Video GamesUpdatedRenamed to 'Gaming,' simplified weights.
HDR MoviesUpdatedRenamed to 'Home Theater.' Simplified scoring breakdown.
TV ShowsRemovedReplaced with Bright Room.
HDR GamingRemovedUpdated the Gaming usage to cover HDR as well, so this usage is no longer needed.
PC MonitorRemovedRemoved for now, but we'd like to revisit this if there's enough interest. We're missing a few key tests like text clarity.
Bright RoomNewNew usage. Covers watching TV and movies in a bright room.
BrightnessNewNew performance usage.
Black LevelNewNew performance usage.
ColorNewNew performance usage.
Processing (In Development)NewWe're in the process of improving our tests related to image processing, but this score should give you a general idea of how a TV performs overall with its image processing capabilities.
Game Mode ResponsivenessNewNew performance usage.
Motion Handling (Broken)NewWe're in the process of fixing the way we evaluate a TV's overall motion handling. This section is currently broken, and the score isn't indicative of how well a TV handles motion overall.

Test Changes

Test GroupChangeDescription
Accelerated Longevity TestRemovedNone of the TVs that are running on this test will be updated to 2.0.
ContrastNew MethodologyAdjusted the test pattern to now take two sets of measurements. Adjusted the scoring curve.
BloomingUpdatedRenamed to 'Lighting Zone Precision'. The philosophy of the test has changed, we're now looking at how precisely the TV's local dimming can display a bright highlight without bleeding into surrounding areas, so TVs with no local dimming are now scoring a '0' on this test.
HDR BrightnessUpdatedAdjusted the scoring to only look at peak brightness. ABL and sustained brightness are no longer scored. Adjusted the scoring splines.
HDR Brightness In Game ModeUpdatedAdjusted the scoring to only look at peak brightness. ABL and sustained brightness are no longer scored. Adjusted the scoring splines.
PQ EOTF TrackingUpdatedAdjusted the scoring splines.
SDR BrightnessUpdatedAdjusted the scoring to only look at peak brightness. ABL and sustained brightness are no longer scored. Adjusted the scoring splines.
Color GamutRemoved 
SDR Color VolumeNew 
Color VolumeUpdatedRenamed to 'HDR Color Volume'. Adjusted the scoring splines.
Pre CalibrationUpdatedRenamed to 'SDR Pre-Calibration Color Accuracy'. Scoring weights changed slightly.
Post CalibrationUpdatedRenamed to 'SDR Post-Calibration Color Accuracy'. Scoring weights changed slightly.
HDR Pre-Calibration Color AccuracyNew 
HDR Post-Calibration Color AccuracyNew 
Gray UniformityUpdatedAdjusted the scoring splines and individual weights.
Black UniformityUpdatedAdjusted the scoring splines and individual weights.
Viewing AngleUpdatedAdjust the scoring splines.
ReflectionsRemovedReplaced by the four new tests listed below.
Total Reflected LightNew 
Direct ReflectionsNew 
Ambient Black Level RaiseNew 
Ambient Color SaturationNew 
Panel TechnologyUpdatedAdded SPD chart.
Response TimeNew MethodologyChanged to new pursuit photo. Added heatmap and new transition charts. Replaced 80% and 100% response time measurements with First Response Time, Total Response Time, and Worst 10% Response Time.
CAD In Game Mode @ Max Refresh RateNew 
CAD In Game Mode @ 120HzNew 
CAD In Game Mode @ 60HzNew 
Flicker-FreeUpdatedRenamed to 'Flicker,' and removed score.
Black Frame Insertion (BFI)New MethodologyChanged to new pursuit photo.
StutterUpdatedAdjusted scoring spline and weights.
Variable Refresh RateUpdatedRemoved 1440p tests. Adjusted scoring splines.
Input LagUpdatedRemoved 1440p tests. Changed 144Hz tests to max refresh rate. Adjusted scoring splines.
Supported ResolutionsUpdatedRemoved 1440p tests. Adjusted weights.
HDR Format SupportNewMoved from Inputs Specifications.
Inputs SpecificationsUpdatedMoved HDR tests to new box. Renamed 'HDMI 2.1 Class Bandwidth' to 'HDMI 2.1 Rated Speed.' Removed outdated tests.
Total InputsRemovedCombined with the Inputs Specifications box.
SoundUpdatedRemoved distortion test. Added test for Digital Room Correction.
InterfaceUpdatedRemoved performance tests.
Apps & FeaturesRemoved 
RemoteUpdatedRemoved subjective tests.
TV ControlsUpdatedAdded a 'Mute Switch' test

Detailed Changelog

Usage Scores

Usage scores on TV 1.11
Usage scores under TV 1.11
List of usages and scores on TV 2.0
Updated usages under TV 2.0

We've completely revamped our usage scores. We've removed a few outdated usages and simplified others. We've also added new Performance Usages to TV reviews. These usage scores look at specific aspects of a TV's performance, so if you care about brightness, for example, you just have to look at the Brightness score to understand how well the TV performs in that aspect. Some of these new performance usages are very much a work in progress. Processing is lacking some key tests that we're still working on, and our motion handling usage is decidedly broken. We're including them anyway because we want your feedback. What tests do you think we should feature in those usages? Let us know in the comments!

Usage1.112.0
Mixed Usage
  • 38.6% HDR Movies
  • 18.8% TV Shows
  • 18.7% Sports
  • 10.4% Video Games
  • 8.3% HDR Gaming
  • 5.2% PC Monitor
  • 25% Brightness
  • 22% Black Level
  • 22% Color
  • 14% Processing
  • 7% Game Mode Responsiveness
  • 5% Direct Reflections
  • 3% Stutter
  • 2% Ambient Level Raise
Sports
  • 23% Reflections
  • 19% SDR Brightness
  • 15% Response Time
  • 14% Gray Uniformity
  • 13% Viewing Angle
  • 8% Upscaling: Sharpness Processing
  • 4% Frequency Response
  • 2% Flicker-Free
  • 2% Distortion
  • 50% SDR Brightness
  • 15% Processing
  • 10% Color
  • 10% Viewing Angle
  • 5% Gray Uniformity
  • 5% Worst 10% Response Time
  • 5% Direct Reflections
Video Games - Renamed to 'Gaming'
  • 30% Contrast And Dark Details In Game Mode
  • 20% Input Lag
  • 15% Response Time
  • 10% Reflections
  • 8% SDR Brightness
  • 6% Gray Uniformity
  • 3% Pre Calibration
  • 1% Variable Refresh Rate
  • 1% 4k @ 120Hz
  • 1% 1080p @ 120Hz
  • 1% 4k @ 120Hz Input Lag
  • 1% 1080p @ 120Hz Input Lag
  • 1% ALLM
  • 0.5% VRR
  • 0.5% Frequency Response
  • 0.5% Distortion
  • 45% Game Mode Responsiveness
  • 20% HDR Brightness in Game Mode
  • 17.5% Color
  • 17.5% Contrast and Dark Details in Game Mode
HDR Movies - Renamed to 'Home Theater'
  • 24% Contrast
  • 12% Pre Calibration
  • 10% Low-Quality Content Smoothing
  • 10% HDR Brightness
  • 8% HDR Native Gradient
  • 7% 24p Judder
  • 6% Black Uniformity
  • 5% Upscaling: Sharpness Processing
  • 5% Blooming
  • 4% PQ EOTF Tracking
  • 3% Color Volume
  • 3% Stutter
  • 2% Lighting Zone Transitions
  • 1% Dolby Vision
  • 35% Black Level
  • 25% Color
  • 20% Brightness
  • 15% Processing
  • 5% Stutter
Bright RoomN/A
  • 55% SDR Brightness
  • 20% Direct Reflections
  • 15% Color
  • 5% Ambient Color Saturation
  • 5% Ambient Black Level Raise
BrightnessN/A
  • 70% HDR Brightness
  • 30% SDR Brightness
Black LevelN/A
  • 50% Contrast
  • 30% Black Uniformity
  • 20% Lighting Zone Precision
ColorN/A
  • 40% HDR Color Volume
  • 30% SDR Color Volume
  • 15% SDR Pre-Calibration Color Accuracy
  • 15% HDR Pre-Calibration Color Accuracy
Processing (In Development)N/A
  • 30% Low-Quality Content Smoothing
  • 30% Upscaling: Sharpness Processing
  • 25% PQ EOTF Tracking
  • 15% HDR Native Gradient
Game Mode ResponsivenessN/A
  • 25% Input Lag
  • 20% CAD in Game Mode @ 60Hz
  • 20% CAD in Game Mode @ 120Hz
  • 15% VRR
  • 10% CAD in Game Mode @ max
  • 10% Supported Resolutions
Motion Handling (Broken)N/A
  • 35% Stutter
  • 35% Worst 10% Response Time
  • 15% 24p Judder
  • 10% Lighting Zone Transitions
  • 5% Gray Uniformity

As you can see, we've greatly simplified most usage scores to better reflect what people care about with each usage. We also removed the TV Shows, HDR Gaming, and PC Monitor usages. We'd like to revisit the PC Monitor usage in a future update, but we're missing a few key tests that people care about.

Contrast

The changes to our contrast test in TV 2.0 are minor but far-reaching. Instead of displaying the single image of the cave opening, we now display it twice, with the second image a mirror flip of the first one. We take measurements on the left and right sides of the test image and average them out. This gives a more realistic black level for displays with noticeable vignetting or poor black uniformity. We've also adjusted the overall scoring curve to be stricter, increasing the definition of "good" contrast from 15,000:1 to 80,000:1.

Blooming

This test has been renamed from 'Blooming' to 'Lighting Zone Precision' to better define the philosophy behind it. We still run the test the exact same way, but instead of looking at it as the amount of light that blooms into dark areas caused by the local dimming, we're now looking at it as how precisely the local dimming feature can display a single point of bright light. One of the issues with the old approach to this test was that TVs that didn't have local dimming at all scored a 10. Since TVs without local dimming can't display a single point of bright light with any precision, they now score a 0.

Brightness

There are no changes to how we run the HDR Brightness test in this update, but we've changed the scoring a bit. Sustained brightness isn't a very representative test, as even TVs that perform badly on this test rarely do so with real content. With that in mind, we've decided to remove the scores from the sustained brightness measurements and the ABL calculation. The data is still there, so if you care about this, you can still see how each TV performs, but it's no longer part of the scoring. We've readjusted the scoring weights to place an even greater emphasis on our real scene tests. We've also adjusted the scoring splines across each individual test. TVs have become much brighter in recent years, so we expect higher brightness levels.

These changes have been applied to the HDR Brightness, HDR Brightness In Game Mode, and SDR Brightness boxes. The scoring curves for the SDR Brightness box are different from HDR.

PQ EOTF Tracking

The changes to this test are minor. We've adjusted the scoring splines for all three curves, instead of < 0.018, the threshold for a good result on this test has been decreased to < 0.008.

Color Gamut

The color gamut test has been removed completely, as the 2D chromaticity plots much of the industry uses are no longer considered to be a valid way to measure chromaticity. Gamut rings have replaced this test.

SDR Color Volume

Gamut rings on the LG C4.
LG C4 gamut rings
Gamut rings on the Samsung S95D.
Samsung S95D gamut rings

Gamut rings are a new way of looking at color volume in SDR by splitting it into rings of increasing lightness, from 0 to 100. We take over 600 individual measurements to produce the rings shown above. As you move from the center out, each ring increases in lightness, and the gray area in each ring shows the parts the TV can't display. This makes it easier to see differences between panel types that weren't very obvious in our previous tests. As you can see above, colors on the LG C4 are noticeably desaturated at high lightness levels. The benefits of the S95D, which doesn't rely on a white subpixel to boost brightness, are immediately apparent.

Color Volume

This has been renamed to 'HDR Color Volume' to avoid confusion with the above test. Other than that, this test is pretty much the same as it's been for a while. We've adjusted the scoring spline slightly, moving the good value for 10,000 cd/m² BT.2020 Coverage ITP from 30% to 35%.

Pre-Calibration/Post-Calibration

These two tests have been renamed to 'SDR Pre-Calibration Color Accuracy' and 'SDR Post-Calibration Color Accuracy,' and the weight of each test in the overall score has been adjusted slightly.

HDR Pre-Calibration/Post-Calibration Color Accuracy

Graphs showing the pre- and post- calibration accuracy on the Samsung S95D.
Pre- and post- calibration HDR accuracy on the Samsung S95D

We can't talk about HDR without talking about creative intent, but until now, we've only been able to talk about the range of brightness and colors that a TV could display. Besides our PQ EOTF test, we couldn't really talk about how well a TV tracks creative intent. TV 2.0 takes one small step closer to answering those questions, as we now measure the white balance dE, color dE, and the overall color temperature in HDR10. We do this twice, once with the TV in its most accurate pre-calibration settings and again after calibrating it.

Gray Uniformity

Although we've not changed how we run this test, we've updated the scoring splines and weights to better reflect how TVs look in person. The scoring splines were a bit too generous before, and it wasn't possible for any TV to score below a 5 or a 6 on certain tests. We've corrected this, and most TVs are now scoring lower.

Black Uniformity

Similar to the Gray Uniformity test above, we've slightly adjusted the scoring splines for this test. The differences aren't as significant, but again, most TVs are scoring a bit worse than they used to.

Viewing Angle

Again, we've slightly tweaked the scoring splines for our viewing angle test. Some of the recent QD-OLED releases, like the Samsung S95D, have incredibly wide viewing angles, remaining accurate beyond 70°, but our scoring didn't look beyond that angle. We've adjusted the splines to account for this. We've also slightly adjusted the scoring splines, but unlike most of the changes we've made, this time, we decreased the 'Good' value. We've heard from a lot of you that our viewing angle scoring is a bit harsh, so we're adjusting the scoring a bit. Most TVs are scoring slightly higher on 2.0 than they were on 1.11.

Reflections

Last year's Samsung S95D OLED, with its matte coating and QD-OLED panel, exposed some limitations in our reflections handling. There are pros and cons to any TV coating, but other than measuring the pure intensity of reflections, we didn't really have any way to quantify how light impacts a TV's picture quality. So, we decided to remove the existing Reflections test and replace it with two new tests that better show how a TV handles both direct and total reflections.

Total Reflected Light & Direct Reflections

Total reflected light on the Samsung S95D.
Total reflected light on the Samsung S95D.
Samsung S95D direct reflections.
Samsung S95D direct reflections.

One of our goals with this test bench was to expand our reflection handling tests to better capture the nuances between different coatings so you can make an informed decision. We've redone our total and direct reflections tests to make the test easier to run and more representative. These new tests use a ring light to simulate a consistent bright light source and measure the intensity of the light reflected. This makes it easier to see the impact of bright lights, and you can also quickly see if there are any distracting diffraction artifacts like rainbow smearing.

Ambient Black Level Raise

These new tests are a good start, but it's not enough. When we talk about TVs, many reviewers assume that viewers are watching movies in a perfectly black room. But that's simply not the case for the vast majority of users. So, we needed to find a way to measure not only the reflectivity of a TV's screen but also the impact that those reflections have on a TV's picture quality.

Ambient black level raise on the Samsung S95D.
Ambient black level raise on the Samsung S95D
Ambient black level raise on the LG C4.
Ambient black level raise on the LG C4

The Ambient Black Level Raise test measures how much a TV's black levels rise when you're in a bright room. This became more of an issue in recent years with the release of QD-OLED panels. Since these TVs lack a polarizer, they have a noticeable purple tint when used in a bright room, and blacks aren't as deep. This new test lets you quickly see how the panel technology impacts contrast when watching TV in a bright room.

Ambient Color Saturation

The Ambient Color Saturation test is very similar to the black level raise test mentioned above. Instead of looking at black levels, though, it looks at a TV's perceived color volume as a function of ambient lighting. This shows you how bright and vibrant colors will be in a bright room. Unlike the black level raise test, which shows the change in black levels relative to a dark room, the color saturation test shows the absolute values in both dark and bright rooms.

Ambient color saturation on the Samsung S95D.
Ambient color saturation on the Samsung S95D
Graph showing ambient color saturation on the LG C4.
Ambient color saturation on the LG C4

Taking the above examples of the LG C4 and the Samsung S95D, you can immediately see the benefits of each model. The C4 retains its black levels better in a bright room, so it maintains more of those deep, inky blacks OLEDs are known for. On the other hand, the S95D maintains bright colors better, so if you prefer bright, saturated colors and usually watch TV in a bright room, you can immediately see why the S95D is a better choice.

Panel Technology

Graph showing the spectral power distribution of a TV.
Spectral power distribution of the Sony X90L showing the presence of a KSF phosphor coating.
Graph showing the spectral power distribution of a TV.
Spectral power distribution of the LG UT7570 showing no wide color gamut coating.
Graph showing the spectral power distribution of a TV.
Spectral power distribution of the Hisense U9N using quantum dot technology to produce better color separation and a wide color gamut.

Although we've been collecting this data for years as part of the process needed to calibrate our equipment before each test, by popular demand, we're now including the spectral power distribution charts for all TVs under the Panel Technology section. Although most people shouldn't pay any attention to this, a TV's SPD tells us a lot about how it produces light. More precise peaks on each primary lead to better color separation and (usually) a wider color gamut. We can also see which TVs are using certain technologies like KSF phosphors, like the Sony X90L above, or quantum dots, like the Hisense U9N.

Response Time & Cumulative Absolute Deviation

Our response time testing has remained relatively unchanged for the last 7 years. It was fine when people mainly used TVs for watching shows/movies or playing casual games on older consoles, but with the rise of high refresh rate TVs, it simply isn't good enough anymore. With more and more people looking to use their TVs as they would a high-end gaming monitor, we needed a better test. The solution to this was fairly easy, and by popular request, we ported a portion of the response time and cumulative absolute deviation tests we developed for Monitor 2.0.

Response Time - Calibrated Mode
Response Time - Calibrated Mode
Cumulative absolute deviation graphs at 60Hz
Game Mode @ 60Hz
Cumulative absolute deviation graphs at 120Hz
Game Mode @ 120Hz
Cumulative absolute deviation graphs at the max refresh rate.
Game Mode @ 144Hz

With this update, we've updated our response time test to use the new pursuit photo. This updated photo makes it easier to spot things like overshoot as specific color response times compared to our old photo. Like our previous test, we still run the response time test in the most accurate settings, and it's intended to show how motion is handled when watching movies, shows, or sports.

We've also added three new tests. Instead of measuring the response time itself, these tests measure the cumulative absolute deviation, or in other words, the total area shaded in yellow on the charts above. This new way of measuring transitions takes into account how severe overshoot is, so if you have two TVs that take the same amount of time to transition between two shades, but one overshoots the target before falling back, it'll score worse in this test. Unlike the response time test, the CAD tests are done in Game Mode, at 60Hz, 120Hz, and the max refresh rate of the TV.

Flicker-Free

We've renamed this box to simply 'Flicker' and removed the score, but this test has not been changed in any other way.

Black Frame Insertion (BFI)

With the updates to our response time testing, we also updated the pursuit photo for our BFI test. Other than that, our BFI test is exactly the same as it was on 1.11.

Stutter

With the changes to the response time test mentioned above, we also had to update our stutter test, as it uses the results of the response time test to calculate the frame hold time. Instead of the 100% Response Time result, for TV 2.0, stutter is calculated from the Total Response Time result. We've also updated the scoring weights and spline to increase the focus on 24 fps stutter, as 60 fps stutter is almost never an issue.

Input Lag & Supported Resolutions

Input lag on the S95D on 1.11
Original input lag box on the Samsung S95D on version 1.11
Input lag on the S95D after 2.0
Input Lag box on the Samsung S95D after TV 2.0

On TV 2.0, we decided to simplify the input lag test to remove formats that don't matter as much anymore, like 1440p. TVs were never designed for 1440p inputs, and with the rise of HDMI 2.1, PC gamers looking to render their games at 1440p to achieve a higher framerate can simply have their graphics card upscale the image to 4k anyway, so it doesn't matter. This change isn't just on input lag, as we've also removed tests for 1440p in the Variable Refresh Rate and Supported Resolutions sections of the review.

Beyond the test coverage itself, we've also made significant changes to our input lag scoring. Our scoring curve hasn't changed much in the last few years, but TVs have gotten a lot better. If input lag is important to you, it's difficult to make a buying decision when the best and worst TVs all score almost the same.

We also adjusted the 144Hz input lag measurements to instead measure the input at the maximum refresh rate a TV supports for both 1080p and 4k. With more and more TVs supporting 165Hz and even higher refresh rates, this ensures we're showing you the best input lag a given TV can provide.

HDR Format Support

We've moved the various HDR tests (HDR10, Dolby Vision, HDR10+, HLG) from the Inputs Specifications box into a new section called HDR Format Support.

Inputs Specifications & Total Inputs

Minor changes; we've moved the HDR tests mentioned above and combined this section with the Total Inputs section. We removed the tests for CEC and HDCP 2.2, as all TVs support these features. We also removed the tests for Component In, Analog Audio Out RCA, DisplayPort, and IR In.

Accelerated Longevity Test

Our accelerated longevity test is running until Summer 2025, but since none of the TVs on the test are being updated to our 2.0 test bench, we decided to remove this section entirely. This ensures that no data is lost, and you'll still be able to see the brightness change over time and the uniformity slides for any TV that ran on the test.

Sound Quality

We've made the difficult decision to remove the distortion test, as it's not very useful in its current form. We plan to revisit our sound quality testing in a future test bench update.

Interface

We've removed the subjective performance-related tests, as they're not very representative of real-world performance.

Apps & Features

We removed this section entirely, as it wasn't very useful in its current form.

Remote

We removed the Size, CEC Menu Control, Other Smart Features, and Remote App tests, as they weren't useful. We also changed the Voice Control test to a simple Yes/No.

TV Controls

Here, we added a test to indicate whether a TV has a mute switch for the built-in mic.

22 TVs Updated So Far

We are retesting popular models first. So far, the test results for the following models have been converted to the new testing methodology. However, the text might be inconsistent with the new results.

19 TVs Planned To Be Updated

We are also planning to retest the following products over the course of the next few weeks:

Comments

  1. Article

Test Bench: Usage & Testing Scores Overhaul, New Color & Reflections Testing: Main Discussion

What do you think of our article? Let us know below.


Want to learn more? Check out our complete list of articles and tests on the R&D page.

PreviewBack to editorFormat guide
Sort by: