A little less than a decade ago, Nike released a shoe that would revolutionize the running shoe industry. First introduced in 2017, the Nike Zoom Vaporfly Elite was the first step towards what we now know as "super shoes." The evolution of running shoe technology has accelerated rapidly since then with nearly every shoe manufacturer today producing their own version of a super shoe.
While past running shoe developments have focused on stability and comfort, the modern super shoe revolution is driven by performance-enhancing technologies designed to maximize efficiency, reduce fatigue, and improve running economy. There are a few design elements that virtually all super shoes leverage to some degree: cushioning, weight, energy return, and carbon fiber plates.
All of the design elements above impact performance, though the degree to which is debated and dependent on the runner. We decided to focus our testing efforts on the midsole foam — specifically energy return and cushioning properties. With midsole technologies rapidly evolving, we wanted a way to quantify and compare this critical aspect of running shoe performance.
There are multiple ways to evaluate running shoe energy return and cushioning properties ranging from in-vivo testing to mechanical testing. In-vivo testing of running shoes involves human subjects running while collecting biometric measurements and data. At RTINGS, we try to focus our efforts on objective, repeatable testing when possible, so for that reason (and many other logistical reasons), we decided against in-vivo testing. That said, we still run in the shoes to corroborate our objective results with real-world experience and performance.
Outside of in-vivo testing, the next best option for evaluating cushioning and energy return properties is to mechanically simulate foot strikes using specialized testing equipment. These types of machines provide precise control over testing conditions, which allows us to measure, replicate, and evaluate key parameters to compare performance across all shoes.
We had two key criteria to meet when deciding on what machine to purchase for our running shoe tests:
When it comes to mechanical testing on footwear, there are two common approaches: gravity-driven impact machines and universal testing machines (UTM).
A gravity-driven impact testing machine is designed to measure how a material or product reacts to impact forces. In the case of running shoes, the impact machine would drop a gravity-driven mass from a specific height and measure the impact characteristics. By adjusting the mass, drop height, and impact area, specific impact characteristics can be simulated. The results of the impact test can be used to determine the cushioning and energy return properties of the shoe being tested.
If the correct size of the machine is chosen, this type of machine can easily replicate the speed and forces required to simulate an individual impact like a foot strike. In fact, the now withdrawn ASTM F1976-13 standard test methodology for evaluating athletic shoe cushioning systems called for the use of a machine like this. Therefore, this type of machine would fit our first criteria: accurate foot strike replication.
However, gravity-driven impact machines lack the ability to simulate repetitive impacts in a short period of time. The impact mass needs to be reset after each impact, so they can't replicate the impacts with the same frequency as a person running. Because of this, it couldn't satisfy our second requirement of repetitive impacts.
Much like gravity-driven impact testing machines, universal testing machines (UTM) come in multiple different sizes and configurations. A UTM could be configured to perform compression testing on a running shoe midsole, including handling various loads and operating at various speeds depending on the testing needs. Because of this, there are UTMs capable of reaching the forces and speed required to simulate foot strikes. Therefore, if configured properly, a UTM would satisfy our first criteria.
A UTM would also satisfy our second criterion, as many machines can repetitively impact a sample for as long as needed. This would allow us to gather detailed data not only about the impact but also about multiple impacts over a period of time. This is particularly important when looking at how running shoe foams perform after a certain amount of mileage.
For the two reasons above, we decided that a UTM would best fit our needs. Because of this, we purchased a custom UTM fitted with an electrodynamic linear actuator and a load cell with sufficient force and speed capabilities to accurately and repeatably measure running shoe performance. The next question was to determine the exact specifications needed to best replicate running impacts.
After choosing our testing machine, we needed to design a test that would evaluate running shoe midsoles in a way that replicated real-world running. To do this, we needed to design an impact head attachment and determine the test parameters that allowed us to closely replicate the impacts of a human foot.
Since we planned to test the shoes on small regions of the foot rather than the entire foot at once, the test head attachment didn't need to replicate the whole foot. The goal was to design an attachment that could closely replicate the bottom of a heel and the metatarsophalangeal (MTP) joints.
For this reason, we decided on an attachment whose shape is based on the impact head described in ASTM F1976-13, with some alterations. We designed our impact head to be slightly wider and more rounded on the edges than the ASTM version. The result is that our impact head more accurately reflects the rounded shape of a runner's foot around their heel and MTP joints.
We decided to use the same impact head to test the forefoot as we did to test the heel for one main reason: comparability. Using a consistent impact head ensured that our results can be directly compared, which means we can know exactly how the cushioning properties change from heel to forefoot.
The next decision was to determine where along the shoe we should be testing. Since our goal was to characterize the heel and forefoot, we initially aligned our testing at the same points that World Athletics measures heel and forefoot stack heights. This corresponds to 12% and 75% of the internal length of the shoe, respectively.
For the heel, the position at 12% of the internal length lined up well with where heel strikers tend to land, so there was no need for adjustment. However, after filming some of the compression tests and running in the shoes, the position at 75% of the internal length didn't line up with where we—and others (see Reddit image below)—felt it should occur.
For these reasons, we moved the forefoot test positioning to 65% of the internal length of the midsole. This new position more accurately represents the area of the forefoot that experiences the most load for most runners.
Most studies that report ground reaction forces (GRF) during running use two techniques: force plates on the ground or treadmill or force sensors in the insoles. This study uses force plates on the ground and an instrumented treadmill to compare vertical GRF between overground and treadmill running.
Peak forces are typically reported in the range of 2–3 times body weight. For a runner that weighs 150 lbs, this translates to 300–450 lbs of force. If we take the upper end of this range and convert 450 lb to newtons, we get ~2000N.
However, instead of targeting a peak force, we aimed to replicate the pressure felt in certain regions of the shoe. This allowed us to test smaller areas of the shoe at lower peak forces.
For instance, if we focused our testing on only the heel or only the forefoot, we can assume that the forces will be lower than the previously calculated peak force of 2000N. That's because the contact area of the object impacting the shoe is concentrated on a smaller area, rather than spread across the entire foot.
This study from the Journal of Sports Sciences evaluated multiple rearfoot strikers (RFS) and non-rearfoot strikers (NRFS) using pressure sensor insoles. They then reported the forces for various regions of the foot, the contact area, and the peak pressure data for each region. Using their findings, we were able to calibrate our testing forces to better align with real-world foot strike mechanics and ensure meaningful measurements.
If we combine the heel region for example (M1 and M2 in the foot diagram above), we can see from the bottom table shown above that the maximum force for RFS is 104.8 %BW (percentage of body weight). If we go back to our example of a 150-lb runner, this will equate to about 700N of force.
In contrast, as we previously determined, testing the entire foot area would require about 2000N. So by compression testing smaller regions of the shoe, we can use lower forces and still obtain meaningful results.
Based on these results, the UTM should ideally be able to comfortably reach above ~1000N. This would allow us to replicate foot strikes for a wide range of runner types.
To ensure our impact head applies forces that reflect realistic conditions, we converted the values from the study above and adjusted them based on our new test head attachment. This approach allowed us to replicate the pressure that is felt in these regions of the shoe without needing excessive force.
As seen by the overlay of our test location and the study's foot map, our heel region lines up with M1 and M2, and our forefoot region lines up with M6. If we take these regions and evaluate the peak forces and contact areas for RFS and NRFS, we can determine the forces required for our test impact area. Since these forces are in relation to body weight (BW), we can correlate our testing forces with runner weight.
Runner Weight (lbs) | Runner Weight (kg) | Adjusted Force - Heel (N) | Adjusted Force - Forefoot (N) |
100 | 45.36 | 355.95 | 279.21 |
120 | 54.43 | 427.14 | 335.06 |
140 | 63.5 | 498.33 | 390.90 |
160 | 72.57 | 569.51 | 446.74 |
180 | 81.65 | 640.70 | 502.59 |
200 | 90.72 | 711.89 | 558.43 |
220 | 99.79 | 783.08 | 614.27 |
240 | 108.86 | 854.27 | 670.12 |
260 | 117.93 | 925.46 | 725.96 |
Estimated conversion for rearfoot strike (RFS) runner weight to testing force in the heel and forefoot.
Runner Weight (lbs) | Runner Weight (kg) | Adjusted Force - Heel (N) | Adjusted Force - Forefoot (N) |
100 | 45.36 | 125.75 | 308.13 |
120 | 54.43 | 150.90 | 369.75 |
140 | 63.5 | 176.05 | 431.38 |
160 | 72.57 | 201.20 | 493.01 |
180 | 81.65 | 226.35 | 554.63 |
200 | 90.72 | 251.50 | 616.26 |
220 | 99.79 | 276.65 | 677.88 |
240 | 108.86 | 301.80 | 739.51 |
260 | 117.93 | 326.95 | 801.13 |
Estimated conversion for non-rearfoot strike (NRFS) runner weight to testing force in the heel and forefoot.
The estimated conversion for RFS and NRFS runner weight to testing force in the heel and forefoot tables above tells us that to test the heel and forefoot using our test head attachment, the applicable force range varies from just above 100N in the NRFS runners to just above 900N in the RFS runners. This range is what dictated our decision to test up to a force of 900N on both the heel and forefoot.
The last test parameter we needed to determine was impact frequency. In an ideal world, we would replicate the exact force-time profile of a foot strike. The GRF profile for an individual foot while running has a few key characteristics. Each foot will have a loading phase, an unloading phase, and a no-load phase. An example of this can be visualized using this dataset of running biomechanics measurements.
Instead of directly replicating a foot impact GRF, we decided to use a sine wave to approximate the loading and unloading cycle. For the sine wave, we decided on a frequency of 2Hz and an amplitude of 900N (based on the results from the previous section). At this frequency, since we are only replicating one foot strike, doubling this to two feet translates to a running cadence of 240 steps per minute (SPM).
While we recognize this isn't a very realistic cadence and by even testing at a more realistic 180 SPM, we found that the loading rate is much lower than what is seen in actual runners. This is because the GRF pattern for an individual foot strike has a no-load phase that isn't accounted for in our sine wave test.
However, if we wanted to match the loading rate using a sine wave, we would need to increase the frequency to a point where there would be almost no time for the foam to recover from each impact. For these reasons, we settled on 2Hz as our test frequency — a balance between realistic load rates and allowing the foam sufficient recovery time between impacts.
To ensure consistency, we also incorporated a brief 'warm-up' period before recording data. Specifically, we impacted the shoe for 100 cycles and collected measurements on the 95th cycle, ensuring the material had settled into a stable response.
Using our compression testing results, we evaluated how the shoes perform in terms of energy return, cushioning, and firmness at both the heel and forefoot. Given all the results, it's interesting to observe how cushioning properties vary across brands and models, highlighting differences in midsole technology, material composition, and design.
As previously mentioned, energy return refers to the amount of energy the midsole will return to the foot after an impact. This can sometimes be misinterpreted as the midsole returning more energy than what is put into it, which is not the case. Since some energy is always expended, the amount of energy returned will always be less than the energy input. Running shoe manufacturers have aimed to minimize energy loss and, therefore, maximize energy return.
Looking at the force vs displacement plot of a desired impact, we can see how energy return is determined in our test. The area between the loading and unloading phase is the energy loss for a given cycle. The smaller the energy loss, the more energy return a shoe has.
For example, the area between the loading and unloading phase for the Nike Alphafly 3 is significantly lower than the ASICS GEL-KAYANO 31, meaning there's less energy loss for the Alphafly 3. The result is that the Alphafly 3 has a forefoot energy return of 88.4% compared to 70.5% for the GEL-KAYANO 31.
The cushioning and firmness tests use the exact same data as the energy return test, but we focused on the loading phase instead.
For cushioning, we looked at the cumulative energy absorbed during the loading phase. Again, this is determined from the area under the force-displacement curve. Higher energy absorbed at a given force will result in a shoe that can absorb more of an impact.
In the above example, the forefoot of the Alphafly 3 has a higher cumulative energy absorbed for nearly the entire force range compared to the GEL-KAYANO 31. This means that for most runners in most situations, the Alphafly 3 will provide better cushioning in the forefoot.
Firmness is defined as the resistance of the midsole to deformation for a given load. Simply put, it's how firm the foam will feel underfoot. We took the derivative of the force-displacement curve to determine how firm the midsole feels over the range of forces tested. Cushioning and firmness are generally inversely proportional.
The above graph shows the results of the firmness testing of the GEL-KAYANO 31 and the Alphafly 3. The results conclude that the forefoot of the GEL-KAYANO 31 is firmer throughout the entire force range compared to the Alphafly 3.
Firmness graphs are very sensitive to small fluctuations in the force-displacement curve. Because of this, the first few data points are ignored because they represent points where the impact head is changing direction. These points don't accurately represent the behavior of the midsole.
Having tested over 30 running shoe models, we can see some interesting trends beginning to develop. As expected, shoes with higher stack heights exhibit higher cushioning, regardless of foam type.
Note that the outlier in the top right corner of the heel graph is the Mizuno Wave Rebellion Pro 3, which has a unique midsole design. Although, according to World Athletics guidelines, the stack height is a race-legal 40 mm. Due to the angle at which the heel rests on the ground, we were only able to compression test where it measures an incredible 61 mm.
What's more interesting is when we compare the results of our cushioning and energy return tests. It becomes apparent that super shoes, with more premium foams, tend to have a higher energy return.
Super shoes designed for long-distance running combine high cushioning with high energy return. These trends are evident in some of the more popular marathon shoes on the market, such as the adidas Adizero Adios Pro 4, the Nike Alphafly 3, and the New Balance FuelCell SuperComp Elite v4.
By carefully selecting our testing equipment and optimizing our test parameters, we believe our tests can highlight important performance aspects for runners of all types. Our chosen testing machine enables precise, repeated impacts, providing us with the flexibility to refine our methodology and expand our testing capabilities in the future.
While not perfect, our test parameters aim to closely replicate the localized forces in the heel and forefoot for a variety of runners. This allows us to evaluate energy return and cushioning characteristics, providing an accurate and meaningful assessment of each shoe's performance.
With these results, we can identify key differences between running shoe midsoles, particularly in terms of energy return and cushioning properties. This allows us to provide detailed and accurate measurements with the goal of helping you find the best running shoes for your needs. If you're interested in checking out our reviews, you can find them here.
We recognize that our testing approach has limitations and are committed to continually improving. We remain open to any suggestions you may have regarding our testing procedures and will continue updating our parameters to ensure we're capturing the most relevant and valuable data for all runners. If you have any suggestions on what you would like to see in future test bench updates, let us know in the comments or send us an email at feedback@rtings.com.
What do you think of our article? Let us know below.
Want to learn more? Check out our complete list of articles and tests on the R&D page.