The Testing Process

How we actually test bags.

Every CC Score starts here. Before a number is assigned, a bag goes through a defined protocol — real weight, real terrain, real time. This page documents exactly what that means.

20 hr minimum carry  ·  3 load tiers  ·  4 test environments  ·  No shortcuts

0 Hours minimum

No bag receives a CC Score before this floor is reached. No exceptions.

Testing Duration

Time on body is the only truth test.

Any bag can feel good in a store. Any bag can feel good for ten minutes. The 20-hour minimum exists because that's when the real carry reveals itself — hotspots emerge, strap padding compresses, back panels stop feeling like they did on day one.

What Counts

Active carry time only

Hours count when the bag is on your back under load — commuting, hiking, rucking, traveling. Time sitting in a chair with the bag on the floor does not count. Time in transit as checked luggage does not count. The clock runs when you're moving.

How Hours Are Logged

Session by session

Every test session is logged with start time, end time, load weight, terrain type, and duration window. Sessions under 30 minutes are not counted toward the total. The 20-hour floor is the sum of qualifying sessions across all test environments.

Why 20 Hours

Roughly two commute weeks

Twenty hours is approximately two full weeks of daily commuting, or four full trail days. It's long enough that first impressions have faded and real carry behavior has emerged. Some bags get more — complex bags, bags with borderline scores, or bags being retested after an update.

Duration Windows

Four observation checkpoints

Notes are taken at four standardized windows within each session — 30 minutes, 2 hours, 4 hours, and full-day (8+ hours). Not every session hits every window. But every bag must be observed at each window at least once before scoring. The full-day window is mandatory for any bag claiming long-haul or travel credentials.

The Rule A bag listed as "in testing" has not yet reached 20 hours of qualifying carry. It will not receive a score, a tier, or a verdict until it does. No timeline is given. The protocol finishes when it finishes.
Load Weights

Three tiers. Calibrated to the carry.

Not every bag gets tested at the same weight. A 50-pound test makes sense for a rucker — it makes no sense for a sling bag. Every bag is tested across three load tiers calibrated to its carry category. The tiers are fixed. The categories determine the actual pound targets.

Tier 1
Light
Baseline load

The minimum functional load for the bag's category. Establishes baseline carry comfort — how the bag sits and moves with realistic light use. Most commuters live here.

Tier 2
Mid
Working load

The most common real-world carry weight for the category. This is where most scoring happens — it's the weight that exposes strap system weaknesses, back panel compression, and load distribution failures.

Tier 3
Max
Stress load

The upper end of realistic carry for the category. Reveals how a bag behaves under pressure — load transfer failures, strap creep, hip belt engagement, and back panel collapse all show up here first.

/ Load Tier Matrix

Pound targets by carry category. Fixed per protocol.

These are the exact pound targets used in every review. They don't change between bags in the same category. A bag reviewed today uses the same targets as a bag reviewed six months from now.

Carry Category Light Mid Max
EDC · Daily Carry Empty + 5 lb 15 lb 25 lb
Slings & Crossbody Empty 5 lb 9 lb
Travel · One Bag Empty 25 lb 40 lb
Hiking & Trail 10 lb 25 lb 50 lb
Ruckers & Load Carriers 20 lb 35 lb 50 lb
Totes & Carry-All Empty 10 lb 20 lb
Duffels Empty 15 lb 30 lb

LD-4 (Long-Haul Wearability) is scored against mid load. LD-5 (Body Fit Range) is scored against max load. Light load is captured implicitly during normal protocol coverage and early session windows.

Why This Matters Load weight is the single biggest variable in how a bag carries. A bag that feels incredible at 10 pounds can be a disaster at 30. The three-tier system exists to catch both ends — and everything in between. A bag that only performs at light load gets scored accordingly.
Test Environments

Four environments. One protocol.

Every bag is tested across four distinct environments before it earns a score. Not every bag is tested in every environment at every load — but the protocol sets the minimum coverage required per carry category. A commuter bag doesn't need a trail day. A rucker does.

01

Commute Testing

Urban · Daily · Multi-surface

The bread and butter. Sidewalks, stairs, transit, revolving doors, overhead bins, coffee shops, offices. This environment tests how a bag performs in the stop-and-go of real daily carry — not a continuous walk, but constant transitions between moving and stationary.

  • Terrain Pavement, stairs, transit, indoor
  • Duration 30 min to full work day
  • Load tier Light and mid
  • Mandatory for All bags
02

Ruck Testing

Weighted · Sustained · Distance

Sustained weighted carry over distance. No stops, no breaks longer than five minutes. This is where strap systems fail, load distribution collapses, and back panels stop performing. It's also where the bags that earn Elite scores separate themselves from everything else.

  • Terrain Pavement, hills, mixed surface
  • Duration 2 hr minimum per session
  • Load tier Mid and max
  • Mandatory for All bags 20L and above
03

Travel Testing

Transit · Packed · Full-day carry

Airport, train, hotel, street. Full packing load across a real travel day — not simulated. This tests how the bag handles packed weight for 8+ hours of intermittent carry. Overhead access, security lanes, and carry-on compliance are all observed and noted.

  • Terrain Airport, transit, hotel, street
  • Duration Full travel day (8+ hrs)
  • Load tier Mid and max
  • Mandatory for Travel and one-bag bags
04

Trail & Urban Terrain

Elevation · Uneven · Variable surface

Rough terrain, elevation change, and uneven surfaces. This environment reveals how a bag handles body movement — lateral shifts, incline loading, and step-up transitions. Urban hike routes (uneven blocks, staircase climbs) and trail sections are both used depending on bag category.

  • Terrain Trail, uneven urban, elevation
  • Duration 2 hr to full day
  • Load tier Mid and max
  • Mandatory for Hiking, trail, and rucker bags
Coverage by Category Every bag must complete commute testing. Ruck testing is mandatory for bags 20L and above. Travel testing is mandatory for bags marketed as travel or one-bag. Trail and urban terrain testing is mandatory for hiking, trail, and rucker bags. A bag that doesn't qualify for an environment is noted in the review — not penalized for it.
What Goes Inside

Real gear. Real weight.

The load isn't random and it isn't simulated with sandbags. Every test uses real gear packed the way a real person would pack it — because how weight distributes inside a bag affects how it carries outside. A misloaded bag is a bad data point.

Tier 1

Light Load

Baseline carry — everyday essentials
  • Laptop or tablet ~3 lb
  • Water bottle (empty or half) ~0.5 lb
  • Jacket or layer ~1 lb
  • EDC items (wallet, keys, cables) ~0.5 lb
Approx total ~5 lb
Tier 2

Mid Load

Working carry — full day packed
  • Laptop + charger ~4 lb
  • Water bottle (full) ~2 lb
  • Jacket + base layer ~2 lb
  • Food / snacks ~2 lb
  • EDC + accessories ~1 lb
  • Weight filler to target variable
Approx total ~15–25 lb
Tier 3

Max Load

Stress carry — upper limit test
  • Laptop + charger + accessories ~5 lb
  • Water (full, 2L where applicable) ~4.5 lb
  • Full clothing layer set ~3 lb
  • Food + meal items ~3 lb
  • Weight plates or filler to target variable
Approx total ~30–50 lb
/ Packing Method

Packed the way a real person packs it

Gear is packed according to the bag's intended use — laptop in the laptop sleeve, clothes in the main compartment, water in the water bottle pocket where one exists. Weight is not concentrated or artificially distributed to stress-test a specific area. The goal is to replicate how the bag's actual user would load it on a real day. If the bag has no organization, it's packed loose. If it has structure, the structure is used. Packing method is noted in every review.

Why Real Gear Matters Sandbags and weight plates pack differently than real gear. They concentrate mass in ways that don't reflect actual use. A laptop sits flat against your back. Clothes fill corners. Food shifts. Real gear produces real data — and real data produces honest scores.
How Notes Are Taken

Observations logged in the field.

Notes aren't written after the fact from memory. They're logged during testing at four standardized time windows — so the observation captures what the bag actually felt like at that moment, not a reconstruction of it hours later.

When Notes Are Logged

Four windows per session

30 Minutes

First impression under load. How the bag sits, initial strap feel, load position, first contact points. This is where fit problems show up immediately — or don't.

2 Hours

Early fatigue window. Strap padding compression, shoulder hotspots, back panel sweat, load shift. Most commute bags are fully evaluated by this window.

4 Hours

Mid-carry assessment. Cumulative fatigue, adjustment frequency, strap creep, pressure point development. Where the difference between a good bag and a great bag starts to show.

Full Day (8+ hrs)

End-of-day verdict. What hurts, what held up, what failed. This window is mandatory for any bag claiming long-haul or travel credentials. A bag that falls apart at hour six doesn't earn a passing score in Long-Haul Wearability.

What Is Recorded

Per-session observation fields

Date & time Session start and end, total duration
Environment Commute, ruck, travel, or trail — and specific terrain conditions
Load weight Exact weight carried, tier designation, and gear list
Load distribution How weight sat on the body, contact points, sag or hug assessment
Strap performance Shoulder pad compression, sternum strap placement, hip belt engagement
Back panel Heat, sweat, airflow, spine contact, foam compression
Fatigue markers Hotspots, chafing, adjustment frequency, strap creep, shoulder fatigue
Body fit Torso fit assessment, shoulder width match, frame stiffness observation
Incidents Hardware failures, zipper issues, strap slippage, structural observations
Overall verdict One-sentence session summary — would carry again at this load and duration
/ How Notes Feed the Score

Observations become category scores

After testing is complete, session notes are reviewed across all windows and environments. Each of the five scoring categories is scored based on the accumulated observations — not a single session, but the pattern across all of them. A bag that performs well in the 30-minute window but collapses at hour four doesn't score well in Long-Haul Wearability, regardless of how good it felt early. The score reflects the full carry, not the first impression.

No Post-Hoc Scoring Scores are not assigned during testing. Notes are taken. Scoring happens after all protocol sessions are complete and all notes are reviewed together. This prevents first-impression bias from inflating early scores and fatigue bias from deflating late ones.
When a Bag Gets Retested

Scores can change. Here's when.

A CC Score is not permanent. If a bag changes in a meaningful way — materials, construction, strap system, back panel — it gets retested against the same protocol. The new score replaces the old one. The old score is archived in the review for reference.

01

Material or Construction Update

A new fabric, foam, or structural change that affects how the bag carries. A colorway change doesn't qualify. A new back panel system does. A new strap foam does. A new frame sheet does. If the carry could be different, it gets retested.

02

Version or Generation Change

A bag released as a V2, Gen 2, or updated model gets treated as a new bag. The prior version's score stays on its own review page. The new version starts fresh with a full protocol. Same brand, same name — different carry, different score.

03

Extended Use Finding

If a bag continues to be carried after scoring and a significant finding emerges — hardware failure, strap delamination, foam collapse — the review is updated with the finding and the score is reassessed. Long-term durability observations are noted even after a score is published.

/ Does Not Trigger a Retest

What stays the same

  • Colorway or color option change
  • Price increase or decrease
  • Limited edition or special release
  • New size in the same line
  • Buckle color or hardware finish change
  • Brand rebrand or logo update
  • Change in distributor or retail availability
  • New marketing or brand story
/ How Retests Are Disclosed

Full transparency on every score change

When a bag is retested, the review page is updated with the new score, the retest date, and a note explaining what changed and why the retest was triggered. The original score is preserved below the current score so readers can see the history. If a bag improved, that's noted. If it got worse, that's noted too. The score history is part of the record.

Brands Cannot Request a Retest Retests are triggered by carry-relevant changes, not by brand preference. A brand that disagrees with a score cannot request a retest. A brand that sends an updated unit claiming changes will have that unit evaluated — and if the changes are carry-relevant, a retest will happen on RC's timeline, not theirs.
What Disqualifies a Bag

No score is still a verdict.

Not every bag that enters testing receives a score. Some bags disqualify themselves — through structural failure, protocol non-completion, or conditions that make honest scoring impossible. When a bag is disqualified, that's disclosed. The reason is published. No score is not the same as a passing grade.

01

Protocol Not Completed

The bag did not reach 20 hours of qualifying carry before the review window closed. This most often happens with loaner units returned before the protocol floor is hit, or bags that failed structurally before testing was complete. A bag listed as "in testing" has not yet disqualified — it simply hasn't finished.

Status
Listed as In Testing — no score published
02

Structural Failure During Testing

A strap detached. A zipper failed catastrophically. A frame buckled. Hardware broke under normal load. If the bag cannot complete the protocol because it broke during testing, it is disqualified. The failure is documented and published. The bag may be re-evaluated if the brand addresses the specific failure.

Status
Disqualified — failure documented and published
03

Unit Recalled or Discontinued Mid-Test

If a bag is recalled by the brand, discontinued, or pulled from sale before testing is complete, the review is suspended. Partial findings may be published as a note — not as a scored review. A bag that no longer exists for purchase serves no one with a score.

Status
Review suspended — partial findings noted if relevant
04

Unit Could Not Be Tested at Required Load

If the bag's construction made it impossible to reach the required load tier without risk of damage — a bag so lightly built that mid-load testing would be destructive rather than representative — it is disqualified from full scoring. Individual category scores that were completed may be published with context.

Status
Disqualified from full score — partial findings may publish
05

Conflict of Interest Identified

If a conflict of interest is identified after testing begins — a financial relationship with the brand, a gifting arrangement that wasn't disclosed, or a situation where honest scoring is compromised — the review is pulled and the conflict is disclosed publicly. The integrity of the score is the only thing that makes it worth anything.

Status
Review pulled — conflict disclosed publicly
/ What Happens After Disqualification

Disclosed. Documented. Done.

Every disqualification is published on the review page with the specific reason. The bag is listed in the reviews index as disqualified — not missing, not forgotten. If the disqualification reason is resolved — a structural failure addressed by the brand, a conflict cleared, a recalled unit replaced — the bag may re-enter testing as a new unit. It starts the protocol from zero.

No Score Is Not a Safe Harbor A disqualified bag is not a bag that escaped a bad score. The disqualification is the record. If a bag broke during testing, that information is more useful to a buyer than a score would have been. The carry tells the truth — and sometimes the truth is that the bag didn't make it through the protocol.
Common Questions

Questions about how we test.

  • The minimum is 20 hours of qualifying carry — roughly two commute weeks or four full trail days. Most bags take longer. Complex bags, bags with borderline scores, and bags being retested after an update get more time. There is no maximum. A bag stays in testing until the protocol is complete and the notes are sufficient to score every category honestly.

  • Yes — and it's always disclosed. If a brand sent a bag for review, the review says so. If RC paid for the bag, the review says that too. The disclosure is in every review, every time. Receiving a bag for free does not change how it is tested or scored. The protocol is the same regardless of how the unit arrived.

  • No. Brands do not see the score before it is published. Brands cannot request a higher score, contest a score, or trigger a retest by disagreeing with the result. A retest is only triggered by a carry-relevant change to the bag — not by brand preference. No sponsorship changes the number. No brand gets a polite pass.

  • It is disqualified. The failure is documented — what broke, when, at what load, in what environment — and published on the review page. A disqualified bag is not a bag that escaped a bad score. The failure record is more useful to a buyer than a score would have been. If the brand addresses the specific failure with a new unit, the bag may re-enter testing from zero.

  • Most reviews tell you what a bag has. Carry Culture tells you how it carries. The CC Score measures one thing — how a bag actually feels to carry under real weight, over real miles, across real terrain. It does not score features, looks, price, or brand prestige. A bag that looks incredible and carries poorly will score accordingly. A bag from an unknown brand that carries exceptionally will score accordingly too.

  • The protocol is the same — four environments, three load tiers, four observation windows, 20-hour minimum. The specific load targets and mandatory environments vary by carry category because a sling bag and a rucker should not be tested at the same weight. The framework is fixed. The calibration is category-specific. Every bag in the same category is tested against identical targets.

  • Yes. Bag suggestions are welcome through the contact page. RC selects the review queue based on carry category coverage, community interest, and bag availability. Suggesting a bag does not guarantee it will be reviewed, and it does not affect the score if it is. The queue is RC's call — not the brand's, not the community's.

Next

Now you know how we test. See what earned a score.

Every bag in the library went through everything on this page. Browse the reviews, understand the scoring system, or get the next verdict delivered the day it drops.

How We Test · 20 Hr Minimum · 3 Load Tiers · 4 Environments · No Shortcuts