How we actually test bags.
Every CC Score starts here. Before a number is assigned, a bag goes through a defined protocol — real weight, real terrain, real time. This page documents exactly what that means.
No bag receives a CC Score before this floor is reached. No exceptions.
Time on body is the only truth test.
Any bag can feel good in a store. Any bag can feel good for ten minutes. The 20-hour minimum exists because that's when the real carry reveals itself — hotspots emerge, strap padding compresses, back panels stop feeling like they did on day one.
Active carry time only
Hours count when the bag is on your back under load — commuting, hiking, rucking, traveling. Time sitting in a chair with the bag on the floor does not count. Time in transit as checked luggage does not count. The clock runs when you're moving.
Session by session
Every test session is logged with start time, end time, load weight, terrain type, and duration window. Sessions under 30 minutes are not counted toward the total. The 20-hour floor is the sum of qualifying sessions across all test environments.
Roughly two commute weeks
Twenty hours is approximately two full weeks of daily commuting, or four full trail days. It's long enough that first impressions have faded and real carry behavior has emerged. Some bags get more — complex bags, bags with borderline scores, or bags being retested after an update.
Four observation checkpoints
Notes are taken at four standardized windows within each session — 30 minutes, 2 hours, 4 hours, and full-day (8+ hours). Not every session hits every window. But every bag must be observed at each window at least once before scoring. The full-day window is mandatory for any bag claiming long-haul or travel credentials.
Three tiers. Calibrated to the carry.
Not every bag gets tested at the same weight. A 50-pound test makes sense for a rucker — it makes no sense for a sling bag. Every bag is tested across three load tiers calibrated to its carry category. The tiers are fixed. The categories determine the actual pound targets.
The minimum functional load for the bag's category. Establishes baseline carry comfort — how the bag sits and moves with realistic light use. Most commuters live here.
The most common real-world carry weight for the category. This is where most scoring happens — it's the weight that exposes strap system weaknesses, back panel compression, and load distribution failures.
The upper end of realistic carry for the category. Reveals how a bag behaves under pressure — load transfer failures, strap creep, hip belt engagement, and back panel collapse all show up here first.
Pound targets by carry category. Fixed per protocol.
These are the exact pound targets used in every review. They don't change between bags in the same category. A bag reviewed today uses the same targets as a bag reviewed six months from now.
| Carry Category | Light | Mid | Max |
|---|---|---|---|
| EDC · Daily Carry | Empty + 5 lb | 15 lb | 25 lb |
| Slings & Crossbody | Empty | 5 lb | 9 lb |
| Travel · One Bag | Empty | 25 lb | 40 lb |
| Hiking & Trail | 10 lb | 25 lb | 50 lb |
| Ruckers & Load Carriers | 20 lb | 35 lb | 50 lb |
| Totes & Carry-All | Empty | 10 lb | 20 lb |
| Duffels | Empty | 15 lb | 30 lb |
LD-4 (Long-Haul Wearability) is scored against mid load. LD-5 (Body Fit Range) is scored against max load. Light load is captured implicitly during normal protocol coverage and early session windows.
Four environments. One protocol.
Every bag is tested across four distinct environments before it earns a score. Not every bag is tested in every environment at every load — but the protocol sets the minimum coverage required per carry category. A commuter bag doesn't need a trail day. A rucker does.
Commute Testing
The bread and butter. Sidewalks, stairs, transit, revolving doors, overhead bins, coffee shops, offices. This environment tests how a bag performs in the stop-and-go of real daily carry — not a continuous walk, but constant transitions between moving and stationary.
- Terrain Pavement, stairs, transit, indoor
- Duration 30 min to full work day
- Load tier Light and mid
- Mandatory for All bags
Ruck Testing
Sustained weighted carry over distance. No stops, no breaks longer than five minutes. This is where strap systems fail, load distribution collapses, and back panels stop performing. It's also where the bags that earn Elite scores separate themselves from everything else.
- Terrain Pavement, hills, mixed surface
- Duration 2 hr minimum per session
- Load tier Mid and max
- Mandatory for All bags 20L and above
Travel Testing
Airport, train, hotel, street. Full packing load across a real travel day — not simulated. This tests how the bag handles packed weight for 8+ hours of intermittent carry. Overhead access, security lanes, and carry-on compliance are all observed and noted.
- Terrain Airport, transit, hotel, street
- Duration Full travel day (8+ hrs)
- Load tier Mid and max
- Mandatory for Travel and one-bag bags
Trail & Urban Terrain
Rough terrain, elevation change, and uneven surfaces. This environment reveals how a bag handles body movement — lateral shifts, incline loading, and step-up transitions. Urban hike routes (uneven blocks, staircase climbs) and trail sections are both used depending on bag category.
- Terrain Trail, uneven urban, elevation
- Duration 2 hr to full day
- Load tier Mid and max
- Mandatory for Hiking, trail, and rucker bags
Real gear. Real weight.
The load isn't random and it isn't simulated with sandbags. Every test uses real gear packed the way a real person would pack it — because how weight distributes inside a bag affects how it carries outside. A misloaded bag is a bad data point.
Light Load
- Laptop or tablet ~3 lb
- Water bottle (empty or half) ~0.5 lb
- Jacket or layer ~1 lb
- EDC items (wallet, keys, cables) ~0.5 lb
Mid Load
- Laptop + charger ~4 lb
- Water bottle (full) ~2 lb
- Jacket + base layer ~2 lb
- Food / snacks ~2 lb
- EDC + accessories ~1 lb
- Weight filler to target variable
Max Load
- Laptop + charger + accessories ~5 lb
- Water (full, 2L where applicable) ~4.5 lb
- Full clothing layer set ~3 lb
- Food + meal items ~3 lb
- Weight plates or filler to target variable
Packed the way a real person packs it
Gear is packed according to the bag's intended use — laptop in the laptop sleeve, clothes in the main compartment, water in the water bottle pocket where one exists. Weight is not concentrated or artificially distributed to stress-test a specific area. The goal is to replicate how the bag's actual user would load it on a real day. If the bag has no organization, it's packed loose. If it has structure, the structure is used. Packing method is noted in every review.
Observations logged in the field.
Notes aren't written after the fact from memory. They're logged during testing at four standardized time windows — so the observation captures what the bag actually felt like at that moment, not a reconstruction of it hours later.
Four windows per session
First impression under load. How the bag sits, initial strap feel, load position, first contact points. This is where fit problems show up immediately — or don't.
Early fatigue window. Strap padding compression, shoulder hotspots, back panel sweat, load shift. Most commute bags are fully evaluated by this window.
Mid-carry assessment. Cumulative fatigue, adjustment frequency, strap creep, pressure point development. Where the difference between a good bag and a great bag starts to show.
End-of-day verdict. What hurts, what held up, what failed. This window is mandatory for any bag claiming long-haul or travel credentials. A bag that falls apart at hour six doesn't earn a passing score in Long-Haul Wearability.
Per-session observation fields
Observations become category scores
After testing is complete, session notes are reviewed across all windows and environments. Each of the five scoring categories is scored based on the accumulated observations — not a single session, but the pattern across all of them. A bag that performs well in the 30-minute window but collapses at hour four doesn't score well in Long-Haul Wearability, regardless of how good it felt early. The score reflects the full carry, not the first impression.
Scores can change. Here's when.
A CC Score is not permanent. If a bag changes in a meaningful way — materials, construction, strap system, back panel — it gets retested against the same protocol. The new score replaces the old one. The old score is archived in the review for reference.
Material or Construction Update
A new fabric, foam, or structural change that affects how the bag carries. A colorway change doesn't qualify. A new back panel system does. A new strap foam does. A new frame sheet does. If the carry could be different, it gets retested.
Version or Generation Change
A bag released as a V2, Gen 2, or updated model gets treated as a new bag. The prior version's score stays on its own review page. The new version starts fresh with a full protocol. Same brand, same name — different carry, different score.
Extended Use Finding
If a bag continues to be carried after scoring and a significant finding emerges — hardware failure, strap delamination, foam collapse — the review is updated with the finding and the score is reassessed. Long-term durability observations are noted even after a score is published.
What stays the same
- Colorway or color option change
- Price increase or decrease
- Limited edition or special release
- New size in the same line
- Buckle color or hardware finish change
- Brand rebrand or logo update
- Change in distributor or retail availability
- New marketing or brand story
Full transparency on every score change
When a bag is retested, the review page is updated with the new score, the retest date, and a note explaining what changed and why the retest was triggered. The original score is preserved below the current score so readers can see the history. If a bag improved, that's noted. If it got worse, that's noted too. The score history is part of the record.
No score is still a verdict.
Not every bag that enters testing receives a score. Some bags disqualify themselves — through structural failure, protocol non-completion, or conditions that make honest scoring impossible. When a bag is disqualified, that's disclosed. The reason is published. No score is not the same as a passing grade.
Protocol Not Completed
The bag did not reach 20 hours of qualifying carry before the review window closed. This most often happens with loaner units returned before the protocol floor is hit, or bags that failed structurally before testing was complete. A bag listed as "in testing" has not yet disqualified — it simply hasn't finished.
Structural Failure During Testing
A strap detached. A zipper failed catastrophically. A frame buckled. Hardware broke under normal load. If the bag cannot complete the protocol because it broke during testing, it is disqualified. The failure is documented and published. The bag may be re-evaluated if the brand addresses the specific failure.
Unit Recalled or Discontinued Mid-Test
If a bag is recalled by the brand, discontinued, or pulled from sale before testing is complete, the review is suspended. Partial findings may be published as a note — not as a scored review. A bag that no longer exists for purchase serves no one with a score.
Unit Could Not Be Tested at Required Load
If the bag's construction made it impossible to reach the required load tier without risk of damage — a bag so lightly built that mid-load testing would be destructive rather than representative — it is disqualified from full scoring. Individual category scores that were completed may be published with context.
Conflict of Interest Identified
If a conflict of interest is identified after testing begins — a financial relationship with the brand, a gifting arrangement that wasn't disclosed, or a situation where honest scoring is compromised — the review is pulled and the conflict is disclosed publicly. The integrity of the score is the only thing that makes it worth anything.
Disclosed. Documented. Done.
Every disqualification is published on the review page with the specific reason. The bag is listed in the reviews index as disqualified — not missing, not forgotten. If the disqualification reason is resolved — a structural failure addressed by the brand, a conflict cleared, a recalled unit replaced — the bag may re-enter testing as a new unit. It starts the protocol from zero.
Questions about how we test.
-
The minimum is 20 hours of qualifying carry — roughly two commute weeks or four full trail days. Most bags take longer. Complex bags, bags with borderline scores, and bags being retested after an update get more time. There is no maximum. A bag stays in testing until the protocol is complete and the notes are sufficient to score every category honestly.
-
Yes — and it's always disclosed. If a brand sent a bag for review, the review says so. If RC paid for the bag, the review says that too. The disclosure is in every review, every time. Receiving a bag for free does not change how it is tested or scored. The protocol is the same regardless of how the unit arrived.
-
No. Brands do not see the score before it is published. Brands cannot request a higher score, contest a score, or trigger a retest by disagreeing with the result. A retest is only triggered by a carry-relevant change to the bag — not by brand preference. No sponsorship changes the number. No brand gets a polite pass.
-
It is disqualified. The failure is documented — what broke, when, at what load, in what environment — and published on the review page. A disqualified bag is not a bag that escaped a bad score. The failure record is more useful to a buyer than a score would have been. If the brand addresses the specific failure with a new unit, the bag may re-enter testing from zero.
-
Most reviews tell you what a bag has. Carry Culture tells you how it carries. The CC Score measures one thing — how a bag actually feels to carry under real weight, over real miles, across real terrain. It does not score features, looks, price, or brand prestige. A bag that looks incredible and carries poorly will score accordingly. A bag from an unknown brand that carries exceptionally will score accordingly too.
-
The protocol is the same — four environments, three load tiers, four observation windows, 20-hour minimum. The specific load targets and mandatory environments vary by carry category because a sling bag and a rucker should not be tested at the same weight. The framework is fixed. The calibration is category-specific. Every bag in the same category is tested against identical targets.
-
Yes. Bag suggestions are welcome through the contact page. RC selects the review queue based on carry category coverage, community interest, and bag availability. Suggesting a bag does not guarantee it will be reviewed, and it does not affect the score if it is. The queue is RC's call — not the brand's, not the community's.
Now you know how we test. See what earned a score.
Every bag in the library went through everything on this page. Browse the reviews, understand the scoring system, or get the next verdict delivered the day it drops.