By Tomasz Teodorowicz — 26 Mar 2026

“It Worked in Sandbox” Is Not a Test Result

The Most Dangerous Phrase in the Ecosystem One of the most dangerous phrases in any Salesforce project is a sentence that sounds deceptively reassuring: “But it worked in the sandbox.” It feels like progress. It implies that the logic is sound and the feature is ready for deployment. But seasoned…

The Most Dangerous Phrase in the Ecosystem

One of the most dangerous phrases in any Salesforce project is a sentence that sounds deceptively reassuring:

“But it worked in the sandbox.”

It feels like progress. It implies that the logic is sound and the feature is ready for deployment. But seasoned Salesforce testers know the truth: This statement is almost always meaningless.

In Salesforce QA, a passing test in a sandbox does not prove correctness. It only proves that this specific environment did not expose the problem.

We need to stop treating sandbox validation as the finish line. It isn't. It is simply a lack of immediate failure in a controlled, often sterile, environment.

The Comforting Lie of the Sandbox

Sandboxes are designed for development agility, not environmental realism. Unless you are working in a freshly refreshed Full Copy sandbox (and let’s be honest, how often does that happen exactly when you need it?), your testing ground is missing the chaos of reality.

Most of the time, Developer and Partial Copy sandboxes are missing:

Real data volume.
Real user roles and hierarchy.
Real ownership structures.
Real automation collisions.

Yet teams routinely treat a "green" test cycle in UAT as evidence that the feature is done. We hear "Tested on Dev," "Passed in Partial," or "UAT signed off," only to watch Production throw a System.LimitException or a Row Lock error two hours after deployment.

The failure usually isn't because the platform is unstable. It’s because the environment lied.

Why Sandbox Tests Pass When Production Fails

The success of a test case in a lower environment usually depends on what is not there. Let’s break down the three biggest liars in your Org.

1. Empty Data Hides Logic Defects

In a typical Developer Sandbox, your data landscape is pristine. You might have 10 Accounts, zero ownership skew, and no legacy records from a data migration done in 2018.

When you run a Flow that queries "all related records," it works perfectly because it only has to loop through five items. Your Apex trigger runs without bulkification because it never hits the Governor Limits.

Then you deploy to Production:

The Reality: 50,000 Accounts exist.
The Skew: One generic "System User" owns 70% of those records.
The Result: Your SOQL query times out, or your Flow hits an element limit.

The logic technically "worked" in the sandbox, but the logic was never the problem. The volume was.

2. Admin Users Invalidate Security Testing

Who ran the test? If the answer is "me," and you are logged in as a System Administrator, the test is likely invalid.

Testing as an Admin creates a false sense of security because you have "God Mode" enabled. You bypass Sharing Rules, ignore Field-Level Security (FLS), see all Record Types, and can update locked records without issue.

In Production, the actual end-user:

Does not have the Permission Set to see that specific field.
Is restricted by a Validation Rule that only applies to their Profile.
Cannot save the record because a Sharing Rule prevents them from seeing the parent object.

The feature didn't break during deployment. The test was invalid from the start.

3. The Automation Traffic Jam

Developer sandboxes are often isolated. They lack the full "Flow landscape," scheduled jobs are turned off to save resources, and integration listeners are deactivated. Your new Record-Triggered Flow runs in isolation, creating a beautiful, clean transaction.

In Production, that Flow is not alone. It is trying to run alongside:

A legacy Process Builder from 2021 that no one wants to touch.
An Apex Trigger managed by an installed package.
A nightly batch job running strictly at 2:00 AM.

The Salesforce Order of Execution is unforgiving. In a sandbox, your automation finishes in 200ms. In Production, the collision of these automations causes a CPU Time Out or a frantic race condition where data is overwritten before the transaction commits.

The QA Shift: From Environments to Risk

Strong Salesforce QA does not ask: "Did it work in sandbox?"

It asks: "What does this feature depend on that is missing here?"

We must shift from verifying functionality to verifying resilience. A sandbox test is only useful when QA explicitly states: "This environment does not represent X, Y, or Z — and we accept that risk."

QA Takeaways

Test as the User: Never sign off on a feature unless you have tested it logged in as the specific persona (Profile/Permission Set) who will use it. Login As is your best friend.
Simulate Volume: Don't test with one record. Use tools (or simple data loader scripts) to generate bulk data. If a Flow handles a collection, force it to handle 200 records, not just two.
Identify the Delta: When reporting a pass, explicitly state the environmental risk (e.g., "Passed in Dev; requires regression testing in Full Copy to verify integration impact").
Respect the Skew: If you know Production has massive data skew (e.g., one Account with 10k Contacts), recreate that skew in your test environment.

The failure is not deploying broken code. The failure is treating sandbox success as proof of quality. In Salesforce, environments are tools—not proofs.

“It worked in sandbox” isn't a conclusion. It’s a warning.

Do you have a horror story where a 'perfect' sandbox test caused a Production meltdown? I’d love to hear it.