How to A/B Test Package Inserts (And Prove ROI to Your Boss)

By
Tom McGee
January 25, 2026

"Do our inserts actually work?"

It's a fair question. You're spending money on printed materials, inventory space, and fulfillment labor. Leadership wants to know the ROI.

The honest answer for most brands: "We don't know."

They add inserts because it "feels right" or "competitors do it." They have no data on whether a thank you card outperforms a discount card, or if samples convert better than promotional flyers.

This guide changes that. You'll learn how to run real A/B tests on your inserts, interpret results correctly, and present data that justifies (or cuts) your insert budget.

Why Most Brands Don't Test Inserts

Testing physical marketing is harder than testing digital:

  1. Manual processes make tests impractical. You can't easily split orders between different inserts without warehouse coordination.

  2. No tracking mechanism. Once an insert goes in a box, how do you know who received what?

  3. "We've always done it this way." Insert programs often run on autopilot without measurement.

  4. Small sample sizes. Some brands don't have enough orders to reach significance.

These are real obstacles. But they're solvable.

What You Can A/B Test

Before setting up tests, know what's worth testing:

High-Impact Tests

Test Hypothesis Why It Matters
Insert vs. no insert Inserts drive more repeat purchases than no insert Proves the entire program has value
Discount vs. thank you card Discounts drive more conversions Determines optimal insert type
Different discount amounts 15% converts better than 10% Optimizes cost vs. conversion
Sample A vs. Sample B One sample converts better to purchase Picks the best cross-sell sample

Lower-Impact Tests

These matter less but can still be tested:

  • Card size/format (postcard vs. folded card)
  • Design variations (same message, different look)
  • Paper stock (premium vs. standard)

Start with high-impact tests. Don't optimize card stock before proving inserts work at all.

Setting Up an A/B Test

Step 1: Define Your Hypothesis

Good hypothesis: "Discount cards drive more repeat purchases than thank you cards among first-time buyers."

Bad hypothesis: "Let's test some stuff."

Your hypothesis should be:

  • Specific: What are you comparing?
  • Measurable: What's the success metric?
  • Actionable: What will you do with the result?

Step 2: Create Variants

In Insertr, create an A/B test instead of a regular rule:

A/B Test: "First Order Insert Test"

Variants:

Variant Percentage Product
Discount Card 50% $10 Off Next Order Card
Thank You Card 50% Handwritten-Style Thank You

A/B test creation interface Create an A/B test and define your variants with percentage allocations.

Conditions:

  • Customer Total Order Count = 1 (first orders only)
  • Only run once per customer = Yes

A/B test variant configuration Configure each variant with its product and traffic percentage.

Step 3: Include a Control Group (Optional)

Want to test insert vs. no insert? Don't allocate 100% to variants.

Example:

  • Discount Card: 40%
  • Thank You Card: 40%
  • Control (no insert): 20%

The remaining percentage automatically becomes your control group. These customers receive nothing, giving you a baseline.

Step 4: Let It Run

Don't check results after 10 orders and declare a winner. You need enough data.

Minimum sample sizes:

  • 100 recipients per variant (bare minimum)
  • 200+ recipients per variant (recommended)
  • 500+ recipients per variant (high confidence)

At 100 orders/day with 50/50 split, you'd need 4 days minimum.

Reading Your Results

Key Metrics

Metric Definition What It Tells You
Recipients Customers who received the variant Sample size
Conversions Recipients who placed a follow-up order Did it work?
Conversion Rate Conversions ÷ Recipients Which variant performs better
Revenue Total revenue from conversion orders Revenue impact
Revenue per Recipient Revenue ÷ Recipients Value created per insert
AOV Revenue ÷ Conversions Who's buying more?

Example Results

After 500 first orders:

Variant Recipients Conversions Rate Revenue Rev/Recipient
Discount Card 250 45 18% $1,575 $6.30
Thank You Card 250 30 12% $1,560 $6.24

Winner: Discount Card (18% vs 12% conversion rate)

But wait—look at revenue. Despite 50% more conversions, the discount card generated similar total revenue. Why?

Let's add AOV:

Variant Conversions AOV Revenue
Discount Card 45 $35 $1,575
Thank You Card 30 $52 $1,560

The discount attracts more buyers, but they buy less (probably just hitting the discount threshold). The thank you card drives fewer but higher-value purchases.

The right answer depends on your goals:

  • Want more repeat customers? Discount card wins.
  • Want higher revenue? Roughly tied.
  • Factor in discount cost? Thank you card might win.

Statistical Significance

Just because one variant has a higher conversion rate doesn't mean it's actually better. You need statistical significance.

Statistical significance indicator The system calculates significance automatically and tells you when results are reliable.

Insertr calculates significance using a 95% confidence level:

Results State What It Means What to Do
"Collecting data" < 30 recipients per variant Keep waiting
"Early results (not significant)" 30-100 recipients, trend visible Interesting but don't act yet
"Variant A winning (95% confidence)" Significant result You can act on this
"No significant difference" 100+ recipients, p ≥ 0.05 Variants perform similarly

Don't stop tests early. A 20% vs 15% difference might look meaningful but could be random noise with small samples.

Calculating ROI

Once you have results, translate them to business impact.

Step 1: Calculate Cost per Insert

Component Discount Card Thank You Card
Printing $0.30 $0.40
Fulfillment $0.25 $0.25
Discount cost (avg redemption) $7.00 $0.00
Total per insert $7.55 $0.65

Don't forget: discount cards have variable cost based on redemption rate.

Step 2: Calculate ROAS

Discount Card:

  • Cost: $7.55 × 250 = $1,887.50
  • Revenue: $1,575
  • ROAS: 0.83x (losing money)

Thank You Card:

  • Cost: $0.65 × 250 = $162.50
  • Revenue: $1,560
  • ROAS: 9.6x (strong return)

Even though the discount card had more conversions, the thank you card has dramatically better ROI because it doesn't give away margin.

Presenting to Leadership

Leadership doesn't want a statistics lesson. They want to know:

  1. What did we test? "Discount cards vs thank you cards for first-time buyers."

  2. What did we learn? "Discount cards drive 50% more repeat purchases, but thank you cards have 10x better ROI."

  3. What should we do? "Switch to thank you cards and reinvest the savings into higher-quality cards."

The One-Slide Summary

First Order Insert Test Results
═══════════════════════════════════════════════════════

Test Period: Jan 1-31, 2026 (500 first orders)
Variants: Discount Card vs Thank You Card

                    Discount    Thank You    Winner
Conversion Rate       18%         12%       Discount
Revenue per Insert   $6.30       $6.24       Tie
ROAS                 0.83x       9.6x       Thank You

Recommendation: Switch to Thank You Cards
- Similar revenue impact at 1/10th the cost
- Savings of ~$1,700/month at current volume
- Reinvest in premium card stock for better impression

A/B test results comparison The analytics dashboard shows variant comparison side-by-side.

Advanced Testing Strategies

Sequential Testing

Test multiple things over time:

Month 1: Insert vs. no insert (prove value) Month 2: Discount vs. thank you card (optimize type) Month 3: 10% vs. 15% vs. 20% discount (optimize offer)

Each test builds on the previous winner.

Segment-Specific Tests

Different customers might respond differently:

  • High-value customers: Maybe they prefer recognition over discounts
  • Subscription customers: Test subscription-specific messaging
  • Product-specific: Test category-relevant samples

Create separate A/B tests for different segments.

Multi-Variant Tests

Test more than two options:

  • Variant A: Discount card (30%)
  • Variant B: Thank you card (30%)
  • Variant C: Sample product (30%)
  • Control: No insert (10%)

More variants = need more orders for significance.

Common Questions

Q: How long should I run a test? A: Until you have 100+ recipients per variant minimum. At 50/50 split and 100 orders/day, that's 4+ days. For high confidence, aim for 2 weeks.

Q: What if results are "not significant"? A: It means the variants perform similarly. That's still useful—you can choose the cheaper option without sacrificing performance.

Q: Can I test more than 2 variants? A: Yes, but you need more orders to reach significance. With 4 variants at 25% each, you'd need 400+ orders minimum.

Q: What about long-term impact? A: Set a longer attribution window (90-120 days) to capture delayed conversions. Some customers take time to return.

Q: Should I test everything? A: No. Test things that matter. "Discount vs no discount" matters more than "blue card vs green card."


Real Example: First Order Gift Test

A DTC food brand tested three first-order inserts:

Variants:

  • A: $5 off next order (40%)
  • B: Recipe card (40%)
  • C: Control/no insert (20%)

Results after 1,000 first orders:

Variant Recipients Conversions Rate AOV Revenue Insert Cost ROAS
$5 discount 400 76 19% $38 $2,888 $1,200* 2.4x
Recipe card 400 52 13% $42 $2,184 $200 10.9x
No insert 200 20 10% $41 $820 $0

*Includes $5 discount cost on redemptions

Key findings:

  1. Both inserts beat control (inserts work)
  2. Discount drives more conversions but lower AOV
  3. Recipe card has 4.5x better ROAS than discount
  4. Recipe card lifts conversion 30% vs control at minimal cost

Decision: Switch to recipe cards, reinvest savings into more frequent product updates.


Get Started

Ready to prove (or disprove) your insert ROI?

  1. Install Insertr (14-day free trial)
  2. Pick one high-impact test to start (insert vs. no insert, or type A vs. type B)
  3. Set up your A/B test with even split
  4. Wait for significance (100+ per variant)
  5. Present results and optimize

Stop guessing what works. Start measuring.


Last updated: January 2026 | Author: Tom McGee, Founder of Insertr

About the Author: Tom McGee is the founder of Insertr and a former Senior Software Engineer at both Shopify and ShipBob. He built Insertr's A/B testing and analytics features specifically to solve the measurement problem he experienced running Cool Steeper Club, where he needed to prove which inserts actually drove subscriber retention.


Related Guides

Ready to get started?

Turn every package into a marketing opportunity
Available on Shopify App Store
Get started