No Sticks Just Carrots
Energy Efficiency Experiments using Carrot Rewards
Natural Resources Canada’s Office of Energy Efficiency (OEE) has been working with Carrot Insights, a certified B Corps, for a few years now. Their healthy living and wellness app, Carrot Rewards, engages over 800K (and growing!) Canadians. Carrot users are rewarded with loyalty-program points of their choosing (e.g., Scene, Aeroplan, RBC Rewards) when they complete offers, like quizzes and missions. To date, our work together has focused on improving energy efficiency awareness, literacy and actions in the home and on the road. So, when we needed a platform and users for our Experimentation Works (EW) experiments, it was a logical step to see what was possible with Carrot.
We brought the idea to the Carrot team and they were, as always, game to play despite it being a new way to use the Carrot Rewards app. The OEE Social Innovation UnLab has worked in the space of unknowns that come with experimentation: unknown paths, unproven tools, unknown outcomes. Embarking on this journey of unknowns with Carrot was par for the experimentation course and we are grateful that the Carrot team was keen to explore the possibilities with us.
Our Experiments
Through a partnership between OEE’s Social Innovation UnLab, OEE Housing Division and NRCan’s Experimentation and Analytics Unit, we are in the process of delivering two experiments:
- The first one involves testing different label designs to see how they might influence the understanding of home energy efficiency. Users will receive different images and paired combinations of graphical scales (up to 36 variations) to test how accurately they can correctly identify and compare energy consumption and energy efficiency between fictional home energy labels.
- Our second involves testing different message frames to nudge homeowners to take action towards engaging with a home energy advisor, a step towards getting a home energy evaluation. Carrot users will be provided with information on home energy evaluations and the role of home energy advisors, but framed in three different messages that will be randomized (e.g., cost-savings, improved comfort).
On Randomization
For both experiments, we will randomly assign Carrot users into different treatment groups to receive either different label graphics or different message frames. This level of randomization is easy for Carrot — they routinely use this functionality when providing users offers. Check!
Next, we needed to figure out how to randomize the different label combinations and message frames within the offer to present all the different combinations of graphic pairings or message frames to the appropriate number of users to get the amount of completions we need. This was a more challenging proposition as the Carrot app cannot yet support in-offer randomization.
We settled on a solution to create an independent offer for each scenario: 36 label offers with common questions but different graphics for experiment #1, and 3 nudge offers with common information but different message frames for experiment #2. This created more work, but also a workable solution for manual randomization.
The Devil is in the Details
Next, we needed to ensure that the graphics we wanted to present with the label experiment were of appropriate quality to see on a mobile device. A smartphone screen is small and we had both images and text that we needed users to easily view. We had already selected graphic scales and styles to test based on existing home energy labels, so we worked with ColourCoding Media to create them.
Some early testing on the Carrot platform showed that the graphics we created were not easily readable on a smartphone screen. Graphical elements are common to existing Carrot offers, but data-images were a new endeavor for Carrot. We quickly realized that the size and shape of real estate available for graphics on the platform required us to re-work our graphics. We were trying to squeeze two comparative graphics and text onto a single screen.
We went back to our designer and worked to reduce the size of each original image (to allow the most space for image enlargement when loaded onto the platform), alter the font, and adjust colour contrast to improve on-screen readability. We looked at different stacking options for the graphic pairings and previously un-tried Carrot options to embed images directly into answer fields (bigger space but programming limitations) all in an effort to optimize the visual experience and minimize any impact poor readability would have on our experiment. We finally settled on vertically stacking the simplified image pairs, which gives us the largest, clearest image possible. Pre-launch testing will help us gauge if we have hit the mark on the visual quality of this design.
Where’s the Finish Line?
Given their large user-base, our Carrot offers typically run for a few days only in order to reach and engage enough Canadians and to demonstrate desired learning and actions. For our previous offers, “enough” was typically framed by budget and time. In a RCT experimental context, the size of the sample will determine the power (reliability) of the results. A power calculation (aided by a gut-sense and a few assumptions) let us determine early on what sample sizes we’d need to ensure we had enough statistical power to detect differences between our graphic or messaging treatments.
For the labeling experiment, we need 30 000 total users to complete the experiment, spread across our 36 label-offer combinations, which breaks down to about 834 completions for each, with a few extra for good measure. To get this sample size, we will launch the label experiment and keep it open until we hit that completions mark. For the messaging experiment, we want to reach a total of 30 000 home or property owners, since we’ve assumed owners are more likely to be nudged to invest in efficiency retrofits (the end-game of an home evaluation) than renters. For this requirement, we needed to either create a qualifying offer to identify Carrot homeowners (both added design cost and time) or leave the offer open until we get enough homeowners within the pool of users who complete the nudge experiment. Fortunately, we are running another home-related offer with Carrot that will target homeowners and we will be able to link up to the sub-population created from that offer- a huge resource saver!
Lessons Learned So Far
Above are a handful of unknowns that we have navigated and shed light on throughout our Experimentation Works journey. We have learned a ton from co-creating with Michael Kalin (NRCan Experimentation and Analytics Unit) and Carrot on this, and suspect they have learned a lot with us too.
As with other work we have done, we cannot emphasize enough how, each time, we are struck by the time and capacity it takes to do experimentation well. There’s the time needed to build the experiment itself (co-create, test, iterate), but often much more time-intensive are the collateral tasks needed to support, coordinate, validate, administer, analyze, manage and communicate the actual experiment. As we practice experimentation, more and more, we will get better and faster if we protect the time and resources.
Also, if direct-user testing and experimentation is the way of the future for the federal public service (we think it is!), the right digital tools, like user testing platforms, to engage directly with Canadians will be essential. Of course, testing analog services in digital ways is also challenging. As we transition to digital service delivery, we expect it will open up opportunities to alpha and beta test in ongoing and easier ways.
Our label experiment will launch this month and we will have preliminary results soon after that. The messaging experiment will run in January along with our other home energy efficiency content. We will analyse and report findings in February, so stay tuned for a follow-up post then!
Post by NRCan’s EW Team
Article également disponible en français ici: https://medium.com/@exp_oeuvre