Experimentation Works Impact Report (2018–2019)

Image from rawpixel.com

The purpose of an impact report is to take stock, in both analytical and introspective ways, of all the things a certain initiative has achieved. We already have blog posts documenting the journey, an independent review which examined various elements, so why ‘yet another piece’? The impact report should be read as a complement to our failure report. In it, aside from collating some stats (which does tell a certain story), we try to be as introspective as possible about what we think went right, and tell some stories that have left a mark on us (which might be representative or symbolic of larger themes).

First off, the statistics:

Alt-text: Infographic of a library with the following words - Participation: 3 departments running experiments; 37 cohort participants involved; 5 departments enabling experimentation. Engagement: 20+ blog posts published; 257 average monthly unique blog views; 11 learning events produced w/ 85% satisfaction rate. Outcomes: 4 experiments run; 100+ professional connections formed; 5 national & international acknowledgement and awards.

In no particular order, here are some of our most impactful stories about EW:

Learning how departments work (Pierre-Olivier Bédard). The initial idea behind expert matching was meant to benefit departments in need of experimentation expertise. This worked out great and was highlighted elsewhere. I think there was a clear impact on that front, as it really filled a capacity gap in departments. An indirect benefit from this process was the insights I gained by working with departments to figure out their experimentation pathway. Designing an experiment on paper is easy enough (for an expert, at least), but implementation is always more challenging, especially in public administration contexts.

My participation in EW essentially provided me with a front-row seat to observe how project team plans unraveled, what obstacles were faced, what practical considerations surfaced and how issues were resolved. This was immensely useful to see and understand for myself, not only how specific projects operate, but to have a higher-level view of how departments, as institutional sets of rules, procedures, practices and norms, support and/or hinder experimentation. It is particularly useful to illustrate how organisations operate, the channels through which experiments can be operationalized and to think about how a system-level dynamic can be changed. EW in itself was a demonstration that experiments are possible, although not always in the most straightforward ways. Some projects bounced in multiple directions, some went on a randomization roller-coaster ride (i.e. we were told one week that randomization is not feasible, thus potentially throwing the project off the rails, to then hear later that a workaround was indeed possible…) but nonetheless managed to get to the end stage. What project teams had in the end was evidence about their project, but also a wealth of insights on how they got there. As public servants, we like to think that we know how public service operates, but we really get to learn by venturing in non-obvious spaces such as experimental projects.

There is no real training on implementation considerations for experiments in public administration settings (maybe there should be? I think so!) and what we have is a collection of personal experience, insights, blog posts and reports documenting all these ideas, obstacles, issues and solutions. This is something we should be paying attention to the next time around when we start to look into what an experiment could look like. No technical textbook on experimentation can substitute this, and it’s definitely something I’ll keep in mind when advising on experimentation design and implementation.

Working in the open is no longer questioned but even celebrated (@danutfm). The open government movement really did make incredible strides in the Government of Canada over the past number of years. Experimentation is also about a mindset that normalizes failure — not in a way where we celebrate incompetence and not thinking things through, but leaving space for hypotheses to not come out as validated as an essential part of learning and moving any field forward. For that reason, we wanted EW to create a space where a nil hypothesis would be seen as a very normal thing to have happen, and in fact, if arrived at using rigorous methods, something that should be seen as useful to the field itself: it would mean other possible solutions could be contemplated next. I was personally very pleased to see that absolutely no one questioned our initial design, which was upfront very bold about the fact that the whole EW cohort should take place completely in the open, from identifying hypotheses, to running the analyses. I was equally thrilled it never became a point of contention in any of the project teams, meaning that it had normalized as a practice government-wide. If our counterparts in other governments internationally had a question about our EW work, they could simply look it up themselves. Success for me here was when the OECD did a write-up about us and we barely had to move a finger — they were able to find enough materials themselves and did a great job summarizing even the most nuanced details.

Working creatively and effectively usually requires leaving office cubicles and boardrooms (@schancase). EW has clearly had a positive impact in showcasing different ways we might begin to weave more innovative and effective practices into government. This was demonstrated this not only through EW’s focus on embedding deliberate experimentation (where possible and appropriate) into how we work across government, but also in showcasing new ways of working, regardless of the context. This includes many elements already featured in this impact report: working in the open, sharing internal talent across government, marrying diverse skill sets through cross-functional teams, and a cohort model that built a sense of community among 7 departments and one central agency. For me, all of this was only possible because it started with a small group of like-minded public servants who were passionate about building something to address a need using any method possible and necessary. Just one small but not insignificant example of this is how and where we worked. Despite 90 Elgin being one of the snazziest government buildings in Ottawa, it didn’t have wifi in 2018. That meant (regularly) taking our tablets and heading down the street to a coffee shop (or someone’s living room) where we could set up a physical and digital space (including real-time collaboration platforms) that could actually allow us to co-work in real-time. I don’t think we could have pulled it off in quite the same way without the freedom and self-propelled drive to work this way.

Getting a unique line of sight into different projects across government (@terhas_). Through my involvement in this initiative, I had the opportunity to learn about a variety of different projects and departmental contexts. Maybe even more valuable than that, I was able to learn from the project teams, experts and the EW community as they worked through their experiments.

For example, I learned a lot about what it takes to do something for the first time across different contexts. I have a deeper understanding around the types of people and expertise needed to do an A/B test in a government context where one has never been done before. Being part of the process, and documenting the learnings, I can tell you that it requires a lot of collaboration. Similarly, I have a better sense of what it takes to deliver one of the first youth micro-grants in the Government of Canada or leverage a third party application to deliver a randomized controlled trial.

Many of the lessons had similar threads, but the varying contexts made the insights particularly unique. Ultimately, I had a really interesting vantage point that not many people get, much less new public servants.

The power of the cohort model (@kailiml). Having been both a participant in the first EW cohort through my role with Natural Resources Canada’s Office of Energy Efficiency, and in taking on the role of Executive Director at Treasury Board of Canada Secretariat responsible for EW, I got to experience first hand the power of the cohort model. By bringing together people who are all working around the issues of experimentation or trying small-scale experiments on their own, EW helped us chart a clear path forward, and made it clear that we were not alone in trying and learning along the way. The sense of community that we built and hope to continue to expand is a strong element of the EW value proposition.

Running EW out of TBS gave departments a direct line into central agency / rule makers (@sean2pt0). In some instances, running EW out of TBS created pressure within TBS to change rules or processes that were obstructions to experiments. Having an active role in implementing departmental projects on the ground is not something that TBS is always used to doing — seeing how TBS rules play out, in real life, can be an enlightening experience for the department and there were several instances wherein the TBS EW team pushed up against other parts of TBS in ways that weren’t always comfortable (or resolvable). It might be an interesting path forward for TBS more generally, even outside of the experimentation commitment, to play a more active role in helping departments launch projects affected by its policies, such that TBS can see the actual manifestation of the rules and processes it creates.

When it comes to experimentation, learning by doing was an inspired choice over theoretical learning models (@danutfm). We knew we had to come up with guides on ‘how to experiment’ within a Government of Canada context, but we resisted the notion of creating a comprehensive guide — it just never seemed right, given that experiments can take so many different designs, and are complicated by the available data, implementation windows, favourable approval processes, ethics, etc. So rather than focusing on the possible methods, we decided to focus attention on implementation, and on learning by doing. This seems to have been an inspired choice, and I would say I would use it again in the future over more traditional learning tools such as classroom-based teaching. Not that we didn’t need to learn from bonafide experts, with real expertise and practice designing and running experiments. We did not want to simply go into a room of our peers and share our experiences to date and call it done. But we also knew that teaching in the abstract will likely not get at the true complications with experimentation, which almost always have to do with implementation. We typically (and sagely) note in government that incentives need to be aligned in order for anything to be a success — here we had a golden opportunity to test the experimental way on a small scale, knowing that we could control many of the incentive structures to see what exactly it takes for it to succeed. I don’t want to suggest that it was easy. But I count as a win the fact that, having brought alongside individuals who were more than skeptical when first joining EW (but joined anyway), these same individuals became true advocates for the value of experimentation in a public service context.

TBS and executives can be effective action levers (Pierre-Olivier Bédard). As an expert, I could have been involved (as I have, from time to time) in providing ad hoc advice on project designs, without any formal connection to EW. Maybe that would have worked out well, but the fact that my involvement was part of a formal process to share expertise between departments made it much easier to sell my contribution to my management. Initially working in an area not directly connected to experimentation, this meant that I had to argue that it wouldn’t affect my existing workload, and could also benefit the organisation.

The fact that the initial invitation came from an executive from within my own organization, and that TBS was willing to sign a Memorandum of Understanding made the case much easier to present: a vice-president/assistant deputy minister suggests I join a TBS endorsed initiative through a Memorandum of Understanding within both departments — what can a manager say? Some may say that I was voluntold, but this was in fact the opening I was looking/hoping for.

Following this, and some department-wide communication that the organisation was joining this TBS initiative with me as an expert, I quickly came to be seen within the organization as this nerd who knows a lot about experiments and stats. From there, I was quickly invited to meetings with various policy and evaluation groups to brainstorm experimental options within the organization. In short, the TBS brand, coupled with the push from executives, paved the way for a fruitful collaboration, one that even spilled over to include unplanned experimental work within my own organisation.

Using TBS’ name in vain really works! (@kailiml) it makes it a lot easier to advocate for the usage of experimental approaches when a Central Agency is enabling you through resources, tools and in-person visits. While we all agree that experimentation is an effective tool for de-risking policy/ program decisions, it’s sometimes a challenge to get the approval to experiment in the first place. Signing the MOU with an ADM at TBS gave a level of permission and comfort that propelled our work forward.

Indirectly creating demand and putting the spotlight on capacity gaps (Pierre-Olivier Bédard) Maybe this doesn’t sound like a clear win, but beyond the projects generating experimental evidence, the lessons learned and the knowledge shared, I think EW also had a demonstration effect for what is required for experimentation to take hold.

In the process, I have repeatedly heard people saying: Is there a tool for this? Is there training on that? Can you do a workshop on this? Do you have case studies on that? Is there a government-wide inventory on this? Is there a list of experts across government? (A lot of my answers started with : “That would be great, but…”)

To me, all these things demonstrate that there is an interest and appetite for experimentation. When we talk about the ‘lack of demand’ for experimentation, we typically mean lack of demand from the top, with a lack of experimentation requirements in government processes and so forth. When talking to working level professionals, it’s a bit of a different story as a lot of them are looking for resources to build their experimentation knowledge and skillset. I actually feel there a fair amount of demand at that level.

Providing guidance and opportunities for experimentation- which is much of what EW did — sparked something and acted as a precedent that I hope will justify further development of experimentation tools and materials.

Where do we go from here?

Results from project teams have provided evidence that ultimately may inform decision-making. The evidence generated and the insights gained through the EW cohort experience should provide project teams (and their management) with a strong signal on what course of action could be adopted going forward. In some cases, the findings indicate that scaling up an intervention may indeed be a good option, while for others it highlights clear opportunities for replication based on future experimental iterations.

The paths followed by the various project teams may not have been linear (and they were faced with some very practical constraints), however they show that experimentation can be accomplished within government to a high degree of methodological rigour, even with limited time and resources.

The connecting thread between all projects — and to us a key success factor — is the value of partnerships. Partnerships took many forms and included a wide variety of folks from many departments and agencies, and from various parts of those departments and agencies: project leads, senior management, experimental design and statistical experts, IM/IT support, communications officers to name just a few. While it may not have always been clear who would be needed at every step of the way, this is what learning-by-doing is all about. As challenges surfaced during the project design and implementation phases, more people were brought in to solve specific issues. Ideally, all of these issues would have been predicted in advance, but in practice this is not possible. Another reason why one should always have a community to help sort out issues as they arise.

Maintaining and expanding experimental capacity within government should mean capitalizing on the added-value of relationships. As experimentation within a public service context becomes more institutionalized and even routine, networks of experimentation experts and project leads need to be strengthened and seen as permanent features of the experimentation ecosystem.

For this to happen, there needs to be sufficient demand within the public service. Experimentation always takes place in a system, which has its own dynamics, incentives and accountability structures. The demand for more evidence (arrived at through experimentation) should come from analysts themselves but also from senior management and the political sphere.

As we’ve noted above in our stories, buy-in from those in executive positions was a success factor. Strong signals from management about the value of experimentation and the need to become proficient in its practice act as validation. They should also translate into creating appropriate space (time and resources) for experimentation. It is vital that staff working on experimentation to have the kind of championing voices that continually advocate for this practice. The advocacy element often has a bandwagon effect: seeing practical examples of what others are doing, being transparent about specific challenges and uncertainties, while at the same time mainstreaming this practice to core areas not just peripheral files, are and will continue to be integral to expanding this function within any public service.

Post by some of the TBS team involved in EW, past and present: Pierre-Olivier Bédard, Sarah Chan, Terhas Ghebretecle, Kaili Lévesque, Dan Monafu, Sean Turnbull.

Article également disponible en français ici: https://medium.com/@exp_oeuvre



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store