How can I make sure it's inclusive?

The title of this post is a question for me to answer and nobody else. This isn't a request for help or advice. I haven't been writing on the internet recently but something came up last week and I thought I could do some of that 'working out loud' like I used to. 

The session

About six weeks ago I started a weekly 'open session' about data architecture and data strategy with my colleague Josh (data architect). The idea was to start small and see if a regular 30 minute video call with an open invite would be a useful vehicle for sharing work and progress, and testing our work with colleagues. Josh and I were alternating each week on who would take the lead, so data architecture one week and data strategy the next. 

We recorded the sessions so that people who couldn't attend could watch them back when convenient [1]. For the first session I invited people who had expressed an interest in the 'data' Slack channel. Over the next few weeks the invite list started to grow organically. We had positive feedback and the quality of conversation and contributions was really good. I hadn't advertised the forum widely but was planning to because it felt like we had something credible and constructive going.

At Citizens Advice we care deeply about inclusive working practices and diverse teams. We need to make sure we represent the population and communities that we serve. Our data work must be inclusive, and I'm particularly motivated to embed this into the development of our data strategy.

The problem

At the session last week my colleague Phil made a closing remark along the lines of "could we get more people in the group who aren't white men?".

And Phil was right of course. It had occurred to me already, but I hadn't called it out. I should have called it out myself by then, 6 weeks in to the work.

My boss had watched one of the videos and thought it was good, but when I mentioned Phil's corrective he said something along the lines of "yes it did look like six white men talking about data."

There were more than six people on the call (we were getting 15-20 attendees), but yes the white men were doing the talking and yes most of the people in the overall group were white men.

This isn't to diminish the thought and enthusiasm that my colleagues have put into this forum so far. It's made the work better, no question. But there is a diversity issue, and it's mine to resolve.

The working out loud

I need to work out how we got here. I am responsible for this, after all, and it's early enough to correct it. 

We have another regular forum where we talk about our data trends. That grew organically as well. I don't think there's an apparent lack of diversity on that call. Also I ran an open session on data strategy before. I don't recall a lack of diversity in that group either. And when I say 'lack of diversity' it doesn't mean it's good enough, just that it's not "six white men talking about data" bad.

Maybe there's a barrier to entry for the topics themselves. Perhaps the language is wrong. Are the words 'architecture' and 'strategy' creating a barrier to entry? Although I haven't found either of these areas of data work to be the sole preserve of white men, both in terms of my work experience and the people I look up to in the sector. Pretty much all those people I look up to aren't white men.

So no, I don't think the topics themselves are excluding people. But we could work to make them more accessible.

I suspect the forum itself isn't quite right, with the format being based on conversation and having the confidence to ask questions. There's also my role in this forum. I know the subject matter and it's a situation where I'm particularly confident. If a question is asked of me I will come up with an answer for it then. And it might not be a particularly good answer but it will probably sound like it is. 

I think when it's my turn to lead the session I put myself in the centre of it too much, rather than the substance of the work. I don't think Josh does that, to his credit.

So I think it would be good to try other ways to contribute. Questions in advance. Questions afterwards. Splitting in to smaller groups for discussion. Not defaulting to answering questions myself, and listening more.

Then there's something about growing the audience. I should have acted sooner. So when I do start to publicise it more widely I need to watch how it develops. I could ask everybody on the invite list to invite one colleague who isn't a white man, for example.

Finally, while data architecture and data strategy are mine and Josh's responsibility to lead in our organisation, I will encourage other people to present and lead the discussion. This could include people from outside of the organisation.

That was pretty useful. Thanks for reading.

Footnotes

[1] If you've met me in real life you know that I can speak really slowly so watching a video where I'm talking at 1.5x speed works pretty well.

What I learned from building a data product in a crisis

I work at Citizens Advice. The Covid-19 pandemic has had a dramatic impact across our services and seen an incredible response from staff in the organisation, for example

  • Unprecedented demand for our website content
  • Creation of new, trusted expert advice content at speed
  • Stopping the provision of advice in person at the 250+ independent local Citizens Advice locations due to lockdown measures
  • A resulting shift to providing advice through our other channels, such as telephone
  • A pronounced change in the patterns of issues that our clients are coming to us with

    I lead the Data team. I work closely with my colleague Tom, who leads the Impact team. Broadly speaking, my team is responsible for making data available, and Tom's team are responsible for asking questions of it.

    On 19 March Tom and I were asked to draw together a wide variety of operational data into a single place, to help management and leadership navigate the crisis. It would include activity and demand data for various channels, and data on client demographics and breakdown by issue. It would also include a summary for a quick, headline view.

    Citizens Advice colleagues have spoken about our data and what it is telling us on social media and in the news in the past couple of months. Rather than talking about our data myself, in this post I wanted to reflect on the process and experience of "making the thing", rather than what's in that thing or what it means.

    It's been a really rewarding experience and I have learned lessons from it that I thought would be worth sharing.

    Get something out there

    It was crisis time, and we received an understandably loose brief as a result. We brought together a group from both of our teams, and came up with a first iteration in 4 days.

    We made a spreadsheet. Spreadsheets are great.

    It is a spreadsheet that's intended to be read by humans, rather than a spreadsheet that's a data source. We collectively agreed to making a spreadsheet, having been given a steer not to build something in the proprietary data analysis tool that's widely used at Citizens Advice. Initially we thought it could be a slide deck, and I had a strong view it should be in Excel, but the consensus was to go with Google sheets. G-Suite is what we use at Citizens Advice, and sheets has the advantage of being easy to share and less of a daily overhead to maintain than a slide deck.

    Ideally we would have had a better understanding of user needs, and some clearer questions to be asked of the data. Regardless, our first version was well received, and put us in a position to improve it regularly. Me aside, the team we put together has a really good understanding of how Citizens Advice works, and I think this helped with our initial and continued success.

    Ask for expert input

    We have a wide range of expertise at Citizens Advice, including a really strong digital team in the form of Customer Journey. I was able to ask for input from a designer and content expert before we circulated the first version of the report. This really improved the product, helping it to be clearer and easier to navigate. Later, when we were trying to understand how the report was being used, I had input from a user researcher on survey questions.

    Even if you aren't working in a 'model' multidisciplinary team, it doesn't mean you shouldn't take a multidisciplinary approach. And if you don't have this kind of expertise to hand, just asking somebody for an objective view and opening yourself up to constructive criticism is always good.

    Work to understand your users

    Again, it wasn't ideal to get into this after the fact. But it's essential to try, and I thought what we ended up doing was neat and pragmatic.

    We were able to get a log of all the staff who had accessed the report, and when. From this, we were able to build a picture of when the report was being used - early in the morning, justifying the demand for an early update of the data. It also gave us a list of people to survey.

    I wrote a short survey in Google forms and around 40% of our users responded. I was most interested in whether the report had resulted in people taking decisions, or doing things differently. For me, this is the benchmark for all reporting - if you're not taking decisions as a result, why go to the expense of doing it?

    So seeing this pie chart [1] showing that over 70% of people had taken decisions or actions as a result of this work was really gratifying:


    The follow up question was detail about what those decisions were. This gave us an understanding of the breadth of decisions, the variety of our users, and the relevance of the report across multiple parts of our organisation.

    The next thing I was most interested in was whether users could understand what was in the report. I think the barrier to entry for data products needs to be as low as possible, and that data focused teams can tend to take understanding for granted. This pie chart indicates that the report was pretty legible, but that further work could be done:

    Being able to do these two humble tests meant a great deal to me.

    Keep iterating

    The report has been constantly changed over the past couple of months, in response to new requests, changing circumstances, and a team attitude of continuous improvement. It's been good to be part of a group where it's understood from the outset that the thing will never be complete.

    I think this has been supported by the team practice, which is curious and conversational. Speaking of which...

    Have regular discussion

    I think the team practice has been the most valuable thing to emerge from this work. We have a weekly discussion about what the data in the report is telling us, facilitated by Tom. This encourages ownership and recognises expertise, because the people closest to the data are driving the conversation. Some really valuable lines of thought and investigation have come out of these meetings. We thought it was so good that we started inviting guests, and they've found it valuable too [2].

    We separated out the mechanics of developing the product from the discussion of the data. We have had a fortnightly meeting to do that, led by me. That's worked well, I think in particular because the team have a high degree of autonomy, with individuals trusted to work on their own parts of the report with some light oversight from the most experienced team members.

    Take things out and do less

    The first stage of the crisis called for daily updates. This is unsustainable over time, and developing the report helped us to understand the various states of the data we have. Some data is easy to automate, whereas some requires a large amount of manual intervention and also changes shape regularly, making it labour intensive to report on. This has been a helpful secondary outcome of the work, because it can help inform where we put effort to improve our underlying systems and practices.

    Not everything we've done was useful, or used. So we've taken things out. In future, I will work to understand what's being used in a more methodical way. I missed an opportunity to ask a question in the first survey - "which tabs are you using?" and a pick list. We also tried to track usage using Google analytics, but it was unsatisfactory.

    Due to the iterative nature of the work and the regular discussion of patterns, it also became clear to the team that the time periods for significant changes in the data were longer than daily. If we hadn't kept developing and discussing, we might have been saddled with a daily reporting burden for longer. This also gives us a sense of the various cadences of the different operational data we have at Citizens Advice. Not everything behaves like the instantaneous spike we'd expect to see on our website after a government announcement, for example. Our service is broad and varied, and to my mind that variety helps us to meet the complex needs of our clients.

    Help people to understand the questions they need to ask

    I think one of the most important areas of work for data maturity in public service is to help people to formulate the questions they want to ask of the data. It helps to focus the discussion and build of any data product. Generally speaking, "give me all the data" isn't the best starting point and there's no intrinsic good in having large amounts of data, hoping that some incredible truth will emerge from it.

    In my experience over the past couple of months these questions have increasingly started coming from colleagues, which is great to see. A polite challenge of "what questions do you want to ask?" has been useful.

    You don't need to serve everybody with one thing

    I think we've resisted putting ever more measures in our report just because it's the first time so much of our data has been together in one place. The survey showed us that our users have different needs, and we recognised these might be better met by other products in future.

    I said earlier about the thing never being complete, but that isn't the same as it not being finished - 'finished' as in set aside in order to move on to the next thing, taking what you learned with you.


    Footnotes

    [1] Better charts are available
    [2] This is what they tell us at least

    Data: 20 years of hurt

    I went to the Institute for Government's 'data bites' event a couple of weeks ago - an evening of four short talks from folks working in public service with data. It was great, thank you to Gavin and the team.

    I was particularly struck by the fourth talk, by Yvonne Gallagher from the National Audit Office (NAO). Yvonne was talking about the challenges of using data in government, in advance of an NAO report coming out later this month. You can watch the talk for yourself. It was eight minutes long with eight minutes for questions.

    My main impression from Yvonne's talk was that 'data' has been a problem across government for over twenty years. I felt an overriding sensation of 'enough of this it's really time to sort it out now folks'. 

    I heard Yvonne say that there had been multiple strategies launched over the last two decades to fix the problems, and I'm paraphrasing, but I heard that the time for launching strategies to fix the problem was over and it was time to actually do something. This made me uncomfortable, not because I don't do things but rather because I've been working on a data strategy [1].  

    I also heard Yvonne say that thirty years ago, things had been better. There was an established practice of data modelling, data management, cataloguing and the like. I don't know if this is true, because I was busy watching 'Rude Dog and the Dweebs' and so on. Let's take it as truth. Yvonne was listing practices used back then that facilitated things working, and working together, through collaborative effort, documentation and widespread understanding. These practices are deeply uncool and unfashionable at the moment [2]. Forgotten, even.

    I am increasingly convinced that it's time to be deeply uncool and unfashionable. To be boring [3].

    Being boring

    I am trying to work through an idea, or collection of ideas, where 'data' is a distinct practice that sits alongside 'digital' and 'technology', complementing them both. I've written about it already. Nobody has said "oh yeah Dan you are right take my money" yet, so clearly it needs further work.

    I believe we [4] have problems with the language we use when we talk about 'data', and that it's too broad an umbrella to be meaningful. Leigh wrote a post about this (it's great). Leigh also proposed some data archetypes recently. Using Leigh's archetypes as a starting point, I think that the problems we face across government (or across public service, which I prefer) mostly come from:
    • The Register
    • The Database
    • The Description

    In my opinion, these are the types of data that facilitate the absolute basics. If they aren't done well then impenetrable silos result.

    Now, I believe that 'data' has this unwarranted mystique about it in general that people mistake for computer magic. At the root of the types of data I've listed above is a broad, human understanding and a set of human practices - like ownership for example. Not magic, just taking responsibility.

    So, I suggest that over the past 20 years we outsourced too much of our collective responsibility to people who do (or claim to be able to do) computer magic. It's easy to point a finger at a Big IT company who've failed to deliver an outcome at great public expense, so lets do that.

    *points finger at Big IT company who've failed to deliver an outcome at great public expense*

    BUT WAIT I'm actually advocating for retaining a greater degree of ownership of the fundamentals of our organisations and how we describe them. So that a supplier doesn't imperfectly represent these things on our behalf when they don't have the expertise or long-term incentive to do so, and also have to do the heavy-lifting to work out what they are in the first place, probably from scratch.

    The same applies when you have an in-house team, or are working closely with smaller agency partners. Hopefully in a multidisciplinary way. Who does the data modelling in a multidisciplinary team? Don't leave it to computer magic - it needs to have an explicit representation somewhere beyond a database only one person came up with [5]. How do the descriptions get done, beyond the content? Do you have a librarian [6]?

    Librarians, yeah?

    People joke about tech bros [7] coming up with ideas for things that already exist, like trains, and libraries. I think the libraries one often misses the point, because a library isn't just a quiet place where you can work that is free of charge (LOL). 

    A library is an open, dynamic repository of knowledge that's been organised in a way that is collectively understood, with expert help on hand to aid discovery, and no technical barrier to entry.

    The demise of libraries in our society is profoundly sad. And sorry but (as much as I love it) the internet / web hasn't satisfactorily replaced my definition above. There is a lack of balance and access for all, where 'access' means being able to contribute rather than just consume.

    Make the thing

    Everybody being able to contribute is central to my line of thought. I think that's where this 'data' practice will come from in a multidisciplinary delivery team. The opposite end of the pipe from developing a collective understanding of needs: developing and owning a collective understanding of your essential data. 

    Will there be resistance? Probably. I could imagine people saying "hey man we're agile we don't do any work upfront that's waterfall UR so lame" or similar, but I don't think there's anything wrong with being prepared before diving in on the build. Design disciplines advocate for developing a rich understanding. The same should apply for data.

    Beyond 'the thing', I do believe we gave away too much responsibility, when really we should have retained and maintained our corporate-level data assets which are more permanent than any individual system.

    But, as Steve and Adam made me realise, you need to sweat these assets (so to speak) by putting them to use and managing them like any other product so that they are as good as they can possibly be - so that people want to use them rather than inventing their own.

    Pull up your brain

    There's work required here that isn't currently happening. It is unfashionable as I've said three times now. The benefits of doing it aren't immediately apparent, so it's a hard sell. We are also working in a time when "technology will save us" is considered a legitimate argument, and arguing the contrary will require some political courage and will [8]. 

    Personally I think that's avoidance of responsibility and putting the effort in, and a case could definitely be made for financial savings over time through reduced friction and increased re-use.

    I'll continue to work on this. Let me know your thoughts, I'd really appreciate them.


    Footnotes

    [1] Yvonne Gallagher didn't say this, that's me.

    [2] I was reminded of the scene in 'Black Dynamite' where Black Dynamite declares war on the people who deal drugs in the community and the drug dealer says "but Black Dynamite, I deal drugs in the community!"

    [3] As in dull. Not like Elon Musk making big holes. Have a link to a song by the Pet Shop Boys. Have a link to a song by Deftones.

    [4] I'm using 'we' a shorthand for people working in public service. Let's say all over the world, apart from Estonia.

    [5] This isn't sour grapes at not being able to do the Johnny Lee Miller myself. I do know a bit about developing a shared understanding of things.

    [6] EDIT: here is a post from Silver about Librarians and the web that I saw just after I pressed 'publish'.

    [7] Not a robot version of Bros, sadly.

    [8] I work at the UK Parliament at the moment so 'political' here means organisational rather than party.

    What does 'data-driven' mean to me?

    Words are important. Have you heard this phrase 'data-driven'? I expect you have. I don't like it, but hey there are lots of things I don't like that are really popular nowadays.

    One of the issues with buzzy management shorthand phrases that their use actively inhibits a shared understanding. One person's 'data-driven' might not be the same as another's, but they're all there in the boardroom [1] going 

    "Franck, our strategical approach is data-driven and leverages machine learning cloud AI capabilities"

    "Yes Wayne, 100% data-driven insightisation of our asset base"

    "Totally Sandra, leveragise our data-driven operating model"

    or similar.

    For a while, I was working on a 'data-driven' website, where I think 'data-driven' was being used to mean "there is data from internal applications that automatically goes onto our public facing website"

    I always found that a bit strange, because (to my mind) that's just a legitimate way to approach making a website [2] and I don't understand why you would make part of what’s going on behind the pixels on the screen a thing of note. I don't think everybody involved understood that was what the phrase was being used to mean either. 'Data-driven' had become meaningless, and that meaning vacuum got filled with negative connotations and disdain, like a sad golem.

    Say what you mean

    I always preferred 'data-driven' meaning "we make decisions based on evidence". However, as I've worked on data strategy and (particularly) measurement in the past year, "we make decisions based on evidence" is also problematic.

    Why are you making decisions?

    Understanding intent and organisation's direction and focus is essential. If this is lacking in any way it can be disproportionately hard to develop goals and measures. Clear statements of intent really help to frame decisions.

    So now to me 'data-driven' means "we know where we're going, and we make decisions based on evidence". But it's still not right.

    What evidence?

    Maybe your data is absolute trash. It's worth really getting into where the data has come from. Maybe you've really gone to town on a rigorous qualitative approach that's disproportionate to the task. Maybe you've taken a qualitative approach that's disproportionate to the task, or that's going to be expensive to establish as a repeated measure.

    So now to me 'data-driven' means "we know where we're going, and we make decisions based on sound evidence that is contextually appropriate".

    Sure, it's a bit of a mouthful. This is why I'll never have a career in marketing. 

    I believe it's always worth taking the time to make sure there's a common understanding though. What does 'data-driven' mean to you?

    Footnotes

    [1] This is a fictional scenario and Franck, Wayne, and Sandra are fictional characters

    [2] Not all websites

    Why can't we talk about data? (Part 3)

    This is a continuation of my line of thought from my previous two blog posts.

    I speculate that there's a gap to be filled with ways of working with data that aren't happening at the moment as far as I'm aware [1].

    Isn't it exciting when you notice that ideas have taken on a life of their own? Not my ideas, to be clear.

    Those moments when you catch a hint that similar conversations are happening somewhere else and you're hearing part of an underground transmission. Like a story being passed around from person to person before anybody actually writes the book.

    Blog posts. Talks. Snatches of something new. I reckon people are going to start to crack this 'talk about data' business fairly soon.

    I read the 'Rewired State: 10 years on' blog post by James Darling and Richard Pope. Two paragraphs stuck out for me in particular (emphasis mine):

    Legacy IT is still a major problem that the GDS movement hasn’t really addressed. The data layer is still unreformed. It remains an unanswered question if the UK’s ‘service-layer down’ approach, the ‘registry-layer up’ approach of Estonia, or some totally different model will ultimately prove to be the best route to a transformed government.

    Both legacy and data lack a place in the new orthodoxy, and in user centred design more broadly. That’s probably because they are difficult and require political capital. It’s hard to justify investment in digital infrastructure when the benefits come in the next electoral cycle or to another department’s budget.

    There's a scene in 'Velvet Goldmine' where Christian Bale's character (a young, awkward glam rock devotee at that point in the film) points at his hero on the television and says to his parents "that's me! that's me that is!". I think the data layer is still unreformed! I think data lacks a place in the new orthodoxy and in user centred design more broadly! [2]

    A new new orthodoxy?

    I met with Leigh from the Open Data Institute this week. We spoke about this broad topic, and the conversation helped me on with my thoughts. Claire joined us, and suggested that 'data' is a decade behind 'digital' in terms of developing and embedding a multidisciplinary working practice. This resonated with me, and I've heard others suggest similar things in recent months (the underground transmission!).

    I swear there's something here. Something distinct from 'digital', but complimentary and with porous boundaries.

    Technology is about computers. 'Digital' isn't about computers but lots of people still think it is. Most people think 'data' is 100% all about computers but actually it's even less about computers than 'Digital'.

    In this multidisciplinary data practice I imagine being good with computers is a secondary skill - a means to an end. The engineering piece for services can be done collaboratively with others, and I'd expect increasingly over time it will become less bespoke [3]. If data lacks a place in that new orthodoxy maybe it's time to revisit some unfashionable roles, define new ones, and hire some librarians. Where I work we've got a couple of libraries. I'm a big fan of librarians.

    So, maybe a practice featuring the full spectrum of data related roles, from the structural to the analytical.

    A sort of infrastructure

    In my second post I described the data I am most interested in, recognising that 'data' is a broad term which would benefit from more detailed definition:

    When we talk about 'infrastructure', these ^ are the things that I think are most important.

    What is 'infrastructure' to me, though? I'm not thinking of platforms, let alone individual services (no matter how large). There is something here that's more fundamental. Note that when I say fundamental I don't mean 'more important'. I've seen enough unnecessary inter-disciplinary disagreement about relative preeminence first- and second-hand over the past few years.

    I just mean that there's something else there - something beyond the platform, or the service, or the contents of the database. Underneath, on top, all around it, in a mirror universe with twinkling stars and comets - you can draw the diagram however you like. It's there already, but it needs to be made more explicit.

    Leigh and I spoke about domain modelling. I've got into a habit of avoiding using the term, but it's a great example of a collaborative, human practice for working with data involving a variety of different types of people.

    Imagine these models were considered as a corporate-level asset [4]. This is the truth of your organisation [5]. You can use this asset to help build services. This asset is reflected in the platforms you build. It's not an academic exercise. This asset isn't static, and you would have feedback loops to and from your services and platforms because things change. In the public service context, outcomes for users traverse organisational boundaries, so your models would link out to those of other organisations.

    For the justifying investment point from James and Richard's post, I believe the case is there to be made. Not working to the map of your organisation's truth, and maintaining the map, is one of the reasons the legacy issue builds up in the first place and is a vicious cycle to be broken. Every system where the models are implicit, hidden in an undocumented database, introduces cost. Every local exception introduces duplication of effort and friction for teams and end users. What is the cost of not having this kind of infrastructure?

    I wonder where in an organisation this infrastructure would sit? If you don't have a library you should get one. My point being that the technology area definitely isn't the natural home for this work, and I suggest the digital area isn't either. There would be a collaborative effort between the three, of course. Doing the hard work to break down silos and what have you.

    Nonetheless, it would be a hard sell to build up a nationwide infrastructure before delivering any outcomes. I envisage a messy period of compromise and starting small, but hopefully there will be a start [6].


    Footnotes

    [1] I'm almost certainly channeling Michael and Robert at best here, and plagiarising at worst. Sorry blokes.

    [2] As I recall, Christian Bale's parents ignore him. My mum and dad don't know what I do for a living either

    [3] I do think there is a place for specialist engineering where a comprehensive understanding of complex data domains is required

    [4] 'Corporate-level asset' was Leigh's term, as I recall

    [5] Or a truth. Doesn't cover organisational culture, for example

    [6] Or a restart, to be fair, with respect and recognition to colleagues who've been here before