Start the week with data

I am the Head of Data Science at Citizens Advice. In a job like mine you delivery and achievement tends to happen through the team. I don't have many things I can say I did all by myself. Thinking back on the last year or so though there is one thing that I think has been particularly successful that I made happen. It's a weekly open forum called 'Start the week with data'.

I wrote about data conversations before. These kinds of forums evolve in my experience. Sometimes you decide they need a change or don't need to happen anymore. Sometimes you think they need to happen but others just aren't feeling it for whatever reason. Generally I like to stick at things for a while before deciding it's time to stop. Start the week with data has been going since May 2022. It was last year when it really came into its own. So yes, sticking at it.

Start the week with data is informed by the principles I set out in that data conversations blog post, especially "conversation not presentation" and "keep it frequent".

Every Monday morning we have a 30 minute video call with a speaker on something to do with data at Citizens Advice. Usually it's 15 minutes speaking to some slides and then the rest of the call for that all important conversation. I chair or maybe it's conduct the session. I sign off every session with "thank you for starting your week with data" which is neat and hasn't got old yet.

The invite list started as the Senior Leadership team but has grown organically to be wider than that. Every session is recorded and linked from a rolling log which is a Google doc. Slides are linked there too if they were used. All the presentations are good, but especially good ones are marked on the log with a fire emoji. Finally there's an email at the end of every week to the invite list telling them what's coming up on Monday and linking to the materials from that week's session.

The emails are especially repetitious. Also my inbox is mostly people declining Start the week with data. But every week numbers are pretty good - 25 to 40 people from all over this fairly big and diverse organisation.

The thing that made Start the week with data in 2023 so good was the variety in the programme. This forum isn't just the data experts talking about the expert data work they've done (although we have that too). We had 35 sessions with 24 different speakers from teams like

  • Operations
  • Finance
  • Business Development
  • Technology
  • Policy and Advocacy
  • Product and Delivery
  • Evaluation
  • Expert Advice and Content
  • and more

The point is that everybody is working with data and can tell an interesting story about it that will be relevant to others.

One of my favourite sessions last year was our Finance team talking about the most successful systems migration I've ever seen. I'd never known a systems migration go well. And there was a strong piece of practical data improvement at the heart of the work, taking the opportunity to change data structures so that they could provide more meaningful insight.

We also get brand new insights at the session, like investigating access gaps for our clients across England and Wales. And we see things that absolutely everybody in the organisation should know, like fundamentals about our client base or how we measure our financial and social impact.

Another benefit of sticking with a forum like this for a while is you get the opportunity to revisit topics and see how they've developed. Or you hear about the next steps in a piece of data work that have been facilitated by something delivered earlier.

Managing to do 35 sessions isn't bad, especially accounting for holidays. It is a fair amount of effort to curate a programme like this, but the fact of doing that helps to build connections and visibility across the organisation for me. There's also something rewarding about encouraging people that they have an interesting story to tell. Every week I say please volunteer a topic if you want to and on the occasions people do come forward it makes me happy. But most of the effort is on me to keep that forward plan going. I think it's worth it though.

Recently a colleague said how much they appreciate the session because it's always relevant to them and their work and it gives them insight into what's happening at Citizens Advice in an effective way. I will take that as a win.

This is the work. It's fairly simple but it does require sustained effort to keep it going. If you're in a role where you're leading data improvement in a fairly large organisation I recommend giving something similar a try.

Thanks for reading.

Data conversations: some practical examples

We’re learning a great deal on our data journey at Citizens Advice. I think it’s worth sharing where we’re at.

18 months ago I would have recommended “talking about your data more”. I still recommend that of course!

Given experience in the intervening time I’m able to describe something more than that original intent, with some structure to it as well.

Maybe you work in an organisation where everything I describe here happens already. That’s great, I’d love to hear about your experiences.

Maybe the data you work with is different to ours —perhaps larger and updated more frequently, or smaller and updated less frequently. I think that the things I’m describing here could still be useful.

The ‘data’ that I’m talking about here are things like:

  • Channel activity for example website and telephone
  • Client volumes and demographics
  • Topic trends at high- and more detailed levels
  • Client outcomes — experience and satisfaction from survey data

Principles

There are a few guiding principles behind what we’re doing. I think the most important one is

Conversation, not presentation

Everything I’m describing here should encourage everybody to talk about the data. It’s not a one-shot pitch — this is ongoing practice and there’s no end to it. Shared understanding builds over time and feedback is essential.

Talk about the whole picture

We have a complex organisation with many channels and services. We try to make sure that we are talking about the relationships between things rather than focusing on isolated data sets.

Using as few products at possible

We have a main data product called the ‘Service Dashboard’ which brings a wide variety of our data together in one place. We try to use this product to meet the needs of as many audiences as we can.

We have a preference for presenting from this product or other dynamic reports and dashboards over transposing data into a slide deck.

Keep it frequent

We have a weekly pattern. This is appropriate because of the wide variety of data that we have, rather than because of the cadence of our data (where trends play out over months rather than weeks). This frequency keeps what’s happening with out data at the front of peoples’ attention though, and there are a wide variety of topics we can cover.

If you’re working with data and it doesn’t change that much I’d still recommend talking about it once a week.

Repeat and reuse

The forums are porous. We present the same material in multiple places. We use the forums to generate content that people can revisit and share.

Iterate

We learn by doing and regularly reflect on how things are progressing and make changes accordingly — both to our practices and to our data products.

Here are the four practices I recommend trying if you’re not doing them already. This isn’t an exhaustive list and it will develop further I’m sure.

1. The top team conversation

We have a rolling programme of short weekly data updates to our Executive and Directors team. They happen at the beginning of the week as part of an existing ‘start the week’ meeting.

When we started this James (who is part of the Executive team) gave these updates but increasingly we’ve brought in other voices. It’s a collective effort to achieve a weekly update and we have input from the Data Science team, our counterparts in the Impact team, and others particularly from Operations.

It is a conversation because this top team get to provide feedback and set priorities for questions to answer. This has been supported by a fairly regular retrospective.

The same material gets shared with all staff from the National Citizens Advice organisation on Workplace and we will be sharing more widely with the Network of 250+ local Citizens Advice across England and Wales too.

In order to keep to the weekly pattern we need a decent but flexible forward plan, and we try to keep a month ahead of ourselves on that. There can be significant lead time for the some of the work required to answer the questions asked. Seemingly simple things can be complex and vice versa.

This practice has driven some of the most forward-thinking and fresh data work that I’ve been involved in since I started this role in late 2019. We know things that we didn’t know 6 months ago, which is itself a measure of improvement.

This practice also involves the most collective effort and preparation. That feels appropriate. This is the level of the organisation that can be supported through data to make the most significant decisions.

What are the benefits?

  • Builds a collective understanding
  • Drives improvement in our evidence base
  • An opportunity to prioritise based on the most important questions to answer
  • Should lead to more informed decision making

Examples from Citizens Advice

In recent weeks we’ve covered these topics:

  • High level client outcome and activity numbers from across our service for the past year.
  • New data that tells us about the impact of our online advice content.
  • New data that tells us about the variety of different telephone service models used by the Network of local Citizens Advice.
  • New analysis on depth of issues our clients experience and the strong relationships between different issues (for example housing and debt).

2. Data at the start of every meeting

Well, not every meeting. Let’s imagine you’ve got a regular team meeting, or an ‘all hands’ session. The practice here is to do a tight 5 minute update on data at the start of the meeting.

As an example, I do this at the weekly meeting for the leadership team I’m part of. 5 minutes translates to 3 or 4 talking points about our data. Committing to this regular practice means that I have to be engaged with the data that we’re working with — I have to look for patterns and trends to highlight. I can also bring in insights from the other forums.

I write up those 3 or 4 talking points and share the document with the team. This is a further commitment, but it’s worth it because it can be shared with the whole group. They can refer back to it and consider the points in their own time. Also it means that nobody in the group is left out if they aren’t at the meeting. Finally these documents are open, they can be shared more widely if my peers think there’s value in doing so.

What are the benefits?

  • Provides context and helps to break out of siloed thinking
  • Builds a shared understanding
  • Builds expertise in talking about data from a variety of sources and how it interrelates
  • Keeps you curious

Examples from Citizens Advice

Here’s an example Google Doc with 3 real data talking points I’ve covered recently.

3. The regular open forum

This is our most established practice. We began it soon after the start of the first pandemic lockdown in 2020. Tom (chief analyst) wrote about it. It is a fortnightly session that lasts around 45 minutes. It is open to all staff from the National organisation. We get around 30 people on the call each time, from a variety of teams and backgrounds.

We use the Service Dashboard, presenting this to the group and having the data specialists who are responsible for each category of data talking about the latest trends. For example Mankeet (senior data analyst) covers website trends.

We take questions from the group. One of the most valuable aspects of this session is that colleagues from Operations or Policy often provide valuable insight and context for what we’re seeing in the data. It’s very much a conversation.

Of the four practices this is the one that gives many people the opportunity to understand and describe the narrative of what’s happening across our service as trends play out over months and years. There’s an element of oral history to it, which could be seen as a weakness because some of the explanations for patterns that we’ve seen aren’t documented. However, we record the sessions and post them in a dedicated Workplace group so people who can’t attend can participate in their own time. And we see the narrative that gets developed reflected consistently in other work that we do, which is a strength.

What are the benefits?

  • Builds a collective understanding
  • A rich exploration of the data given the expertise involved
  • The narrative stays current and is reflected in other forums
  • Provides early sight of trends and issues that can be highlighted or escalated elsewhere

Examples from Citizens Advice

The Service Dashboard is updated weekly. This forum has established regular content. We look at client trends (numbers and demographics), advice topic trends, website trends (topics, top pages, volumes, search terms), and telephone and webchat trends (volumes). We can compare back to a variety of different time periods but we find comparing year on year to be most valuable because of strong seasonality in our data.

4. The deeper dive

We run a roughly weekly 30 minute session with an agenda that covers a wide variety of data topics. I say roughly weekly because it’s 4 weeks on and 1 week off with the slot being at a different time each week to encourage attendance.

The session is open to all National staff. There’s a fairly large invite list and we tell everybody what we’re going to be covering and share materials in advance. We get around 15 people at each session.

This practice provides an opportunity to go deeper on new analyses and insight, presented by the specialists who have done the work. It also provides an opportunity to talk about our data work ‘behind the scenes’, for example developing new services and standards. It has been really valuable for developing our data strategy work in an open and collaborative way. Finally we’ve had guests from other organisations telling us about their experiences — that’s particularly valuable when you get to hear about shared challenges and how people have approached them.

We record the sessions and post them in a Slack channel. Necessarily this forum generates a fair few slide decks. We share those too, and make sure that they contain links to other resources.

What are the benefits?

  • Showing what really goes in to the work to an audience who wouldn’t otherwise understand
  • Doing justice to data specialists’ effort by having time to go into greater detail
  • Bringing in fresh perspectives for shared problems
  • Developing in the open, particularly strategic work

Examples from Citizens Advice

In recent weeks we’ve covered these topics:

  • Data strategy principles framework — how we’re getting owners for these principles from across the organisation
  • A new ‘service’ approach to data, building initial versions of services that we can iterate. Examples include a service for data about our volunteers and a service for data about the Network of local Citizens Advice.
  • Collaboration between Data Science and Product to decommission a legacy system, and establish a new primary source of data for reuse by multiple systems as a result.
  • A new way to visualise client volume data, showing it across our entire service at a high level for the first time.

One day I’d like to put together a ‘playbook’ for data specialist work but it’s a daunting task. I can break it down into smaller pieces though. This post is a first attempt at that.

Please get in touch if you’ve got any thoughts — you can find me on Twitter and occasionally on LinkedIn.

Thank you for reading.


Creative Commons Licence
Data conversations: some practical examples by Dan Barrett is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Citizens Advice Data Science team month-notes #3 (October)

We are blogging about the work we do. We intend to write a post each month with contributions from various members of the team. Here’s our previous post

We haven’t managed to post for a while because we’ve been especially busy. We’re working as a team to get around that in future. Dan’s going to try transcribing oral updates from the team so that we can all be contributing to writing in the open on the internet. 

As a team we believe in continuous improvement, building capability, and focusing on meeting data users’ needs. The opportunities we have for improving the way we work with data at Citizens Advice aren’t unique to our organisation. It’d be great if the practical approaches we take are useful to others.

Data Glossary

Josh (Data Architect)

Earlier this year we ran a survey for all staff to understand the experience of using data at Citizens Advice. This gave us valuable insights into opportunities for improvement, revealing that people want data to be quicker to find and easier to understand. 

One of the ways in which we’re trying to help is through the creation of our organisation wide Data Glossary. This is a hybrid of a traditional business glossary, data dictionary, and data catalog. It has a number of different components and aims to be a central hub to answer data related questions from across the organisation. We’ve created this in Google Sheets for now, as it was quick to create, change, and share widely across the organisation.

Here’s an outline of the sections we have so far and their purpose:

Terms

This section is what you’d expect to see from a business glossary but also includes some high level technical information. It lists our main data entities, describes what they mean in the context of the organisation, synonyms, which systems they can be found in, and the suggested way to uniquely identify them. 

Its main purpose is to encourage consistent terminology and language, as well as promoting data standards. It’s also been an important collaboration tool for our team to discuss with people from across the organisation on what their understanding of these terms are.

Data reports 

We created this section as a way to make it easier for people to find the reports they’re looking for. We’ve listed our main products, explaining what they can help with and which team can help answer any questions. It's particularly useful for any new members of staff as it gives them one place to look when trying to understand what data is widely used. It also provides a single place for anyone who’s trying to find a report that has been mentioned that they didn’t know about. 

Data sets 

This part is the start of our journey towards building a shared metadata framework. It’s a collection of our data assets, describing what they are, where they can be found, who has responsibility for maintaining and links to supporting sources, such as the data asset information document we’ve recently launched (more to come on this in a later post!)

We think this section has the biggest potential for making data more accessible across the organisation and has opportunities for automation with our data platform

The main purpose is for people to be able to search for a dataset they’re looking for and reduce the time spent searching for a link or the right team to speak to.

We also have sections with training videos, team-specific supporting documentation and links to external data sources. 

We’re continually revising these to improve their usefulness. 

What’s next?

We need to measure the current usefulness of the Data Glossary. For this we will start research with users on their experience using it. The data collected from this will influence any future changes we make. 

One proposed change we plan to make is to link out to data models for our main entities. The intent here is to raise the profile of these data models and provide a place to reference when creating existing data concepts in a new location.

We’ll also be monitoring the sustainability of keeping this data in Google Sheets, as well as exploring opportunities to automate the capturing and updating of metadata. 

Technology growth in this space has led to a number of emerging platforms offering great solutions for metadata management and we’ll explore whether we can make use of any of these in the future.

Though it’s early days for our Data Glossary, we’re seeing a consistent level of people using it and we think it’s making our data more accessible.


Learning from our regular open sessions

Dan (Head of Data Science)

Since April Josh and I have been running regular sessions over Zoom about data architecture and data strategy. These sessions are open to all our colleagues at National Citizens Advice. After a couple of months we reflected on how we could improve them. We thought the main issue was that they weren’t inclusive or diverse enough.

We had been running the session at the same time and day each week. We changed this so that the schedule is a 5 week cycle of Tuesday / Wednesday / Thursday / Friday on subsequent weeks and then a week off. The sessions are all at different times of day to accommodate different working patterns, and they're not all at lunch time. 

We also wanted to bring in external speakers so that it wasn’t just Josh and I doing the talking. We had 8 different speakers from other organisations on a broad and fascinating range of data topics. Several of these sessions resulted in follow up calls to get into more detail on topics, like a session on data governance with Jenny.

We recognised that not everybody is comfortable asking questions on a call, so we allow for that when it’s a session with Q+A but we’ve also run more creative workshop sessions using Google Jamboard. We put more effort into communications and publicity too. Ahead of every session we email the invite list to tell them what we’re going to be covering so people can choose if they’re interested, and we include the recording and any materials from the previous session too.

Our open sessions are giving us a regular opportunity to showcase the wide range of data work that we do at Citizens Advice. It also provides an opportunity to get wider input - the sessions have been really helpful to me in developing a data strategy for example. We’ll keep reflecting on how they’re going and making improvements.

We want the barriers to entry for working with data to be as low as possible and work hard as a team to achieve that. As one of our guest speakers Adam put it we are normalising talking about data.


Interesting links

Why data scientists shouldn’t need to know Kubernetes” via Jon

Citizens Advice Data Science team month-notes #2 (August 2021)

We wanted to start blogging about the work we do. We intend to write a post each month with contributions from various members of the team. Here’s our previous post from July.

As a team we believe in continuous improvement, building capability, and focusing on meeting data users’ needs. The opportunities we have for improving the way we work with data at Citizens Advice aren’t unique to our organisation. It’d be great if the practical approaches we take are useful to others.

Building better data products that meet users’ needs

Hamza (Senior Data Analyst)

The Citizens Advice service is made up of Citizens Advice - the national charity - and a network of over 260 local Citizens Advice members. Our network members are all independent charities delivering services to help people across England and Wales. At the national charity we have a responsibility to support each local charity to deliver their services in the best way possible.

There is a leadership self-assessment every year. This is where we ask each of the 260+ charities to rate themselves across organisational areas such as Governance, People Management, Equality, and Financial Management. At the same time, performance assessors from the national charity carry out the same assessment on each local charity. Where it’s evident that there are opportunities for improvements the national charity works closely with the local charity and suggests specific courses of action. 

The leadership self-assessment is vital because it embeds risk-based thinking, helping local charities to assure that their organisation is well run. It also accredits local charities to external quality standards that are recognised by funders.

In the national organisation we have a dedicated team that looks after all activities related to the leadership self-assessment. This team also includes a small number of performance assessors whose role is to assess each of our network members at least once a year. 

To help with monitoring these assessments, the team was using a tracker document in Google Sheets. The aim behind this tracker was to help plan resources, track how many assessments have been completed and for which charities, and analyse results from the assessments. 

As this tracker was also being used by several other teams, over time it became congested with multiple tabs, showing different pieces of information in different formats. In some cases, there were even multiple tabs that would show the same information but in different formats. It became evident that the tracker was losing focus and there was just too much information in different formats, not to mention that there were also broken formulas in some places too! It sounded like a job for the Data Science team... 

We collaborated extensively with the leadership self-assessment team and met with them to understand the problems they were facing and their data needs. Our task was to create a new, more consolidated and effective tracker tool that would enable its users to better plan resources and gain richer insights into the assessments. 

We discussed things such as

  • which is the best platform to use (Google Sheets or Tableau)?

  • which data is needed? 

  • which format should the data be presented in? 

  • how should the data be updated?

  • who should have edit access to the data?

Having these initial discussions really helped us to build the right tool for their needs. For example, the stakeholders had a stronger preference for Google Sheets, therefore we built the new tool in Google Sheets and chose not to use Tableau. 

We adopted an agile approach to building the tool. We didn’t just go away and build a complete tool in one go for them and then think ‘job done’. Instead, we built it in steps. We built a version 1, presented this to the team, received feedback and then built version 2, presented this to the team, received feedback and so forth. In each version, we kept on refining the tool based on the feedback, making it easier to use. 

We focused more on building clear visualizations as opposed to just building tables of data. We also focused on making the tool interactive, for example building features that would allow users to extract specific data for their own needs. We try to encourage this kind of ‘self service’ as much as possible.

In the end, after several iterations and meetings with the stakeholders, we had built a tracker tool in Google Sheets that could better serve the needs of the leadership self-assessment team. The tool has been created in such a way that it involves little to no effort in terms of maintenance as all the data is updated automatically. The tracker tool is linked to a Google Form that is completed by the performance assessors for recording scores from each of their assessments. Therefore, as each new response comes in, all the visuals and data summaries in the tracker are updated automatically (this is one great benefit for using Google Forms for automatically analysing form data in any way you want). 

The new tracker tool we created has received great feedback from the end users. They feel they have something more focused, and that it helps them to answer the questions of the data that they need to ask.


Connecting data to give us new insights

Sarah (Channel Reporting Manager)

Josh (Data Architect)

Jon (Data Analyst)

How we identified the opportunity

There have been a few new hires recently in the Data Science team which has helped increase our bandwidth and allowed us to take on additional work, like solving long-standing problems. The formation of the Channel Reporting team has resulted in improved reporting, and we found that in order to take it to the next level and better meet the needs of users across the organisation we needed a greater level of detail in the data about advisors. 

Alongside this, the team also has data architecture skills now, which resulted in a detailed system context map being drawn for how data moves between products and teams at Citizens Advice. We call it ‘the data landscape’. This map allows us to hone in on improvement opportunities, for example we discovered a reliance on spreadsheets for managing advisor data.

What did we discover

We’ve historically had decentralised management of advisor data. This has worked fine for reporting and analysis, but we’re always looking to improve. We didn’t have a unique way to identify the same advisors across various systems and it has created inconsistencies in how the data is structured. Having this data would mean that our users could get a better and more joined up view of the performance of their service. It would help show activity data across channels, rather than viewing each channel in relative isolation.

What we did

We needed to change how we viewed the relationship between an advisor and the systems they use. For this we created a conceptual data model for advisors, which helped show the commonality across systems and what we could use as a common identifier. We found a number of different systems in which advisor profiles existed. The one where most of our advisors could be found was in our user authentication product. Advisors use this product in order to access various systems and it provides the best opportunity to have a common identifier for an advisor. 

The majority of our channel systems are connected to the user authentication product. This allowed us to quickly map the common ID to the advisors in those systems. In a system that isn’t currently connected we used our data skills to get to 80% of mapping identifiers to all advisors by matching google sheets via various VLOOKUPs.

This left us with just 1,100 advisors in that system that didn’t have a direct login or name match to an ID. We knew that many of these advisors probably did have a central user authentication ID - but weren’t matching due to inconsistencies in naming conventions and data entry errors - for example an advisor with local login “London Jon L” might have an existing ID, but registered against “SELondon Jon L”, or “London JJon L”, or “London Jonathan L”. [1]

Rather than match all of these manually (which would have taken us about a minute per advisor, based on the advisors we did match by hand), we wrote a quick script to help us look for these loose matches that may have been caused by human error. Our script used local advisor metadata such as office location to narrow its search criteria to a few hundred possible IDs out of thousands, and then identified close matches using a fuzzy matching algorithm. 

Fuzzy matching is a simple technique for matching text that’s approximately but not quite the same. This is super useful for identifying close matches caused by typos and nickname variations, which is a pretty common problem faced by volunteering charities! We used a standard fuzzy matching package for Python, but the algorithm can be found implemented in most languages, and is even implemented as a standard formula in Excel.  For example, with the package we used, the typo “Josh Tedgett” matched up against “Josh Tredgett” with a similarity score of 0.97, and less immediately obvious mismatches caused by nicknames like “Becky Harlow” and “Rebecca Harlow” still returned a high similarity score of 0.77. [2]

While this method returned a fair amount of false positives, it was much faster to work off these loose matches and either confirm or discard the script’s matches than it was to match all 1,100 advisors by hand. After writing the script (which took about an hour) and running it, it took us about an hour to match 600 of the remaining 1100 advisors, and discard the rest as having no likely matches. At the original rate of about a minute per advisor, it would have taken us about 18 hours to work through the same list manually.

Once we’d completed the matching process, we still had the challenge of maintaining this data set going forward. How were we going to make sure new advisors had the ID assigned to them? This is where colleagues in Technology have stepped in. They proposed an automated solution via our internal ticketing system. This will take the effort away from our Operations colleagues to maintain the data. Instead, an orchestration tool will pick up the new user requests and assign the ID in the consolidated dataset. It will also allow us to monitor the quality of the process as it can pick up any errors that have occurred.

How will it work and what this helps with

Once this work is complete, we will be able to drill down to a greater level of detail in our data products. We recently set up a new report that analyses performance for local Citizen Advice charities across all channels (phone, chat, and email). This report gives a more joined up view than before, and with our improved data the next iteration will give our network members deeper insight. 

Things to think about in the future

We think the new process will provide better insight for our users and more robust data management. There are still some aspects to work on. Part of the reason why we were able to progress on this work is that we weren’t aiming for perfection. We focused on making something better now and not trying to create anything that was difficult to upgrade or replace in the future.


Thanks for reading. Feel free to get in touch with any of us on Twitter or LinkedIn.


Footnotes

[1] This is not a real example

[2] This is not a real example

Citizens Advice Data Science team month-notes #1 (July 2021)

We wanted to start blogging about the work we do. We intend to write a post each month with contributions from various members of the team. 

As a team we believe in continuous improvement, building capability, and focusing on meeting data users’ needs. The opportunities we have for improving the way we work with data at Citizens Advice aren’t unique to our organisation. It’d be great if the practical approaches we take are useful to others.

Enabling self-service for users

Sarah (Channel Reporting Manager)

The Citizens Advice service in England and Wales is made up of the National organisation and over 260 independent local Citizens Advice charities. Our work as the National charity involves helping to coordinate the efforts of the local charities so they can provide national level services as well as pooling expertise, tools, and insight. 

One example of this collaboration can be seen through our efforts to help manage Adviceline, our national phone service for England. The Data Science team currently provides an Adviceline Tableau dashboard to help local charities understand call volume and other performance and demand data. However, over 230 local Citizens Advice provide the service and they have a wide variety of reporting requirements. It’s impractical for the Channel Reporting team (currently 2 people) to provide over 230 unique daily reports. The Adviceline dashboard we provide is insufficient to meet the individual needs of local charities, even if we do our best to respond to user requests. 

So we decided to enable Tableau web editing, allowing all users to edit dashboards as they see fit. This will help to democratise data, putting additional power in local offices’ hands, whilst freeing up Data Science team capacity and allowing us to provide more advanced models and better quality data and reporting for other channels like web chat and email. 

Two potential barriers to successful implementation were the impact on load time for reports due to server capacity, and wanting the new features to be accessible to  staff in local charities. To tackle the first problem we sent surveys to help estimate demand for the feature from local charities and also estimate the maximum concurrent Tableau users. We then used this information to simulate the potential impact on server load time. To tackle data skills we will work with our Facilities team to deliver training via live sessions. These sessions will be recorded and later published on our internal learning platform to allow participants to revisit information. 

Tableau web editing gives us an opportunity to facilitate the local charities’ data needs like never before. There’s also the opportunity to build skills in the local Citizens Advice network so they can use our trusted data to provide high quality reporting that meets their individual needs. We’ll continue to monitor the implementation to measure impact and look for further ways to improve reporting across the network. 

Automating away the busywork - reducing a 1 - 3 hour daily process into 10 minutes of automation

Jon (Data Analyst)

I'm part of the Channel Reporting Team. Among other things, we provide data and reporting for remote help services across phone, webchat and email. By making this information readily available, the 260+ local offices in our network can access the data required for deploying their staff and volunteers as effectively as possible.

Sarah wrote about our Adviceline reporting, and we also provide reporting for the Welsh Advicelink service. There's always been a pressing need for this kind of daily reporting - especially with the pandemic causing a historic rise in people seeking help through remote services - so we can understand how well our organisation is meeting clients’ needs. 

We want to be as efficient as possible in this work, and make sure that our data is trusted. We recently made some improvements to how we provide this data.

Our reporting workflow was previously very complicated due to a number of factors:

  • Access to even basic data from our telephony system requires manual effort.

  • Operational adjustments were (and still are) made everyday to try and adapt to changing conditions - and our reporting needed to reflect that. The way in which the team had to deliver their reporting changed multiple times during the pandemic, making it difficult to ratify and streamline a single process for daily reporting.

  • With so many different teams and offices having input on the reporting, our reports needed to pull in information maintained by many different stakeholders across several disparate sources. These sources include various SQL servers, a bunch of different google sheets, the aforementioned manual effort for data access, and daily call record CSVs sent to an email account.

As a result of these constantly shifting conditions and ad hoc adjustments, we ended up with a highly manual process that was difficult to run, and error prone if you didn’t have the requisite skills and knowledge.

When everything went right, the process would take about 45 minutes to complete every day. If there were complications the process would take anywhere from 1 to 3 hours to execute.

With almost a hundred different manual steps along the way the process was error prone. If you made a mistake at any point during the process, you had to catch it immediately or face restarting from scratch. If you didn’t catch an error it could take an entire day to roll back.

For a process producing a daily report with a noon deadline, this was clearly unacceptable. We knew we had to make it easier on ourselves, and reduce the possibility of errors along the way.

Using some simple Python scripting and Google Apps scripting we managed to distill the sequence into a largely automated process that takes ten minutes a day. Here's some of the things we did to achieve this:

  • We built a Selenium Browser Automation script that crawled the telephone records we needed, and dumped them straight into our python scripts for ingest.

  • We used Google Apps scripts to automate sourcing the data we needed from Google Sheets, and using pandas to perform the necessary lookups and data manipulation.

  • We used Pyodbc to give our Python scripts access to the SQL servers they needed to both draw data from and update.

  • We wrote a series of unit tests ensuring that the processes executed by the script all pass basic common-sense checks and don't contradict our knowledge about our telephone systems’ operational setup. In addition to these unit tests, the script still asks for a human to sign off on the changes it's about to make - while this means the script isn't fully automated, it means that we're aware of any recent changes made by the process, and can catch stuff that simple common-sense logic tests can't.

Pulling these changes into a Python prototype took about a day, but iterating and readying the script into a production-ready state took the better part of a week. The unit tests themselves were written iteratively over several weeks, and the scripts still get tweaks whenever we see room for improvement. The end result is that we no longer have to dread wrestling with our reporting processes in the morning, and we save hours of development time every week which we can spend on higher value work.

User feedback is invaluable thought not always easy to get - data survey

Josh (Data Architect)

Throughout this month the data team has been running a survey to understand what it’s like for staff to use data at Citizens Advice. We used Google Forms. 

The survey is a way to gain insight into how accessible our data is, the level of data capability across the organisation, and which data products are currently helping people with decision making. It can be a challenge to get people to participate in surveys. Thankfully we had support from executives and our internal communications team which gave us the response rate we aimed for.

We’ve now been able to calculate baseline metrics that give us an indication of how easy our data is to find, how confident people feel using data, and how often data is used in decision making. Of these three things, finding data seems to be the biggest pain point. Though this isn’t unique to Citizens Advice it is a challenge for us to solve. The survey also has a lot of insightful qualitative data that we’ll be digging into next month and following up with more focused research on specific areas.

Overall it’s been a great way to collect thoughts about our data from across the organisation. It’s also providing an indication of the areas we should be focusing which will make the experience of using data better for all. We will repeat the survey quarterly so that we can measure the impact of the improvements we make. To encourage the same level of participation as in this round of the survey we’ll be demonstrating how we’ve acted on the feedback to make a meaningful difference. We intend to extend the survey to the network of local Citizens Advice as well.

Building the team

Dan (Head of Data Science)

I love recruitment. Having the opportunity to build and lead a team is a privilege. This month we’ve had a new starter Rahul who is a Web Analyst working in the multidisciplinary Content Platform team. This is a model that I want to establish when it’s the best fit for the work. Having a dedicated data specialist in a product team is much better for meeting that product- and team’s data needs than being at arm’s length.

Rahul is being inducted remotely and I think that’s more challenging than being in person. On the other hand, I definitely prefer interviewing remotely. Maybe there’s something about it being a leveller, balancing out the dynamic and making it less intimidating than having to visit somebody’s office. Maybe a child will get locked out of somebody’s house in their pyjamas and it’ll be alright to rescue them midway through the call. 

Did I say somebody’s house? Ok I meant my house.

We had a successful interview campaign for a new Data Science Lead this month. I really enjoyed interviewing with Josh, and with Maz (Director of Content and Expert Advice). I have definitely learned from others  in the past year and improved my interviewing practice. There have been small things, like introducing yourself with your pronouns, or pasting the text of each question into the video chat function. There have been larger things, like putting the effort into producing a scoring guide for each question so the panel is working from a shared, explicit understanding. Writing the scoring guide takes me considerably longer than the questions. Finally there are things that mean you’re being explicit about your organisation’s values with candidates. I’ve included a question about equity, diversity and inclusion in interviews for a while but I’ve never been completely happy with how it was phrased. For this campaign Maz introduced a new formulation that was a big improvement.


Thanks for reading. Feel free to get in touch with any of us on Twitter or LinkedIn.

Interesting links