Using data science to transform the social sector
Data science has infiltrated most enterprise organizations. But it’s also being used to improve the efficacy of organizations in the social sector.
Andrew Means is the founder of Data Analysts for Social Good and Big Elephant Studios, and the Co-founder of BrightHive. Previously he was the founding executive director of Uptake.org and Associate Director at the University of Chicago’s Center for Data Science & Public Policy.Andrew is an internationally recognized speaker on issues of data and philanthropy. In this conversation we talk about how smart non-profits are taking advantage of data science to improve transparency and impact, how they’re overcoming issues around lack of data through unique collaborative models, and how organizations can grow a data competency in house.
Manifold: I know you sort of had an interesting sort of path. How did you get into the world of data science in the first place? How did you end up doing what you do now?
Andrew: My path is quite unique in the sense that I'm part of the "data for good" community, whatever you want to label that. And I come from the "for good" side. So I was always really interested in nonprofits, really interested in social change, and I came to data because I thought data science is actually a really good way to create change in the world.
Manifold: Why is that?
Andrew: Because a lot of organizations in the social sector I think struggled to know if they're actually achieving the goals that they want. And they struggle to to actually know are they making the impact that they set out to make. And data was a really nice way to validate that.Part of the way I think about how the social sector even works is that in some sense, nonprofits are selling their ability to create change in the world.
"Nonprofits are selling their ability to create change in the world. Nonprofits exist because there's something about the world as it is today that we want to see different for tomorrow. And funders, whether it's me, just an individual donor, or the Gates Foundation, exist to buy that change.
Nonprofits exist because there's something about the world as it is today that we want to see different for tomorrow. And funders, whether it's me, just an individual donor, or the Gates Foundation, exist to buy that change.When you give your money to a non-profit, what you're doing is saying "I like the change that you're creating the world and want to see more of it." I'm buying that.The issue is far too often that transaction relies on stories alone. It's an organization saying, "Well I'm doing this thing. I promise." And they have no proof.Data is the thing that can actually validate whether the change that we think is occurring is actually occurring or not.
Manifold: It seems like even on the problem side, asking whether or not this is a problem worth solving, is it able to help on that side as well?
Andrew: Absolutely. This is the thing with data, and data science in particular. Just just like it can transform the ways that we get around, and that we do commerce, and the ways we do logistics, and the ways we consume and create entertainment, data has the power to actually transform the way the social sector operates.Whether that is identifying which problems we want to solve, changing how we solve those problems, or evaluating whether we solve those problems or not, data has a role to play across the entire value spectrum.
Manifold: Is that something you feel like at this stage of the game most change-based organizations have realized, or is this still early days? They don't totally get it, and you have to kind of convince them of the need for this stuff.
Andrew: I think it's early days. It's not as early as we had five years ago. I mean five years ago, there was very little work being done around the use of data science and the social sector. I think there was a group of us that were trying to demonstrate that it was possible.Today we have organizations that have Chief Data Officers and data scientists on their staff, and very little of that existed even five or six years ago. But when you think about the breadth and depth of the social sector, it's still at a pretty nascent stage.And I think part of that is for many nonprofit organizations and change based organizations, especially established ones and ones that have existed for some period of time, they're run by people that come from a a certain background. Often its human services or social services kind of background.They're not technologists. They they got to their position running large human services organizations because they started as a social worker, working with homeless populations for example, and they just have built their careers and gone up in this direction.And so convincing some of the these kinds of organizations that data and technology can help you solve the problems that you care about in new ways is a challenging endeavor sometimes.
Manifold: Is there a fear component to it, where if data gives me visibility into the efficacy of my program - or not - is there a piece of this where maybe they realize what they're doing isn't working as well as they would like it to?
Andrew: I absolutely think so. Data is transforming the way the social sector operates.If we think about the private sector, what data essentially did was make more efficient businesses. But businesses were still competing on the same thing. They were competing on their ability to create profits. Data just changed the way they created those profits.
In the social sector, the organizations that win are the ones that tell really compelling stories, and have well-connected boards, and convinced donors really well to give them money. They have great marketing machines. They're not necessarily the most effective.
So data comes in and it begins to actually change the way that donors operate. It begins to shed light onto what's working, and what's not.I think there is a fear among many organizations that they're going to look less effective. No one ever looks as good as their marketing campaign. And today largely the only data we have about nonprofits is based on the stories of their choosing to tell us.So as you see this shift to a more data-driven social sector, it's going to radically change the winners and losers.
Manifold: Is donor awareness what's going to ultimately drive that shift? As they become aware, it's almost like being a more informed consumer. Are they going to force that level of transparency on these organizations, where even if they don't want to be held accountable that way, they don't really have a choice?
Andrew: I think to some extent. There's a challenge here - when it comes to individual giving the research actually shows we're driven by stories most of the time. It's the story that really compels us. That money I think will largely still be determined by reputable brands in the social sector, or I have a friend who started this organization and I care about my friend so I'll support this organization.If you step up a level and look at institutional donors. Foundations, whether it be the Kellogg Foundation, or Gates Foundation, etc. They are becoming more and more data driven in their philanthropy and requiring more organizations to share data. But even there you have a disincentive to some extent.The corollary to this is finance. 20 years ago, you had your investment broker who was supposedly smarter than the market, and was going to make all of these decisions because they knew better than you.A program officer in a foundation acts in much the same way. They'll say "I've been giving away money to homelessness organizations or human rights organizations or whatever for my entire career. I can do this really well."If I come in as a data guy, and say I can actually set up the the technology where we can get all of the workforce development organizations in the state to share data, and I can tell you which ones are most effective, that program officer's job is fundamentally changed.So there are some that are starting to see the importance of data. But I think there's still some disincentives there.I think we're actually surprisingly seeing some of the most changes at the federal level. Federal and state government are the largest donors to the nonprofit sector. In many ways, the nonprofit sector exists to provide state-supported services outside of the state. And they are increasingly requiring more transparency from organizations and and requiring that they share data. It's not always the right data. It's not always done in the right way. There's a lot of problems there. But they're actually seeing the needle move.
Manifold: You mentioned something I think sounds quite a bit different from how the enterprise environments think about approaches to data, in that maybe because of different philosophy around competition, it sounds like these organizations realize that they have a very small piece of sort of the pie. And their ability to have the kind of visibility that they would need to make effective change is somewhat limited, and by actually coordinating and actually sharing all of their data with each other they can all end up being more effective. Is that accurate?
Andrew: Yeah. One of the challenges facing the social sector, the nonprofit sector in particular, is that's very fractured. Since there's no incentive for a merger or acquisition in the social sector, it's very rare that you have super large institutions.In every city, you have dozens of afterschool programs that are all essentially doing the same thing. You have dozens of homeless agencies or food pantries or workforce development organizations - many organizations doing the same thing.We don't have a Netflix that has that kind of market penetration. And the reason Netflix and Amazon and Google and all of these services work is because they have massive market penetration, massive amounts of information that they build their systems on.
Manifold: Because you need a ton of data for an algorithm to be useful.
Andrew: Exactly. For Netflix to know that I actually really want to watch the Great British Bake Off, it needs lots of people that look like me that watch The Great British Bake Off. In the Social Circle, you don't have that kind of user base.So the only way to get there and gain the kinds of insights that you can gain using that kind of data is to share data.
I think there's a moral imperative to share data. Nonprofits are public institutions. They're part of a public trust. We need to know if they're effective or not. And the only way to do that is to require some kind of transparency when it comes to the impact that they're actually creating.
Manifold: Who practically does that? Is it is it one of these organizations that says "This is important to us and we have the resources to do this on the on the data side. Will you help we partner with us and help?" Or is this more coming from the federal or state level. Who's driving the push to get all of these organizations to share their data? And I would imagine there's a normalization part to get all of it to match up, etc.
Andrew: Yeah, you have to normalize the data. You have to make sure that when you're talking about graduation rates, you're talking about the same thing.But the there are two main places where I'm seeing this happen. One is some funder is requiring it - whether that be the federal or state government or a large foundation of saying "You all need to play nice. To have access to this kind of funding, this is now going to be required of you." And that's one kind of lever that we can pull.The other is that there are many institutions.The nice thing about the nonprofit sector is that while there are human incentives that's sometimes hold us back, but these are people who actually care about the work that they're doing. Some of them will say "Look, if I'm not the most effective maybe I should go out of business and somebody else should get the dollar."If you can find enough of those kind of leaders and organizations, I often them see them driving these conversations. This is really a time for great leaders to step up in the social sector and say there is something more important than my institution lasting another year. I look to be successful not just by raising more money next year, but by being more impactful next year. And for me to do that, I need to know how I compared to my peers and know how well I'm doing. And I think there there are leaders that are leading the sector in that direction.
Manifold: What are some examples of initiatives that have been successful. Where by being able to do this, this is something I can now do that I wasn't previously able to do.
Andrew: So I'm the co-founder of this organization called Bright Hive, and we help facilitate some of this large data sharing work. And we do a lot of initiatives around workforce development? One of the challenges of workforce development is that I train you for a job, and then you go out and get a job, and then I lose track of you. I might have helped you get the first job. But did you lose it in two months? Where are you a year from now? How much money are you actually earning?So we're doing some work in the state of Colorado, where we're working with the state and we're able to get wage and employment data down to the individual level. And we're able to connect that to training providers, and say who are you serving? Let's find them and find their wage and employment data, and then give you some of that aggregated data back so you can begin to understand what kinds of jobs are we helping people get, and what kind of wage bump are we seeing.What's great is that then we can begin to say who's serving what kind of populations well? Who's actually doing a good job increasing family income? And then funders can come in and say we want to support organizations that are effective according to this criteria. Whatever that criteria might be.
Manifold: You said earlier that a lot of these organizations are fundamentally driven by story-based marketing, and there are obviously holes with that. But I would imagine the reverse of that is true.I'm reading Steven Pinker's book right now, and he was saying one of the most surprising things to him was he thought by sharing yet 500 examples of how this is objectively the best time we've ever lived - by far - he thought that that the data would be so compelling that it would sort of do its own job, and he was very surprised to see that wasn't the case.Do you run into that? Is there still a need for for story? And if so, how do you connect the the rational part of the brain with the emotional part of the brain and do it in a way where I'm able to turn the data into a compelling story that is more effective?
Andrew: Absolutely. There's constantly this kind of question "data vs. story". And they aren't necessarily on opposing sides the field. There are ways for them to work together.Data is a raw resource. It's like a block of stone. And there's this great Michelangelo quote that says "inside every block of stone is a statue. And it's the job of the sculptor to uncover it."For those of us like working with data, our job is to take this raw resource and use our insight and experience and tools and methods to uncover something really valuable, and compelling, and true.
I think there are ways where this kind of scientific methodology can meet with our ability to tell stories. Stories are how we make sense of the world. Data is actually not how we make sense of the world. Data plays a role in the stories we oftentimes to tell. But we don't make pure analytical decisions most of the time.
I think there are ways where we can use data to validate whether our stories are true or not. There are things that are true and there are things that are untrue. And I think the role of data in helping us identify what's true and what's not is important. And then I think we can find ways to tell stories that resonate with that truth.I also think sometimes we think of data is just giving us something. Like the data will tell us the answer. And most the time that's not true. Data should inform the answer but data doesn't oftentimes tell you the answer.There's certainly places where data is automating decisions, and I think those are really exciting opportunities. But I think where data often adds the most value is in assisting our decisions.It's the doctor standing in front of a patient, looking at the results of a bunch of different algorithms, and then using their own experience and intuition to interpret and make a call for a patient. Or it's the organization that's trying to help kids graduate from high school getting a list of the 20 kids that we think are most likely to drop out and then intervening with them and then using their own mind and creativity to intervene.I don't think it's about computers just automating all of our decisions. It's about augmenting our ability to make really smart decisions with with data.
Manifold: For organizations, seeing a lack of bodies to do the work, or at least the perception there's a lack of bodies, but for organizations that are wanting to to start making better decisions informed by data that don't necessarily know where to start, what kind of advice do you give those types of organizations?
Andrew: So I think there's oftentimes this idea that the only way to do something with data is to hire some really expensive nerd. But any organization that does data science has a team. It's almost always a team. Because there's such a breadth of skills that are necessary to turn raw data into valuable product. So one "data person" for your organization is rarely the answer.I also think that there's this kind of reaction that we need to hire somebody and invest in technology, when I think often the best place to begin is culture. If you don't have a culture that is driven by evidence, you're gonna hire a data person or invest in technology and then never use it. If you're not a culture that's concerned about your performance, or is rigorous about how you understand whether you're making progress or not, you're never going to utilize the data that you have.
Manifold: If you're not that way now, but you want to be that - if you're a leader who wants to create that culture - how do you change the culture to get it to be more evidence-based?
Andrew: There's a couple things I've seen that I think work. One is it's important to have some champion of this kind of way of thinking who's at the table, who's senior enough to sit at the executive table, but junior enough to still get along with "the people", and who can really kind of be the the nagging voice in the corner. Why are you making this decision? What's the evidence?And I also think thinking in terms of "evidence" rather than just data is really valuable. Because evidence is a term we use that encompasses a much more broad array of things.When we talk about data, we tend to simplify it into numbers and ones and zeros on the computer. And there's a lot of organizations that don't have a lot of data, that don't have a lot of ones and zeros on servers somewhere. But every organization can make an evidence-based decision. The evidence might look different if you're a one or two-person small nonprofit vs. a five billion dollar company. But everyone can make decisions based on evidence. And so I think having a an evidence champion is really helpful.Another thing that I sometimes do with with my nonprofit partners is - because everyone in the nonprofit sector is really nice and we just want to get along and don't want conflict - I'll sometimes assign somebody in meanings to be the "skeptic".The tendency is somebody throws out an idea. We all agree with it and think it's great and think of all the reasons why the work. But I'll tell them I want you to be the person to poke holes in all of our ideas. I think that frees everyone up to think a little bit more critically.And then the third thing is setting up rhythms. In my first role as director of research analytics at the YMCA Chicago, we would sit down every quarter with our different business leaders, whether they are running programs, or running facilities for us. And we would do these "planning with data" sessions.Literally all I would do is sit down and have a dashboard of some important metrics that we collected. And I'd say, "Why did this spike happen here? And why did this happen here? Which of these numbers do we want to see changed and improved in three months? And what are three actions you can take to change this number?"And then three months later we come back and see did that number go up or go down like we thought it would. And it gets people in a rhythm of realizing that the data is not some ethereal, out of body thing that you have no control over. It's actually just a mirror. It's a representation of your work. And the decisions that you make can actually change the metrics. So getting people in that rhythm of looking at some information, interpreting that information, making some actionable decisions about it, then evaluating whether they worked or not, that rhythm can be really helpful.
Manifold: So let's say that I've done that. I do have a culture that is more execution oriented, and that's not my hurdle. What what do I do next? If I want to get into legit data science stuff, and I don't know where to start. Where do I start?
Andrew: So I think the first level is what I call a "data analysis". And that's about how do we get an insight? How do I learn something that I didn't know before that I can make decisions on? And if your culture humming around all of those things, the next kind of thing is "how do I build products?"How do I move from from a mental thought to a technological product? Where data is actually not just giving me more information, but where we you move into that decision supports or automated decision space.
Manifold: And by "products", you don't necessarily mean something that I then turn around and sell to external people. It's an internal tool that does something on a repetitive basis.
Andrew: Yeah, exactly. So for example, a lot of nonprofit programs are oversubscribed. More people want to get into them than you're able to fit. And for a lot of organizations the way that they make that decision is "who signed up first?" We're just gonna let who signed up first into the program.If you're thinking about it from an impact perspective that's a really dumb way to decide who gets into your program or not. You could make the decision based upon "who do I think I'm going to have the most impact on, or who needs my services the most?"So by "product", I mean you could actually build an application engine that has people who applied, and you rank order them. And you then decide based on that rank ordering who to let in your program. That's a product. It's something that plugs into your operational process and helps you do the work that you do in a slightly different way.So there's a there's a few ways of beginning to move in that direction. There's a guy named Jake Garcia at the Foundations Center in New York who I think runs the best kind of data science team in a non-profit I've seen. And what he talks about is a lot around "skilling up."There's some people in the social sector who think "If I could just get somebody who used to work at Google, all my problems will be solved." And I don't think that's actually true.We have tremendously skilled workforce that we can begin to scale up, and give them new abilities and new opportunities. As you're having natural turnover and your organization can up the technical skills that are required to fill that role. Don't just fill it at the level it was at. Say "We want you to be able to do this, but we also want you to know a little bit of coding." Scale up those people those people.And with the people you already have, give them opportunities to learn new technologies and test those out. Give them projects and time to stretch themselves. This can actually be really challenging at nonprofits because you're under-resourced, everyone is strapped and wearing 13 different hats. But as much as possible carve out a little bit of time for them to go and learn.If you go to any enterprise company using data science, they they carve out time for the people just to learn, because these technologies are changing so quickly. We also need to do that in the social sector.And then the other kind of aspect to this is really starting to to ensure that the the tech folks or the data folks much more attached to the mission of the organization. They're not in a service of IT, but may be more connected to the program staff or the executive office where they have a broader mandate where they're seen not just as technology folks, but as folks that are there to help us achieve our mission.I've seen many organizations who have actually moved a data science team out of IT and made them their own standalone team, reporting to the CTO or a program person. And I think that's also a good next step.When it comes to like the the challenge of staffing, I actually think to some extent social sector organizations and have the ability to give people more meaningful work. I know a lot of people in the data science field that are really driven by the problems that they're solving. And there is the opportunity to say "You could go in and improve click-through rates at a big tech company and make $250,000 a year. Or you could come and help us eradicate Malaria in Eastern Africa. And we're going to pay you less, but you'll save millions of kids lives. Up to you."I do think there's a portion of the workforce that's highly motivated by those kind of causes. Now if you've been in graduate school for 12 years, you're not going to give up two hundred thousand dollars of income. But you might give up a portion of that to do something that's more meaningful.
Manifold: You've been in very different worlds, on the consulting side, and pure nonprofit. You've been sort of corporate social responsibility. Have you seen differences in those worlds in terms of how they're approaching the problems they're trying to solve?
Andrew: Absolutely. They're definitely different approaches. To some extent though I think we over-focus on organizational form. The only difference between a non-profit and a for-profit is that a non-profit doesn't have owners. It's a public trust. So any any revenues that it generates beyond its expenses go back into the organization. There are sometimes nonprofits and for-profits that look almost identical in their their work, they just have this different legal structure.But there are some some differences. Sometimes given the management structure of many nonprofits, they sometimes moved a different pace. There's sometimes a lot of board ownership, and so things move at the pace of board meetings. There's a lot of effort coming up to a board meeting and then it gets maybe a little bit quiet.But I've also been I know nonprofits that are actually very agile and very fast moving, just like any tech startup. There are some fascinating nonprofit organizations where if you walked into their office, you wouldn't know if they were a venture backed startup or nonprofit. There are places that are doing really interesting work. And I think it comes much more down to the leadership and mission of the organization.
Manifold: What are you most excited about like over the next five years? What do you think the world of data science is going to look like?
Andrew: I'm jazzed for the hype bubble to burst, and to actually get to the real value. I'm excited for like data literacy to increase, where a broader range of people know what data science can and can't do, and leverage it really well for what it can do.From the social sector perspective, I'm very excited at the kinds and scale of some of these data sharing initiatives. For the first time we're beginning to see some pretty large scale work being done there. And what that will unlock and enable is very very exciting.On the technology said, as we start to see a more instrumented world, I'm very interested to see how that affects the social sector - the ways that that we take advantage of the internet of things and connected technologies, all the emerging tech, I'm interested how that begins to trickle down social sector.There are some great organizations doing cool stuff with drones on anti-poaching or disaster relief efforts. There's really exciting stuff in the medical field.I was talking with a very large foundation that funds tens of millions of dollars or hundreds of millions dollars worth of studies every year. And we're in talks about how data science can predict the results of a clinical trial based on other existing trials you've already done. And this can radically speed up the way in which we discover what works. That's an area that I'm very excited about.In general I think we're going to get much more efficient. We're gonna be able to say "You don't need all of these different medicines. You've eradicated these two diseases, but you can actually eradicate these five others because those branch off of the other two." Or "If we train the future of our workforce in this kind of way, we'll actually have better employment outcomes or wage outcomes."So I'm excited for this kind of like optimization of the social sector to occur.