Data-Driven Open Source: Why You Should Care About Metrics (Part 1)
Published
July 24, 2023
This article was originally posted on
HackernoonIn the open source world, there seems to be a divide when it comes to using and valuing data. I've spoken with some big, well respected and known names in the open source community who often downplay, or even overlook, the importance of keeping track of metrics. Instead, they're all about building relationships, using open source to make the world a better place, or growing their own companies (which are noble and still good outcomes). The idea of using data to make decisions often ends up on the back burner or isn't considered at all.
But here's the thing. This "let's not bother with data" mindset isn't just in companies—it's also in projects and non-profit foundations that are all about growing the open source community. They're focused on creating positive outcomes and expanding the ecosystem, which is fantastic. That said, some people I talk with in the community have an open disdain for metrics as they believe community and relationships can not be measured and attempts to do so lead to false conclusions. However, I can't help but think that without having the appropriate data available, they are opening themselves to a giant blind spot when it comes to decision-making.
As a big fan of data, I believe that any open source project, just like any business, stands to gain a lot from a data-focused approach. I've always lived by the principle—if you can measure it, you can improve it. This belief has guided my work managing databases and technology and pushing for better performance.

Without Data you are Feeling for a Light Switch in a Dark Room:
Many open source projects and companies struggle with growth and getting users on board. They might feel like something's off, like they're not getting enough GitHub stars, or not many people are contributing code, or there aren't enough active users. But without real data, often the efforts to grow the user base and attract more interest are focused on previous experience, gut feel, or copying what others had done and hoping it works. Interestingly enough, many of the people managing these efforts I have talked with are left scratching their heads when those efforts don’t seem to have the impact they expected or others have seen.
It is smart to learn, copy, and try to replicate others' success, but it does not always lead to the same results. For instance a previous company I worked with had massive amounts of blog traffic with a ton of traffic coming in from blogs that are extremely technical. I talked with people who were struggling to copy some of the same practices and get the same level of traction. When I asked for data on which blogs they had were working, which content channels maybe doing better than written blogs (i.e. videos, etc), where they ranked in search, etc. I was left with blank stares or best guesses. Without the data to understand where the bottleneck was or what was or wasn’t working, fixing the problem is regulated to trying things out and hoping for results.

But even if you are running experiments that seem successful, you will often have false positives when it comes to growth. For instance, we recently saw a massive uptick in the number of users signing up for our services. It was around the same time we were testing out some new content and promoting specific content to certain audiences. When I see I have started X, and I am starting to see a 300-400% increase, that is a good indication that those tactics may be working. However, in this case, a company in the tech sector made a change to their pricing and terms, and someone in that community suggested us in a well read thread which drove almost all of the increase. The initial experiment and tactics we tried were, in fact, not working as well as we had hoped, we just happened to have an organic jump at the same time. Without being able to measure and see the specifics and reasons for our growth and to be able to measure it, I would have assumed spending money on this specific promoted content was worth it, and invested money in it.
Whether you want to grow your user base, sustain your open source projects, increase your sales, or make the world a better place, I believe we need to be serious about using data (along with other qualitative indicators).
Using Metrics & Data as Part of your Sustainability Efforts:
Keeping open source projects alive can be done in many ways. You can attract contributors, ask for time donations, drum up financial support, or offer up commercial services. But while these are all great strategies, there are more projects and companies that struggle to make them work for their open source project, then those who reach a high level of success with them. Why do some projects thrive and others struggle?

There's a big debate about how to keep open source projects alive and thriving, especially when it comes to paying the people who maintain them. Sure, big companies are donating money, individuals donate time and small funds when possible, and some sell some sort of commercial offerings to offset project costs. All too often, however, these funds are a drop in the ocean compared to what the project and community need. This gap is one of the reasons many projects who turn into commercial entities seek outside funding. Often to accelerate the project ( and build a reasonable revenue stream ) more engineers are needed to fill in gaps or needs in features, quality, and address other technical challenges. Many of these funded entities still spend multiple years and hundreds of thousands of engineering hours before they become commercially viable or are trusted in their space ( if they ever do ). Efficiency and focus on the “right” activities for both engineering, growth, and community is needed no matter if you are funded or not, but with limited funds you have to know that what you are working on matters and is helping the community get better. That's why we need data to help us focus on the most important stuff and get the best results.
The challenge now is to build a set of metrics that can help project maintainers and developers boost their growth. And by growth, I don't mean just making more money. I'm talking about growing users, getting more contributors, and hitting the goals you've set.
Successful open source projects follow a simple rule: decide what outcome you want the most, work towards achieving it, and measure how well you're doing along the way. So why isn't everyone focusing on metrics? The answer, as we'll explore in the next part of this blog, has a lot to do with the unique culture of the open-source community.
Stick around for the next part where we dive into the challenges of measuring success in open source and how we can navigate them.
I hope you enjoyed this blog and learned something new about the importance of data for open source success. If you want to learn more about which metrics you should be tracking in your open source business, I have a video for you. In this video, I share some of the key metrics that can help you measure and improve your open source performance. You can watch it here:
If you want to read the second part of this blog, you can do it here.
Latest blog posts
Tools and strategies modern teams need to help their companies grow.