Scarf Blog - The Most Neglected and Overlooked Open Source Metric: Production Users

After speaking with hundreds of people in various roles working on large open source projects or at commercial open source software (COSS) companies, I have found that their perspectives on measuring growth and success vary a great deal. In fact, I am surprised that some people zero in on certain metrics and ignore many others. The one commonality is that they often fall short of grasping a complete picture. For instance, one executive may be hyper focused on the number of contributors, while another at a similar-sized company may be hyper focused on the number of customers.

Managers, executives, maintainers, and VCs tend to gravitate towards community and revenue metrics but most frequently overlook numbers concerning their proper user base—the fundamental building block that ties everything together and completes all other open source metrics. Production usage supplies the central lens by which we should view all other metrics. Without it, we’d simply have standalone metrics that don’t tell us anything concrete about the actual business or business potential. Ironically, production usage is the metric that open source maintainers and business owners miss the most. I’ll proceed to cover how you can avoid this prevalent pitfall and instead take measures toward optimizing your metrics based on production users.
‍

4 types of open source projects or companies

Certain folks are more prone to a certain outlook on metrics, so it’s best to first identify which group applies to you. People in open source tend to fall into four main categories. ‍

1.) Results, not metrics

About 10% of the people I talk to say that their user base is growing, but they don’t track metrics regularly or specifically. The philosophy of “we do good for the community, and they do good for us” or “we trust that it will work out” is alive and well. This group is mostly made up of projects (some small and some very large), but a surprisingly couple of companies hold this philosophy as well.

I personally am worried about this group. While many good people and projects fall into this category, you can’t measure your activities' effectiveness, make adjustments to improve results, and review if your activities achieved the desired results without looking at metrics.

2.) Community above all else

A little more than 50% of those I talk to are focused almost exclusively on the community. A lot of pre-seed or early seed companies fall into this category. Additionally, most pure open source project teams focus their energy on these metrics. A small but not insignificant number of investors also are here.

This group cares most about metrics such as:

Slack users
GitHub stars
Overall contributors

Note, these metrics are the most common, but this group values other popular metrics too (PRs opened and closed, issues opened, number of users on their forum, number of times a project is cloned, etc.), just on an inconsistent basis.

3.) Community plus revenue

About 33–35% of those I have chatted with fall into this group. This group is all about commercial open source. Their number one goal typically is revenue. The larger the company, the less their executives tend to focus on the community and more on the top-line numbers. Still, there are usually teams at these companies who do care greatly about the community metrics.

This group focuses on metrics such as:

Slack users
GitHub stars
Overall contributors
Number of customers
Annual recurring revenue (ARR)

Note that COSS companies also look at churn, customer acquisition costs, etc.

4.) All metrics count

Fewer than I would have hoped—about 5%—emphasize a more holistic metrics focus. VCs, COSS companies, and a few maintainers talk about these metrics regularly.

This group builds on community and commercial metrics but adds usage data:

Slack users
GitHub stars
Overall contributors
Downloads and Docker pulls
Number of users (enterprise vs. not)
Number of customers (enterprise vs. not)
ARR

Note that people will flip these metrics in and out. Maintainers, for example, don’t usually focus on customers but some do look at the growing user base.

Which group do you think applies to you?

Missing the full picture?

If you’re part of the 95% who don’t identify with the “all metrics count” group, then it’s likely that you’ve yet to capture key insights that could be game changing. All metrics show you something different, but they build on top of one another, meaning that seeing an uptick or slowdown in one area could lead to a chain reaction.

Normally, COSS customers follow this type of path:

Step 1: A healthy and happy community (contributors, repo activity, and an active Slack or forums)

Leads to

‍Step 2: A healthy user base (people who download, install, and are ongoing users)

Which leads to ‍

‍Step 3: Paying customers (better ARR and more enterprise penetration)

‍

The spread across the types of open source projects and companies out there reveals that people seem most focused on step 1 (the healthy community) and step 3 (paying customers) but ignore or minimize step 2 (a healthy open source user base). Companies make the trackability of users on the free tier of their SaaS offering(s) easy and resource light, so the problem of tracking the user base will look a little different in their case. So why aren’t there more people who care about community, commercial, and usage metrics, particularly for on premises?

Even if you are focused exclusively focused on either the community or sales, there is a direct correlation between a healthy growing user based and a growing community or growing customer base:

Why should you care about measuring your open source user base?

Growing the user base enables not only more customers, but also generates more potential contributors. In my experience, many projects’ most valuable contributors were first users of the product who either loved the project so much that they wanted to help, or they found something that the software simply could not do and wanted to contribute.

A massively awesome community, but very little in the way of users is a recipe for slow customer growth. To counter that, let’s examine why user growth deserves to be watched more closely.

Get ahead of problems

No matter if you are a startup or a Fortune 500 company, unexpected fluctuations in revenue are problematic. If you see fewer deals in your pipeline, you send out an alarm. In the community space, if you see fewer contributions or less engagement in Slack, then you send out an alarm and strategize how to fix it. Similarly, if you can see fewer installs, downloads, or active production installs of your software, that is a huge red flag. The worst time to fix a revenue problem or a customer problem is when you have lost a customer. You want to get ahead of that, and monitoring user growth helps you get a much faster pulse on trajectory, serving as a lead rather than a lag measure.

Optimize your sales process and customer journey:

How many open source users become paying customers? Do you know where they come from? Why did they first download? What industries most use your software? How many installs happen at each company?

These are questions that can help shape the sales process, onboarding, and your customer’s overall journey. Having access to details like these enables you to build a plan, optimize, and improve conversion rates.

Make your DevRel and community activities more efficient

Being able to understand and help your user base can make all of your community and DevRel initiatives run smoother and more efficiently. For example, take the following map:

People tend to focus on contributors or customers when looking at decisions about where to have events. This may overlook a massive user base elsewhere that just needs a bit of attention to grow faster.

If we are going to do a hackathon or a contributor-focused meetup, either China or the western U.S. seems ideal.
We have a lot of users but not a lot of customers in Australia. Maybe a conference there could help? We would get more attendance there than at a user conference in China.
Maybe we should look to hire or provide more local language support to contributors in China or users in Germany by translating all of our docs into Chinese and German.
Given that most of our customers are in the UK or Germany and we also have a large number of users in those locations, those places seem prime for field marketing events.

User data allows you to make these conclusions, but the benefits go beyond events. With the proper data setup, you can see which pages on the website lead to downloads, the most popular package managers, embeds by third-party tools (partner activities, anyone?), or even tying downloads to actual content! All of this information points to who your users are and how they use your product. Such data prevents you from making the wrong assumptions.

If you really want to take user and usage data a step further, correlating those metrics with content metrics gives you a sense of what users are interested in or looking for moving forward.

In this example, blog 2 and 3 would receive more traffic than blog 1, but blog 1 drives more people to download and try your software. Depending on your goal (in this scenario, conversion vs. awareness), you may want to drill down into the habits and demographic data of those who consumed certain blog content in an effort to better cater to their needs. You might deem blog 1 as one for warmer leads and look at how that group interacted with the content to gain more insight into what they respond to and how to meet them where they’re at. Without that connection, you can spend a lot of energy on activities that do not move the needle.

Encounter more investment and funding options

Of all reasons, this one is a matter of survival. In today’s market, more data is a must for investors. Recently, Cowboy Ventures surveyed investors on what they look for in early seed rounds, and the top metric was production users (not customers, not Slack, not GitHub). The second was users in larger companies and then community metrics (Slack, GitHub issues, and stars).

*If you were investing in a new company which set of metrics would you prefer to see?*

Why do we minimize or in some cases outright ignore tracking real open source user base metrics?

I have a few theories.

No one owns your open source user growth targets

The first theory is simple. There are separate and dedicated teams focused on community growth and revenue growth but not user base growth.

On the revenue growth side, sales and marketing are focused on closing deals, adding new customers, and increasing revenue for the company. They are driven by their ARR and customer growth goals. Marketing seeks to funnel potential customers into the sales pipeline and turn a user into a customer as quickly as possible, though it can take months if not longer in some cases. Marketing is all about awareness (top of the funnel) and lead generation (middle of the funnel), while sales is tasked with converting leads into customers.

On the community growth side, the DevRel, community, and even marketing teams are all trying to bring awareness and usher people into the community. They work or interface with code contributors to improve the project by adding features, eliminating bugs, and extending the capabilities of the underlying software. Many community teams measure their performance based on contributor growth, Slack activity increases, etc.

DevRel folks are usually charged with driving awareness as part of their role. It’s their job to make people aware of the product, that it is awesome, and to get them to try it out! They do this by releasing a continual stream of content, including talks, docs, and tutorials. Although DevRel gives users the tools to be successful, I have not seen a DevRel person in the open source community specifically charged with (or evaluated on) a specific user growth target. If the responsibility for user base growth were to live anywhere, however, it would fit best under this team.

Part of owning the user base is ensuring that people stay users and don’t churn. While great content and docs can help keep users in the ecosystem, DevRel historically has not assumed responsibility for maintaining the user base. That said, some people who are developer relations engineers end up acting as free community support, but as commercialization efforts ramp up, support-type activities shift over to a paid-for offering, so community involvement in that regard gets neglected or understaffed. The product team could pick up the slack, but in my experience most product teams end up focused on the commercial offerings as well.

Consequently, few look at, plan for, and measure actual open source users.

Measuring the growth of the open source user base is hard

The second theory is that it's hard to measure. In the last five years, we have developed tons of tools to help measure the community, marketing, and sales cycles but very few to focus on the actual open source user base. Measuring downloads, installs, and actual usage of the product requires a degree of data sharing that companies have not been willing to explore or involves technicalities very difficult to overcome. I’ve broken down some of these roadblocks below.
Downloads

When looking at who downloads your software, the first challenge is understanding where they originate. Users can grab software from dozens of places:

Container repos such as Docker Hub, Container Registry from Google Cloud, Red Hat Quay, Amazon Elastic Container Registry, and Azure Container Registry
Package managers such as Nix, Homebrew, RPM, and APT
Language specific package managers such as pip and npm
Direct downloads
Source (GitHub or GitLab)

Consolidating these metrics in an organized way requires planning and some sort of central control plane or gateway (like we have at Scarf) to aggregate and track.

Installs

If you can get download numbers, you can infer some number of installs, but a certain percentage of downloads are going to be upgrades or automation as part of CI/CD pipelines. This is where call home metrics or install telemetry would be the most helpful. See more details in the following section on active usage.

Active usage

To understand who actively uses your project over an extended period of time, either people need to volunteer the information or you need to provide some sort of call home or ping back functionality. A few projects over the years have tried and failed when concerns over privacy and transparency emerged. You can infer users by looking at aspects such as distinct companies opening up tickets and engaging in the community, but in my experience, these active community users are about 5–10% at most of the real user base. The number of companies downloading updates or on a regular basis gets you a more realistic count, but it is still underreported (about 80% accurate). The best solutions here are opt in, but at the very least, offer something that the customer can use (e.g., checks for common vulnerabilities and exposures).

It’s boring

Another challenge that makes measuring growth of the user base hard comes from the nature of the beast. Sometimes the things that are hard seem like they aren’t of interest and don’t need to be when in reality they are actually just hard. After all, which story sounds better?

Over 10,000 users worldwide love and trust our software
Thousands of contributors worldwide use our software and make it better, plus we have over 5,000 active members in the community and over 15,000 stars on GitHub

Let’s be honest—having people who love and care about your product enough not only to use but also contribute to it is a powerful and wonderful story to tell. It is also necessary. You need to understand how many of your users are contributors, and how many of your customers are contributors. You need all three to be healthy.

It’s a blind spot

In our digital age, it can be hard to believe that not enough information is accessible and out there. In truth, there is a lot of information about working with the open source community and projects running themselves. There is also a great deal about how to track and grow revenue, but there is very little in fact on growing the free user base without tying it to revenue or contributor growth.

As you can see, these theories highlight obstacles that may explain why the user base is not measured as often as it ought. I’ve always been of the mind that the degree of skill required to address the challenges involved reflects the degree to which this metric is worth pursuing.

‍

Final thoughts

For as many reasons why we need to measure production usage and gather data on our user base, it’s a wonder why we don’t see it happening more. The majority of us already track community engagement and/or revenue, but a third, equally important layer exists, and that is the user base. Obtaining those numbers is certainly no small task, but those who do are the common thread across successful open source projects and COSS companies. When you are looking for meaningful growth in your open source project, make sure you are looking at your open source user base.

‍

If you don’t know who owns the growth of your open source user base and is responsible for reporting it every month, then you may want to rethink your metrics strategy.
‍

If you’d rather not do that alone, sign up for Scarf and we’ll be more than happy to help you get started.

‍