Scarf Announces Integration with Common Room
Playbook

The Most Neglected and Overlooked Open Source Metric: Production Users

General
Analytics for open source
Try for Free
Try for Free

After speaking with hundreds of people in various roles working on large open source projects or at commercial open source software (COSS) companies, I have found that their perspectives on measuring growth and success vary a great deal. In fact, I am surprised that some people zero in on certain metrics and ignore many others. The one commonality is that they often fall short of grasping a complete picture. For instance, one executive may be hyper focused on the number of contributors, while another at a similar-sized company may be hyper focused on the number of customers. 

Managers, executives, maintainers, and VCs tend to gravitate towards community and revenue metrics but most frequently overlook numbers concerning their proper user base—the fundamental building block that ties everything together and completes all other open source metrics. Production usage supplies the central lens by which we should view all other metrics. Without it, we’d simply have standalone metrics that don’t tell us anything concrete about the actual business or business potential. Ironically, production usage is the metric that open source maintainers and business owners miss the most. I’ll proceed to cover how you can avoid this prevalent pitfall and instead take measures toward optimizing your metrics based on production users.

4 types of open source projects or companies

Certain folks are more prone to a certain outlook on metrics, so it’s best to first identify which group applies to you. People in open source tend to fall into four main categories.  

1.) Results, not metrics 

About 10% of the people I talk to say that their user base is growing, but they don’t track metrics regularly or specifically. The philosophy of “we do good for the community, and they do good for us” or “we trust that it will work out” is alive and well. This group is mostly made up of projects (some small and some very large), but a surprisingly couple of companies hold this philosophy as well.

I personally am worried about this group. While many good people and projects fall into this category, you can’t measure your activities' effectiveness, make adjustments to improve results, and review if your activities achieved the desired results without looking at metrics. 

2.) Community above all else 

A little more than 50% of those I talk to are focused almost exclusively on the community.  A lot of pre-seed or early seed companies fall into this category. Additionally, most pure open source project teams focus their energy on these metrics. A small but not insignificant number of investors also are here.

This group cares most about metrics such as:

  • Slack users
  • GitHub stars 
  • Overall contributors

Note, these metrics are the most common, but this group values other popular metrics too (PRs opened and closed, issues opened, number of users on their forum, number of times a project is cloned, etc.), just on an inconsistent basis.  

3.) Community plus revenue

About 33–35% of those I have chatted with fall into this group. This group is all about commercial open source. Their number one goal typically is revenue. The larger the company, the less their executives tend to focus on the community and more on the top-line numbers. Still, there are usually teams at these companies who do care greatly about the community metrics.

This group focuses on metrics such as:

  • Slack users
  • GitHub stars
  • Overall contributors
  • Number of customers
  • Annual recurring revenue (ARR)

Note that COSS companies also look at churn, customer acquisition costs, etc.

4.) All metrics count

Fewer than I would have hoped—about 5%—emphasize a more holistic metrics focus. VCs, COSS companies, and a few maintainers talk about these metrics regularly. 

This group builds on community and commercial metrics but adds usage data:

  • Slack users
  • GitHub stars 
  • Overall contributors
  • Downloads and Docker pulls
  • Number of users (enterprise vs. not)
  • Number of customers (enterprise vs. not)
  • ARR

Note that people will flip these metrics in and out. Maintainers, for example, don’t usually focus on customers but some do look at the growing user base. 

Which group do you think applies to you?

Missing the full picture?

If you’re part of the 95% who don’t identify with the “all metrics count” group, then it’s likely that you’ve yet to capture key insights that could be game changing. All metrics show you something different, but they build on top of one another, meaning that seeing an uptick or slowdown in one area could lead to a chain reaction.

Normally, COSS customers follow this type of path:

Step 1: A healthy and happy community (contributors, repo activity, and an active Slack or forums)

Leads to

Step 2: A healthy user base (people who download, install, and are ongoing users)

Which leads to

Step 3: Paying customers (better ARR and more enterprise penetration)

The spread across the types of open source projects and companies out there reveals that people seem most focused on step 1 (the healthy community) and step 3 (paying customers) but ignore or minimize step 2 (a healthy open source user base). Companies make the trackability of users on the free tier of their SaaS offering(s) easy and resource light, so the problem of tracking the user base will look a little different in their case. So why aren’t there more people who care about community, commercial, and usage metrics, particularly for on premises?

Even if you are focused exclusively focused on either the community or sales, there is a direct correlation between a healthy growing user based and a growing community or growing customer base:

Why should you care about measuring your open source user base?


Growing the user base enables not only more customers, but also generates more potential contributors. In my experience, many projects’ most valuable contributors were first users of the product who either loved the project so much that they wanted to help, or they found something that the software simply could not do and wanted to contribute.

A massively awesome community, but very little in the way of users is a recipe for slow customer growth. To counter that, let’s examine why user growth deserves to be watched more closely. 

Get ahead of problems

No matter if you are a startup or a Fortune 500 company, unexpected fluctuations in revenue are problematic. If you see fewer deals in your pipeline, you send out an alarm. In the community space, if you see fewer contributions or less engagement in Slack, then you send out an alarm and strategize how to fix it. Similarly, if you can see fewer installs, downloads, or  active production installs of your software, that is a huge red flag. The worst time to fix a revenue problem or a customer problem is when you have lost a customer. You want to get ahead of that, and monitoring user growth helps you get a much faster pulse on trajectory, serving as a lead rather than a lag measure.

Optimize your sales process and customer journey:

How many open source users become paying customers? Do you know where they come from? Why did they first download? What industries most use your software? How many installs happen at each company? 

These are questions that can help shape the sales process, onboarding, and your customer’s overall journey. Having access to details like these enables you to build a plan, optimize, and improve conversion rates.

Make your DevRel and community activities more efficient

Being able to understand and help your user base can make all of your community and DevRel initiatives run smoother and more efficiently. For example, take the following map:

People tend to focus on contributors or customers when looking at decisions about where to have events. This may overlook a massive user base elsewhere that just needs a bit of attention to grow faster.
  • If we are going to do a hackathon or a contributor-focused meetup, either China or the western U.S. seems ideal. 
  • We have a lot of users but not a lot of customers in Australia. Maybe a conference there could help? We would get more attendance there than at a user conference in China.  
  • Maybe we should look to hire or provide more local language support to contributors in China or users in Germany by translating all of our docs into Chinese and German.
  • Given that most of our customers are in the UK or Germany and we also have a large number of users in those locations, those places seem prime for field marketing events.  

User data allows you to make these conclusions, but the benefits go beyond events. With the proper data setup, you can see which pages on the website lead to downloads, the most popular package managers, embeds by third-party tools (partner activities, anyone?), or even tying downloads to actual content! All of this information points to who your users are and how they use your product. Such data prevents you from making the wrong assumptions. 

If you really want to take user and usage data a step further, correlating those metrics with content metrics gives you a sense of what users are interested in or looking for moving forward.

In this example, blog 2 and 3 would receive more traffic than blog 1, but blog 1 drives more people to download and try your software.  Depending on your goal (in this scenario, conversion vs. awareness), you may want to drill down into the habits and demographic data of those who consumed certain blog content in an effort to better cater to their needs. You might deem blog 1 as one for warmer leads and look at how that group interacted with the content to gain more insight into what they respond to and how to meet them where they’re at. Without that connection, you can spend a lot of energy on activities that do not move the needle.

Encounter more investment and funding options

Of all reasons, this one is a matter of survival. In today’s market, more data is a must for investors. Recently, Cowboy Ventures surveyed investors on what they look for in early seed rounds, and the top metric was production users (not customers, not Slack, not GitHub). The second was users in larger companies and then community metrics (Slack, GitHub issues, and stars).

If you were investing in a new company which set of metrics would you prefer to see?

Why do we minimize or in some cases outright ignore tracking real open source user base metrics?

I have a few theories. 

No one owns your open source user growth targets

 The first theory is simple. There are separate and dedicated teams focused on community growth and revenue growth but not user base growth.

On the revenue growth side, sales and marketing are focused on closing deals, adding new customers, and increasing revenue for the company. They are driven by their ARR and customer growth goals. Marketing seeks to funnel potential customers into the sales pipeline and turn a user into a customer as quickly as possible, though it can take months if not longer in some cases. Marketing is all about awareness (top of the funnel) and lead generation (middle of the funnel), while sales is tasked with converting leads into customers.

On the community growth side, the DevRel, community, and even marketing teams are all trying to bring awareness and usher people into the community. They work or interface with code contributors to improve the project by adding features, eliminating bugs, and extending the capabilities of the underlying software. Many community teams measure their performance based on contributor growth, Slack activity increases, etc.  

DevRel folks are usually charged with driving awareness as part of their role. It’s their job to make people aware of the product, that it is awesome, and to get them to try it out! They do this by releasing a continual stream of content, including talks, docs, and tutorials. Although DevRel gives users the tools to be successful, I have not seen a DevRel person in the open source community specifically charged with (or evaluated on) a specific user growth target. If the responsibility for user base growth were to live anywhere, however, it would fit best under this team.   

Part of owning the user base is ensuring that people stay users and don’t churn. While great content and docs can help keep users in the ecosystem, DevRel historically has not assumed responsibility for maintaining the user base. That said, some people who are developer relations engineers end up acting as free community support, but as commercialization efforts ramp up, support-type activities shift over to a paid-for offering, so community involvement in that regard gets neglected or understaffed. The product team could pick up the slack, but in my experience most product teams end up focused on the commercial offerings as well.  

Consequently, few look at, plan for, and measure actual open source users. 

Measuring the growth of the open source user base is hard

The second theory is that it's hard to measure. In the last five years, we have developed tons of tools to help measure the community, marketing, and sales cycles but very few to focus on the actual open source user base. Measuring downloads, installs, and actual usage of the product requires a degree of data sharing that companies have not been willing to explore or involves technicalities very difficult to overcome. I’ve broken down some of these roadblocks below.
Downloads

When looking at who downloads your software, the first challenge is understanding where they originate. Users can grab software from dozens of places:

  • Container repos such as Docker Hub, Container Registry from Google Cloud, Red Hat Quay, Amazon Elastic Container Registry, and Azure Container Registry  
  • Package managers such as Nix, Homebrew, RPM, and APT
  • Language specific package managers such as pip and npm 
  • Direct downloads
  • Source (GitHub or GitLab) 

Consolidating these metrics in an organized way requires planning and some sort of central control plane or gateway (like we have at Scarf) to aggregate and track.   

Installs

If you can get download numbers, you can infer some number of installs, but a certain percentage of downloads are going to be upgrades or automation as part of CI/CD pipelines. This is where call home metrics or install telemetry would be the most helpful. See more details in the following section on active usage.

Active usage

To understand who actively uses your project over an extended period of time, either people need to volunteer the information or you need to provide some sort of call home or ping back functionality. A few projects over the years have tried and failed when concerns over privacy and transparency emerged. You can infer users by looking at aspects such as distinct companies opening up tickets and engaging in the community, but in my experience, these active community users are about 5–10% at most of the real user base. The number of companies downloading updates or on a regular basis gets you a more realistic count, but it is still underreported (about 80% accurate). The best solutions here are opt in, but at the very least, offer something that the customer can use (e.g., checks for common vulnerabilities and exposures). 

It’s boring

Another challenge that makes measuring growth of the user base hard comes from the nature of the beast. Sometimes the things that are hard seem like they aren’t of interest and don’t need to be when in reality they are actually just hard. After all, which story sounds better?   

  1. Over 10,000 users worldwide love and trust our software
  2. Thousands of contributors worldwide use our software and make it better, plus we have over 5,000 active members in the community and over 15,000 stars on GitHub

Let’s be honest—having people who love and care about your product enough not only to use but also contribute to it is a powerful and wonderful story to tell. It is also necessary. You need to understand how many of your users are contributors, and how many of your customers are contributors. You need all three to be healthy.

It’s a blind spot

In our digital age, it can be hard to believe that not enough information is accessible and out there. In truth, there is a lot of information about working with the open source community and projects running themselves. There is also a great deal about how to track and grow revenue, but there is very little in fact on growing the free user base without tying it to revenue or contributor growth.

As you can see, these theories highlight obstacles that may explain why the user base is not measured as often as it ought. I’ve always been of the mind that the degree of skill required to address the challenges involved reflects the degree to which this metric is worth pursuing.

Final thoughts

For as many reasons why we need to measure production usage and gather data on our user base, it’s a wonder why we don’t see it happening more. The majority of us already track community engagement and/or revenue, but a third, equally important layer exists, and that is the user base. Obtaining those numbers is certainly no small task, but those who do are the common thread across successful open source projects and COSS companies. When you are looking for meaningful growth in your open source project, make sure you are looking at your open source user base.

If you don’t know who owns the growth of your open source user base and is responsible for reporting it every month, then you may want to rethink your metrics strategy.

If you’d rather not do that alone, sign up for Scarf and we’ll be more than happy to help you get started.

The Most Neglected and Overlooked Open Source Metric: Production Users

Published

March 21, 2023

This article was originally posted on

Hackernoon

After speaking with hundreds of people in various roles working on large open source projects or at commercial open source software (COSS) companies, I have found that their perspectives on measuring growth and success vary a great deal. In fact, I am surprised that some people zero in on certain metrics and ignore many others. The one commonality is that they often fall short of grasping a complete picture. For instance, one executive may be hyper focused on the number of contributors, while another at a similar-sized company may be hyper focused on the number of customers. 

Managers, executives, maintainers, and VCs tend to gravitate towards community and revenue metrics but most frequently overlook numbers concerning their proper user base—the fundamental building block that ties everything together and completes all other open source metrics. Production usage supplies the central lens by which we should view all other metrics. Without it, we’d simply have standalone metrics that don’t tell us anything concrete about the actual business or business potential. Ironically, production usage is the metric that open source maintainers and business owners miss the most. I’ll proceed to cover how you can avoid this prevalent pitfall and instead take measures toward optimizing your metrics based on production users.

4 types of open source projects or companies

Certain folks are more prone to a certain outlook on metrics, so it’s best to first identify which group applies to you. People in open source tend to fall into four main categories.  

1.) Results, not metrics 

About 10% of the people I talk to say that their user base is growing, but they don’t track metrics regularly or specifically. The philosophy of “we do good for the community, and they do good for us” or “we trust that it will work out” is alive and well. This group is mostly made up of projects (some small and some very large), but a surprisingly couple of companies hold this philosophy as well.

I personally am worried about this group. While many good people and projects fall into this category, you can’t measure your activities' effectiveness, make adjustments to improve results, and review if your activities achieved the desired results without looking at metrics. 

2.) Community above all else 

A little more than 50% of those I talk to are focused almost exclusively on the community.  A lot of pre-seed or early seed companies fall into this category. Additionally, most pure open source project teams focus their energy on these metrics. A small but not insignificant number of investors also are here.

This group cares most about metrics such as:

  • Slack users
  • GitHub stars 
  • Overall contributors

Note, these metrics are the most common, but this group values other popular metrics too (PRs opened and closed, issues opened, number of users on their forum, number of times a project is cloned, etc.), just on an inconsistent basis.  

3.) Community plus revenue

About 33–35% of those I have chatted with fall into this group. This group is all about commercial open source. Their number one goal typically is revenue. The larger the company, the less their executives tend to focus on the community and more on the top-line numbers. Still, there are usually teams at these companies who do care greatly about the community metrics.

This group focuses on metrics such as:

  • Slack users
  • GitHub stars
  • Overall contributors
  • Number of customers
  • Annual recurring revenue (ARR)

Note that COSS companies also look at churn, customer acquisition costs, etc.

4.) All metrics count

Fewer than I would have hoped—about 5%—emphasize a more holistic metrics focus. VCs, COSS companies, and a few maintainers talk about these metrics regularly. 

This group builds on community and commercial metrics but adds usage data:

  • Slack users
  • GitHub stars 
  • Overall contributors
  • Downloads and Docker pulls
  • Number of users (enterprise vs. not)
  • Number of customers (enterprise vs. not)
  • ARR

Note that people will flip these metrics in and out. Maintainers, for example, don’t usually focus on customers but some do look at the growing user base. 

Which group do you think applies to you?

Missing the full picture?

If you’re part of the 95% who don’t identify with the “all metrics count” group, then it’s likely that you’ve yet to capture key insights that could be game changing. All metrics show you something different, but they build on top of one another, meaning that seeing an uptick or slowdown in one area could lead to a chain reaction.

Normally, COSS customers follow this type of path:

Step 1: A healthy and happy community (contributors, repo activity, and an active Slack or forums)

Leads to

Step 2: A healthy user base (people who download, install, and are ongoing users)

Which leads to

Step 3: Paying customers (better ARR and more enterprise penetration)

The spread across the types of open source projects and companies out there reveals that people seem most focused on step 1 (the healthy community) and step 3 (paying customers) but ignore or minimize step 2 (a healthy open source user base). Companies make the trackability of users on the free tier of their SaaS offering(s) easy and resource light, so the problem of tracking the user base will look a little different in their case. So why aren’t there more people who care about community, commercial, and usage metrics, particularly for on premises?

Even if you are focused exclusively focused on either the community or sales, there is a direct correlation between a healthy growing user based and a growing community or growing customer base:

Why should you care about measuring your open source user base?


Growing the user base enables not only more customers, but also generates more potential contributors. In my experience, many projects’ most valuable contributors were first users of the product who either loved the project so much that they wanted to help, or they found something that the software simply could not do and wanted to contribute.

A massively awesome community, but very little in the way of users is a recipe for slow customer growth. To counter that, let’s examine why user growth deserves to be watched more closely. 

Get ahead of problems

No matter if you are a startup or a Fortune 500 company, unexpected fluctuations in revenue are problematic. If you see fewer deals in your pipeline, you send out an alarm. In the community space, if you see fewer contributions or less engagement in Slack, then you send out an alarm and strategize how to fix it. Similarly, if you can see fewer installs, downloads, or  active production installs of your software, that is a huge red flag. The worst time to fix a revenue problem or a customer problem is when you have lost a customer. You want to get ahead of that, and monitoring user growth helps you get a much faster pulse on trajectory, serving as a lead rather than a lag measure.

Optimize your sales process and customer journey:

How many open source users become paying customers? Do you know where they come from? Why did they first download? What industries most use your software? How many installs happen at each company? 

These are questions that can help shape the sales process, onboarding, and your customer’s overall journey. Having access to details like these enables you to build a plan, optimize, and improve conversion rates.

Make your DevRel and community activities more efficient

Being able to understand and help your user base can make all of your community and DevRel initiatives run smoother and more efficiently. For example, take the following map:

People tend to focus on contributors or customers when looking at decisions about where to have events. This may overlook a massive user base elsewhere that just needs a bit of attention to grow faster.
  • If we are going to do a hackathon or a contributor-focused meetup, either China or the western U.S. seems ideal. 
  • We have a lot of users but not a lot of customers in Australia. Maybe a conference there could help? We would get more attendance there than at a user conference in China.  
  • Maybe we should look to hire or provide more local language support to contributors in China or users in Germany by translating all of our docs into Chinese and German.
  • Given that most of our customers are in the UK or Germany and we also have a large number of users in those locations, those places seem prime for field marketing events.  

User data allows you to make these conclusions, but the benefits go beyond events. With the proper data setup, you can see which pages on the website lead to downloads, the most popular package managers, embeds by third-party tools (partner activities, anyone?), or even tying downloads to actual content! All of this information points to who your users are and how they use your product. Such data prevents you from making the wrong assumptions. 

If you really want to take user and usage data a step further, correlating those metrics with content metrics gives you a sense of what users are interested in or looking for moving forward.

In this example, blog 2 and 3 would receive more traffic than blog 1, but blog 1 drives more people to download and try your software.  Depending on your goal (in this scenario, conversion vs. awareness), you may want to drill down into the habits and demographic data of those who consumed certain blog content in an effort to better cater to their needs. You might deem blog 1 as one for warmer leads and look at how that group interacted with the content to gain more insight into what they respond to and how to meet them where they’re at. Without that connection, you can spend a lot of energy on activities that do not move the needle.

Encounter more investment and funding options

Of all reasons, this one is a matter of survival. In today’s market, more data is a must for investors. Recently, Cowboy Ventures surveyed investors on what they look for in early seed rounds, and the top metric was production users (not customers, not Slack, not GitHub). The second was users in larger companies and then community metrics (Slack, GitHub issues, and stars).

If you were investing in a new company which set of metrics would you prefer to see?

Why do we minimize or in some cases outright ignore tracking real open source user base metrics?

I have a few theories. 

No one owns your open source user growth targets

 The first theory is simple. There are separate and dedicated teams focused on community growth and revenue growth but not user base growth.

On the revenue growth side, sales and marketing are focused on closing deals, adding new customers, and increasing revenue for the company. They are driven by their ARR and customer growth goals. Marketing seeks to funnel potential customers into the sales pipeline and turn a user into a customer as quickly as possible, though it can take months if not longer in some cases. Marketing is all about awareness (top of the funnel) and lead generation (middle of the funnel), while sales is tasked with converting leads into customers.

On the community growth side, the DevRel, community, and even marketing teams are all trying to bring awareness and usher people into the community. They work or interface with code contributors to improve the project by adding features, eliminating bugs, and extending the capabilities of the underlying software. Many community teams measure their performance based on contributor growth, Slack activity increases, etc.  

DevRel folks are usually charged with driving awareness as part of their role. It’s their job to make people aware of the product, that it is awesome, and to get them to try it out! They do this by releasing a continual stream of content, including talks, docs, and tutorials. Although DevRel gives users the tools to be successful, I have not seen a DevRel person in the open source community specifically charged with (or evaluated on) a specific user growth target. If the responsibility for user base growth were to live anywhere, however, it would fit best under this team.   

Part of owning the user base is ensuring that people stay users and don’t churn. While great content and docs can help keep users in the ecosystem, DevRel historically has not assumed responsibility for maintaining the user base. That said, some people who are developer relations engineers end up acting as free community support, but as commercialization efforts ramp up, support-type activities shift over to a paid-for offering, so community involvement in that regard gets neglected or understaffed. The product team could pick up the slack, but in my experience most product teams end up focused on the commercial offerings as well.  

Consequently, few look at, plan for, and measure actual open source users. 

Measuring the growth of the open source user base is hard

The second theory is that it's hard to measure. In the last five years, we have developed tons of tools to help measure the community, marketing, and sales cycles but very few to focus on the actual open source user base. Measuring downloads, installs, and actual usage of the product requires a degree of data sharing that companies have not been willing to explore or involves technicalities very difficult to overcome. I’ve broken down some of these roadblocks below.
Downloads

When looking at who downloads your software, the first challenge is understanding where they originate. Users can grab software from dozens of places:

  • Container repos such as Docker Hub, Container Registry from Google Cloud, Red Hat Quay, Amazon Elastic Container Registry, and Azure Container Registry  
  • Package managers such as Nix, Homebrew, RPM, and APT
  • Language specific package managers such as pip and npm 
  • Direct downloads
  • Source (GitHub or GitLab) 

Consolidating these metrics in an organized way requires planning and some sort of central control plane or gateway (like we have at Scarf) to aggregate and track.   

Installs

If you can get download numbers, you can infer some number of installs, but a certain percentage of downloads are going to be upgrades or automation as part of CI/CD pipelines. This is where call home metrics or install telemetry would be the most helpful. See more details in the following section on active usage.

Active usage

To understand who actively uses your project over an extended period of time, either people need to volunteer the information or you need to provide some sort of call home or ping back functionality. A few projects over the years have tried and failed when concerns over privacy and transparency emerged. You can infer users by looking at aspects such as distinct companies opening up tickets and engaging in the community, but in my experience, these active community users are about 5–10% at most of the real user base. The number of companies downloading updates or on a regular basis gets you a more realistic count, but it is still underreported (about 80% accurate). The best solutions here are opt in, but at the very least, offer something that the customer can use (e.g., checks for common vulnerabilities and exposures). 

It’s boring

Another challenge that makes measuring growth of the user base hard comes from the nature of the beast. Sometimes the things that are hard seem like they aren’t of interest and don’t need to be when in reality they are actually just hard. After all, which story sounds better?   

  1. Over 10,000 users worldwide love and trust our software
  2. Thousands of contributors worldwide use our software and make it better, plus we have over 5,000 active members in the community and over 15,000 stars on GitHub

Let’s be honest—having people who love and care about your product enough not only to use but also contribute to it is a powerful and wonderful story to tell. It is also necessary. You need to understand how many of your users are contributors, and how many of your customers are contributors. You need all three to be healthy.

It’s a blind spot

In our digital age, it can be hard to believe that not enough information is accessible and out there. In truth, there is a lot of information about working with the open source community and projects running themselves. There is also a great deal about how to track and grow revenue, but there is very little in fact on growing the free user base without tying it to revenue or contributor growth.

As you can see, these theories highlight obstacles that may explain why the user base is not measured as often as it ought. I’ve always been of the mind that the degree of skill required to address the challenges involved reflects the degree to which this metric is worth pursuing.

Final thoughts

For as many reasons why we need to measure production usage and gather data on our user base, it’s a wonder why we don’t see it happening more. The majority of us already track community engagement and/or revenue, but a third, equally important layer exists, and that is the user base. Obtaining those numbers is certainly no small task, but those who do are the common thread across successful open source projects and COSS companies. When you are looking for meaningful growth in your open source project, make sure you are looking at your open source user base.

If you don’t know who owns the growth of your open source user base and is responsible for reporting it every month, then you may want to rethink your metrics strategy.

If you’d rather not do that alone, sign up for Scarf and we’ll be more than happy to help you get started.

Latest blog posts

Tools and strategies modern teams need to help their companies grow.

Integrating Scarf Data with Your Analytics Tools

Integrating Scarf Data with Your Analytics Tools

Exporting data tracked by Scarf is essential for analytics, reporting, and integration with other tools. Scarf adds open-source usage metrics to the data you already collect, giving you a fuller picture of how your project is used. This helps you monitor trends, measure impact, and make better data-driven decisions.
3 Methods to Collect Data with Scarf

3 Methods to Collect Data with Scarf

Scarf helps you unlock the full potential of your open source project by collecting valuable usage data in three key ways: Scarf Packages, in-app telemetry, and tracking pixels. In this post, we’ll break down each of these powerful tools and show you how to use them to optimize your open source strategy.
Scarf Newsletter - August 2024

Scarf Newsletter - August 2024

Stay up to date with the latest updates from Scarf. Discover upcoming features, industry news, partnerships, and events. August 2024 Newsletter.
How Apache Superset Implemented Scarf

How Apache Superset Implemented Scarf

In this playbook, you’ll learn how to integrate Scarf into an Apache Software Foundation project. It details how the Preset team implemented Scarf in their Apache Superset project, as shared during our first-ever Scarf Summit on July 16th, 2024.
Implement a Call-Home Functionality or Telemetry in your Open Source Project

Implement a Call-Home Functionality or Telemetry in your Open Source Project

Implementing telemetry in your open source project helps you determine whether people are testing your software and continuing its use over time. Such insights not only confirm if the developed software meets users' needs but also helps identify which versions are being adopted and which might be vulnerable to the latest bugs or other issues.
Prisma: Validating Enterprise Adoption Through Open Source Engagement

Prisma: Validating Enterprise Adoption Through Open Source Engagement

Prisma turned to Scarf for a monthly Strategic Insights Report. By integrating Scarf into various parts of their web and software delivery infrastructure, Prisma now knows relevant details about their users in terms of company size, industry, location and much more.
Measure and Optimize Open Source User Interactions Using Scarf

Measure and Optimize Open Source User Interactions Using Scarf

This playbook will walk you through setting up Scarf to get a clearer picture of how people are interacting with your open-source project. You’ll learn how to create and use Scarf Pixels, track open source project documentation views, measure engagement across social media, and more.
CopilotKit Case Study: Leveraging Scarf to Uncover Hidden Open-Source Opportunities

CopilotKit Case Study: Leveraging Scarf to Uncover Hidden Open-Source Opportunities

CopilotKit implemented Scarf to gain visibility into their open-source community. By adding Scarf to their documentation, they could see which companies were actively engaging with their resources, providing valuable insights into potential leads and customer segments.
Measure Your Open Source Project's Downloads Using Scarf

Measure Your Open Source Project's Downloads Using Scarf

Tracking downloads of your open-source projects is key to understanding user engagement. With Scarf, you can see which businesses are using your project, which versions are popular, which platforms are being targeted, and more. This playbook will show you how to set up Scarf to monitor your project’s downloads.
What's New at Scarf: Key Takeaways from the Scarf Summit

What's New at Scarf: Key Takeaways from the Scarf Summit

On July 16th, we hosted our first-ever Scarf Summit, celebrating analytics for open source and the significant improvements we’ve made to the Scarf platform. In case you missed it, here’s a recap of all the key updates shared by our Engineering Leader, Aaron Porter.
Building Scarf: Avi Press on Haskell, Telemetry, and Open Source Challenges

Building Scarf: Avi Press on Haskell, Telemetry, and Open Source Challenges

In this episode of the Haskell Interlude Podcast, Joachim Breitner and Andreas Löh sit down with Avi Press, the founder of Scarf, to discuss his journey with Haskell, the telemetry landscape in open source software, and the technical as well as operational challenges of building a startup with Haskell at its core.
Boost Your Outreach with Scarf Filtering

Boost Your Outreach with Scarf Filtering

Scarf Basic and Premium tiers have long had the ability to sort their open source usage data by company, domain, events, last seen, and funnel stage. But our customers have been wanting more. Now you can hyper target by combining region, tech stack, and funnel stage, making outreach as refined and low friction as possible. 
Below the Surface: Why Open Source Needs Analytics

Below the Surface: Why Open Source Needs Analytics

Understanding open source user engagements and usage is obscured by a lack of actionable data, a result of its inherent openness and anonymity. Embracing a data-driven approach to open source projects helps them not only grow, but also understand the keys to their success, benefiting everyone involved.
How Garden Leverages Scarf to Understand and Grow Their User Base

How Garden Leverages Scarf to Understand and Grow Their User Base

As an open source company, Garden knew how hard it was going to be to get usage data. Adding Scarf for analytics on open source downloads turned anonymous numbers into company names. Using Scarf’s privacy-first analytics also helped Garden to know what kind of companies were using their OSS and where they were located.
OSS Privacy & OSS Analytics, How Heroic Labs Struck a Balance

OSS Privacy & OSS Analytics, How Heroic Labs Struck a Balance

Once Heroic started using Scarf, they learned that they were even more popular than they thought they were. Using Scarf, they were able to determine where, by country, their users were downloading from, and how many per day.
Unlimited Free Seats and Data Retention for All Linux Foundation Projects

Unlimited Free Seats and Data Retention for All Linux Foundation Projects

Any LF project maintainer can use Scarf without needing any further approval from the foundation. Scarf is offering all LF projects free accounts with a few additional features over our base free version. LF projects will get usage data like docs, downloads, and page views with unlimited free seat licenses and data retention.
Union.ai and Flyte: Privacy, Open Source, and Building a Commercial Business

Union.ai and Flyte: Privacy, Open Source, and Building a Commercial Business

Union is an open source first company. It uses Scarf to drive their DevRel strategy and improve their open source project. It also uses Scarf to power its consultative sales approach to help customers where it makes sense. Union has been successfully leveraging Scarf funnel analysis to shape the product to better fit the market so that they can focus on ensuring that companies can get value from Flyte sooner.
Navigating the Complexities of Open Source Commercialization: Insights from Adam Jacob

Navigating the Complexities of Open Source Commercialization: Insights from Adam Jacob

In this latest episode of "Hacking Open Source Business," Avi Press and Matt Yonkovit sit down with Adam Jacob, the co-founder of Chef and current CEO of System Initiative. With a rich history in the open-source world and numerous thought-provoking opinions, Adam delves into the intricacies of open-source commercialization, offering valuable insights and alternative strategies to the commonly held Open Core model.
Scarf Newsletter - May 2024

Scarf Newsletter - May 2024

Stay up to date with the latest updates from Scarf. Discover upcoming features, industry news, partnerships, and events. May 2024 Newsletter.
Smallstep Labs: Leveraging Open Source Data for Enterprise Growth

Smallstep Labs: Leveraging Open Source Data for Enterprise Growth

Smallstep wanted to understand the impact of their open-source project on enterprise adoption of their commercial security solutions. Smallstep uses Scarf to better understand user interactions and software usage, providing insights into its user base and potential customer segments as an important signal for commercial use.
Diagrid and Dapr: How to Balance Open Source and Business Through Data

Diagrid and Dapr: How to Balance Open Source and Business Through Data

Diagrid was founded in 2022 by the creators of the popular Dapr open source project. Making data-driven decisions for a commercial company built on an open source project that had no real concrete data, was a real challenge. Diagrid translated Scarf data into valuable insights for marketing and product development of their commercial product.
12 Reasons Why Haskell is a Terrible Choice for Startups (and why we picked it anyway)

12 Reasons Why Haskell is a Terrible Choice for Startups (and why we picked it anyway)

When we approached the project of building Scarf, we turned to our favorite language: Haskell. Little did we know, this decision would shape our story in more ways than one.
Unstructured: Understanding an Open Source Project’s Impact on Commercial Success

Unstructured: Understanding an Open Source Project’s Impact on Commercial Success

Unstructured had so much usage of their open source, but so little data. Prior to Scarf, they mostly had GitHub information for things like downloads and stars. It was difficult to separate the good signal from the noise without any specific information that would help them to better target this large and growing open source user base or data to influence their product roadmap. 
New Integration: Scarf + Common Room = Supercharged Insights for Open Source Projects

New Integration: Scarf + Common Room = Supercharged Insights for Open Source Projects

It’s happening! Scarf is part of the Common Room Signal Partners program. Soon, you will be able to integrate your Scarf data into your Common Room platform for a more complete view of all of your user signals.
Scarf Newsletter - March 2024

Scarf Newsletter - March 2024

Stay up to date with the latest updates from Scarf. Discover upcoming features, industry news, partnerships, and events. March 2024 Newsletter.
State of Open Source Usage: The Scarf Report 2023

State of Open Source Usage: The Scarf Report 2023

In 2023, the open source software (OSS) landscape showed significant growth and shifts in various aspects. Here are the key findings:
Scarf Successfully Completes Type 1 SOC 2 Examination with an Unqualified Opinion

Scarf Successfully Completes Type 1 SOC 2 Examination with an Unqualified Opinion

We are thrilled to announce that we have successfully completed a Type 1 System and Organization Controls 2 (SOC 2) examination for our Scarf Platform service as of January 31, 2024.
Analytics are Starting to Win in Open Source

Analytics are Starting to Win in Open Source

When Scarf emerged back in 2019, many people expressed skepticism that usage analytics would ever be tolerated in the open source world. 5 years later, Scarf has shown this once solidified cultural norm can indeed change. Learn how Scarf's journey mirrors a broader shift in open source culture and why embracing usage analytics could shape the future of open software development.
Scarf Newsletter - February 2024

Scarf Newsletter - February 2024

Stay up to date with the latest updates from Scarf. Discover upcoming features, industry news, partnerships, and events. February 2024 Newsletter.
Scarf Case Study: Apache Superset

Scarf Case Study: Apache Superset

Apache Superset is an open-source modern data exploration and visualization platform that makes it easy for users of all skill sets to explore and visualize their data. We spoke with Maxime Beauchemin, founder & CEO of Preset, and the original creator of both Apache Superset and Apache Airflow, who shared with us Superset's experience using Scarf.
Haskell.org: Bridging the Gap Between Language Innovation and Community Understanding

Haskell.org: Bridging the Gap Between Language Innovation and Community Understanding

Haskell, a cutting-edge programming language rooted in pure functionality, boasts static typing, type inference, and lazy evaluation. The language's ongoing evolution is bolstered by a diverse array of organizations, including the Haskell.org committee. This committee strategically leveraged the Scarf solution for testing purposes.
Scarf Newsletter - December 2023

Scarf Newsletter - December 2023

We’re pleased to share a final recap of the latest Scarf updates for December and 2023 as a whole. Join us in this last edition of our 2023 newsletters.
Introducing OQLs: A New Way for Businesses to Quantify Open Source Adoption

Introducing OQLs: A New Way for Businesses to Quantify Open Source Adoption

In the open source ecosystem, user behaviors are diverse and conversion tracking poses unique challenges frequently leaving traditional marketing strategies insufficient. Recognizing this gap, we are excited to introduce a brand new way for businesses to make sense of this opaque and noisy signal – Open Source Qualified Leads (OQLs).
Scarf Newsletter - November 2023

Scarf Newsletter - November 2023

Stay up to date with the latest updates from Scarf. Discover upcoming features, industry news, partnerships, and events. November 2023 Newsletter.
The BSL Phenomenon: Balancing Sustainability and Open Source Principles

The BSL Phenomenon: Balancing Sustainability and Open Source Principles

In recent years, a notable development in the open source landscape is the growing number of large corporations considering the transition from open source licenses to more restrictive models like the Business Source License (BSL). This trend raises further questions about the sustainability and future of open source projects, particularly when large players alter their approach.
State of Open Source Usage Q3 2023: The Scarf Report

State of Open Source Usage Q3 2023: The Scarf Report

In Q3 2023, the open source software (OSS) landscape showed significant growth and shifts in various aspects. Here are the key findings:
Unlocking the Power of Custom URL Parameters with Scarf: A Comprehensive Guide

Unlocking the Power of Custom URL Parameters with Scarf: A Comprehensive Guide

A recent release of Scarf added the ability to track and report on custom URL parameters. If you are looking to gain more intelligence around how you open source users interact with your project and download your software using link parameters in key situations can reveal interesting and helpful trends that can help you grow your user base and unlock open source qualified leads.
Building Trust: How to Collect Data Responsibly as an Open Source Project

Building Trust: How to Collect Data Responsibly as an Open Source Project

In the ever-evolving landscape of open source software, data collection has become a hot-button issue. As the open source community grows and software becomes increasingly integral to our daily lives, concerns about data collection ethics have emerged.
Scarf Newsletter - September 2023

Scarf Newsletter - September 2023

Stay up to date with the latest updates from Scarf. Discover upcoming features, industry news, partnerships, and events. September 2023 Newsletter.
 Measuring the Commercial ROI of DEVREL

Measuring the Commercial ROI of DEVREL

In today's fast-paced tech world, the Developer Relations (DevRel) role has moved from the periphery to the center stage. Companies, irrespective of their size, are now seriously considering the worth of having a dedicated DevRel team. But, how do you quantify the success or failure of such an effort? What metrics should companies use? This post dives deep into understanding the commercial Return on Investment (ROI) of DevRel.
Selling Open Source: 101 - Guide for Sales and Marketing Teams

Selling Open Source: 101 - Guide for Sales and Marketing Teams

Monetizing open source software is a challenging task, but it can also be highly rewarding. Unlike traditional software, you're essentially competing against a free version of your product. So, how do you sell something that is inherently free?
Beyond the Surface: How to Engage with the Quiet Members of your Open Source Community

Beyond the Surface: How to Engage with the Quiet Members of your Open Source Community

In the dynamic realm of community management, marketing, and developer relations, success depends upon more than just attracting attention. It's about fostering meaningful relationships, nurturing engagement, and amplifying your community's impact. 
Mastering Telemetry in Open Source: A Simple Guide to Building Lightweight Call Home Functionality

Mastering Telemetry in Open Source: A Simple Guide to Building Lightweight Call Home Functionality

This guidebook shows you how to implement a call-home functionality or telemetry within your open-source software while at the same time being transparent and respectful of your users data. Let's explore how to build a minimal, privacy-focused call home functionality using a simple version check and Scarf.
Scarf Newsletter - July 2023

Scarf Newsletter - July 2023

Stay up to date with the latest updates from Scarf. Discover upcoming features, industry news, partnerships, and events. July 2023 Newsletter.
Open Source Metrics: Fear and Loathing (Part 2)

Open Source Metrics: Fear and Loathing (Part 2)

Many open source contributors are reluctant or skeptical about metrics. They think metrics are overrated, irrelevant, or even harmful to their projects and communities. But in this blog post, we argue that metrics are essential for making better decisions, improving the experience for users and contributors, and demonstrating the impact and value of your open source work. We also share some tips and examples from OSPOs and DevRel teams on how to choose and use metrics effectively.
Why GitHub Repos Are Not Enough for Your Docs: The Benefits of Creating a Dedicated Doc Site

Why GitHub Repos Are Not Enough for Your Docs: The Benefits of Creating a Dedicated Doc Site

Many open-source developers rely on GitHub as their primary documentation source. But this can be a costly mistake that can affect your project’s success and adoption. In this blog, we’ll explain why you need to build your own docs site and how to do it easily and effectively.
Data-Driven Open Source: Why You Should Care About Metrics (Part 1)

Data-Driven Open Source: Why You Should Care About Metrics (Part 1)

Open source projects and companies need data to grow and enhance their performance. However, many open source leaders and communities overlook or reject metrics and depend on intuition, relationships, or imitation. Data can help you spot problems, opportunities, and false positives in growth strategies. In this blog post, Matt Yonkovit shows you why data is important for open source success and how it can offer insights and guidance for open source projects to reach their goals and make better decisions.
State of Open Source Usage Q2 2023: The Scarf Report

State of Open Source Usage Q2 2023: The Scarf Report

Open source software continues to be a vital part of enterprise operations in Q2 2023, as more and more companies adopt open source solutions for their business needs. In this blog post, we will examine the state of open source usage in Q2 2023 and the trends that are shaping the future of open source.
Developer Relations (DevRel): Where Should It Reside in Your Organization

Developer Relations (DevRel): Where Should It Reside in Your Organization

DevRel is a vital function for any organization that wants to engage with the developer community and grow its user base. However, there is no one-size-fits-all solution for where to place DevRel within the organizational structure. In this blog post, we explore three common strategies for DevRel placement: marketing, product, and hybrid. We discuss the advantages and challenges of each strategy, and provide some tips on how to decide which one is best for your organization and goals.
The Gating Debate: Striking a Balance Between Open Source and Marketing Insights

The Gating Debate: Striking a Balance Between Open Source and Marketing Insights

In the open source industry, identifying and engaging users is a major challenge. Many users download software from third-party platforms that do not share user data with the software company. Gating content behind a login or an email form can help, but it can also alienate potential users who value their privacy and convenience. In this blog post, we explore the pros and cons of gating content in the open source industry, and we offer an alternative solution that can help you identify and connect with your users without compromising your content.
How to Use Metrics to Track and Evaluate Your Open Source Community’s Success

How to Use Metrics to Track and Evaluate Your Open Source Community’s Success

Open source software depends on the power of its community. But how do you know if your community is healthy and thriving? In this blog, you will learn how to use metrics to track and evaluate your community’s activity, engagement, growth, diversity, quality, and impact. You will hear from founders, DevRel experts, and investors who share their best practices and tips on how to measure and improve your community’s performance and value.
How to: Using anonymous downloads, website traffic, and documentation views to generate leads

How to: Using anonymous downloads, website traffic, and documentation views to generate leads

Learn how to overcome the challenges of open source software marketing and turn anonymous data into qualified leads. In this blog post, we’ll show you how to use download data, web traffic, and documentation views to identify potential customers and grow your sales pipeline. Discover how to track downloads, website traffic and documentation views with Scarf Gateway and the Scarf Tracking Pixel.
Why Your Open Source Startup Is Going To Fail (And What You Can Do About It)

Why Your Open Source Startup Is Going To Fail (And What You Can Do About It)

This blog post outlines ten common mistakes made by founders of open source startups, from failing to ask the right questions to neglecting the standardization of key metrics. By offering guidance on how to avoid these pitfalls, it provides a roadmap to successfully commercializing open source projects.
Open Source Monetization 101: A Step-by-Step Guide

Open Source Monetization 101: A Step-by-Step Guide

Many people believe that making money from open source projects is an arduous or even impossible task. However, with the right strategies it is possible to build a sustainable business while keeping the spirit of open source intact. By evaluating the market fit and commercial viability of an open source project before considering funding and monetization, one can realistically begin to explore the financial potential of an open source project. Here's how to do it.
The Open Source Sales & Marketing Funnel: Navigating the Challenges of Anonymous Downloads and Activity Tracking

The Open Source Sales & Marketing Funnel: Navigating the Challenges of Anonymous Downloads and Activity Tracking

This blog emphasizes the importance of a comprehensive approach to lead generation in the open source software space. Amid the challenges of anonymous usage and privacy regulations, strategies focusing on download activity, community engagement, and web traffic can maximize lead identification. Employing lead scoring and maintaining a list of active software users can further enhance sales outcomes in this unique market.
Scarf Newsletter - May 2023

Scarf Newsletter - May 2023

Stay up to date with the latest updates from Scarf. Discover upcoming features, industry news, partnerships, and events. May 2023 Newsletter.
Harnessing Software Download Patterns: Using Open Source Download Metrics to Uncover New Users and Potential Customers

Harnessing Software Download Patterns: Using Open Source Download Metrics to Uncover New Users and Potential Customers

Here at Scarf, we've developed a solution to help open source projects and businesses gain more insight into their users and their download traffic - Scarf Gateway. Here's how it works.
Unlocking Growth Potential: Scarf Users Benefit from Clearbit Integration for Improved User Intelligence

Unlocking Growth Potential: Scarf Users Benefit from Clearbit Integration for Improved User Intelligence

We are thrilled to announce our latest partnership with Clearbit (https://clearbit.com/). This collaboration will offer Scarf users and customers an enriched array of data about their user base, significantly enhancing the quality of information you already value from Scarf.
State of Open Source Usage Q1 2023: The Scarf Report

State of Open Source Usage Q1 2023: The Scarf Report

The popularity of open source software is not in doubt, but little concrete public data exists beyond human-generated surveys on adoption usage. In this blog post, we will explore the state of open source usage in Q1 2023 and the data illustrating how open source is becoming an increasingly important part of enterprise operations.
Connecting Community Efforts in Open Source to Business Success

Connecting Community Efforts in Open Source to Business Success

The success of DevRel (Developer Relations) and community efforts in open source can be challenging to measure, as there is often a disconnect between the goals and expectations of the community and the business. This blog post discusses the challenges of measuring the success of DevRel and community efforts in open source.
3 Keys to Growing the Adoption of an Open Source Project

3 Keys to Growing the Adoption of an Open Source Project

Successful open source projects don't always translate into successful open source businesses. However, by focusing on building a kick-ass product, raising awareness, making the product easier to use, and fostering a strong open source community, you can set the stage for converting users into paying customers.
The Most Neglected and Overlooked Open Source Metric: Production Users

The Most Neglected and Overlooked Open Source Metric: Production Users

Everyone wants a larger open source user base, but very few people effectively measure its growth. Let’s discuss why.
Switching Container Registries With Zero Downtime

Switching Container Registries With Zero Downtime

You can use the open source Scarf Gateway to switch hosting providers, container registries, or repositories without impacting end users in the future.
Understanding Tech Layoffs and the Economy’s Impact on Open Source

Understanding Tech Layoffs and the Economy’s Impact on Open Source

What is driving all this tech layoffs? , What is their impact on the open source software industry? We will walk through all the potential reasons from an economic downturn, herd mentality, excessive borrowing and spending due to low interest rates, and growth at all costs as the main reasons behind the layoffs. Companies can continue to grow in this tight economic market if they are focused on optimizing efficiency and sustaining the right growth.
Why Downloads are an Essential Metric for Open Source Software Projects

Why Downloads are an Essential Metric for Open Source Software Projects

If you're only going to track one thing for your OSS project, track your downloads.
The Open Source Business Metrics Guide

The Open Source Business Metrics Guide

How to Build, Grow, and Measure the Success of an Open Source Business
Messaging and Positioning Considerations for Introducing an Open Source Product

Messaging and Positioning Considerations for Introducing an Open Source Product

At the All Things Open conference, Emily Omier, a seasoned positioning consultant, sat down with Avi Press (Founder and CEO, Scarf) and Matt Yonkovit (The HOSS, Scarf) to discuss how to message, position, and validate your open source product on The Hacking Open Source Business Podcast. You can watch the full episode below or continue reading for a recap.
How to Get the Attention of an Open Source Software Investor

How to Get the Attention of an Open Source Software Investor

On the Hacking Open Source Business podcast, Joseph Jacks aka JJ (Founder, OSS Capital) joins Avi Press (Founder and CEO, Scarf) and Matt Yonkovit (The HOSS, Scarf) to share what you need to know before starting a commercial open source software (COSS) company and how you can set yourself and your project apart in a way that attracts investor funding. As an investor who exclusively focuses on open source startups, JJ provides a VC perspective on what he looks for when evaluating investment opportunities.
Heroic Labs' Journey to Open Source and 5.3M Docker Downloads

Heroic Labs' Journey to Open Source and 5.3M Docker Downloads

On The Hacking Open Source Business podcast, CEO Chris Molozian and Head of Developer Relations Gabriel Pene at Heroic Labs elaborate on their usage and shift to open source and how it accelerated their adoption.
How to Keep Open Source Projects Open Source

How to Keep Open Source Projects Open Source

In this recap of the first episode of the Hacking Open Source Business Podcast, co-hosts Matt Yonkovit and Avi Press, Scarf Founder and CEO, dig into a recent controversy that highlights the challenges open source projects face trying to create sustainable revenue streams to support a business or a non-profit that funds the project’s growth.
How Buoyant Drives Open-Source-Led Growth with Linkerd

How Buoyant Drives Open-Source-Led Growth with Linkerd

Building a business around an open-source project is hard. Learn more about how Buoyant drives product-led growth with Linkerd.
Alex Biehl: Open Sourcing a Tool to Generate Haskell Server Stubs

Alex Biehl: Open Sourcing a Tool to Generate Haskell Server Stubs

Alex is a software engineer at Scarf who recently open sourced a tool to generate Haskell server stubs called Tie.
Tanner Linsley: Building Sustainable Open Source Projects

Tanner Linsley: Building Sustainable Open Source Projects

Tanner Linsley joined us to explain how he got started in open source and how he has made working in open source sustainable.
Stefano Maffulli: An Exploration on Standards for Open Source Packaging and Distribution

Stefano Maffulli: An Exploration on Standards for Open Source Packaging and Distribution

Scarf Sessions is a new stream where we have conversations with people shaping the landscape in open source and open source sustainability. This post will give a recap of the conversation Scarf CEO, Avi Press and I had with our guest Stefano Maffulli.
Using OSS Usage Data to Sell your Company

Using OSS Usage Data to Sell your Company

Learn how Nestybox used Scarf to gather better project insights and provide accurate data during their recent acquisition.
A Different Approach to Measuring Open Source Community Health

A Different Approach to Measuring Open Source Community Health

Community is important to the success of open source software. To understand and grow a community, project founders and maintainers need visibility into various technical, social, and even financial metrics. But what metrics should we be using?
Scarf Tech Stack: Relude

Scarf Tech Stack: Relude

This blog post will talk about Relude, a project we use in the majority of our Scarf tech stack
Python Wheels vs Eggs (And How Data-Driven Decisions Must Become The Norm in Open-Source)

Python Wheels vs Eggs (And How Data-Driven Decisions Must Become The Norm in Open-Source)

Should Python eggs be deprecated in favor of wheels? What does the data show? This post explores how the right data can make decisions like this easier for maintainers and Open Source organizations.
Changelog: Company Identification Change

Changelog: Company Identification Change

Announcing a new change to the way we identify companies.
Announcing Python Support

Announcing Python Support

Advanced registry analytics are now available for Python package maintainers
Project Spotlight: Scarf Gateway Stats

Project Spotlight: Scarf Gateway Stats

This Project Spotlight will focus on another exciting open source project, Scarf Gateway Stats.
Scarf Will Block Package Downloads from the Russian Government

Scarf Will Block Package Downloads from the Russian Government

In solidarity with Ukraine, Scarf Gateway will no longer service package downloads from Russian Government sources.
Changelog: New Pixel Snippet

Changelog: New Pixel Snippet

A notice to our Documentation Insights users.
Community Spotlight: nix-community

Community Spotlight: nix-community

This is the second post in a new series from Scarf: Spotlights where we highlight awesome projects and communities.
Changelog: Registry Validation for Auto-package Creation

Changelog: Registry Validation for Auto-package Creation

A summary of the new registry validation feature for auto-package creation.
Three Ways to Build Better Products Through Analytics

Three Ways to Build Better Products Through Analytics

A special guest post from open-source analytics company PostHog
New Year, New Scarf Features

New Year, New Scarf Features

Today, we're launching some of the most frequently asked for features since we launched Scarf Gateway back in March.
The Scarf Tech Stack

The Scarf Tech Stack

How Scarf is built
OSS Project Spotlight: IHP

OSS Project Spotlight: IHP

In a new blog post series, we'll highlight great OSS projects that are using Scarf. Today, we are featuring IHP, a modern batteries-included Haskell web framework
Measuring Downloads of Anything You Distribute

Measuring Downloads of Anything You Distribute

Scarf's core registry infrastructure has leveled up to support any kind of direct file download
Announcing Nomia and the Scarf Environment Manager

Announcing Nomia and the Scarf Environment Manager

Our mission here at Scarf centers around enhancing the connections between open source software maintainers and end users. Learn how Scarf + Nomia can reduce the complexity and increase the efficiency of the end-user open source integration experience.
Announcing The Scarf Gateway

Announcing The Scarf Gateway

Understand how your containers are downloaded and decouple your project from your registry
Composition with Semantically Rich Names

Composition with Semantically Rich Names

Insights from recent developments in name-based composition
Shea Levy, Composition Fanatic

Shea Levy, Composition Fanatic

Introducing Shea, Scarf's new VP of Engineering
Are Package Registries Holding Open-Source Hostage?

Are Package Registries Holding Open-Source Hostage?

Package registries are a central piece of infrastructure for software development. How aligned are they with the developers who make all of the packages being hosted?
Analytics and Open Source Sustainability

Analytics and Open Source Sustainability

Analytics will be an important part of improving sustainability for open-source maintainers
Scarf Insights Page: Understand Your OSS Project Usage with Scarf Metrics

Scarf Insights Page: Understand Your OSS Project Usage with Scarf Metrics

Discover the importance of key metrics in assessing the health and growth of your open source project
Understanding Open Source User Adoption Funnel Stages with Scarf

Understanding Open Source User Adoption Funnel Stages with Scarf

Scarf open source adoption funnel stages allow you to better understand and qualify the user journey with open source software.
3 Methods to Collect Data with Scarf
September 10, 2024

3 Methods to Collect Data with Scarf

Scarf helps you unlock the full potential of your open source project by collecting valuable usage data in three key ways: Scarf Packages, in-app telemetry, and tracking pixels. In this post, we’ll break down each of these powerful tools and show you how to use them to optimize your open source strategy.
Sara Dornsife
Sara Dornsife
Scarf Newsletter - August 2024
August 28, 2024

Scarf Newsletter - August 2024

Stay up to date with the latest updates from Scarf. Discover upcoming features, industry news, partnerships, and events. August 2024 Newsletter.
Scarf
Scarf
How Apache Superset Implemented Scarf
August 22, 2024

How Apache Superset Implemented Scarf

In this playbook, you’ll learn how to integrate Scarf into an Apache Software Foundation project. It details how the Preset team implemented Scarf in their Apache Superset project, as shared during our first-ever Scarf Summit on July 16th, 2024.
Scarf
Scarf

The Most Neglected and Overlooked Open Source Metric: Production Users

After speaking with hundreds of people in various roles working on large open source projects or at commercial open source software (COSS) companies, I have found that their perspectives on measuring growth and success vary a great deal. In fact, I am surprised that some people zero in on certain metrics and ignore many others. The one commonality is that they often fall short of grasping a complete picture. For instance, one executive may be hyper focused on the number of contributors, while another at a similar-sized company may be hyper focused on the number of customers. 

Managers, executives, maintainers, and VCs tend to gravitate towards community and revenue metrics but most frequently overlook numbers concerning their proper user base—the fundamental building block that ties everything together and completes all other open source metrics. Production usage supplies the central lens by which we should view all other metrics. Without it, we’d simply have standalone metrics that don’t tell us anything concrete about the actual business or business potential. Ironically, production usage is the metric that open source maintainers and business owners miss the most. I’ll proceed to cover how you can avoid this prevalent pitfall and instead take measures toward optimizing your metrics based on production users.

4 types of open source projects or companies

Certain folks are more prone to a certain outlook on metrics, so it’s best to first identify which group applies to you. People in open source tend to fall into four main categories.  

1.) Results, not metrics 

About 10% of the people I talk to say that their user base is growing, but they don’t track metrics regularly or specifically. The philosophy of “we do good for the community, and they do good for us” or “we trust that it will work out” is alive and well. This group is mostly made up of projects (some small and some very large), but a surprisingly couple of companies hold this philosophy as well.

I personally am worried about this group. While many good people and projects fall into this category, you can’t measure your activities' effectiveness, make adjustments to improve results, and review if your activities achieved the desired results without looking at metrics. 

2.) Community above all else 

A little more than 50% of those I talk to are focused almost exclusively on the community.  A lot of pre-seed or early seed companies fall into this category. Additionally, most pure open source project teams focus their energy on these metrics. A small but not insignificant number of investors also are here.

This group cares most about metrics such as:

  • Slack users
  • GitHub stars 
  • Overall contributors

Note, these metrics are the most common, but this group values other popular metrics too (PRs opened and closed, issues opened, number of users on their forum, number of times a project is cloned, etc.), just on an inconsistent basis.  

3.) Community plus revenue

About 33–35% of those I have chatted with fall into this group. This group is all about commercial open source. Their number one goal typically is revenue. The larger the company, the less their executives tend to focus on the community and more on the top-line numbers. Still, there are usually teams at these companies who do care greatly about the community metrics.

This group focuses on metrics such as:

  • Slack users
  • GitHub stars
  • Overall contributors
  • Number of customers
  • Annual recurring revenue (ARR)

Note that COSS companies also look at churn, customer acquisition costs, etc.

4.) All metrics count

Fewer than I would have hoped—about 5%—emphasize a more holistic metrics focus. VCs, COSS companies, and a few maintainers talk about these metrics regularly. 

This group builds on community and commercial metrics but adds usage data:

  • Slack users
  • GitHub stars 
  • Overall contributors
  • Downloads and Docker pulls
  • Number of users (enterprise vs. not)
  • Number of customers (enterprise vs. not)
  • ARR

Note that people will flip these metrics in and out. Maintainers, for example, don’t usually focus on customers but some do look at the growing user base. 

Which group do you think applies to you?

Missing the full picture?

If you’re part of the 95% who don’t identify with the “all metrics count” group, then it’s likely that you’ve yet to capture key insights that could be game changing. All metrics show you something different, but they build on top of one another, meaning that seeing an uptick or slowdown in one area could lead to a chain reaction.

Normally, COSS customers follow this type of path:

Step 1: A healthy and happy community (contributors, repo activity, and an active Slack or forums)

Leads to

Step 2: A healthy user base (people who download, install, and are ongoing users)

Which leads to

Step 3: Paying customers (better ARR and more enterprise penetration)

The spread across the types of open source projects and companies out there reveals that people seem most focused on step 1 (the healthy community) and step 3 (paying customers) but ignore or minimize step 2 (a healthy open source user base). Companies make the trackability of users on the free tier of their SaaS offering(s) easy and resource light, so the problem of tracking the user base will look a little different in their case. So why aren’t there more people who care about community, commercial, and usage metrics, particularly for on premises?

Even if you are focused exclusively focused on either the community or sales, there is a direct correlation between a healthy growing user based and a growing community or growing customer base:

Why should you care about measuring your open source user base?


Growing the user base enables not only more customers, but also generates more potential contributors. In my experience, many projects’ most valuable contributors were first users of the product who either loved the project so much that they wanted to help, or they found something that the software simply could not do and wanted to contribute.

A massively awesome community, but very little in the way of users is a recipe for slow customer growth. To counter that, let’s examine why user growth deserves to be watched more closely. 

Get ahead of problems

No matter if you are a startup or a Fortune 500 company, unexpected fluctuations in revenue are problematic. If you see fewer deals in your pipeline, you send out an alarm. In the community space, if you see fewer contributions or less engagement in Slack, then you send out an alarm and strategize how to fix it. Similarly, if you can see fewer installs, downloads, or  active production installs of your software, that is a huge red flag. The worst time to fix a revenue problem or a customer problem is when you have lost a customer. You want to get ahead of that, and monitoring user growth helps you get a much faster pulse on trajectory, serving as a lead rather than a lag measure.

Optimize your sales process and customer journey:

How many open source users become paying customers? Do you know where they come from? Why did they first download? What industries most use your software? How many installs happen at each company? 

These are questions that can help shape the sales process, onboarding, and your customer’s overall journey. Having access to details like these enables you to build a plan, optimize, and improve conversion rates.

Make your DevRel and community activities more efficient

Being able to understand and help your user base can make all of your community and DevRel initiatives run smoother and more efficiently. For example, take the following map:

People tend to focus on contributors or customers when looking at decisions about where to have events. This may overlook a massive user base elsewhere that just needs a bit of attention to grow faster.
  • If we are going to do a hackathon or a contributor-focused meetup, either China or the western U.S. seems ideal. 
  • We have a lot of users but not a lot of customers in Australia. Maybe a conference there could help? We would get more attendance there than at a user conference in China.  
  • Maybe we should look to hire or provide more local language support to contributors in China or users in Germany by translating all of our docs into Chinese and German.
  • Given that most of our customers are in the UK or Germany and we also have a large number of users in those locations, those places seem prime for field marketing events.  

User data allows you to make these conclusions, but the benefits go beyond events. With the proper data setup, you can see which pages on the website lead to downloads, the most popular package managers, embeds by third-party tools (partner activities, anyone?), or even tying downloads to actual content! All of this information points to who your users are and how they use your product. Such data prevents you from making the wrong assumptions. 

If you really want to take user and usage data a step further, correlating those metrics with content metrics gives you a sense of what users are interested in or looking for moving forward.

In this example, blog 2 and 3 would receive more traffic than blog 1, but blog 1 drives more people to download and try your software.  Depending on your goal (in this scenario, conversion vs. awareness), you may want to drill down into the habits and demographic data of those who consumed certain blog content in an effort to better cater to their needs. You might deem blog 1 as one for warmer leads and look at how that group interacted with the content to gain more insight into what they respond to and how to meet them where they’re at. Without that connection, you can spend a lot of energy on activities that do not move the needle.

Encounter more investment and funding options

Of all reasons, this one is a matter of survival. In today’s market, more data is a must for investors. Recently, Cowboy Ventures surveyed investors on what they look for in early seed rounds, and the top metric was production users (not customers, not Slack, not GitHub). The second was users in larger companies and then community metrics (Slack, GitHub issues, and stars).

If you were investing in a new company which set of metrics would you prefer to see?

Why do we minimize or in some cases outright ignore tracking real open source user base metrics?

I have a few theories. 

No one owns your open source user growth targets

 The first theory is simple. There are separate and dedicated teams focused on community growth and revenue growth but not user base growth.

On the revenue growth side, sales and marketing are focused on closing deals, adding new customers, and increasing revenue for the company. They are driven by their ARR and customer growth goals. Marketing seeks to funnel potential customers into the sales pipeline and turn a user into a customer as quickly as possible, though it can take months if not longer in some cases. Marketing is all about awareness (top of the funnel) and lead generation (middle of the funnel), while sales is tasked with converting leads into customers.

On the community growth side, the DevRel, community, and even marketing teams are all trying to bring awareness and usher people into the community. They work or interface with code contributors to improve the project by adding features, eliminating bugs, and extending the capabilities of the underlying software. Many community teams measure their performance based on contributor growth, Slack activity increases, etc.  

DevRel folks are usually charged with driving awareness as part of their role. It’s their job to make people aware of the product, that it is awesome, and to get them to try it out! They do this by releasing a continual stream of content, including talks, docs, and tutorials. Although DevRel gives users the tools to be successful, I have not seen a DevRel person in the open source community specifically charged with (or evaluated on) a specific user growth target. If the responsibility for user base growth were to live anywhere, however, it would fit best under this team.   

Part of owning the user base is ensuring that people stay users and don’t churn. While great content and docs can help keep users in the ecosystem, DevRel historically has not assumed responsibility for maintaining the user base. That said, some people who are developer relations engineers end up acting as free community support, but as commercialization efforts ramp up, support-type activities shift over to a paid-for offering, so community involvement in that regard gets neglected or understaffed. The product team could pick up the slack, but in my experience most product teams end up focused on the commercial offerings as well.  

Consequently, few look at, plan for, and measure actual open source users. 

Measuring the growth of the open source user base is hard

The second theory is that it's hard to measure. In the last five years, we have developed tons of tools to help measure the community, marketing, and sales cycles but very few to focus on the actual open source user base. Measuring downloads, installs, and actual usage of the product requires a degree of data sharing that companies have not been willing to explore or involves technicalities very difficult to overcome. I’ve broken down some of these roadblocks below.
Downloads

When looking at who downloads your software, the first challenge is understanding where they originate. Users can grab software from dozens of places:

  • Container repos such as Docker Hub, Container Registry from Google Cloud, Red Hat Quay, Amazon Elastic Container Registry, and Azure Container Registry  
  • Package managers such as Nix, Homebrew, RPM, and APT
  • Language specific package managers such as pip and npm 
  • Direct downloads
  • Source (GitHub or GitLab) 

Consolidating these metrics in an organized way requires planning and some sort of central control plane or gateway (like we have at Scarf) to aggregate and track.   

Installs

If you can get download numbers, you can infer some number of installs, but a certain percentage of downloads are going to be upgrades or automation as part of CI/CD pipelines. This is where call home metrics or install telemetry would be the most helpful. See more details in the following section on active usage.

Active usage

To understand who actively uses your project over an extended period of time, either people need to volunteer the information or you need to provide some sort of call home or ping back functionality. A few projects over the years have tried and failed when concerns over privacy and transparency emerged. You can infer users by looking at aspects such as distinct companies opening up tickets and engaging in the community, but in my experience, these active community users are about 5–10% at most of the real user base. The number of companies downloading updates or on a regular basis gets you a more realistic count, but it is still underreported (about 80% accurate). The best solutions here are opt in, but at the very least, offer something that the customer can use (e.g., checks for common vulnerabilities and exposures). 

It’s boring

Another challenge that makes measuring growth of the user base hard comes from the nature of the beast. Sometimes the things that are hard seem like they aren’t of interest and don’t need to be when in reality they are actually just hard. After all, which story sounds better?   

  1. Over 10,000 users worldwide love and trust our software
  2. Thousands of contributors worldwide use our software and make it better, plus we have over 5,000 active members in the community and over 15,000 stars on GitHub

Let’s be honest—having people who love and care about your product enough not only to use but also contribute to it is a powerful and wonderful story to tell. It is also necessary. You need to understand how many of your users are contributors, and how many of your customers are contributors. You need all three to be healthy.

It’s a blind spot

In our digital age, it can be hard to believe that not enough information is accessible and out there. In truth, there is a lot of information about working with the open source community and projects running themselves. There is also a great deal about how to track and grow revenue, but there is very little in fact on growing the free user base without tying it to revenue or contributor growth.

As you can see, these theories highlight obstacles that may explain why the user base is not measured as often as it ought. I’ve always been of the mind that the degree of skill required to address the challenges involved reflects the degree to which this metric is worth pursuing.

Final thoughts

For as many reasons why we need to measure production usage and gather data on our user base, it’s a wonder why we don’t see it happening more. The majority of us already track community engagement and/or revenue, but a third, equally important layer exists, and that is the user base. Obtaining those numbers is certainly no small task, but those who do are the common thread across successful open source projects and COSS companies. When you are looking for meaningful growth in your open source project, make sure you are looking at your open source user base.

If you don’t know who owns the growth of your open source user base and is responsible for reporting it every month, then you may want to rethink your metrics strategy.

If you’d rather not do that alone, sign up for Scarf and we’ll be more than happy to help you get started.