Scarf Announces Integration with Common Room
Playbook

How to: Using anonymous downloads, website traffic, and documentation views to generate leads

General
Analytics for open source
Try for Free
Try for Free

Navigating the world of open source software can often feel like trying to find a needle in a haystack when it comes to identifying potential leads and customers. We're up against several unique challenges that aren't typically seen in other industries. Firstly, we're competing against 'free' - a tough proposition in any business context. Secondly, the open-source nature of our software means that many of our users and their respective companies stay hidden behind the veil of anonymity, turning customer identification into a high-stakes game of hide and seek. And there are countless other hurdles that add to the complexity of this landscape. Despite these challenges, there's a wealth of untapped potential buried within anonymous download data, web traffic, and documentation views. Stick around as we unravel the mystery of transforming this sea of anonymous data into valuable company profiles, turning seemingly anonymous interactions into meaningful business opportunities.

Lack of Leads: This is not a new problem

Some time ago, I worked for a company called MySQL AB, the company that brought us MySQL.

The company, like many others offering commercial open source solutions, grappled with several challenges. A big one was figuring out who was using their software. After all, as an open source company, we were not only competing with 'free', we also had limited visibility into who was using our products since they were freely distributed through various anonymous channels.

The way MySQL handled this was by tapping into download and traffic data. It worked like this - our sales and marketing team would approach a company about the value of our support, to which they would respond, "We know MySQL, but we don't use your database (or any open source, for that matter)". That’s when we'd drop the bombshell: "Well, that's funny because our records show you downloaded it 1000 times last year." You can imagine the surprise on their faces! This led to some serious internal discussions and often revealed a significant usage of our software within their tech stack, unbeknownst to the decision-makers.

Fast forward 20 years, and we're still dealing with the same problems, but at an even faster pace. Open source usage has exploded, but managers are still not fully aware of the components and dependencies in their tech stack. And with so much software coming from anonymous channels, it's still a struggle for commercial open source companies to figure out who's using their software.

But hey, it's not all doom and gloom. Challenges also present opportunities, right? We just need to learn to engage with our user base, not just for sales conversations but to ensure they're using our open source software effectively and can deploy it successfully in production. So, let's explore together how to make the most of this situation in this data-driven world.

Important Activities You Should Track in Open Source:

There are 3 events I would suggest everyone track.  

  1. Downloads/Pull events
  2. Views of documentation
  3. Views of content/website (pages, blogs, tutorials)

The first is downloads (no matter if it's direct on your website, via a container registry, or via public repositories. Scarf allows you to track and aggregate downloads across all these different channels). This is probably the most valuable action. A download means someone has not only some interest in your product but enough interest to try it out.

There are three aspects to downloads which you should be paying attention to:

  • The number of downloads from unique sources at a company - more than one machine/source downloading is good.
  • The volume of downloads over a time period at the company - you want to see continued downloads over time, this implies ongoing usage.
  • Is the company downloading newer versions of the software over time - this is gold as it implies not only are they downloading but they are trying to keep things up to date and implies the software is critical enough to have maintenance procedures around it.

The second on the list is documentation views. People using your software will often have questions about how to install, use, and upgrade the software. You will see patterns evolve over time in the usage of the software docs depending on the software. Initially you will see more traffic to installation and setup sections. This coupled with download events is a great indicator or testing or trying things out. Then users will evolve more into troubleshooting or optimization views. See more page views shift to this is normal. Then you should see views to readmes or upgrade pages as they settle into maintenance and sustain mode. Ultimately I would be looking for views over an extended period of time to ensure they are invested and not just kicking the tires.  

The third on the list is content/website views. Not all views will be coming from docs, in fact for commercial purposes there are certain pages on your website that may be highly predictive of potential interest in becoming a customer (i.e. the pricing pages). But I recommend looking for ongoing views and traffic hitting blogs and other news on the product and upcoming releases.  

For each of the events, I would recommend breaking down all the activities into either good/better/best or low/medium/high impact events. Here is a suggested list of criteria when it comes to classifying events:

GOOD BETTER BEST
Downloads 1 or more downloads in a week. More than 1 download over a
30-day period.
Multiple downloads over a 90-day period,
including incremental downloads
of new versions.
Documentation Views Repeated views on installation
and setup instructions.
Documentation views spanning
more than 30 days from multiple sources.

More than just install page views.
Documentation views spanning more
than 90 days from multiple sources.

Doc views on upgrades and maintenance
procedures.
Website Traffic Multiple pages visited and viewed
by 1 company over a week period.
Multiple pages visited and viewed by 1
company over a 30-day period.

Page views to medium value content.
I.e. Reading technical blogs, visiting forum pages,
product feature pages.
Multiple pages visited and viewed by 1
company over a 30-day period.

Page views to high-value content.
I.e. Visiting the pricing pages, visiting but
not signing up on the signup page, etc.

The Riskiest But Most Valuable Metric: Ongoing Usage

While the three activities above are straightforward and generally not viewed with too much concern, there is a fourth activity or metric you can (and probably should) track.  An essential, albeit controversial, activity that serves as a highly valuable metric for any organization seeking to understand the usage patterns of its software - the use of 'call-home' functionality, also known as ongoing usage tracking. The call-home functionality is a mechanism within your software that sends a signal, or a 'ping', back to a designated server or gateway. This signal provides you with real-time information about your software's usage in live production environments, surpassing the insight level gained from just tracking downloads.

While download data can indicate interest and repeated use of your software, the ongoing, consistent 'ping' or call-home activity serves as a definitive predictor of your software's actual usage. Consider this the 'Nirvana' of metrics for your projects, the golden standard that allows you to measure the exact magnitude of your active install base and the frequency of software usage and deployment.

However, implementing this mechanism requires a degree of technical adaptation. Platforms like Scarf, for instance, offer this capability out-of-the-box. But to make full use of it, you'll need to adjust your application accordingly. There are different ways to accomplish this; for JavaScript applications, a package called 'Scarf-JS' can be used. Alternatively, a lightweight, background 'ping' or activity back to a Scarf gateway event can be employed. This ping can be triggered when your application starts up, is used, or at any other specified event.

In essence, your application would asynchronously call back to the gateway website, which doesn't return any data but rather tracks that the application was active. If you can successfully implement this, you can then monitor the number of unique pings over a certain period from various sources. This is incredibly valuable for lead scoring as it provides consistent, ongoing proof of life from these systems, making it the most valuable event or activity you could track.

Lead Scoring or User Scoring is Still Needed:

Not all people visiting your website and downloading your software are equally likely to become customers. In fact you will find 3x, 5x, or even 10x  more drive by traffic as you will find those interested in commercial offerings. To become efficient at finding which companies and users you should focus on, let's explore the concept of “lead scoring”. 

Lead scoring is a methodology used by sales and marketing departments to determine the worthiness of leads, or potential customers, by assigning values to them based on their behavior relating to their interest level in products or services. These values, or scores, are derived from a variety of factors like the professional information they've submitted, how they've engaged with the company's website, or their response to marketing efforts. The purpose of lead scoring is to prioritize leads who are more likely to convert into customers, allowing teams to focus their time and resources effectively. It's a vital part of creating an efficient sales and marketing strategy.

If you've already established a lead scoring system and are utilizing marketing software, consider events in open-source channels as additional data points to further qualify or uncover leads. For instance, a software download could be treated as a high-value (or high-score) activity, whereas a documentation view might be scored similarly to other website visits. It could be beneficial to categorize documentation and page views into high, medium, and low scoring pages, as certain pages (like pricing or install pages) can be more predictive and valuable than others.

The purpose of lead scoring is to prioritize leads who are more likely to convert into customers, allowing teams to focus their time and resources effectively.

The key distinction between traditional lead scoring and the incorporation of open-source download and traffic data lies in the summarization of data at the company level, requiring decisions on scoring criteria. Most marketing lead management tools track users based on sign-ups, cookies, or other mechanisms, capturing specifics such as Matt from Scarf signing up for a webinar. With data from anonymous sources, the best we can do is infer that someone from Scarf has downloaded your software.

The question then becomes: if you know Matt attended a webinar and works at Scarf, does the Scarf download make Matt a more qualified lead? Or should you shift your focus to other individuals at Scarf, possibly higher up in the management hierarchy? There's no absolute right or wrong answer, but my inclination would be to enrich the data of the known user who has already shown interest.

Additionally, it's important to note that software downloads can often be automated. Seeing ten downloads a day doesn't necessarily equate to thousands of servers or the potential for a massive deal. This data needs to be scrutinized, at the very least, by examining the unique systems or origins from where these downloads originate.

Lastly, when incorporating open-source downloads and traffic data, the timeline of events becomes critical. A single download could mean anything, but consistent downloads over several months, especially with each new version release, suggests a real, potentially highly qualified user.

Different Phases of Interest Description Events Action
Passive interest:
Hello World
Someone discovered or visited your website.
They may or may not have any interest in your
software or projects.
Web traffic to docs or websites over the
course of 1 or 2 days.
I would not take any action here.
Intrigued in your software:
This looks interesting
Someone takes more than a drive by interest in
your software. They are truly interested in what you have.
Documentation views.
Looking at install docs and/or feature lists.
Typically this is over multiple days.
I would consider promoting content to
that company's target audience (engineers?)
on other external channels.
Trial & Exploration:
Let me try this out
They move from just learning about the software
to actually downloading it.
Documentation and website views of high value
pages along with at least 1 download event.
You still see this traffic over multiple days but
typically over a week or two.
I would recommend promoting blogs or
how-tos that are interesting to this group of customers.
You could even promote this content directly on your
website when these visitors appear.
Testing & Evaluation:
I wonder if I can use
this for this project
Now someone is looking deeper into this and is starting
to either use it or seriously consider it.
Sustained page views and multiple downloads
over a month period.
Here is where additional content promotion is still a
good idea, but where there is a strong commercial offering
targeting these customers can be effective.
Implementation & Reliance:
This is cool, let's use this in production
Someone is using this over a longer period of time and looks to be beyond merely testing/trying out. If you see activities (both downloads and traffic) spread over a 90 day period, there is a high confidence in their usage in a critical space. This is the best time to seek out conversations.

- Cold outreach
- Targeted ads
- Seek out devs at conferences
Maintenance & Ongoing Upkeep:
Keeping things updated and safe
Someone has been using your software for months and is grabbing new versions of your software and reading readmes or regular updates (like blogs). Look for activities over months (3-12 months), with downloads of multiple versions. Also look for views on readmes or product specific content (blogs, etc). This is the best time to seek out conversations.

- Cold outreach
- Targeted ads
- Seek out devs at conferences
Waning Interest & Potential Churn:
Uh oh… this user is at risk
Usage is dropping and there is risk that this user may turn from an active user to a former user. If you see massive drop offs in traffic and downloads over a 30 day period this sends up red flags.

So we identified interesting companies; now, what do you do with this data?

This section of the guide provides recommendations on how to utilize the data obtained from downloads, website traffic, and documentation usage to enhance product adoption and discover potential leads. Different strategies are outlined for integrating these insights into existing sales/marketing activities, developing a product-led growth strategy, and for startups or new sales/marketing initiatives.

Becoming a Customer is a Journey:

Becoming a customer is indeed a journey that mirrors the transformation of a budding interest into a commercial relationship. This journey begins with a spark of curiosity, driving an individual to explore and try out the software. As they interact with the software, they begin to craft something unique, leading to deployment in a production environment. This stage often uncovers additional needs that may call for a commercial relationship, such as expert support, advanced features, or scale-up capacities.

This entire process can undoubtedly unfold organically, but it can be significantly enhanced, made more fruitful, or even accelerated by tailoring the right activities towards a user or company at the appropriate time. Key players in facilitating this journey include Product Development, Marketing, Developer Relations (DevRel), Sales, and even the Community. They collectively orchestrate a symphony of support and guidance for the user, with each instrument playing a vital role at the right moment.

An aptly timed article, a resonant message, a well-crafted tutorial, or a stimulating community discussion can serve as powerful catalysts in this journey, greatly influencing the user's progression. However, it's crucial to maintain a delicate balance. Overzealous pushing or rushing can result in adverse consequences, creating resistance or disengagement rather than fostering advancement.

As such, it becomes paramount to possess a deep understanding of where a company or user is in their journey. The more detailed your insights into their progress, the more effectively you can tailor your efforts. Similarly, having robust metrics around what strategies are fruitful and which ones fall short is equally beneficial. These insights not only inform your current strategies but also help shape your future approaches, ensuring you continuously enhance your user's journey towards becoming a valuable customer.

When they face challenges or opportunities in their production environment, show them how you can help them succeed with your solutions.

General Advice:

In today's data-driven world, harnessing and leveraging the power of download data and website traffic information can yield impressive results for organizations of all sizes, from startups to established enterprises. However, effectively employing these data requires a strategic and tailored approach to meet the unique needs and goals of each organization. Below are some general recommendations based on the discussions above that can apply across the board:

  • Understand Your Audience: Use download data and website traffic information to build a deeper understanding of your audience. This involves analyzing who is downloading your software, viewing your documentation, and browsing your website. With this information, you can enrich your existing leads, score potential ones, and build a well-informed customer profile.
  • Customize Your Approach: Once you've gathered and analyzed your data, tailor your marketing and sales processes to align with your findings. Whether you're focusing on sales/marketing or product, align your strategies and activities with the preferences and behaviors of your users. This could involve adjusting lead scoring based on the activity level or nurturing potential users to become ongoing ones.
  • Integrate Data with Existing Processes: Integrate your new data with your existing sales, marketing, and customer success processes. For instance, using download patterns to assess the churn potential can help you anticipate and mitigate customer attrition.
  • Adopt a Nurturing Approach: When it comes to new or startup sales/marketing processes, take a nurturing approach. This means guiding users through a lifecycle where they are initially familiarized with your software, then nurtured to become regular users, and eventually led to become paid customers.
  • Leverage Social Media: Social media platforms offer targeted marketing opportunities. Platforms like LinkedIn allow you to aim your promoted content towards specific companies and job titles.
  • Optimize Content: Make use of your existing content and create new content based on where your users and companies are spending the most time. Calls-to-action (CTAs) on these pages can effectively guide users through your marketing funnel.
  • Community Engagement: Encourage users to join your community, participate in events, and engage in discussions. Community engagement can serve as a powerful tool for user retention and organic growth.
  • Monitor and Adapt: Regularly assess the effectiveness of your strategies and be willing to make necessary adjustments. The digital landscape is ever-evolving, and your strategies should be adaptable to accommodate these changes.

Remember, the overarching aim should be to use this data to deliver value to your users, nurture relationships, and ultimately drive the growth of your organization.

Integrating within Existing Sales/Marketing Activities:

Existing sales and marketing activities can be significantly enriched by smartly integrating download data and website traffic information. By revising your lead scoring methodology to include new data points such as software downloads and page visits, you can ensure that you are incorporating the latest indicators of interest from your audience. The enhanced lead scoring will provide a more nuanced understanding of your prospective customers, paving the way for more targeted and effective outreach.

Use the company lists generated from this data in your cold outreach activities. By focusing your outreach efforts on these companies, you are targeting organizations already demonstrating interest, thereby increasing your chances of gaining a receptive audience. These lists can also serve as a valuable resource for your Business Development Representatives (BDRs), equipping them with a list of vetted leads, saving time and improving their efficiency.

Additionally, using this data, you can strategically plan meetings at conferences, events, and similar networking platforms with representatives from companies using or showing interest in your product. This targeted networking can lead to higher-value interactions and ultimately result in stronger leads.

Incorporating the pattern of downloads into your customer success and renewal operations can provide a more comprehensive customer overview. Such insights into customer behavior can inform your renewal strategies, equipping you with necessary foresight to address potential issues and ensure customer satisfaction. Moreover, the data can be a key indicator of potential churn risks, allowing you to proactively manage customer retention by identifying and addressing their concerns before they choose to discontinue your service.

TLDR:

  • Use the data to enrich your existing set of leads. You can add additional events to your lead scoring process.
  • Use the data to build a highly qualified list for outreach activities. Target companies that are using your software or are interested in your software.
  • Use this data to inform your marketing strategies. For example, prioritize individuals from companies that have shown interest in your software at meetings, conferences, and events.
  • If you have a fully fleshed out sales, marketing, and customer success process, use the data to assess churn risk.

Startup or New Marketing/Sales Activities (Active Prospecting):

For startups or companies initiating new sales and marketing initiatives, creating a lightweight growth engine that nurtures potential users can be the key to driving growth. Setting up a lifecycle or nurture campaign can guide potential users through your marketing funnel, providing them with the right content at the right time to foster interest and engagement.

Promoted content can be a powerful tool in these campaigns. Aimed at users in the early stages of engagement, this content can educate users about your software, showcasing its features and benefits and encouraging them to explore it further. As these potential users turn into ongoing users, you can begin to introduce promoted content, offers, and cold outreach to convert them into paying customers.

Understanding the customer journey is crucial in a startup or new marketing environment. By mapping out this journey and identifying combinations of events and thresholds, you can strategize when to increase or decrease marketing activities for optimal effect. This dynamic approach can keep your marketing efforts agile and responsive to user behavior.

Social media platforms like LinkedIn offer a targeted way to reach specific companies

TLDR:

  • Use this data to build a lightweight marketing and growth engine.
  • Approach the process as a life cycle or nurture type campaign. Nurture potential users until they become productive users.
  • Use promoted content targeted towards companies that are downloading or have looked at your documentation.
  • Once users are actively using your software, shift the focus to ongoing maintenance and new releases. Then, start introducing your paid offerings or services.
  • Use social media to engage potential users.
  • Integrate the scarf platform into your existing community activity to help nurture and guide potential users.

Integrating into a Product-Led Growth Strategy:

In a product-led growth strategy, the primary focus is on expanding product usage, and insights from website traffic and download data can play a crucial role in driving this growth. You can target specific companies with promoted content on various channels, catching the attention of potential or current users and stimulating their interest in your software.

Educational resources such as blogs, tutorials, and videos offer a non-intrusive way to engage companies that are exploring your software. These resources can help prospective users understand the value your product offers and how it can address their needs, fostering trust and driving product adoption.

Networking can also play a pivotal role in a product-led growth strategy. You can seek out speakers and attendees from targeted companies at industry conferences and events, fostering relationships that can lead to future collaborations or customers.

To gain a holistic picture of user engagement and behavior, consider merging this download and website usage data into your community tools, such as Common Room. This integration will allow you to monitor how users interact with your product and community, providing insights that can help shape your product development and marketing strategies.

TLDR:

  • Use download data to understand product adoption and usage patterns.
  • Monitor decreasing downloads or decreasing activity as a potential indicator of churn.
  • Use the data to understand which stage of the company's life cycle the users are in. This will help inform product development and roadmap decisions.
  • Integrate with existing community tools to build a complete picture of potential users -> users -> community members.
  • Use CTA’s (Call to Action) for events like join our community where you can convert anonymous users to known.

Using Scarf:

Introducing Scarf to your Community:

When adding Scarf to your website or as part of your deployment strategy you may get questions from users.

Here is some basic information about Scarf that others have found useful in discussing with their users when asked about using Scarf:

  • Scarf is used by 1000’s of projects to collect analytics for package downloads, documentation views, and website traffic
  • Scarf is fully GDPR compliant and ensure PII is protected
  • Scarf has passed the privacy, compliance, and legal requirements to be approved by open source foundations like the Apache Foundation
  • https://privacy.apache.org/policies/privacy-policy-public.html
  • https://privacy.apache.org/faq/committers.html
  • Scarf provides cookie-less and privacy conscious documentation and privacy focused website and documentation analytics
  • Scarf stores only the bare minimum metadata needed to collect and aggregate analytics data for our users.

Scarf also provides your users with other benefits:

  • Your downloads no longer are locked to a single hosting provider or service.  As services (such as container registers or package managers) change their terms of service or make changes to their offerings, you can adjust your hosting without changing your docs or impacting your users in the future. 
  • Scarf can be used to determine how exposed your user base is to old or insecure software, enabling your project to take a proactive approach to informing and educating your user base of potential issues 
  • Improves the sustainability of your project by providing data on the real user base to investors (without exposing PII).  

Setting up Scarf:

Scarf is very straightforward to get started with.  

Overall the process is:

1. Signup for a free account at: http://app.scarf.sh/register

2. For downloads,

     a. Setup a new package URL via the Scarf Gateway within your Scarf Dashboard.

     b. Point this URL to your current download endpoints.

     c. Update installation and setup documentation to direct users to use the gateway.

3. For Documentation or website tracking:

     a. Create a Scarf Tracking Pixel and add it to the pages you want analytics for (whether on your site or on third party sites).

4. For Link Tracking and social monitoring:

     a. Create a new URL in the Scarf Gateway as a redirect/link shortener to your website, Youtube, Hacker News, or other sites.  

     b. When posting links on social media use the new URL instead of the main link.  Data will then be available in the Scarf dashboard.

5. For Basic Call Home functionality:

     a. Create a basic URL in Scarf Gateway that will act as an endpoint for your applications to ping.

     b. Point the URL to a blank page.

     c. In your software issue an async web call/ping/or page load using (your favorite tool i.e. curl/libcurl, etc).  Note you can call this on start, daily, every time something runs, up to you.  You can throw away the result, the mere background call to open the URL is enough.

You can see our 3 minute tutorial on Youtube here: 

If you are looking for documentation on tracking links to your website or posts via social media we produced a tutorial for this as well:

You can read our documentation here:  https://docs.scarf.sh/

Scarf Tracking Recommendations:

There are lots of different things you can track using Scarf, here is a list of recommendations from our users.

Basic tracking:

  • Tracking package downloads via the Scarf Gateway with a custom URL
    - Create custom variables for each version of your software - enabling version tracking
    - If you are an OSS project that’s supported by multiple vendors and/or an open source foundation, it may be easier to use Scarf URLs for your gateway packages rather than a custom domain, e.g. apacheproject.gateway.scarf.sh rather than apacheproject.org
    - In file package routes, you can add more variables to the incoming path for tracking purposes even if they are not used in the outgoing URL, and this can be used for attribution. e.g. download.com/v1.0/referal_source or similar.
    - File package route variables are very robust, so you can even put entire websites or paths behind it, ie website.com/{+path} . You can probably achieve most tasks with only a couple of routes.
    - You can use GitHub Actions’ cron functionality to run scheduled export jobs of your Scarf data for free!
    - Include referring domain where possible:
    scarf.gateway.scarf.sh/abc.com/{referer_domain}
  • Tracking website and documentation tracking with a Scarf Tracking Pixel
    - Add a different pixel for each category of page view i.e. high value, medium value, low value.  
    - You can add multiple tracking pixels to a single page if need be.
    - Including the referring page where need be.
    - Cross-site tracking

Advanced tracking:

  • Call home functionality via gateway and/or scarf-js
  • Link sharing tracking via the gateway using a Customer URL
    • Use variables to allow for custom pages…
      • I.e. Youtube
        1. Redirect /youtube/{videoname} to abc.com/youtube/{videoname}
        2. This allows you to use the same gateway for multiple videos on youtube

How to: Using anonymous downloads, website traffic, and documentation views to generate leads

Published

June 23, 2023

This article was originally posted on

Hackernoon

Navigating the world of open source software can often feel like trying to find a needle in a haystack when it comes to identifying potential leads and customers. We're up against several unique challenges that aren't typically seen in other industries. Firstly, we're competing against 'free' - a tough proposition in any business context. Secondly, the open-source nature of our software means that many of our users and their respective companies stay hidden behind the veil of anonymity, turning customer identification into a high-stakes game of hide and seek. And there are countless other hurdles that add to the complexity of this landscape. Despite these challenges, there's a wealth of untapped potential buried within anonymous download data, web traffic, and documentation views. Stick around as we unravel the mystery of transforming this sea of anonymous data into valuable company profiles, turning seemingly anonymous interactions into meaningful business opportunities.

Lack of Leads: This is not a new problem

Some time ago, I worked for a company called MySQL AB, the company that brought us MySQL.

The company, like many others offering commercial open source solutions, grappled with several challenges. A big one was figuring out who was using their software. After all, as an open source company, we were not only competing with 'free', we also had limited visibility into who was using our products since they were freely distributed through various anonymous channels.

The way MySQL handled this was by tapping into download and traffic data. It worked like this - our sales and marketing team would approach a company about the value of our support, to which they would respond, "We know MySQL, but we don't use your database (or any open source, for that matter)". That’s when we'd drop the bombshell: "Well, that's funny because our records show you downloaded it 1000 times last year." You can imagine the surprise on their faces! This led to some serious internal discussions and often revealed a significant usage of our software within their tech stack, unbeknownst to the decision-makers.

Fast forward 20 years, and we're still dealing with the same problems, but at an even faster pace. Open source usage has exploded, but managers are still not fully aware of the components and dependencies in their tech stack. And with so much software coming from anonymous channels, it's still a struggle for commercial open source companies to figure out who's using their software.

But hey, it's not all doom and gloom. Challenges also present opportunities, right? We just need to learn to engage with our user base, not just for sales conversations but to ensure they're using our open source software effectively and can deploy it successfully in production. So, let's explore together how to make the most of this situation in this data-driven world.

Important Activities You Should Track in Open Source:

There are 3 events I would suggest everyone track.  

  1. Downloads/Pull events
  2. Views of documentation
  3. Views of content/website (pages, blogs, tutorials)

The first is downloads (no matter if it's direct on your website, via a container registry, or via public repositories. Scarf allows you to track and aggregate downloads across all these different channels). This is probably the most valuable action. A download means someone has not only some interest in your product but enough interest to try it out.

There are three aspects to downloads which you should be paying attention to:

  • The number of downloads from unique sources at a company - more than one machine/source downloading is good.
  • The volume of downloads over a time period at the company - you want to see continued downloads over time, this implies ongoing usage.
  • Is the company downloading newer versions of the software over time - this is gold as it implies not only are they downloading but they are trying to keep things up to date and implies the software is critical enough to have maintenance procedures around it.

The second on the list is documentation views. People using your software will often have questions about how to install, use, and upgrade the software. You will see patterns evolve over time in the usage of the software docs depending on the software. Initially you will see more traffic to installation and setup sections. This coupled with download events is a great indicator or testing or trying things out. Then users will evolve more into troubleshooting or optimization views. See more page views shift to this is normal. Then you should see views to readmes or upgrade pages as they settle into maintenance and sustain mode. Ultimately I would be looking for views over an extended period of time to ensure they are invested and not just kicking the tires.  

The third on the list is content/website views. Not all views will be coming from docs, in fact for commercial purposes there are certain pages on your website that may be highly predictive of potential interest in becoming a customer (i.e. the pricing pages). But I recommend looking for ongoing views and traffic hitting blogs and other news on the product and upcoming releases.  

For each of the events, I would recommend breaking down all the activities into either good/better/best or low/medium/high impact events. Here is a suggested list of criteria when it comes to classifying events:

GOOD BETTER BEST
Downloads 1 or more downloads in a week. More than 1 download over a
30-day period.
Multiple downloads over a 90-day period,
including incremental downloads
of new versions.
Documentation Views Repeated views on installation
and setup instructions.
Documentation views spanning
more than 30 days from multiple sources.

More than just install page views.
Documentation views spanning more
than 90 days from multiple sources.

Doc views on upgrades and maintenance
procedures.
Website Traffic Multiple pages visited and viewed
by 1 company over a week period.
Multiple pages visited and viewed by 1
company over a 30-day period.

Page views to medium value content.
I.e. Reading technical blogs, visiting forum pages,
product feature pages.
Multiple pages visited and viewed by 1
company over a 30-day period.

Page views to high-value content.
I.e. Visiting the pricing pages, visiting but
not signing up on the signup page, etc.

The Riskiest But Most Valuable Metric: Ongoing Usage

While the three activities above are straightforward and generally not viewed with too much concern, there is a fourth activity or metric you can (and probably should) track.  An essential, albeit controversial, activity that serves as a highly valuable metric for any organization seeking to understand the usage patterns of its software - the use of 'call-home' functionality, also known as ongoing usage tracking. The call-home functionality is a mechanism within your software that sends a signal, or a 'ping', back to a designated server or gateway. This signal provides you with real-time information about your software's usage in live production environments, surpassing the insight level gained from just tracking downloads.

While download data can indicate interest and repeated use of your software, the ongoing, consistent 'ping' or call-home activity serves as a definitive predictor of your software's actual usage. Consider this the 'Nirvana' of metrics for your projects, the golden standard that allows you to measure the exact magnitude of your active install base and the frequency of software usage and deployment.

However, implementing this mechanism requires a degree of technical adaptation. Platforms like Scarf, for instance, offer this capability out-of-the-box. But to make full use of it, you'll need to adjust your application accordingly. There are different ways to accomplish this; for JavaScript applications, a package called 'Scarf-JS' can be used. Alternatively, a lightweight, background 'ping' or activity back to a Scarf gateway event can be employed. This ping can be triggered when your application starts up, is used, or at any other specified event.

In essence, your application would asynchronously call back to the gateway website, which doesn't return any data but rather tracks that the application was active. If you can successfully implement this, you can then monitor the number of unique pings over a certain period from various sources. This is incredibly valuable for lead scoring as it provides consistent, ongoing proof of life from these systems, making it the most valuable event or activity you could track.

Lead Scoring or User Scoring is Still Needed:

Not all people visiting your website and downloading your software are equally likely to become customers. In fact you will find 3x, 5x, or even 10x  more drive by traffic as you will find those interested in commercial offerings. To become efficient at finding which companies and users you should focus on, let's explore the concept of “lead scoring”. 

Lead scoring is a methodology used by sales and marketing departments to determine the worthiness of leads, or potential customers, by assigning values to them based on their behavior relating to their interest level in products or services. These values, or scores, are derived from a variety of factors like the professional information they've submitted, how they've engaged with the company's website, or their response to marketing efforts. The purpose of lead scoring is to prioritize leads who are more likely to convert into customers, allowing teams to focus their time and resources effectively. It's a vital part of creating an efficient sales and marketing strategy.

If you've already established a lead scoring system and are utilizing marketing software, consider events in open-source channels as additional data points to further qualify or uncover leads. For instance, a software download could be treated as a high-value (or high-score) activity, whereas a documentation view might be scored similarly to other website visits. It could be beneficial to categorize documentation and page views into high, medium, and low scoring pages, as certain pages (like pricing or install pages) can be more predictive and valuable than others.

The purpose of lead scoring is to prioritize leads who are more likely to convert into customers, allowing teams to focus their time and resources effectively.

The key distinction between traditional lead scoring and the incorporation of open-source download and traffic data lies in the summarization of data at the company level, requiring decisions on scoring criteria. Most marketing lead management tools track users based on sign-ups, cookies, or other mechanisms, capturing specifics such as Matt from Scarf signing up for a webinar. With data from anonymous sources, the best we can do is infer that someone from Scarf has downloaded your software.

The question then becomes: if you know Matt attended a webinar and works at Scarf, does the Scarf download make Matt a more qualified lead? Or should you shift your focus to other individuals at Scarf, possibly higher up in the management hierarchy? There's no absolute right or wrong answer, but my inclination would be to enrich the data of the known user who has already shown interest.

Additionally, it's important to note that software downloads can often be automated. Seeing ten downloads a day doesn't necessarily equate to thousands of servers or the potential for a massive deal. This data needs to be scrutinized, at the very least, by examining the unique systems or origins from where these downloads originate.

Lastly, when incorporating open-source downloads and traffic data, the timeline of events becomes critical. A single download could mean anything, but consistent downloads over several months, especially with each new version release, suggests a real, potentially highly qualified user.

Different Phases of Interest Description Events Action
Passive interest:
Hello World
Someone discovered or visited your website.
They may or may not have any interest in your
software or projects.
Web traffic to docs or websites over the
course of 1 or 2 days.
I would not take any action here.
Intrigued in your software:
This looks interesting
Someone takes more than a drive by interest in
your software. They are truly interested in what you have.
Documentation views.
Looking at install docs and/or feature lists.
Typically this is over multiple days.
I would consider promoting content to
that company's target audience (engineers?)
on other external channels.
Trial & Exploration:
Let me try this out
They move from just learning about the software
to actually downloading it.
Documentation and website views of high value
pages along with at least 1 download event.
You still see this traffic over multiple days but
typically over a week or two.
I would recommend promoting blogs or
how-tos that are interesting to this group of customers.
You could even promote this content directly on your
website when these visitors appear.
Testing & Evaluation:
I wonder if I can use
this for this project
Now someone is looking deeper into this and is starting
to either use it or seriously consider it.
Sustained page views and multiple downloads
over a month period.
Here is where additional content promotion is still a
good idea, but where there is a strong commercial offering
targeting these customers can be effective.
Implementation & Reliance:
This is cool, let's use this in production
Someone is using this over a longer period of time and looks to be beyond merely testing/trying out. If you see activities (both downloads and traffic) spread over a 90 day period, there is a high confidence in their usage in a critical space. This is the best time to seek out conversations.

- Cold outreach
- Targeted ads
- Seek out devs at conferences
Maintenance & Ongoing Upkeep:
Keeping things updated and safe
Someone has been using your software for months and is grabbing new versions of your software and reading readmes or regular updates (like blogs). Look for activities over months (3-12 months), with downloads of multiple versions. Also look for views on readmes or product specific content (blogs, etc). This is the best time to seek out conversations.

- Cold outreach
- Targeted ads
- Seek out devs at conferences
Waning Interest & Potential Churn:
Uh oh… this user is at risk
Usage is dropping and there is risk that this user may turn from an active user to a former user. If you see massive drop offs in traffic and downloads over a 30 day period this sends up red flags.

So we identified interesting companies; now, what do you do with this data?

This section of the guide provides recommendations on how to utilize the data obtained from downloads, website traffic, and documentation usage to enhance product adoption and discover potential leads. Different strategies are outlined for integrating these insights into existing sales/marketing activities, developing a product-led growth strategy, and for startups or new sales/marketing initiatives.

Becoming a Customer is a Journey:

Becoming a customer is indeed a journey that mirrors the transformation of a budding interest into a commercial relationship. This journey begins with a spark of curiosity, driving an individual to explore and try out the software. As they interact with the software, they begin to craft something unique, leading to deployment in a production environment. This stage often uncovers additional needs that may call for a commercial relationship, such as expert support, advanced features, or scale-up capacities.

This entire process can undoubtedly unfold organically, but it can be significantly enhanced, made more fruitful, or even accelerated by tailoring the right activities towards a user or company at the appropriate time. Key players in facilitating this journey include Product Development, Marketing, Developer Relations (DevRel), Sales, and even the Community. They collectively orchestrate a symphony of support and guidance for the user, with each instrument playing a vital role at the right moment.

An aptly timed article, a resonant message, a well-crafted tutorial, or a stimulating community discussion can serve as powerful catalysts in this journey, greatly influencing the user's progression. However, it's crucial to maintain a delicate balance. Overzealous pushing or rushing can result in adverse consequences, creating resistance or disengagement rather than fostering advancement.

As such, it becomes paramount to possess a deep understanding of where a company or user is in their journey. The more detailed your insights into their progress, the more effectively you can tailor your efforts. Similarly, having robust metrics around what strategies are fruitful and which ones fall short is equally beneficial. These insights not only inform your current strategies but also help shape your future approaches, ensuring you continuously enhance your user's journey towards becoming a valuable customer.

When they face challenges or opportunities in their production environment, show them how you can help them succeed with your solutions.

General Advice:

In today's data-driven world, harnessing and leveraging the power of download data and website traffic information can yield impressive results for organizations of all sizes, from startups to established enterprises. However, effectively employing these data requires a strategic and tailored approach to meet the unique needs and goals of each organization. Below are some general recommendations based on the discussions above that can apply across the board:

  • Understand Your Audience: Use download data and website traffic information to build a deeper understanding of your audience. This involves analyzing who is downloading your software, viewing your documentation, and browsing your website. With this information, you can enrich your existing leads, score potential ones, and build a well-informed customer profile.
  • Customize Your Approach: Once you've gathered and analyzed your data, tailor your marketing and sales processes to align with your findings. Whether you're focusing on sales/marketing or product, align your strategies and activities with the preferences and behaviors of your users. This could involve adjusting lead scoring based on the activity level or nurturing potential users to become ongoing ones.
  • Integrate Data with Existing Processes: Integrate your new data with your existing sales, marketing, and customer success processes. For instance, using download patterns to assess the churn potential can help you anticipate and mitigate customer attrition.
  • Adopt a Nurturing Approach: When it comes to new or startup sales/marketing processes, take a nurturing approach. This means guiding users through a lifecycle where they are initially familiarized with your software, then nurtured to become regular users, and eventually led to become paid customers.
  • Leverage Social Media: Social media platforms offer targeted marketing opportunities. Platforms like LinkedIn allow you to aim your promoted content towards specific companies and job titles.
  • Optimize Content: Make use of your existing content and create new content based on where your users and companies are spending the most time. Calls-to-action (CTAs) on these pages can effectively guide users through your marketing funnel.
  • Community Engagement: Encourage users to join your community, participate in events, and engage in discussions. Community engagement can serve as a powerful tool for user retention and organic growth.
  • Monitor and Adapt: Regularly assess the effectiveness of your strategies and be willing to make necessary adjustments. The digital landscape is ever-evolving, and your strategies should be adaptable to accommodate these changes.

Remember, the overarching aim should be to use this data to deliver value to your users, nurture relationships, and ultimately drive the growth of your organization.

Integrating within Existing Sales/Marketing Activities:

Existing sales and marketing activities can be significantly enriched by smartly integrating download data and website traffic information. By revising your lead scoring methodology to include new data points such as software downloads and page visits, you can ensure that you are incorporating the latest indicators of interest from your audience. The enhanced lead scoring will provide a more nuanced understanding of your prospective customers, paving the way for more targeted and effective outreach.

Use the company lists generated from this data in your cold outreach activities. By focusing your outreach efforts on these companies, you are targeting organizations already demonstrating interest, thereby increasing your chances of gaining a receptive audience. These lists can also serve as a valuable resource for your Business Development Representatives (BDRs), equipping them with a list of vetted leads, saving time and improving their efficiency.

Additionally, using this data, you can strategically plan meetings at conferences, events, and similar networking platforms with representatives from companies using or showing interest in your product. This targeted networking can lead to higher-value interactions and ultimately result in stronger leads.

Incorporating the pattern of downloads into your customer success and renewal operations can provide a more comprehensive customer overview. Such insights into customer behavior can inform your renewal strategies, equipping you with necessary foresight to address potential issues and ensure customer satisfaction. Moreover, the data can be a key indicator of potential churn risks, allowing you to proactively manage customer retention by identifying and addressing their concerns before they choose to discontinue your service.

TLDR:

  • Use the data to enrich your existing set of leads. You can add additional events to your lead scoring process.
  • Use the data to build a highly qualified list for outreach activities. Target companies that are using your software or are interested in your software.
  • Use this data to inform your marketing strategies. For example, prioritize individuals from companies that have shown interest in your software at meetings, conferences, and events.
  • If you have a fully fleshed out sales, marketing, and customer success process, use the data to assess churn risk.

Startup or New Marketing/Sales Activities (Active Prospecting):

For startups or companies initiating new sales and marketing initiatives, creating a lightweight growth engine that nurtures potential users can be the key to driving growth. Setting up a lifecycle or nurture campaign can guide potential users through your marketing funnel, providing them with the right content at the right time to foster interest and engagement.

Promoted content can be a powerful tool in these campaigns. Aimed at users in the early stages of engagement, this content can educate users about your software, showcasing its features and benefits and encouraging them to explore it further. As these potential users turn into ongoing users, you can begin to introduce promoted content, offers, and cold outreach to convert them into paying customers.

Understanding the customer journey is crucial in a startup or new marketing environment. By mapping out this journey and identifying combinations of events and thresholds, you can strategize when to increase or decrease marketing activities for optimal effect. This dynamic approach can keep your marketing efforts agile and responsive to user behavior.

Social media platforms like LinkedIn offer a targeted way to reach specific companies

TLDR:

  • Use this data to build a lightweight marketing and growth engine.
  • Approach the process as a life cycle or nurture type campaign. Nurture potential users until they become productive users.
  • Use promoted content targeted towards companies that are downloading or have looked at your documentation.
  • Once users are actively using your software, shift the focus to ongoing maintenance and new releases. Then, start introducing your paid offerings or services.
  • Use social media to engage potential users.
  • Integrate the scarf platform into your existing community activity to help nurture and guide potential users.

Integrating into a Product-Led Growth Strategy:

In a product-led growth strategy, the primary focus is on expanding product usage, and insights from website traffic and download data can play a crucial role in driving this growth. You can target specific companies with promoted content on various channels, catching the attention of potential or current users and stimulating their interest in your software.

Educational resources such as blogs, tutorials, and videos offer a non-intrusive way to engage companies that are exploring your software. These resources can help prospective users understand the value your product offers and how it can address their needs, fostering trust and driving product adoption.

Networking can also play a pivotal role in a product-led growth strategy. You can seek out speakers and attendees from targeted companies at industry conferences and events, fostering relationships that can lead to future collaborations or customers.

To gain a holistic picture of user engagement and behavior, consider merging this download and website usage data into your community tools, such as Common Room. This integration will allow you to monitor how users interact with your product and community, providing insights that can help shape your product development and marketing strategies.

TLDR:

  • Use download data to understand product adoption and usage patterns.
  • Monitor decreasing downloads or decreasing activity as a potential indicator of churn.
  • Use the data to understand which stage of the company's life cycle the users are in. This will help inform product development and roadmap decisions.
  • Integrate with existing community tools to build a complete picture of potential users -> users -> community members.
  • Use CTA’s (Call to Action) for events like join our community where you can convert anonymous users to known.

Using Scarf:

Introducing Scarf to your Community:

When adding Scarf to your website or as part of your deployment strategy you may get questions from users.

Here is some basic information about Scarf that others have found useful in discussing with their users when asked about using Scarf:

  • Scarf is used by 1000’s of projects to collect analytics for package downloads, documentation views, and website traffic
  • Scarf is fully GDPR compliant and ensure PII is protected
  • Scarf has passed the privacy, compliance, and legal requirements to be approved by open source foundations like the Apache Foundation
  • https://privacy.apache.org/policies/privacy-policy-public.html
  • https://privacy.apache.org/faq/committers.html
  • Scarf provides cookie-less and privacy conscious documentation and privacy focused website and documentation analytics
  • Scarf stores only the bare minimum metadata needed to collect and aggregate analytics data for our users.

Scarf also provides your users with other benefits:

  • Your downloads no longer are locked to a single hosting provider or service.  As services (such as container registers or package managers) change their terms of service or make changes to their offerings, you can adjust your hosting without changing your docs or impacting your users in the future. 
  • Scarf can be used to determine how exposed your user base is to old or insecure software, enabling your project to take a proactive approach to informing and educating your user base of potential issues 
  • Improves the sustainability of your project by providing data on the real user base to investors (without exposing PII).  

Setting up Scarf:

Scarf is very straightforward to get started with.  

Overall the process is:

1. Signup for a free account at: http://app.scarf.sh/register

2. For downloads,

     a. Setup a new package URL via the Scarf Gateway within your Scarf Dashboard.

     b. Point this URL to your current download endpoints.

     c. Update installation and setup documentation to direct users to use the gateway.

3. For Documentation or website tracking:

     a. Create a Scarf Tracking Pixel and add it to the pages you want analytics for (whether on your site or on third party sites).

4. For Link Tracking and social monitoring:

     a. Create a new URL in the Scarf Gateway as a redirect/link shortener to your website, Youtube, Hacker News, or other sites.  

     b. When posting links on social media use the new URL instead of the main link.  Data will then be available in the Scarf dashboard.

5. For Basic Call Home functionality:

     a. Create a basic URL in Scarf Gateway that will act as an endpoint for your applications to ping.

     b. Point the URL to a blank page.

     c. In your software issue an async web call/ping/or page load using (your favorite tool i.e. curl/libcurl, etc).  Note you can call this on start, daily, every time something runs, up to you.  You can throw away the result, the mere background call to open the URL is enough.

You can see our 3 minute tutorial on Youtube here: 

If you are looking for documentation on tracking links to your website or posts via social media we produced a tutorial for this as well:

You can read our documentation here:  https://docs.scarf.sh/

Scarf Tracking Recommendations:

There are lots of different things you can track using Scarf, here is a list of recommendations from our users.

Basic tracking:

  • Tracking package downloads via the Scarf Gateway with a custom URL
    - Create custom variables for each version of your software - enabling version tracking
    - If you are an OSS project that’s supported by multiple vendors and/or an open source foundation, it may be easier to use Scarf URLs for your gateway packages rather than a custom domain, e.g. apacheproject.gateway.scarf.sh rather than apacheproject.org
    - In file package routes, you can add more variables to the incoming path for tracking purposes even if they are not used in the outgoing URL, and this can be used for attribution. e.g. download.com/v1.0/referal_source or similar.
    - File package route variables are very robust, so you can even put entire websites or paths behind it, ie website.com/{+path} . You can probably achieve most tasks with only a couple of routes.
    - You can use GitHub Actions’ cron functionality to run scheduled export jobs of your Scarf data for free!
    - Include referring domain where possible:
    scarf.gateway.scarf.sh/abc.com/{referer_domain}
  • Tracking website and documentation tracking with a Scarf Tracking Pixel
    - Add a different pixel for each category of page view i.e. high value, medium value, low value.  
    - You can add multiple tracking pixels to a single page if need be.
    - Including the referring page where need be.
    - Cross-site tracking

Advanced tracking:

  • Call home functionality via gateway and/or scarf-js
  • Link sharing tracking via the gateway using a Customer URL
    • Use variables to allow for custom pages…
      • I.e. Youtube
        1. Redirect /youtube/{videoname} to abc.com/youtube/{videoname}
        2. This allows you to use the same gateway for multiple videos on youtube

Latest blog posts

Tools and strategies modern teams need to help their companies grow.

Integrating Scarf Data with Your Analytics Tools

Integrating Scarf Data with Your Analytics Tools

Exporting data tracked by Scarf is essential for analytics, reporting, and integration with other tools. Scarf adds open-source usage metrics to the data you already collect, giving you a fuller picture of how your project is used. This helps you monitor trends, measure impact, and make better data-driven decisions.
3 Methods to Collect Data with Scarf

3 Methods to Collect Data with Scarf

Scarf helps you unlock the full potential of your open source project by collecting valuable usage data in three key ways: Scarf Packages, in-app telemetry, and tracking pixels. In this post, we’ll break down each of these powerful tools and show you how to use them to optimize your open source strategy.
Scarf Newsletter - August 2024

Scarf Newsletter - August 2024

Stay up to date with the latest updates from Scarf. Discover upcoming features, industry news, partnerships, and events. August 2024 Newsletter.
How Apache Superset Implemented Scarf

How Apache Superset Implemented Scarf

In this playbook, you’ll learn how to integrate Scarf into an Apache Software Foundation project. It details how the Preset team implemented Scarf in their Apache Superset project, as shared during our first-ever Scarf Summit on July 16th, 2024.
Implement a Call-Home Functionality or Telemetry in your Open Source Project

Implement a Call-Home Functionality or Telemetry in your Open Source Project

Implementing telemetry in your open source project helps you determine whether people are testing your software and continuing its use over time. Such insights not only confirm if the developed software meets users' needs but also helps identify which versions are being adopted and which might be vulnerable to the latest bugs or other issues.
Prisma: Validating Enterprise Adoption Through Open Source Engagement

Prisma: Validating Enterprise Adoption Through Open Source Engagement

Prisma turned to Scarf for a monthly Strategic Insights Report. By integrating Scarf into various parts of their web and software delivery infrastructure, Prisma now knows relevant details about their users in terms of company size, industry, location and much more.
Measure and Optimize Open Source User Interactions Using Scarf

Measure and Optimize Open Source User Interactions Using Scarf

This playbook will walk you through setting up Scarf to get a clearer picture of how people are interacting with your open-source project. You’ll learn how to create and use Scarf Pixels, track open source project documentation views, measure engagement across social media, and more.
CopilotKit Case Study: Leveraging Scarf to Uncover Hidden Open-Source Opportunities

CopilotKit Case Study: Leveraging Scarf to Uncover Hidden Open-Source Opportunities

CopilotKit implemented Scarf to gain visibility into their open-source community. By adding Scarf to their documentation, they could see which companies were actively engaging with their resources, providing valuable insights into potential leads and customer segments.
Measure Your Open Source Project's Downloads Using Scarf

Measure Your Open Source Project's Downloads Using Scarf

Tracking downloads of your open-source projects is key to understanding user engagement. With Scarf, you can see which businesses are using your project, which versions are popular, which platforms are being targeted, and more. This playbook will show you how to set up Scarf to monitor your project’s downloads.
What's New at Scarf: Key Takeaways from the Scarf Summit

What's New at Scarf: Key Takeaways from the Scarf Summit

On July 16th, we hosted our first-ever Scarf Summit, celebrating analytics for open source and the significant improvements we’ve made to the Scarf platform. In case you missed it, here’s a recap of all the key updates shared by our Engineering Leader, Aaron Porter.
Building Scarf: Avi Press on Haskell, Telemetry, and Open Source Challenges

Building Scarf: Avi Press on Haskell, Telemetry, and Open Source Challenges

In this episode of the Haskell Interlude Podcast, Joachim Breitner and Andreas Löh sit down with Avi Press, the founder of Scarf, to discuss his journey with Haskell, the telemetry landscape in open source software, and the technical as well as operational challenges of building a startup with Haskell at its core.
Boost Your Outreach with Scarf Filtering

Boost Your Outreach with Scarf Filtering

Scarf Basic and Premium tiers have long had the ability to sort their open source usage data by company, domain, events, last seen, and funnel stage. But our customers have been wanting more. Now you can hyper target by combining region, tech stack, and funnel stage, making outreach as refined and low friction as possible. 
Below the Surface: Why Open Source Needs Analytics

Below the Surface: Why Open Source Needs Analytics

Understanding open source user engagements and usage is obscured by a lack of actionable data, a result of its inherent openness and anonymity. Embracing a data-driven approach to open source projects helps them not only grow, but also understand the keys to their success, benefiting everyone involved.
How Garden Leverages Scarf to Understand and Grow Their User Base

How Garden Leverages Scarf to Understand and Grow Their User Base

As an open source company, Garden knew how hard it was going to be to get usage data. Adding Scarf for analytics on open source downloads turned anonymous numbers into company names. Using Scarf’s privacy-first analytics also helped Garden to know what kind of companies were using their OSS and where they were located.
OSS Privacy & OSS Analytics, How Heroic Labs Struck a Balance

OSS Privacy & OSS Analytics, How Heroic Labs Struck a Balance

Once Heroic started using Scarf, they learned that they were even more popular than they thought they were. Using Scarf, they were able to determine where, by country, their users were downloading from, and how many per day.
Unlimited Free Seats and Data Retention for All Linux Foundation Projects

Unlimited Free Seats and Data Retention for All Linux Foundation Projects

Any LF project maintainer can use Scarf without needing any further approval from the foundation. Scarf is offering all LF projects free accounts with a few additional features over our base free version. LF projects will get usage data like docs, downloads, and page views with unlimited free seat licenses and data retention.
Union.ai and Flyte: Privacy, Open Source, and Building a Commercial Business

Union.ai and Flyte: Privacy, Open Source, and Building a Commercial Business

Union is an open source first company. It uses Scarf to drive their DevRel strategy and improve their open source project. It also uses Scarf to power its consultative sales approach to help customers where it makes sense. Union has been successfully leveraging Scarf funnel analysis to shape the product to better fit the market so that they can focus on ensuring that companies can get value from Flyte sooner.
Navigating the Complexities of Open Source Commercialization: Insights from Adam Jacob

Navigating the Complexities of Open Source Commercialization: Insights from Adam Jacob

In this latest episode of "Hacking Open Source Business," Avi Press and Matt Yonkovit sit down with Adam Jacob, the co-founder of Chef and current CEO of System Initiative. With a rich history in the open-source world and numerous thought-provoking opinions, Adam delves into the intricacies of open-source commercialization, offering valuable insights and alternative strategies to the commonly held Open Core model.
Scarf Newsletter - May 2024

Scarf Newsletter - May 2024

Stay up to date with the latest updates from Scarf. Discover upcoming features, industry news, partnerships, and events. May 2024 Newsletter.
Smallstep Labs: Leveraging Open Source Data for Enterprise Growth

Smallstep Labs: Leveraging Open Source Data for Enterprise Growth

Smallstep wanted to understand the impact of their open-source project on enterprise adoption of their commercial security solutions. Smallstep uses Scarf to better understand user interactions and software usage, providing insights into its user base and potential customer segments as an important signal for commercial use.
Diagrid and Dapr: How to Balance Open Source and Business Through Data

Diagrid and Dapr: How to Balance Open Source and Business Through Data

Diagrid was founded in 2022 by the creators of the popular Dapr open source project. Making data-driven decisions for a commercial company built on an open source project that had no real concrete data, was a real challenge. Diagrid translated Scarf data into valuable insights for marketing and product development of their commercial product.
12 Reasons Why Haskell is a Terrible Choice for Startups (and why we picked it anyway)

12 Reasons Why Haskell is a Terrible Choice for Startups (and why we picked it anyway)

When we approached the project of building Scarf, we turned to our favorite language: Haskell. Little did we know, this decision would shape our story in more ways than one.
Unstructured: Understanding an Open Source Project’s Impact on Commercial Success

Unstructured: Understanding an Open Source Project’s Impact on Commercial Success

Unstructured had so much usage of their open source, but so little data. Prior to Scarf, they mostly had GitHub information for things like downloads and stars. It was difficult to separate the good signal from the noise without any specific information that would help them to better target this large and growing open source user base or data to influence their product roadmap. 
New Integration: Scarf + Common Room = Supercharged Insights for Open Source Projects

New Integration: Scarf + Common Room = Supercharged Insights for Open Source Projects

It’s happening! Scarf is part of the Common Room Signal Partners program. Soon, you will be able to integrate your Scarf data into your Common Room platform for a more complete view of all of your user signals.
Scarf Newsletter - March 2024

Scarf Newsletter - March 2024

Stay up to date with the latest updates from Scarf. Discover upcoming features, industry news, partnerships, and events. March 2024 Newsletter.
State of Open Source Usage: The Scarf Report 2023

State of Open Source Usage: The Scarf Report 2023

In 2023, the open source software (OSS) landscape showed significant growth and shifts in various aspects. Here are the key findings:
Scarf Successfully Completes Type 1 SOC 2 Examination with an Unqualified Opinion

Scarf Successfully Completes Type 1 SOC 2 Examination with an Unqualified Opinion

We are thrilled to announce that we have successfully completed a Type 1 System and Organization Controls 2 (SOC 2) examination for our Scarf Platform service as of January 31, 2024.
Analytics are Starting to Win in Open Source

Analytics are Starting to Win in Open Source

When Scarf emerged back in 2019, many people expressed skepticism that usage analytics would ever be tolerated in the open source world. 5 years later, Scarf has shown this once solidified cultural norm can indeed change. Learn how Scarf's journey mirrors a broader shift in open source culture and why embracing usage analytics could shape the future of open software development.
Scarf Newsletter - February 2024

Scarf Newsletter - February 2024

Stay up to date with the latest updates from Scarf. Discover upcoming features, industry news, partnerships, and events. February 2024 Newsletter.
Scarf Case Study: Apache Superset

Scarf Case Study: Apache Superset

Apache Superset is an open-source modern data exploration and visualization platform that makes it easy for users of all skill sets to explore and visualize their data. We spoke with Maxime Beauchemin, founder & CEO of Preset, and the original creator of both Apache Superset and Apache Airflow, who shared with us Superset's experience using Scarf.
Haskell.org: Bridging the Gap Between Language Innovation and Community Understanding

Haskell.org: Bridging the Gap Between Language Innovation and Community Understanding

Haskell, a cutting-edge programming language rooted in pure functionality, boasts static typing, type inference, and lazy evaluation. The language's ongoing evolution is bolstered by a diverse array of organizations, including the Haskell.org committee. This committee strategically leveraged the Scarf solution for testing purposes.
Scarf Newsletter - December 2023

Scarf Newsletter - December 2023

We’re pleased to share a final recap of the latest Scarf updates for December and 2023 as a whole. Join us in this last edition of our 2023 newsletters.
Introducing OQLs: A New Way for Businesses to Quantify Open Source Adoption

Introducing OQLs: A New Way for Businesses to Quantify Open Source Adoption

In the open source ecosystem, user behaviors are diverse and conversion tracking poses unique challenges frequently leaving traditional marketing strategies insufficient. Recognizing this gap, we are excited to introduce a brand new way for businesses to make sense of this opaque and noisy signal – Open Source Qualified Leads (OQLs).
Scarf Newsletter - November 2023

Scarf Newsletter - November 2023

Stay up to date with the latest updates from Scarf. Discover upcoming features, industry news, partnerships, and events. November 2023 Newsletter.
The BSL Phenomenon: Balancing Sustainability and Open Source Principles

The BSL Phenomenon: Balancing Sustainability and Open Source Principles

In recent years, a notable development in the open source landscape is the growing number of large corporations considering the transition from open source licenses to more restrictive models like the Business Source License (BSL). This trend raises further questions about the sustainability and future of open source projects, particularly when large players alter their approach.
State of Open Source Usage Q3 2023: The Scarf Report

State of Open Source Usage Q3 2023: The Scarf Report

In Q3 2023, the open source software (OSS) landscape showed significant growth and shifts in various aspects. Here are the key findings:
Unlocking the Power of Custom URL Parameters with Scarf: A Comprehensive Guide

Unlocking the Power of Custom URL Parameters with Scarf: A Comprehensive Guide

A recent release of Scarf added the ability to track and report on custom URL parameters. If you are looking to gain more intelligence around how you open source users interact with your project and download your software using link parameters in key situations can reveal interesting and helpful trends that can help you grow your user base and unlock open source qualified leads.
Building Trust: How to Collect Data Responsibly as an Open Source Project

Building Trust: How to Collect Data Responsibly as an Open Source Project

In the ever-evolving landscape of open source software, data collection has become a hot-button issue. As the open source community grows and software becomes increasingly integral to our daily lives, concerns about data collection ethics have emerged.
Scarf Newsletter - September 2023

Scarf Newsletter - September 2023

Stay up to date with the latest updates from Scarf. Discover upcoming features, industry news, partnerships, and events. September 2023 Newsletter.
 Measuring the Commercial ROI of DEVREL

Measuring the Commercial ROI of DEVREL

In today's fast-paced tech world, the Developer Relations (DevRel) role has moved from the periphery to the center stage. Companies, irrespective of their size, are now seriously considering the worth of having a dedicated DevRel team. But, how do you quantify the success or failure of such an effort? What metrics should companies use? This post dives deep into understanding the commercial Return on Investment (ROI) of DevRel.
Selling Open Source: 101 - Guide for Sales and Marketing Teams

Selling Open Source: 101 - Guide for Sales and Marketing Teams

Monetizing open source software is a challenging task, but it can also be highly rewarding. Unlike traditional software, you're essentially competing against a free version of your product. So, how do you sell something that is inherently free?
Beyond the Surface: How to Engage with the Quiet Members of your Open Source Community

Beyond the Surface: How to Engage with the Quiet Members of your Open Source Community

In the dynamic realm of community management, marketing, and developer relations, success depends upon more than just attracting attention. It's about fostering meaningful relationships, nurturing engagement, and amplifying your community's impact. 
Mastering Telemetry in Open Source: A Simple Guide to Building Lightweight Call Home Functionality

Mastering Telemetry in Open Source: A Simple Guide to Building Lightweight Call Home Functionality

This guidebook shows you how to implement a call-home functionality or telemetry within your open-source software while at the same time being transparent and respectful of your users data. Let's explore how to build a minimal, privacy-focused call home functionality using a simple version check and Scarf.
Scarf Newsletter - July 2023

Scarf Newsletter - July 2023

Stay up to date with the latest updates from Scarf. Discover upcoming features, industry news, partnerships, and events. July 2023 Newsletter.
Open Source Metrics: Fear and Loathing (Part 2)

Open Source Metrics: Fear and Loathing (Part 2)

Many open source contributors are reluctant or skeptical about metrics. They think metrics are overrated, irrelevant, or even harmful to their projects and communities. But in this blog post, we argue that metrics are essential for making better decisions, improving the experience for users and contributors, and demonstrating the impact and value of your open source work. We also share some tips and examples from OSPOs and DevRel teams on how to choose and use metrics effectively.
Why GitHub Repos Are Not Enough for Your Docs: The Benefits of Creating a Dedicated Doc Site

Why GitHub Repos Are Not Enough for Your Docs: The Benefits of Creating a Dedicated Doc Site

Many open-source developers rely on GitHub as their primary documentation source. But this can be a costly mistake that can affect your project’s success and adoption. In this blog, we’ll explain why you need to build your own docs site and how to do it easily and effectively.
Data-Driven Open Source: Why You Should Care About Metrics (Part 1)

Data-Driven Open Source: Why You Should Care About Metrics (Part 1)

Open source projects and companies need data to grow and enhance their performance. However, many open source leaders and communities overlook or reject metrics and depend on intuition, relationships, or imitation. Data can help you spot problems, opportunities, and false positives in growth strategies. In this blog post, Matt Yonkovit shows you why data is important for open source success and how it can offer insights and guidance for open source projects to reach their goals and make better decisions.
State of Open Source Usage Q2 2023: The Scarf Report

State of Open Source Usage Q2 2023: The Scarf Report

Open source software continues to be a vital part of enterprise operations in Q2 2023, as more and more companies adopt open source solutions for their business needs. In this blog post, we will examine the state of open source usage in Q2 2023 and the trends that are shaping the future of open source.
Developer Relations (DevRel): Where Should It Reside in Your Organization

Developer Relations (DevRel): Where Should It Reside in Your Organization

DevRel is a vital function for any organization that wants to engage with the developer community and grow its user base. However, there is no one-size-fits-all solution for where to place DevRel within the organizational structure. In this blog post, we explore three common strategies for DevRel placement: marketing, product, and hybrid. We discuss the advantages and challenges of each strategy, and provide some tips on how to decide which one is best for your organization and goals.
The Gating Debate: Striking a Balance Between Open Source and Marketing Insights

The Gating Debate: Striking a Balance Between Open Source and Marketing Insights

In the open source industry, identifying and engaging users is a major challenge. Many users download software from third-party platforms that do not share user data with the software company. Gating content behind a login or an email form can help, but it can also alienate potential users who value their privacy and convenience. In this blog post, we explore the pros and cons of gating content in the open source industry, and we offer an alternative solution that can help you identify and connect with your users without compromising your content.
How to Use Metrics to Track and Evaluate Your Open Source Community’s Success

How to Use Metrics to Track and Evaluate Your Open Source Community’s Success

Open source software depends on the power of its community. But how do you know if your community is healthy and thriving? In this blog, you will learn how to use metrics to track and evaluate your community’s activity, engagement, growth, diversity, quality, and impact. You will hear from founders, DevRel experts, and investors who share their best practices and tips on how to measure and improve your community’s performance and value.
How to: Using anonymous downloads, website traffic, and documentation views to generate leads

How to: Using anonymous downloads, website traffic, and documentation views to generate leads

Learn how to overcome the challenges of open source software marketing and turn anonymous data into qualified leads. In this blog post, we’ll show you how to use download data, web traffic, and documentation views to identify potential customers and grow your sales pipeline. Discover how to track downloads, website traffic and documentation views with Scarf Gateway and the Scarf Tracking Pixel.
Why Your Open Source Startup Is Going To Fail (And What You Can Do About It)

Why Your Open Source Startup Is Going To Fail (And What You Can Do About It)

This blog post outlines ten common mistakes made by founders of open source startups, from failing to ask the right questions to neglecting the standardization of key metrics. By offering guidance on how to avoid these pitfalls, it provides a roadmap to successfully commercializing open source projects.
Open Source Monetization 101: A Step-by-Step Guide

Open Source Monetization 101: A Step-by-Step Guide

Many people believe that making money from open source projects is an arduous or even impossible task. However, with the right strategies it is possible to build a sustainable business while keeping the spirit of open source intact. By evaluating the market fit and commercial viability of an open source project before considering funding and monetization, one can realistically begin to explore the financial potential of an open source project. Here's how to do it.
The Open Source Sales & Marketing Funnel: Navigating the Challenges of Anonymous Downloads and Activity Tracking

The Open Source Sales & Marketing Funnel: Navigating the Challenges of Anonymous Downloads and Activity Tracking

This blog emphasizes the importance of a comprehensive approach to lead generation in the open source software space. Amid the challenges of anonymous usage and privacy regulations, strategies focusing on download activity, community engagement, and web traffic can maximize lead identification. Employing lead scoring and maintaining a list of active software users can further enhance sales outcomes in this unique market.
Scarf Newsletter - May 2023

Scarf Newsletter - May 2023

Stay up to date with the latest updates from Scarf. Discover upcoming features, industry news, partnerships, and events. May 2023 Newsletter.
Harnessing Software Download Patterns: Using Open Source Download Metrics to Uncover New Users and Potential Customers

Harnessing Software Download Patterns: Using Open Source Download Metrics to Uncover New Users and Potential Customers

Here at Scarf, we've developed a solution to help open source projects and businesses gain more insight into their users and their download traffic - Scarf Gateway. Here's how it works.
Unlocking Growth Potential: Scarf Users Benefit from Clearbit Integration for Improved User Intelligence

Unlocking Growth Potential: Scarf Users Benefit from Clearbit Integration for Improved User Intelligence

We are thrilled to announce our latest partnership with Clearbit (https://clearbit.com/). This collaboration will offer Scarf users and customers an enriched array of data about their user base, significantly enhancing the quality of information you already value from Scarf.
State of Open Source Usage Q1 2023: The Scarf Report

State of Open Source Usage Q1 2023: The Scarf Report

The popularity of open source software is not in doubt, but little concrete public data exists beyond human-generated surveys on adoption usage. In this blog post, we will explore the state of open source usage in Q1 2023 and the data illustrating how open source is becoming an increasingly important part of enterprise operations.
Connecting Community Efforts in Open Source to Business Success

Connecting Community Efforts in Open Source to Business Success

The success of DevRel (Developer Relations) and community efforts in open source can be challenging to measure, as there is often a disconnect between the goals and expectations of the community and the business. This blog post discusses the challenges of measuring the success of DevRel and community efforts in open source.
3 Keys to Growing the Adoption of an Open Source Project

3 Keys to Growing the Adoption of an Open Source Project

Successful open source projects don't always translate into successful open source businesses. However, by focusing on building a kick-ass product, raising awareness, making the product easier to use, and fostering a strong open source community, you can set the stage for converting users into paying customers.
The Most Neglected and Overlooked Open Source Metric: Production Users

The Most Neglected and Overlooked Open Source Metric: Production Users

Everyone wants a larger open source user base, but very few people effectively measure its growth. Let’s discuss why.
Switching Container Registries With Zero Downtime

Switching Container Registries With Zero Downtime

You can use the open source Scarf Gateway to switch hosting providers, container registries, or repositories without impacting end users in the future.
Understanding Tech Layoffs and the Economy’s Impact on Open Source

Understanding Tech Layoffs and the Economy’s Impact on Open Source

What is driving all this tech layoffs? , What is their impact on the open source software industry? We will walk through all the potential reasons from an economic downturn, herd mentality, excessive borrowing and spending due to low interest rates, and growth at all costs as the main reasons behind the layoffs. Companies can continue to grow in this tight economic market if they are focused on optimizing efficiency and sustaining the right growth.
Why Downloads are an Essential Metric for Open Source Software Projects

Why Downloads are an Essential Metric for Open Source Software Projects

If you're only going to track one thing for your OSS project, track your downloads.
The Open Source Business Metrics Guide

The Open Source Business Metrics Guide

How to Build, Grow, and Measure the Success of an Open Source Business
Messaging and Positioning Considerations for Introducing an Open Source Product

Messaging and Positioning Considerations for Introducing an Open Source Product

At the All Things Open conference, Emily Omier, a seasoned positioning consultant, sat down with Avi Press (Founder and CEO, Scarf) and Matt Yonkovit (The HOSS, Scarf) to discuss how to message, position, and validate your open source product on The Hacking Open Source Business Podcast. You can watch the full episode below or continue reading for a recap.
How to Get the Attention of an Open Source Software Investor

How to Get the Attention of an Open Source Software Investor

On the Hacking Open Source Business podcast, Joseph Jacks aka JJ (Founder, OSS Capital) joins Avi Press (Founder and CEO, Scarf) and Matt Yonkovit (The HOSS, Scarf) to share what you need to know before starting a commercial open source software (COSS) company and how you can set yourself and your project apart in a way that attracts investor funding. As an investor who exclusively focuses on open source startups, JJ provides a VC perspective on what he looks for when evaluating investment opportunities.
Heroic Labs' Journey to Open Source and 5.3M Docker Downloads

Heroic Labs' Journey to Open Source and 5.3M Docker Downloads

On The Hacking Open Source Business podcast, CEO Chris Molozian and Head of Developer Relations Gabriel Pene at Heroic Labs elaborate on their usage and shift to open source and how it accelerated their adoption.
How to Keep Open Source Projects Open Source

How to Keep Open Source Projects Open Source

In this recap of the first episode of the Hacking Open Source Business Podcast, co-hosts Matt Yonkovit and Avi Press, Scarf Founder and CEO, dig into a recent controversy that highlights the challenges open source projects face trying to create sustainable revenue streams to support a business or a non-profit that funds the project’s growth.
How Buoyant Drives Open-Source-Led Growth with Linkerd

How Buoyant Drives Open-Source-Led Growth with Linkerd

Building a business around an open-source project is hard. Learn more about how Buoyant drives product-led growth with Linkerd.
Alex Biehl: Open Sourcing a Tool to Generate Haskell Server Stubs

Alex Biehl: Open Sourcing a Tool to Generate Haskell Server Stubs

Alex is a software engineer at Scarf who recently open sourced a tool to generate Haskell server stubs called Tie.
Tanner Linsley: Building Sustainable Open Source Projects

Tanner Linsley: Building Sustainable Open Source Projects

Tanner Linsley joined us to explain how he got started in open source and how he has made working in open source sustainable.
Stefano Maffulli: An Exploration on Standards for Open Source Packaging and Distribution

Stefano Maffulli: An Exploration on Standards for Open Source Packaging and Distribution

Scarf Sessions is a new stream where we have conversations with people shaping the landscape in open source and open source sustainability. This post will give a recap of the conversation Scarf CEO, Avi Press and I had with our guest Stefano Maffulli.
Using OSS Usage Data to Sell your Company

Using OSS Usage Data to Sell your Company

Learn how Nestybox used Scarf to gather better project insights and provide accurate data during their recent acquisition.
A Different Approach to Measuring Open Source Community Health

A Different Approach to Measuring Open Source Community Health

Community is important to the success of open source software. To understand and grow a community, project founders and maintainers need visibility into various technical, social, and even financial metrics. But what metrics should we be using?
Scarf Tech Stack: Relude

Scarf Tech Stack: Relude

This blog post will talk about Relude, a project we use in the majority of our Scarf tech stack
Python Wheels vs Eggs (And How Data-Driven Decisions Must Become The Norm in Open-Source)

Python Wheels vs Eggs (And How Data-Driven Decisions Must Become The Norm in Open-Source)

Should Python eggs be deprecated in favor of wheels? What does the data show? This post explores how the right data can make decisions like this easier for maintainers and Open Source organizations.
Changelog: Company Identification Change

Changelog: Company Identification Change

Announcing a new change to the way we identify companies.
Announcing Python Support

Announcing Python Support

Advanced registry analytics are now available for Python package maintainers
Project Spotlight: Scarf Gateway Stats

Project Spotlight: Scarf Gateway Stats

This Project Spotlight will focus on another exciting open source project, Scarf Gateway Stats.
Scarf Will Block Package Downloads from the Russian Government

Scarf Will Block Package Downloads from the Russian Government

In solidarity with Ukraine, Scarf Gateway will no longer service package downloads from Russian Government sources.
Changelog: New Pixel Snippet

Changelog: New Pixel Snippet

A notice to our Documentation Insights users.
Community Spotlight: nix-community

Community Spotlight: nix-community

This is the second post in a new series from Scarf: Spotlights where we highlight awesome projects and communities.
Changelog: Registry Validation for Auto-package Creation

Changelog: Registry Validation for Auto-package Creation

A summary of the new registry validation feature for auto-package creation.
Three Ways to Build Better Products Through Analytics

Three Ways to Build Better Products Through Analytics

A special guest post from open-source analytics company PostHog
New Year, New Scarf Features

New Year, New Scarf Features

Today, we're launching some of the most frequently asked for features since we launched Scarf Gateway back in March.
The Scarf Tech Stack

The Scarf Tech Stack

How Scarf is built
OSS Project Spotlight: IHP

OSS Project Spotlight: IHP

In a new blog post series, we'll highlight great OSS projects that are using Scarf. Today, we are featuring IHP, a modern batteries-included Haskell web framework
Measuring Downloads of Anything You Distribute

Measuring Downloads of Anything You Distribute

Scarf's core registry infrastructure has leveled up to support any kind of direct file download
Announcing Nomia and the Scarf Environment Manager

Announcing Nomia and the Scarf Environment Manager

Our mission here at Scarf centers around enhancing the connections between open source software maintainers and end users. Learn how Scarf + Nomia can reduce the complexity and increase the efficiency of the end-user open source integration experience.
Announcing The Scarf Gateway

Announcing The Scarf Gateway

Understand how your containers are downloaded and decouple your project from your registry
Composition with Semantically Rich Names

Composition with Semantically Rich Names

Insights from recent developments in name-based composition
Shea Levy, Composition Fanatic

Shea Levy, Composition Fanatic

Introducing Shea, Scarf's new VP of Engineering
Are Package Registries Holding Open-Source Hostage?

Are Package Registries Holding Open-Source Hostage?

Package registries are a central piece of infrastructure for software development. How aligned are they with the developers who make all of the packages being hosted?
Analytics and Open Source Sustainability

Analytics and Open Source Sustainability

Analytics will be an important part of improving sustainability for open-source maintainers
Scarf Insights Page: Understand Your OSS Project Usage with Scarf Metrics

Scarf Insights Page: Understand Your OSS Project Usage with Scarf Metrics

Discover the importance of key metrics in assessing the health and growth of your open source project
Understanding Open Source User Adoption Funnel Stages with Scarf

Understanding Open Source User Adoption Funnel Stages with Scarf

Scarf open source adoption funnel stages allow you to better understand and qualify the user journey with open source software.
3 Methods to Collect Data with Scarf
September 10, 2024

3 Methods to Collect Data with Scarf

Scarf helps you unlock the full potential of your open source project by collecting valuable usage data in three key ways: Scarf Packages, in-app telemetry, and tracking pixels. In this post, we’ll break down each of these powerful tools and show you how to use them to optimize your open source strategy.
Sara Dornsife
Sara Dornsife
Scarf Newsletter - August 2024
August 28, 2024

Scarf Newsletter - August 2024

Stay up to date with the latest updates from Scarf. Discover upcoming features, industry news, partnerships, and events. August 2024 Newsletter.
Scarf
Scarf
How Apache Superset Implemented Scarf
August 22, 2024

How Apache Superset Implemented Scarf

In this playbook, you’ll learn how to integrate Scarf into an Apache Software Foundation project. It details how the Preset team implemented Scarf in their Apache Superset project, as shared during our first-ever Scarf Summit on July 16th, 2024.
Scarf
Scarf

How to: Using anonymous downloads, website traffic, and documentation views to generate leads

Navigating the world of open source software can often feel like trying to find a needle in a haystack when it comes to identifying potential leads and customers. We're up against several unique challenges that aren't typically seen in other industries. Firstly, we're competing against 'free' - a tough proposition in any business context. Secondly, the open-source nature of our software means that many of our users and their respective companies stay hidden behind the veil of anonymity, turning customer identification into a high-stakes game of hide and seek. And there are countless other hurdles that add to the complexity of this landscape. Despite these challenges, there's a wealth of untapped potential buried within anonymous download data, web traffic, and documentation views. Stick around as we unravel the mystery of transforming this sea of anonymous data into valuable company profiles, turning seemingly anonymous interactions into meaningful business opportunities.

Lack of Leads: This is not a new problem

Some time ago, I worked for a company called MySQL AB, the company that brought us MySQL.

The company, like many others offering commercial open source solutions, grappled with several challenges. A big one was figuring out who was using their software. After all, as an open source company, we were not only competing with 'free', we also had limited visibility into who was using our products since they were freely distributed through various anonymous channels.

The way MySQL handled this was by tapping into download and traffic data. It worked like this - our sales and marketing team would approach a company about the value of our support, to which they would respond, "We know MySQL, but we don't use your database (or any open source, for that matter)". That’s when we'd drop the bombshell: "Well, that's funny because our records show you downloaded it 1000 times last year." You can imagine the surprise on their faces! This led to some serious internal discussions and often revealed a significant usage of our software within their tech stack, unbeknownst to the decision-makers.

Fast forward 20 years, and we're still dealing with the same problems, but at an even faster pace. Open source usage has exploded, but managers are still not fully aware of the components and dependencies in their tech stack. And with so much software coming from anonymous channels, it's still a struggle for commercial open source companies to figure out who's using their software.

But hey, it's not all doom and gloom. Challenges also present opportunities, right? We just need to learn to engage with our user base, not just for sales conversations but to ensure they're using our open source software effectively and can deploy it successfully in production. So, let's explore together how to make the most of this situation in this data-driven world.

Important Activities You Should Track in Open Source:

There are 3 events I would suggest everyone track.  

  1. Downloads/Pull events
  2. Views of documentation
  3. Views of content/website (pages, blogs, tutorials)

The first is downloads (no matter if it's direct on your website, via a container registry, or via public repositories. Scarf allows you to track and aggregate downloads across all these different channels). This is probably the most valuable action. A download means someone has not only some interest in your product but enough interest to try it out.

There are three aspects to downloads which you should be paying attention to:

  • The number of downloads from unique sources at a company - more than one machine/source downloading is good.
  • The volume of downloads over a time period at the company - you want to see continued downloads over time, this implies ongoing usage.
  • Is the company downloading newer versions of the software over time - this is gold as it implies not only are they downloading but they are trying to keep things up to date and implies the software is critical enough to have maintenance procedures around it.

The second on the list is documentation views. People using your software will often have questions about how to install, use, and upgrade the software. You will see patterns evolve over time in the usage of the software docs depending on the software. Initially you will see more traffic to installation and setup sections. This coupled with download events is a great indicator or testing or trying things out. Then users will evolve more into troubleshooting or optimization views. See more page views shift to this is normal. Then you should see views to readmes or upgrade pages as they settle into maintenance and sustain mode. Ultimately I would be looking for views over an extended period of time to ensure they are invested and not just kicking the tires.  

The third on the list is content/website views. Not all views will be coming from docs, in fact for commercial purposes there are certain pages on your website that may be highly predictive of potential interest in becoming a customer (i.e. the pricing pages). But I recommend looking for ongoing views and traffic hitting blogs and other news on the product and upcoming releases.  

For each of the events, I would recommend breaking down all the activities into either good/better/best or low/medium/high impact events. Here is a suggested list of criteria when it comes to classifying events:

GOOD BETTER BEST
Downloads 1 or more downloads in a week. More than 1 download over a
30-day period.
Multiple downloads over a 90-day period,
including incremental downloads
of new versions.
Documentation Views Repeated views on installation
and setup instructions.
Documentation views spanning
more than 30 days from multiple sources.

More than just install page views.
Documentation views spanning more
than 90 days from multiple sources.

Doc views on upgrades and maintenance
procedures.
Website Traffic Multiple pages visited and viewed
by 1 company over a week period.
Multiple pages visited and viewed by 1
company over a 30-day period.

Page views to medium value content.
I.e. Reading technical blogs, visiting forum pages,
product feature pages.
Multiple pages visited and viewed by 1
company over a 30-day period.

Page views to high-value content.
I.e. Visiting the pricing pages, visiting but
not signing up on the signup page, etc.

The Riskiest But Most Valuable Metric: Ongoing Usage

While the three activities above are straightforward and generally not viewed with too much concern, there is a fourth activity or metric you can (and probably should) track.  An essential, albeit controversial, activity that serves as a highly valuable metric for any organization seeking to understand the usage patterns of its software - the use of 'call-home' functionality, also known as ongoing usage tracking. The call-home functionality is a mechanism within your software that sends a signal, or a 'ping', back to a designated server or gateway. This signal provides you with real-time information about your software's usage in live production environments, surpassing the insight level gained from just tracking downloads.

While download data can indicate interest and repeated use of your software, the ongoing, consistent 'ping' or call-home activity serves as a definitive predictor of your software's actual usage. Consider this the 'Nirvana' of metrics for your projects, the golden standard that allows you to measure the exact magnitude of your active install base and the frequency of software usage and deployment.

However, implementing this mechanism requires a degree of technical adaptation. Platforms like Scarf, for instance, offer this capability out-of-the-box. But to make full use of it, you'll need to adjust your application accordingly. There are different ways to accomplish this; for JavaScript applications, a package called 'Scarf-JS' can be used. Alternatively, a lightweight, background 'ping' or activity back to a Scarf gateway event can be employed. This ping can be triggered when your application starts up, is used, or at any other specified event.

In essence, your application would asynchronously call back to the gateway website, which doesn't return any data but rather tracks that the application was active. If you can successfully implement this, you can then monitor the number of unique pings over a certain period from various sources. This is incredibly valuable for lead scoring as it provides consistent, ongoing proof of life from these systems, making it the most valuable event or activity you could track.

Lead Scoring or User Scoring is Still Needed:

Not all people visiting your website and downloading your software are equally likely to become customers. In fact you will find 3x, 5x, or even 10x  more drive by traffic as you will find those interested in commercial offerings. To become efficient at finding which companies and users you should focus on, let's explore the concept of “lead scoring”. 

Lead scoring is a methodology used by sales and marketing departments to determine the worthiness of leads, or potential customers, by assigning values to them based on their behavior relating to their interest level in products or services. These values, or scores, are derived from a variety of factors like the professional information they've submitted, how they've engaged with the company's website, or their response to marketing efforts. The purpose of lead scoring is to prioritize leads who are more likely to convert into customers, allowing teams to focus their time and resources effectively. It's a vital part of creating an efficient sales and marketing strategy.

If you've already established a lead scoring system and are utilizing marketing software, consider events in open-source channels as additional data points to further qualify or uncover leads. For instance, a software download could be treated as a high-value (or high-score) activity, whereas a documentation view might be scored similarly to other website visits. It could be beneficial to categorize documentation and page views into high, medium, and low scoring pages, as certain pages (like pricing or install pages) can be more predictive and valuable than others.

The purpose of lead scoring is to prioritize leads who are more likely to convert into customers, allowing teams to focus their time and resources effectively.

The key distinction between traditional lead scoring and the incorporation of open-source download and traffic data lies in the summarization of data at the company level, requiring decisions on scoring criteria. Most marketing lead management tools track users based on sign-ups, cookies, or other mechanisms, capturing specifics such as Matt from Scarf signing up for a webinar. With data from anonymous sources, the best we can do is infer that someone from Scarf has downloaded your software.

The question then becomes: if you know Matt attended a webinar and works at Scarf, does the Scarf download make Matt a more qualified lead? Or should you shift your focus to other individuals at Scarf, possibly higher up in the management hierarchy? There's no absolute right or wrong answer, but my inclination would be to enrich the data of the known user who has already shown interest.

Additionally, it's important to note that software downloads can often be automated. Seeing ten downloads a day doesn't necessarily equate to thousands of servers or the potential for a massive deal. This data needs to be scrutinized, at the very least, by examining the unique systems or origins from where these downloads originate.

Lastly, when incorporating open-source downloads and traffic data, the timeline of events becomes critical. A single download could mean anything, but consistent downloads over several months, especially with each new version release, suggests a real, potentially highly qualified user.

Different Phases of Interest Description Events Action
Passive interest:
Hello World
Someone discovered or visited your website.
They may or may not have any interest in your
software or projects.
Web traffic to docs or websites over the
course of 1 or 2 days.
I would not take any action here.
Intrigued in your software:
This looks interesting
Someone takes more than a drive by interest in
your software. They are truly interested in what you have.
Documentation views.
Looking at install docs and/or feature lists.
Typically this is over multiple days.
I would consider promoting content to
that company's target audience (engineers?)
on other external channels.
Trial & Exploration:
Let me try this out
They move from just learning about the software
to actually downloading it.
Documentation and website views of high value
pages along with at least 1 download event.
You still see this traffic over multiple days but
typically over a week or two.
I would recommend promoting blogs or
how-tos that are interesting to this group of customers.
You could even promote this content directly on your
website when these visitors appear.
Testing & Evaluation:
I wonder if I can use
this for this project
Now someone is looking deeper into this and is starting
to either use it or seriously consider it.
Sustained page views and multiple downloads
over a month period.
Here is where additional content promotion is still a
good idea, but where there is a strong commercial offering
targeting these customers can be effective.
Implementation & Reliance:
This is cool, let's use this in production
Someone is using this over a longer period of time and looks to be beyond merely testing/trying out. If you see activities (both downloads and traffic) spread over a 90 day period, there is a high confidence in their usage in a critical space. This is the best time to seek out conversations.

- Cold outreach
- Targeted ads
- Seek out devs at conferences
Maintenance & Ongoing Upkeep:
Keeping things updated and safe
Someone has been using your software for months and is grabbing new versions of your software and reading readmes or regular updates (like blogs). Look for activities over months (3-12 months), with downloads of multiple versions. Also look for views on readmes or product specific content (blogs, etc). This is the best time to seek out conversations.

- Cold outreach
- Targeted ads
- Seek out devs at conferences
Waning Interest & Potential Churn:
Uh oh… this user is at risk
Usage is dropping and there is risk that this user may turn from an active user to a former user. If you see massive drop offs in traffic and downloads over a 30 day period this sends up red flags.

So we identified interesting companies; now, what do you do with this data?

This section of the guide provides recommendations on how to utilize the data obtained from downloads, website traffic, and documentation usage to enhance product adoption and discover potential leads. Different strategies are outlined for integrating these insights into existing sales/marketing activities, developing a product-led growth strategy, and for startups or new sales/marketing initiatives.

Becoming a Customer is a Journey:

Becoming a customer is indeed a journey that mirrors the transformation of a budding interest into a commercial relationship. This journey begins with a spark of curiosity, driving an individual to explore and try out the software. As they interact with the software, they begin to craft something unique, leading to deployment in a production environment. This stage often uncovers additional needs that may call for a commercial relationship, such as expert support, advanced features, or scale-up capacities.

This entire process can undoubtedly unfold organically, but it can be significantly enhanced, made more fruitful, or even accelerated by tailoring the right activities towards a user or company at the appropriate time. Key players in facilitating this journey include Product Development, Marketing, Developer Relations (DevRel), Sales, and even the Community. They collectively orchestrate a symphony of support and guidance for the user, with each instrument playing a vital role at the right moment.

An aptly timed article, a resonant message, a well-crafted tutorial, or a stimulating community discussion can serve as powerful catalysts in this journey, greatly influencing the user's progression. However, it's crucial to maintain a delicate balance. Overzealous pushing or rushing can result in adverse consequences, creating resistance or disengagement rather than fostering advancement.

As such, it becomes paramount to possess a deep understanding of where a company or user is in their journey. The more detailed your insights into their progress, the more effectively you can tailor your efforts. Similarly, having robust metrics around what strategies are fruitful and which ones fall short is equally beneficial. These insights not only inform your current strategies but also help shape your future approaches, ensuring you continuously enhance your user's journey towards becoming a valuable customer.

When they face challenges or opportunities in their production environment, show them how you can help them succeed with your solutions.

General Advice:

In today's data-driven world, harnessing and leveraging the power of download data and website traffic information can yield impressive results for organizations of all sizes, from startups to established enterprises. However, effectively employing these data requires a strategic and tailored approach to meet the unique needs and goals of each organization. Below are some general recommendations based on the discussions above that can apply across the board:

  • Understand Your Audience: Use download data and website traffic information to build a deeper understanding of your audience. This involves analyzing who is downloading your software, viewing your documentation, and browsing your website. With this information, you can enrich your existing leads, score potential ones, and build a well-informed customer profile.
  • Customize Your Approach: Once you've gathered and analyzed your data, tailor your marketing and sales processes to align with your findings. Whether you're focusing on sales/marketing or product, align your strategies and activities with the preferences and behaviors of your users. This could involve adjusting lead scoring based on the activity level or nurturing potential users to become ongoing ones.
  • Integrate Data with Existing Processes: Integrate your new data with your existing sales, marketing, and customer success processes. For instance, using download patterns to assess the churn potential can help you anticipate and mitigate customer attrition.
  • Adopt a Nurturing Approach: When it comes to new or startup sales/marketing processes, take a nurturing approach. This means guiding users through a lifecycle where they are initially familiarized with your software, then nurtured to become regular users, and eventually led to become paid customers.
  • Leverage Social Media: Social media platforms offer targeted marketing opportunities. Platforms like LinkedIn allow you to aim your promoted content towards specific companies and job titles.
  • Optimize Content: Make use of your existing content and create new content based on where your users and companies are spending the most time. Calls-to-action (CTAs) on these pages can effectively guide users through your marketing funnel.
  • Community Engagement: Encourage users to join your community, participate in events, and engage in discussions. Community engagement can serve as a powerful tool for user retention and organic growth.
  • Monitor and Adapt: Regularly assess the effectiveness of your strategies and be willing to make necessary adjustments. The digital landscape is ever-evolving, and your strategies should be adaptable to accommodate these changes.

Remember, the overarching aim should be to use this data to deliver value to your users, nurture relationships, and ultimately drive the growth of your organization.

Integrating within Existing Sales/Marketing Activities:

Existing sales and marketing activities can be significantly enriched by smartly integrating download data and website traffic information. By revising your lead scoring methodology to include new data points such as software downloads and page visits, you can ensure that you are incorporating the latest indicators of interest from your audience. The enhanced lead scoring will provide a more nuanced understanding of your prospective customers, paving the way for more targeted and effective outreach.

Use the company lists generated from this data in your cold outreach activities. By focusing your outreach efforts on these companies, you are targeting organizations already demonstrating interest, thereby increasing your chances of gaining a receptive audience. These lists can also serve as a valuable resource for your Business Development Representatives (BDRs), equipping them with a list of vetted leads, saving time and improving their efficiency.

Additionally, using this data, you can strategically plan meetings at conferences, events, and similar networking platforms with representatives from companies using or showing interest in your product. This targeted networking can lead to higher-value interactions and ultimately result in stronger leads.

Incorporating the pattern of downloads into your customer success and renewal operations can provide a more comprehensive customer overview. Such insights into customer behavior can inform your renewal strategies, equipping you with necessary foresight to address potential issues and ensure customer satisfaction. Moreover, the data can be a key indicator of potential churn risks, allowing you to proactively manage customer retention by identifying and addressing their concerns before they choose to discontinue your service.

TLDR:

  • Use the data to enrich your existing set of leads. You can add additional events to your lead scoring process.
  • Use the data to build a highly qualified list for outreach activities. Target companies that are using your software or are interested in your software.
  • Use this data to inform your marketing strategies. For example, prioritize individuals from companies that have shown interest in your software at meetings, conferences, and events.
  • If you have a fully fleshed out sales, marketing, and customer success process, use the data to assess churn risk.

Startup or New Marketing/Sales Activities (Active Prospecting):

For startups or companies initiating new sales and marketing initiatives, creating a lightweight growth engine that nurtures potential users can be the key to driving growth. Setting up a lifecycle or nurture campaign can guide potential users through your marketing funnel, providing them with the right content at the right time to foster interest and engagement.

Promoted content can be a powerful tool in these campaigns. Aimed at users in the early stages of engagement, this content can educate users about your software, showcasing its features and benefits and encouraging them to explore it further. As these potential users turn into ongoing users, you can begin to introduce promoted content, offers, and cold outreach to convert them into paying customers.

Understanding the customer journey is crucial in a startup or new marketing environment. By mapping out this journey and identifying combinations of events and thresholds, you can strategize when to increase or decrease marketing activities for optimal effect. This dynamic approach can keep your marketing efforts agile and responsive to user behavior.

Social media platforms like LinkedIn offer a targeted way to reach specific companies

TLDR:

  • Use this data to build a lightweight marketing and growth engine.
  • Approach the process as a life cycle or nurture type campaign. Nurture potential users until they become productive users.
  • Use promoted content targeted towards companies that are downloading or have looked at your documentation.
  • Once users are actively using your software, shift the focus to ongoing maintenance and new releases. Then, start introducing your paid offerings or services.
  • Use social media to engage potential users.
  • Integrate the scarf platform into your existing community activity to help nurture and guide potential users.

Integrating into a Product-Led Growth Strategy:

In a product-led growth strategy, the primary focus is on expanding product usage, and insights from website traffic and download data can play a crucial role in driving this growth. You can target specific companies with promoted content on various channels, catching the attention of potential or current users and stimulating their interest in your software.

Educational resources such as blogs, tutorials, and videos offer a non-intrusive way to engage companies that are exploring your software. These resources can help prospective users understand the value your product offers and how it can address their needs, fostering trust and driving product adoption.

Networking can also play a pivotal role in a product-led growth strategy. You can seek out speakers and attendees from targeted companies at industry conferences and events, fostering relationships that can lead to future collaborations or customers.

To gain a holistic picture of user engagement and behavior, consider merging this download and website usage data into your community tools, such as Common Room. This integration will allow you to monitor how users interact with your product and community, providing insights that can help shape your product development and marketing strategies.

TLDR:

  • Use download data to understand product adoption and usage patterns.
  • Monitor decreasing downloads or decreasing activity as a potential indicator of churn.
  • Use the data to understand which stage of the company's life cycle the users are in. This will help inform product development and roadmap decisions.
  • Integrate with existing community tools to build a complete picture of potential users -> users -> community members.
  • Use CTA’s (Call to Action) for events like join our community where you can convert anonymous users to known.

Using Scarf:

Introducing Scarf to your Community:

When adding Scarf to your website or as part of your deployment strategy you may get questions from users.

Here is some basic information about Scarf that others have found useful in discussing with their users when asked about using Scarf:

  • Scarf is used by 1000’s of projects to collect analytics for package downloads, documentation views, and website traffic
  • Scarf is fully GDPR compliant and ensure PII is protected
  • Scarf has passed the privacy, compliance, and legal requirements to be approved by open source foundations like the Apache Foundation
  • https://privacy.apache.org/policies/privacy-policy-public.html
  • https://privacy.apache.org/faq/committers.html
  • Scarf provides cookie-less and privacy conscious documentation and privacy focused website and documentation analytics
  • Scarf stores only the bare minimum metadata needed to collect and aggregate analytics data for our users.

Scarf also provides your users with other benefits:

  • Your downloads no longer are locked to a single hosting provider or service.  As services (such as container registers or package managers) change their terms of service or make changes to their offerings, you can adjust your hosting without changing your docs or impacting your users in the future. 
  • Scarf can be used to determine how exposed your user base is to old or insecure software, enabling your project to take a proactive approach to informing and educating your user base of potential issues 
  • Improves the sustainability of your project by providing data on the real user base to investors (without exposing PII).  

Setting up Scarf:

Scarf is very straightforward to get started with.  

Overall the process is:

1. Signup for a free account at: http://app.scarf.sh/register

2. For downloads,

     a. Setup a new package URL via the Scarf Gateway within your Scarf Dashboard.

     b. Point this URL to your current download endpoints.

     c. Update installation and setup documentation to direct users to use the gateway.

3. For Documentation or website tracking:

     a. Create a Scarf Tracking Pixel and add it to the pages you want analytics for (whether on your site or on third party sites).

4. For Link Tracking and social monitoring:

     a. Create a new URL in the Scarf Gateway as a redirect/link shortener to your website, Youtube, Hacker News, or other sites.  

     b. When posting links on social media use the new URL instead of the main link.  Data will then be available in the Scarf dashboard.

5. For Basic Call Home functionality:

     a. Create a basic URL in Scarf Gateway that will act as an endpoint for your applications to ping.

     b. Point the URL to a blank page.

     c. In your software issue an async web call/ping/or page load using (your favorite tool i.e. curl/libcurl, etc).  Note you can call this on start, daily, every time something runs, up to you.  You can throw away the result, the mere background call to open the URL is enough.

You can see our 3 minute tutorial on Youtube here: 

If you are looking for documentation on tracking links to your website or posts via social media we produced a tutorial for this as well:

You can read our documentation here:  https://docs.scarf.sh/

Scarf Tracking Recommendations:

There are lots of different things you can track using Scarf, here is a list of recommendations from our users.

Basic tracking:

  • Tracking package downloads via the Scarf Gateway with a custom URL
    - Create custom variables for each version of your software - enabling version tracking
    - If you are an OSS project that’s supported by multiple vendors and/or an open source foundation, it may be easier to use Scarf URLs for your gateway packages rather than a custom domain, e.g. apacheproject.gateway.scarf.sh rather than apacheproject.org
    - In file package routes, you can add more variables to the incoming path for tracking purposes even if they are not used in the outgoing URL, and this can be used for attribution. e.g. download.com/v1.0/referal_source or similar.
    - File package route variables are very robust, so you can even put entire websites or paths behind it, ie website.com/{+path} . You can probably achieve most tasks with only a couple of routes.
    - You can use GitHub Actions’ cron functionality to run scheduled export jobs of your Scarf data for free!
    - Include referring domain where possible:
    scarf.gateway.scarf.sh/abc.com/{referer_domain}
  • Tracking website and documentation tracking with a Scarf Tracking Pixel
    - Add a different pixel for each category of page view i.e. high value, medium value, low value.  
    - You can add multiple tracking pixels to a single page if need be.
    - Including the referring page where need be.
    - Cross-site tracking

Advanced tracking:

  • Call home functionality via gateway and/or scarf-js
  • Link sharing tracking via the gateway using a Customer URL
    • Use variables to allow for custom pages…
      • I.e. Youtube
        1. Redirect /youtube/{videoname} to abc.com/youtube/{videoname}
        2. This allows you to use the same gateway for multiple videos on youtube