Navigating the world of open source software can often feel like trying to find a needle in a haystack when it comes to identifying potential leads and customers. We're up against several unique challenges that aren't typically seen in other industries. Firstly, we're competing against 'free' - a tough proposition in any business context. Secondly, the open-source nature of our software means that many of our users and their respective companies stay hidden behind the veil of anonymity, turning customer identification into a high-stakes game of hide and seek. And there are countless other hurdles that add to the complexity of this landscape. Despite these challenges, there's a wealth of untapped potential buried within anonymous download data, web traffic, and documentation views. Stick around as we unravel the mystery of transforming this sea of anonymous data into valuable company profiles, turning seemingly anonymous interactions into meaningful business opportunities.
Lack of Leads: This is not a new problem
Some time ago, I worked for a company called MySQL AB, the company that brought us MySQL.
The company, like many others offering commercial open source solutions, grappled with several challenges. A big one was figuring out who was using their software. After all, as an open source company, we were not only competing with 'free', we also had limited visibility into who was using our products since they were freely distributed through various anonymous channels.
The way MySQL handled this was by tapping into download and traffic data. It worked like this - our sales and marketing team would approach a company about the value of our support, to which they would respond, "We know MySQL, but we don't use your database (or any open source, for that matter)". That’s when we'd drop the bombshell: "Well, that's funny because our records show you downloaded it 1000 times last year." You can imagine the surprise on their faces! This led to some serious internal discussions and often revealed a significant usage of our software within their tech stack, unbeknownst to the decision-makers.
Fast forward 20 years, and we're still dealing with the same problems, but at an even faster pace. Open source usage has exploded, but managers are still not fully aware of the components and dependencies in their tech stack. And with so much software coming from anonymous channels, it's still a struggle for commercial open source companies to figure out who's using their software.
But hey, it's not all doom and gloom. Challenges also present opportunities, right? We just need to learn to engage with our user base, not just for sales conversations but to ensure they're using our open source software effectively and can deploy it successfully in production. So, let's explore together how to make the most of this situation in this data-driven world.
Important Activities You Should Track in Open Source:
There are 3 events I would suggest everyone track.
- Downloads/Pull events
- Views of documentation
- Views of content/website (pages, blogs, tutorials)
The first is downloads (no matter if it's direct on your website, via a container registry, or via public repositories. Scarf allows you to track and aggregate downloads across all these different channels). This is probably the most valuable action. A download means someone has not only some interest in your product but enough interest to try it out.
There are three aspects to downloads which you should be paying attention to:
- The number of downloads from unique sources at a company - more than one machine/source downloading is good.
- The volume of downloads over a time period at the company - you want to see continued downloads over time, this implies ongoing usage.
- Is the company downloading newer versions of the software over time - this is gold as it implies not only are they downloading but they are trying to keep things up to date and implies the software is critical enough to have maintenance procedures around it.
The second on the list is documentation views. People using your software will often have questions about how to install, use, and upgrade the software. You will see patterns evolve over time in the usage of the software docs depending on the software. Initially you will see more traffic to installation and setup sections. This coupled with download events is a great indicator or testing or trying things out. Then users will evolve more into troubleshooting or optimization views. See more page views shift to this is normal. Then you should see views to readmes or upgrade pages as they settle into maintenance and sustain mode. Ultimately I would be looking for views over an extended period of time to ensure they are invested and not just kicking the tires.
The third on the list is content/website views. Not all views will be coming from docs, in fact for commercial purposes there are certain pages on your website that may be highly predictive of potential interest in becoming a customer (i.e. the pricing pages). But I recommend looking for ongoing views and traffic hitting blogs and other news on the product and upcoming releases.
For each of the events, I would recommend breaking down all the activities into either good/better/best or low/medium/high impact events. Here is a suggested list of criteria when it comes to classifying events:
The Riskiest But Most Valuable Metric: Ongoing Usage
While the three activities above are straightforward and generally not viewed with too much concern, there is a fourth activity or metric you can (and probably should) track. An essential, albeit controversial, activity that serves as a highly valuable metric for any organization seeking to understand the usage patterns of its software - the use of 'call-home' functionality, also known as ongoing usage tracking. The call-home functionality is a mechanism within your software that sends a signal, or a 'ping', back to a designated server or gateway. This signal provides you with real-time information about your software's usage in live production environments, surpassing the insight level gained from just tracking downloads.
While download data can indicate interest and repeated use of your software, the ongoing, consistent 'ping' or call-home activity serves as a definitive predictor of your software's actual usage. Consider this the 'Nirvana' of metrics for your projects, the golden standard that allows you to measure the exact magnitude of your active install base and the frequency of software usage and deployment.
However, implementing this mechanism requires a degree of technical adaptation. Platforms like Scarf, for instance, offer this capability out-of-the-box. But to make full use of it, you'll need to adjust your application accordingly. There are different ways to accomplish this; for JavaScript applications, a package called 'Scarf-JS' can be used. Alternatively, a lightweight, background 'ping' or activity back to a Scarf gateway event can be employed. This ping can be triggered when your application starts up, is used, or at any other specified event.
In essence, your application would asynchronously call back to the gateway website, which doesn't return any data but rather tracks that the application was active. If you can successfully implement this, you can then monitor the number of unique pings over a certain period from various sources. This is incredibly valuable for lead scoring as it provides consistent, ongoing proof of life from these systems, making it the most valuable event or activity you could track.
Lead Scoring or User Scoring is Still Needed:
Not all people visiting your website and downloading your software are equally likely to become customers. In fact you will find 3x, 5x, or even 10x more drive by traffic as you will find those interested in commercial offerings. To become efficient at finding which companies and users you should focus on, let's explore the concept of “lead scoring”.
Lead scoring is a methodology used by sales and marketing departments to determine the worthiness of leads, or potential customers, by assigning values to them based on their behavior relating to their interest level in products or services. These values, or scores, are derived from a variety of factors like the professional information they've submitted, how they've engaged with the company's website, or their response to marketing efforts. The purpose of lead scoring is to prioritize leads who are more likely to convert into customers, allowing teams to focus their time and resources effectively. It's a vital part of creating an efficient sales and marketing strategy.
If you've already established a lead scoring system and are utilizing marketing software, consider events in open-source channels as additional data points to further qualify or uncover leads. For instance, a software download could be treated as a high-value (or high-score) activity, whereas a documentation view might be scored similarly to other website visits. It could be beneficial to categorize documentation and page views into high, medium, and low scoring pages, as certain pages (like pricing or install pages) can be more predictive and valuable than others.
The key distinction between traditional lead scoring and the incorporation of open-source download and traffic data lies in the summarization of data at the company level, requiring decisions on scoring criteria. Most marketing lead management tools track users based on sign-ups, cookies, or other mechanisms, capturing specifics such as Matt from Scarf signing up for a webinar. With data from anonymous sources, the best we can do is infer that someone from Scarf has downloaded your software.
The question then becomes: if you know Matt attended a webinar and works at Scarf, does the Scarf download make Matt a more qualified lead? Or should you shift your focus to other individuals at Scarf, possibly higher up in the management hierarchy? There's no absolute right or wrong answer, but my inclination would be to enrich the data of the known user who has already shown interest.
Additionally, it's important to note that software downloads can often be automated. Seeing ten downloads a day doesn't necessarily equate to thousands of servers or the potential for a massive deal. This data needs to be scrutinized, at the very least, by examining the unique systems or origins from where these downloads originate.
Lastly, when incorporating open-source downloads and traffic data, the timeline of events becomes critical. A single download could mean anything, but consistent downloads over several months, especially with each new version release, suggests a real, potentially highly qualified user.
So we identified interesting companies; now, what do you do with this data?
This section of the guide provides recommendations on how to utilize the data obtained from downloads, website traffic, and documentation usage to enhance product adoption and discover potential leads. Different strategies are outlined for integrating these insights into existing sales/marketing activities, developing a product-led growth strategy, and for startups or new sales/marketing initiatives.
Becoming a Customer is a Journey:
Becoming a customer is indeed a journey that mirrors the transformation of a budding interest into a commercial relationship. This journey begins with a spark of curiosity, driving an individual to explore and try out the software. As they interact with the software, they begin to craft something unique, leading to deployment in a production environment. This stage often uncovers additional needs that may call for a commercial relationship, such as expert support, advanced features, or scale-up capacities.
This entire process can undoubtedly unfold organically, but it can be significantly enhanced, made more fruitful, or even accelerated by tailoring the right activities towards a user or company at the appropriate time. Key players in facilitating this journey include Product Development, Marketing, Developer Relations (DevRel), Sales, and even the Community. They collectively orchestrate a symphony of support and guidance for the user, with each instrument playing a vital role at the right moment.
An aptly timed article, a resonant message, a well-crafted tutorial, or a stimulating community discussion can serve as powerful catalysts in this journey, greatly influencing the user's progression. However, it's crucial to maintain a delicate balance. Overzealous pushing or rushing can result in adverse consequences, creating resistance or disengagement rather than fostering advancement.
As such, it becomes paramount to possess a deep understanding of where a company or user is in their journey. The more detailed your insights into their progress, the more effectively you can tailor your efforts. Similarly, having robust metrics around what strategies are fruitful and which ones fall short is equally beneficial. These insights not only inform your current strategies but also help shape your future approaches, ensuring you continuously enhance your user's journey towards becoming a valuable customer.
General Advice:
In today's data-driven world, harnessing and leveraging the power of download data and website traffic information can yield impressive results for organizations of all sizes, from startups to established enterprises. However, effectively employing these data requires a strategic and tailored approach to meet the unique needs and goals of each organization. Below are some general recommendations based on the discussions above that can apply across the board:
- Understand Your Audience: Use download data and website traffic information to build a deeper understanding of your audience. This involves analyzing who is downloading your software, viewing your documentation, and browsing your website. With this information, you can enrich your existing leads, score potential ones, and build a well-informed customer profile.
- Customize Your Approach: Once you've gathered and analyzed your data, tailor your marketing and sales processes to align with your findings. Whether you're focusing on sales/marketing or product, align your strategies and activities with the preferences and behaviors of your users. This could involve adjusting lead scoring based on the activity level or nurturing potential users to become ongoing ones.
- Integrate Data with Existing Processes: Integrate your new data with your existing sales, marketing, and customer success processes. For instance, using download patterns to assess the churn potential can help you anticipate and mitigate customer attrition.
- Adopt a Nurturing Approach: When it comes to new or startup sales/marketing processes, take a nurturing approach. This means guiding users through a lifecycle where they are initially familiarized with your software, then nurtured to become regular users, and eventually led to become paid customers.
- Leverage Social Media: Social media platforms offer targeted marketing opportunities. Platforms like LinkedIn allow you to aim your promoted content towards specific companies and job titles.
- Optimize Content: Make use of your existing content and create new content based on where your users and companies are spending the most time. Calls-to-action (CTAs) on these pages can effectively guide users through your marketing funnel.
- Community Engagement: Encourage users to join your community, participate in events, and engage in discussions. Community engagement can serve as a powerful tool for user retention and organic growth.
- Monitor and Adapt: Regularly assess the effectiveness of your strategies and be willing to make necessary adjustments. The digital landscape is ever-evolving, and your strategies should be adaptable to accommodate these changes.
Remember, the overarching aim should be to use this data to deliver value to your users, nurture relationships, and ultimately drive the growth of your organization.
Integrating within Existing Sales/Marketing Activities:
Existing sales and marketing activities can be significantly enriched by smartly integrating download data and website traffic information. By revising your lead scoring methodology to include new data points such as software downloads and page visits, you can ensure that you are incorporating the latest indicators of interest from your audience. The enhanced lead scoring will provide a more nuanced understanding of your prospective customers, paving the way for more targeted and effective outreach.
Use the company lists generated from this data in your cold outreach activities. By focusing your outreach efforts on these companies, you are targeting organizations already demonstrating interest, thereby increasing your chances of gaining a receptive audience. These lists can also serve as a valuable resource for your Business Development Representatives (BDRs), equipping them with a list of vetted leads, saving time and improving their efficiency.
Additionally, using this data, you can strategically plan meetings at conferences, events, and similar networking platforms with representatives from companies using or showing interest in your product. This targeted networking can lead to higher-value interactions and ultimately result in stronger leads.
Incorporating the pattern of downloads into your customer success and renewal operations can provide a more comprehensive customer overview. Such insights into customer behavior can inform your renewal strategies, equipping you with necessary foresight to address potential issues and ensure customer satisfaction. Moreover, the data can be a key indicator of potential churn risks, allowing you to proactively manage customer retention by identifying and addressing their concerns before they choose to discontinue your service.
TLDR:
- Use the data to enrich your existing set of leads. You can add additional events to your lead scoring process.
- Use the data to build a highly qualified list for outreach activities. Target companies that are using your software or are interested in your software.
- Use this data to inform your marketing strategies. For example, prioritize individuals from companies that have shown interest in your software at meetings, conferences, and events.
- If you have a fully fleshed out sales, marketing, and customer success process, use the data to assess churn risk.
Startup or New Marketing/Sales Activities (Active Prospecting):
For startups or companies initiating new sales and marketing initiatives, creating a lightweight growth engine that nurtures potential users can be the key to driving growth. Setting up a lifecycle or nurture campaign can guide potential users through your marketing funnel, providing them with the right content at the right time to foster interest and engagement.
Promoted content can be a powerful tool in these campaigns. Aimed at users in the early stages of engagement, this content can educate users about your software, showcasing its features and benefits and encouraging them to explore it further. As these potential users turn into ongoing users, you can begin to introduce promoted content, offers, and cold outreach to convert them into paying customers.
Understanding the customer journey is crucial in a startup or new marketing environment. By mapping out this journey and identifying combinations of events and thresholds, you can strategize when to increase or decrease marketing activities for optimal effect. This dynamic approach can keep your marketing efforts agile and responsive to user behavior.
Social media platforms like LinkedIn offer a targeted way to reach specific companies
TLDR:
- Use this data to build a lightweight marketing and growth engine.
- Approach the process as a life cycle or nurture type campaign. Nurture potential users until they become productive users.
- Use promoted content targeted towards companies that are downloading or have looked at your documentation.
- Once users are actively using your software, shift the focus to ongoing maintenance and new releases. Then, start introducing your paid offerings or services.
- Use social media to engage potential users.
- Integrate the scarf platform into your existing community activity to help nurture and guide potential users.
Integrating into a Product-Led Growth Strategy:
In a product-led growth strategy, the primary focus is on expanding product usage, and insights from website traffic and download data can play a crucial role in driving this growth. You can target specific companies with promoted content on various channels, catching the attention of potential or current users and stimulating their interest in your software.
Educational resources such as blogs, tutorials, and videos offer a non-intrusive way to engage companies that are exploring your software. These resources can help prospective users understand the value your product offers and how it can address their needs, fostering trust and driving product adoption.
Networking can also play a pivotal role in a product-led growth strategy. You can seek out speakers and attendees from targeted companies at industry conferences and events, fostering relationships that can lead to future collaborations or customers.
To gain a holistic picture of user engagement and behavior, consider merging this download and website usage data into your community tools, such as Common Room. This integration will allow you to monitor how users interact with your product and community, providing insights that can help shape your product development and marketing strategies.
TLDR:
- Use download data to understand product adoption and usage patterns.
- Monitor decreasing downloads or decreasing activity as a potential indicator of churn.
- Use the data to understand which stage of the company's life cycle the users are in. This will help inform product development and roadmap decisions.
- Integrate with existing community tools to build a complete picture of potential users -> users -> community members.
- Use CTA’s (Call to Action) for events like join our community where you can convert anonymous users to known.
Using Scarf:
Introducing Scarf to your Community:
When adding Scarf to your website or as part of your deployment strategy you may get questions from users.
Here is some basic information about Scarf that others have found useful in discussing with their users when asked about using Scarf:
- Scarf is used by 1000’s of projects to collect analytics for package downloads, documentation views, and website traffic
- Scarf is fully GDPR compliant and ensure PII is protected
- Scarf has passed the privacy, compliance, and legal requirements to be approved by open source foundations like the Apache Foundation
- https://privacy.apache.org/policies/privacy-policy-public.html
- https://privacy.apache.org/faq/committers.html
- Scarf provides cookie-less and privacy conscious documentation and privacy focused website and documentation analytics
- Scarf stores only the bare minimum metadata needed to collect and aggregate analytics data for our users.
Scarf also provides your users with other benefits:
- Your downloads no longer are locked to a single hosting provider or service. As services (such as container registers or package managers) change their terms of service or make changes to their offerings, you can adjust your hosting without changing your docs or impacting your users in the future.
- Scarf can be used to determine how exposed your user base is to old or insecure software, enabling your project to take a proactive approach to informing and educating your user base of potential issues
- Improves the sustainability of your project by providing data on the real user base to investors (without exposing PII).
Setting up Scarf:
Scarf is very straightforward to get started with.
Overall the process is:
1. Signup for a free account at: http://app.scarf.sh/register
2. For downloads,
a. Setup a new package URL via the Scarf Gateway within your Scarf Dashboard.
b. Point this URL to your current download endpoints.
c. Update installation and setup documentation to direct users to use the gateway.
3. For Documentation or website tracking:
a. Create a Scarf Tracking Pixel and add it to the pages you want analytics for (whether on your site or on third party sites).
4. For Link Tracking and social monitoring:
a. Create a new URL in the Scarf Gateway as a redirect/link shortener to your website, Youtube, Hacker News, or other sites.
b. When posting links on social media use the new URL instead of the main link. Data will then be available in the Scarf dashboard.
5. For Basic Call Home functionality:
a. Create a basic URL in Scarf Gateway that will act as an endpoint for your applications to ping.
b. Point the URL to a blank page.
c. In your software issue an async web call/ping/or page load using (your favorite tool i.e. curl/libcurl, etc). Note you can call this on start, daily, every time something runs, up to you. You can throw away the result, the mere background call to open the URL is enough.
You can see our 3 minute tutorial on Youtube here:
If you are looking for documentation on tracking links to your website or posts via social media we produced a tutorial for this as well:
You can read our documentation here: https://docs.scarf.sh/
Scarf Tracking Recommendations:
There are lots of different things you can track using Scarf, here is a list of recommendations from our users.
Basic tracking:
- Tracking package downloads via the Scarf Gateway with a custom URL
- Create custom variables for each version of your software - enabling version tracking
- If you are an OSS project that’s supported by multiple vendors and/or an open source foundation, it may be easier to use Scarf URLs for your gateway packages rather than a custom domain, e.g. apacheproject.gateway.scarf.sh rather than apacheproject.org
- In file package routes, you can add more variables to the incoming path for tracking purposes even if they are not used in the outgoing URL, and this can be used for attribution. e.g. download.com/v1.0/referal_source or similar.
- File package route variables are very robust, so you can even put entire websites or paths behind it, ie website.com/{+path} . You can probably achieve most tasks with only a couple of routes.
- You can use GitHub Actions’ cron functionality to run scheduled export jobs of your Scarf data for free!
- Include referring domain where possible:
scarf.gateway.scarf.sh/abc.com/{referer_domain} - Tracking website and documentation tracking with a Scarf Tracking Pixel
- Add a different pixel for each category of page view i.e. high value, medium value, low value.
- You can add multiple tracking pixels to a single page if need be.
- Including the referring page where need be.
- Cross-site tracking
Advanced tracking: