How To Select Open Source Software

We are often asked by our clients to help select the best open source software for a given use case. For example, we might be asked to select a data warehouse for analytics queries, a time series database for marketing events, or an ML library for a recommendation engine. Through a combination of research, data analysis, and prototyping, we work with our clients to make the decision for their business.

For each use case, we first try to understand their business requirements, in terms of functionality, performance, and cost.  We then look at the potential open source solutions (and cloud services, if appropriate) available that might solve the underlying problem. Given a subset of most promising candidates, we dig deeper to understand best practices around each solution, how well the architecture fits with the intended use, and how well the software fits with the customer’s current stack (e.g.integrations,  infrastructure, and team skills). We also look at licenses and make sure the customer picks a project whose license is compatible with their business model.

We then evaluate the open source project itself. Using data we have gathered, we look at the history of the project. Is activity on the project growing and declining? Who is contributing to the project? Are the founders still with the project? Does the project have a large corporate or institutional backer? We map the project to a stage of the open source software life cycle: Niche, Mainstream, or Inactive, as described in this blog post. We also look at the quality of the documentation, the clarity of the public APIs, and the availability of help, whether it is through an issue tracker, public forums, or commercial support.

Finally, we may prototype a small subset of the use case to validate functionality or create use case specific benchmarks to evaluate performance. For example, we recently benchmarked a time series database for use as a cache of marketing events. We found that a mismatch between the design of the database and our access patterns resulted in poor performance. Ultimately, we found a solution for the customer that was a better match for their use case.

There is a lot to consider when picking an open source project, but, with a systematic approach, you can be confident in your decisions.