Splunk and Sumo Logic both compete for a similar target market within the log management and analytics space. On the surface, their offerings primarily differ in deployment model—Splunk offers an on-premises solution while Sumo Logic’s product is cloud-based. Yet, each solution brings its own value propositions and considerations, and each impacts customers differently. The following is a detailed comparison between Splunk and Sumo Logic
Network endpoints in a technology-driven enterprise generate terabytes of log data that contain invaluable information on infrastructure performance, security and more. In the big data cosmos, an intuitive operational intelligence platform is required to index, search and analyze the deluge of machine-generated big data coming your way. Splunk does that for you. It aims to be like Google for log files.
Splunk Enterprise contains two main services: the Splunk Daemon (splunkd) and Splunk Web Services (splunkweb). The Splunk daemon is written in C++ and offers a solid internal architecture for fast and effective data collection, storage, indexing and search capabilities. The Splunk Web Services is written in AJAX, Python and XML, among other languages to create an intuitive and easy-to-use graphical user interface. The proprietary data store that powers the Splunk Enterprise platform was built on the Google File System concept, providing revolutionary storage and search capabilities for unstructured log data generated in various formats.
Data moves through Splunk with three processing tiers: Data Input, Index and Search Management. A system of Splunk Forwarders consumes raw machine data and preprocesses it before forwarding it to Indexers. The latter converts raw machine data into events and stores the resulting information. Indexers further search indexed data from this repository. (This Splunk resource describes the process in detail.)
Core products include:
- Splunk® Enterprise: On-premises solution for search, monitoring and analysis of machine data.
- Splunk Cloud™: Similar functionality to the Enterprise version, delivered as a SaaS offering.
- Splunk Light: Limited log search and analysis features for startups and small IT shops.
- Hunk®: Analytics and visualization tool for data in Hadoop.
Strong Search Mechanism
Splunk is widely regarded as the best search engine for log files in the market today. The platform features the Splunk Processing Language (SPL), based on the Unix pipeline and SQL. SPL allows users to modify, filter and manipulate indexed data and perform contextual and conditional searches using a range of search commands and functions. Users can add custom scripts to search queries, and Splunk supports SDK and Web frameworks for log management activities including reporting, alerts, and investigations. If specific data is not available from the log files, users can create customized search scripts and program the platform to ingest the output. The platform can search from the historical indexed data and log files generated in real time. The search system uses MapReduce for statistical analysis.
Store Now, Parse Later
Splunk stores log data without necessitating predetermined schema or running into support or parsing issues. The platform indexes new logs based on previously known configurations, or stores it for later use. The stream of characters is divided into recognizable event information based on timestamps and other event properties that may have been predefined. Users can apply Field Extraction rules to create customize field parsing. Since the data ingesting process takes place in real time, the time and effort required to set up a data ingestion environment is reduced significantly. Field extraction mostly takes place at search time, as a fixed schema is not always provided unlike traditional databases for log files.
Dashboard and Usage
Although the quality of dashboard and UI is subjective, many users consider Splunk a convenient and easy-to-use log management tool with several intuitive features and functionality. The platform offers a great deal of visualization capabilities to make log management, monitoring and analysis a simple and straightforward experience. Users can add their own dashboards, app widgets, extensions, charts, tables and other listings using the Splunk dashboard editor or write scripts in XML code.
As a pioneer in the log management solutions segment, Splunk has evolved to a point where there is an entire app ecosystem to its name. The Splunkbase app store offers close to 600 apps and plugins that add unprecedented functionality to the Splunk platform. The apps can be used to analyze virtually every format of log data, create dashboards, or simply act as data input agents. Competing products will have a long way to go before they can come up with such a feature-rich app ecosystem.
The big data architecture of Splunk offers significant scalability at the log management layer. The company excels in the distributed file system architecture of a master (search heads) and slave (Indexers) system. This results in intensive search capabilities, allowing up to 48 concurrent users per search head at 200-300 GB per day. (See more on performance recommendations here.) End-users don’t have to set up and manage a backend database. The platform keeps ingesting data and storing it to the file system. Multiple Splunk servers can be added on a whim, without necessitating multiple configuration changes. The platform optimizes performance across all available server instances. Redundancy features ensure there is no single point of failure. Splunk can maintain virtually endless volumes of data without losing granularity of past events. Since everything is set up on-premises, users can transfer vast volumes of data across the network faster. High security is inherent due to the deployment model of the platform. Sensitive log data never leaves your corporate network, keeping you in full control over data security. Of course, rising performance and capacity requirements necessitate investments in new, additional hardware resources.
Documentation, Community and Support
This is one of the strongest points for the platform, and plenty of time will pass before a competitor can replicate the community, documentation and support standards of Splunk. Documentation is available for all Splunk versions and the community actively participates on forums to address challenges faced by end-users.
- Cloud: The Splunk Cloud offers limited features.
- Cost: Splunk is one of the most expensive platforms available in the market. The cost of building an entire infrastructure and resources to support the platform makes its cost effectiveness for SMB firms questionable .
- Need to Go Beyond Log Management and Analytics: Technology and data-driven organizations require more than just log management and analytics capabilities. Advanced application analytics tools may be required for accurate investigation and analysis at the machine level. Additional automation tools that can extract metrics information from applications without much manual input, and dependencies on logs are also important. (Splunk is also not a full-on SIEM tool, either.)
- Limited Correlation: Splunk offers limited correlation capabilities. Strong knowledge of SPL is required to perform manual correlation capabilities that Splunk offers.
- Lack of Aggregation: Splunk initially stores log data without aggregating it. As a result, you can run out of storage capacity fairly quickly.
- Low Compression Rate: Raw data is compressed at around 50 percent, which is only average in comparison with compression ratios of competing SIEM tools.
- No API Search: Limited API capabilities. Users cannot search or manage data sources via API.
Sumo Logic is essentially designed as a cloud-based version of the Splunk Enterprise with similar functionality and features. Sumo Logic is a relatively new entrant in the log management and analytics market segment, but has fast-evolved as a key competitor to Splunk. While Sumo Logic is not as mature as Splunk, it does offer innovation and pricing that are particularly attractive to SMBs and startup firms. Sumo Logic has focused its development efforts to enable real-time, continuous intelligence across the delivery pipeline and organization-wide infrastructure for DevOps, security and compliance use cases. A focus on automation and the ability to create customized scripts for virtually all sorts of alerts give Sumo Logic a slight edge over Splunk. Sumo Logic leverages machine learning algorithms to think beyond limited predefined SIEM rules and accelerate the process of log management and analytics.
Under the hood, Sumo Logic follows a similar data flow architecture as Splunk. The data is collected from multiple host sources and sent to the backend cloud network. Tags are added to the data during collection to create metadata fields. Similar to Splunk, Sumo Logic does not require predefined schema or parsing to store and index data. The ingested data can be stored in its native format and analyzed in real time. Sumo Logic is a single, unified log management and analytics platform that features elastic processing capabilities and is based on globally distributed data retention architecture.
Unlike Splunk, Sumo Logic is a cloud-native software solution that does not require additional hardware infrastructure resources from customers. Implementation is as simple as signing up for any cloud product. Sumo Logic offers two options to set up data collectors. Customers can either choose to host collectors on Sumo Logic’s AWS network or install them on Sumo machines deployed on-premises as a hybrid cloud system. Sumo Logic offers out-of-the-box support and integration for AWS and VMWare, among other cloud services.
To maximize performance, Sumo Logic offers compute capacity of 3,000 times and analyzes 600 percent of the contracted daily volume as part of its Service License Agreement.
Sumo Logic Continuous Intelligence
The platform uses a continuous intelligence model that keeps learning about your log data. Machine learning algorithms help users identify anomalies and insightful patterns across their network infrastructure. The model keeps refining to improve prediction accuracy pertaining to network performance, availability, anomalies, issues, capacity requirements, and resource saturation, among others. For Splunk, the same process requires more supervision and analysis using the Splunk SPL.
Dashboard and Usage
Sumo Logic offers a feature-rich and intuitive dashboard that allows users to get on board fast. Users can toggle between light and dark dashboard themes, which is especially beneficial for users looking at the dashboard and visualized reports for prolonged periods of time. A high level of customization across tables, charts, dashboards and the wider interface is also offered.
Apps and API Integration
Sumo Logic’s continuous intelligence capability is particularly useful in generating reports automatically based on real-time data. Several apps are available to monitor infrastructure and create insightful reports to aid compliance. Strong encryption is used to maintain security as the platform accesses sensitive information from multiple sources. Customers can use APIs to not only access data, but also configure data sources and return reports. Users can interface the REST/HTTP-based API from any third-party application or programming language. Splunk does not offer the same level of extensibility.
The Sumo Logic apps ecosystem is nowhere near as rich as that of Splunk. The handful of available apps are all incredibly useful and cover most typical sources, including web services, operating systems, cloud offerings and compliance and security use cases. The cloud version of Splunk doesn’t offer apps integration, making Sumo Logic one of the most feature-rich log management solutions delivered from the cloud.
Sumo Logic is a cost-effective alternative to Splunk for organizations that require advanced log management and analytics capabilities, but cannot afford to build an enterprise-grade infrastructure in-house to support tools such as Splunk Enterprise. Pricing starts at $90 per month for 1GB/day of log data, although there’s also a free option to test the waters—However, pricing rises fast because many data-driven organizations need to manage and analyze a deluge of log data on a daily basis in the context of a diverse range of metrics and KPIs. The cost would, therefore, keep rising.
- Security Concerns: Cybercrime is a reality. Despite the precautions and layers of security implemented by Sumo Logic, many customers are just not comfortable with handing over sensitive log data to third-party datacenters outside of their corporate network.
- High Bandwidth Resources: Although this isn’t a concern for many large enterprises, startup firms and SMBs around the world may not have solid and secure Internet connectivity to transmit terabytes of log data from their network on a daily basis.
- Lack of an App Ecosystem: There aren’t necessarily enough apps to satisfy the functionality requirements of Sumo Logic’s growing userbase.
- Lack of Vendor Support: While app integration and APIs support many large vendors, the same level of support is not available for smaller vendors that may be popular outside of the U.S.
- High Pricing: Although the cost is low during the initial phase, pricing rises fast enough for SMBs to reconsider their investments.
- Insufficient Documentation and Community Support: Since the platform is not as popular as Splunk, the documentation and community support lacks depth.
- Not the Most Feature-rich: Splunk offers a lot more features and functionality out-of-the-box.
Clearly, there is no clear winner in the Splunk vs. Sumo Logic comparison. Each platform offers its own value propositions and constraints that will determine investments between the two. Because both solutions are costly, customers should evaluate their business and technology requirements and carefully review both solutions before finalizing their decision.
About the Author
Ali Raza is a DevOps consultant who analyzes IT solutions, practices, trends and challenges for large enterprises and promising new startup firms.