The role of Machine Learning and Artificial Intelligence in Observability is one of the focus area of research for us in Rishidot Research. It is our strong belief that using ML and AI on Observability data is absolutely necessary for the next generation observability products because
- Cloud native makes enterprise IT more distributed adding multiple dimensions to the Observability of application lifecycle
- Traditional SRE approaches based on few clearly defined failure domains cannot help in the cloud native world with grey failures due to unknowns are going to be the fact of life
We are still in early days when it comes to taking advantage of ML and AI but the train has already left the station. In a previous post, we highlighted some vendors taking advantage of ML/AI in the monitoring, logging and tracing data and we spoke about how Splunk is well positioned to take advantage of ML and AI.
This week, Splunk announced that they are bringing ML and AI capabilities to Splunk Cloud and Splunk Enterprise in their newest release. The product now supports tons of data ranging from data center operations to servers to IoT and Edge devices. In order to better support larger datasets, Splunk is also adding support for Apache Kafka. In addition, Splunk has beefed up Machine Learning Toolkit to better support identifying machine learning experiments and also newer algorithms that can identify previously unknown patters, This can lead to train the models needed for their platform. In addition, Splunk IT Service Intelligence predicts outages and service health. Splunk User Behavior Analytics is also updated to accelerates threat Identification using Machine Learning.
The key to using machine learning in Observability is the training data. Getting training data is a difficult problem because the needs of every organization is different. Making machine learning relevant to organizations is entirely dependent on how relevant the training data from the context of the end user. Splunk has tremendous advantage in terms of data from all their customers. The generic patterns from their user data when coupled with additional learning from customer data is a powerful starting point to marry machine learning with observability.
IT’s role has been in the flux ever since cloud computing gained traction. The traditional ideas of perimeter changed, processes changed and, now with cloud native architectures, even the traditional ideas of application has changed to loosely coupled modular components that may be distributed across regions or even cloud providers. When you add IoT and edge to the mix (which will also come under the wraps of IT), traditional domain knowledge about failures or even grey failures will fall short. Eventually, machine learning and artificial intelligence are going to be critical part of operations and the Observability data is going to be the underlying fabric for this future.
Splunk has taken a step in this direction. Vendors like Honeycomb, New Relic, Instana, CloudFabrix, Swim.ai, CoreStack, etc. are positioning themselves for a future of operations driven by machine learning and AI. As an enterprise decision maker, it is important for you to consider this future and prepare the data architecture needed for Observability in tune with this future.
We recently hosted a virtual panel on the role of ML and AI in Observability with Andi Mann from Splunk participating in the panel. You can watch the recording below.
Disclosure: CoreStack and CloudFabrix is StackSense sponsor