Tingyun NeurAlert

AI implements alarm convergence, and scenario-based alarm triggering avoids alarm storms

Alert convergence with AI, and scenario-based alert to avoid alert storms

Tingyun NeurAlert is an unified alert management platform, which uses AI technology to achieve alert convergence for monitoring platforms such as Zabbix and Prometheus. Alert can be triggered in terms of scenarios based on machine learning technology, effectively avoiding alert storm and alert fatigue.

Free Trial

Unified monitoring

O&M data is collected from different monitoring tools and platforms, the data is standardized and enriched, and finally the centralized visualization is realized, so that the correlation analysis between the data can be realized, and the global monitoring, overall analysis, and accurate decision-making can be achieved.

Intelligent noise reduction

Through the “rule + AI” dual mode to achieve intelligent noise reduction of events, while reducing frequent interference, you can also identify important alarms from low-level events to automatically upgrade and notify users to avoid missing alarms.

Event correlation

Through artificial intelligence, the essence of event correlation is found based on big data analysis, supplemented by data models such as CMDB resource topology and application call chain, and the correlation knowledge base is constructed to aggregate related events.

Fault location

Construct an event causality graph model, train and learn the model through historical event data, domain knowledge, and related information, and perform root cause analysis and fault location according to event causality.

AI enhancements

Through visual, user-friendly language to inform AI how to better learn the logic and laws behind the data, accelerate the model training effect, and double enhance AI capabilities.

Teams collaborate efficiently

Timely handling of found faults, recording processing actions through event comments and replies to enable team members to understand the situation in a timely manner, efficient communication, collaborative processing, and rapid response.

Multiple data access

Multivariate Data AcquisitionMultiple

data acquisition

There are many monitoring systems under construction, and the data generated by each of them is separated from each other, which cannot form an effective correlation and cannot produce value. For O&M data (metrics, logs, events, and topologies), you can collect metadata in real time from open source monitoring tools, commercial monitoring software, APIs, message queues, mail, documents, and other data sources, and cleanse, process, calculate, and analyze the data, and finally visualize it in a centralized and unified manner.

Anomaly detection

Metric Anomaly DetectionDemonth

anomaly detection

A threshold that is too high for metric alerts can lead to a complaint for missed reports, and a threshold that is too low can cause too much noise to miss the real anomaly. Bidding farewell to the problem of inaccurate fixed threshold and baseline threshold of traditional indicators, and comprehensively judging the fluctuation changes of indicators according to the changes in indicator fluctuations such as period, trend, time pattern and other factors, the system automatically selects appropriate anomaly detection algorithms to detect real-time dynamic change data of indicators, identify real abnormal behavior trigger alarms, and improve the accuracy of alarms.

Alarm convergence

Alarm Storm Suppression

Alarm Storm Suppression

When managing large-scale service architectures, once a system failure occurs, it will lead to a large number of repeated useless alarm storms, causing trouble to O&M personnel. TheTingyun north meditation alarm platform intelligently and automatically filters, compresses, merges, and deduplications the alarm events, and finally aggregates them into an advanced event, that is, fault notification to the user for processing, reducing alarm noise, reducing information interference, and reducing the pressure on O&M personnel to handle alarms.

Root cause analysis

Failure root cause analysis

Failure root cause analysis

In today’s virtualized and highly redundant IT environment, how can you quickly determine the cause of a failure? The Tingyun North Meditation Alarm Platform is committed to investigating the root causes affecting business services, using machine learning technology to analyze the contextual information provided by big data, understanding the correlation characteristics such as correlation, dependency and causality of events, and inferring possible root causes; It can also improve the accuracy of the root cause analysis algorithm according to user feedback, improve the efficiency of O&M resolution, and reduce the impact of service interruption.

Multi-management

Combined with peripheral systems Combined

with peripheral systems

By integrating CMDB to enrich alarm events, and enhancing event correlation capabilities according to CMDB’s resource correlation relationship, while expanding the scope of aggregation and improving the accuracy of aggregation, the aggregated faults are connected with itsM work order system to form a full-life cycle management of fault closed loops, and fault voice calls can also be realized through integration with call centers.