Course Code: aiopsact
Duration: 14 hours
Prerequisites:
  • 具备使用监控系统(如Prometheus或ELK)的经验
  • 具备Python的基础知识以及基本的机器学习技能
  • 熟悉事件管理工作流程

受众

  • 高级站点可靠性工程师(SREs)
  • IT自动化架构师
  • DevOps与可观测性平台负责人
Overview:

AIOps(IT运维人工智能)正越来越多地用于预测事件发生前的故障,并自动化根本原因分析(RCA),以最小化停机时间并加速问题解决。

本课程为讲师引导的培训(线上或线下),面向希望使用AIOps工具和机器学习模型实现预测分析、自动化修复并设计智能RCA工作流程的高级IT专业人士。

在本培训结束时,参与者将能够:

  • 构建和训练机器学习模型,以检测导致系统故障的模式。
  • 基于多源日志和指标相关性,自动化RCA工作流程。
  • 将告警和修复流程集成到现有平台中。
  • 在生产环境中部署和扩展智能AIOps管道。

课程形式

  • 互动讲座与讨论。
  • 大量练习与实践。
  • 在实时实验室环境中进行实际操作。

课程定制选项

  • 如需为本课程定制培训,请联系我们进行安排。
Course Outline:

Predictive AIOps 简介

  • IT 运维中的预测分析概述
  • 用于预测的数据来源(日志、指标、事件)
  • 时间序列预测和异常模式的关键概念

设计事件预测模型

  • 标记历史事件和系统行为
  • 选择和训练模型(例如 LSTM、Random Forest、AutoML)
  • 评估模型性能和处理误报

数据收集与特征工程

  • 摄取和对齐日志与指标数据以供模型输入
  • 从结构化和非结构化数据中提取特征
  • 处理操作管道中的噪声和缺失数据

自动化根因分析(RCA)

  • 基于图的服务与基础设施关联
  • 使用机器学习从事件链中推断可能的根因
  • 使用拓扑感知仪表板可视化 RCA

修复与 Workflow Automation

  • 与自动化平台集成(例如 Ansible、Rundeck)
  • 触发回滚、重启或流量重定向
  • 审核和记录自动化干预

扩展智能 AIOps 管道

  • MLOps 用于可观测性:重新训练和模型版本控制
  • 在分布式节点上实时运行预测
  • 在生产环境中部署 AIOps 的最佳实践

案例研究与实际应用

  • 使用预测 AIOps 模型分析真实事件数据
  • 使用合成和生产数据部署 RCA 管道
  • 行业用例回顾:云中断、微服务不稳定、网络退化

总结与后续步骤

Sites Published:

United Arab Emirates - AIOps in Action: Incident Prediction and Root Cause Automation

Qatar - AIOps in Action: Incident Prediction and Root Cause Automation

Egypt - AIOps in Action: Incident Prediction and Root Cause Automation

Saudi Arabia - AIOps in Action: Incident Prediction and Root Cause Automation

South Africa - AIOps in Action: Incident Prediction and Root Cause Automation

Brasil - AIOps in Action: Incident Prediction and Root Cause Automation

Canada - AIOps in Action: Incident Prediction and Root Cause Automation

中国 - AIOps in Action: Incident Prediction and Root Cause Automation

香港 - AIOps in Action: Incident Prediction and Root Cause Automation

澳門 - AIOps in Action: Incident Prediction and Root Cause Automation

台灣 - AIOps in Action: Incident Prediction and Root Cause Automation

USA - AIOps in Action: Incident Prediction and Root Cause Automation

Österreich - AIOps in Action: Incident Prediction and Root Cause Automation

Schweiz - AIOps in Action: Incident Prediction and Root Cause Automation

Deutschland - AIOps in Action: Incident Prediction and Root Cause Automation

Czech Republic - AIOps in Action: Incident Prediction and Root Cause Automation

Denmark - AIOps in Action: Incident Prediction and Root Cause Automation

Estonia - AIOps in Action: Incident Prediction and Root Cause Automation

Finland - AIOps in Action: Incident Prediction and Root Cause Automation

Greece - AIOps in Action: Incident Prediction and Root Cause Automation

Magyarország - AIOps in Action: Incident Prediction and Root Cause Automation

Ireland - AIOps in Action: Incident Prediction and Root Cause Automation

Luxembourg - AIOps in Action: Incident Prediction and Root Cause Automation

Latvia - AIOps in Action: Incident Prediction and Root Cause Automation

España - AIOps in Action: Incident Prediction and Root Cause Automation

Italia - AIOps in Action: Incident Prediction and Root Cause Automation

Lithuania - AIOps in Action: Incident Prediction and Root Cause Automation

Nederland - AIOps in Action: Incident Prediction and Root Cause Automation

Norway - AIOps in Action: Incident Prediction and Root Cause Automation

Portugal - AIOps in Action: Incident Prediction and Root Cause Automation

România - AIOps in Action: Incident Prediction and Root Cause Automation

Sverige - AIOps in Action: Incident Prediction and Root Cause Automation

Türkiye - AIOps in Action: Incident Prediction and Root Cause Automation

Malta - AIOps in Action: Incident Prediction and Root Cause Automation

Belgique - AIOps in Action: Incident Prediction and Root Cause Automation

France - AIOps in Action: Incident Prediction and Root Cause Automation

日本 - AIOps in Action: Incident Prediction and Root Cause Automation

Australia - AIOps in Action: Incident Prediction and Root Cause Automation

Malaysia - AIOps in Action: Incident Prediction and Root Cause Automation

New Zealand - AIOps in Action: Incident Prediction and Root Cause Automation

Philippines - AIOps in Action: Incident Prediction and Root Cause Automation

Singapore - AIOps in Action: Incident Prediction and Root Cause Automation

Thailand - AIOps in Action: Incident Prediction and Root Cause Automation

Vietnam - AIOps in Action: Incident Prediction and Root Cause Automation

India - AIOps in Action: Incident Prediction and Root Cause Automation

Argentina - AIOps in Action: Incident Prediction and Root Cause Automation

Chile - AIOps in Action: Incident Prediction and Root Cause Automation

Costa Rica - AIOps in Action: Incident Prediction and Root Cause Automation

Ecuador - AIOps in Action: Incident Prediction and Root Cause Automation

Guatemala - AIOps in Action: Incident Prediction and Root Cause Automation

Colombia - AIOps in Action: Incident Prediction and Root Cause Automation

México - AIOps in Action: Incident Prediction and Root Cause Automation

Panama - AIOps in Action: Incident Prediction and Root Cause Automation

Peru - AIOps in Action: Incident Prediction and Root Cause Automation

Uruguay - AIOps in Action: Incident Prediction and Root Cause Automation

Venezuela - AIOps in Action: Incident Prediction and Root Cause Automation

Polska - AIOps in Action: Incident Prediction and Root Cause Automation

United Kingdom - AIOps in Action: Incident Prediction and Root Cause Automation

South Korea - AIOps in Action: Incident Prediction and Root Cause Automation

Pakistan - AIOps in Action: Incident Prediction and Root Cause Automation

Sri Lanka - AIOps in Action: Incident Prediction and Root Cause Automation

Bulgaria - AIOps in Action: Incident Prediction and Root Cause Automation

Bolivia - AIOps in Action: Incident Prediction and Root Cause Automation

Indonesia - AIOps in Action: Incident Prediction and Root Cause Automation

Kazakhstan - AIOps in Action: Incident Prediction and Root Cause Automation

Moldova - AIOps in Action: Incident Prediction and Root Cause Automation

Morocco - AIOps in Action: Incident Prediction and Root Cause Automation

Tunisia - AIOps in Action: Incident Prediction and Root Cause Automation

Kuwait - AIOps in Action: Incident Prediction and Root Cause Automation

Oman - AIOps in Action: Incident Prediction and Root Cause Automation

Slovakia - AIOps in Action: Incident Prediction and Root Cause Automation

Kenya - AIOps in Action: Incident Prediction and Root Cause Automation

Nigeria - AIOps in Action: Incident Prediction and Root Cause Automation

Botswana - AIOps in Action: Incident Prediction and Root Cause Automation

Slovenia - AIOps in Action: Incident Prediction and Root Cause Automation

Croatia - AIOps in Action: Incident Prediction and Root Cause Automation

Serbia - AIOps in Action: Incident Prediction and Root Cause Automation

Bhutan - AIOps in Action: Incident Prediction and Root Cause Automation

Nepal - AIOps in Action: Incident Prediction and Root Cause Automation

Uzbekistan - AIOps in Action: Incident Prediction and Root Cause Automation