|
Overview The Operations & Infrastructure Manager (AIOps) at Con Edison is responsible for maintaining the reliability, resiliency, and operational performance of the company's enterprise IT and telecommunications infrastructure. This role supports Con Edison's mission to deliver safe, reliable, and clean energy by ensuring that the systems supporting field operations, grid modernization, customer platforms, and corporate functions operate with maximum availability.The position combines traditional infrastructure management with next-generation AIOps capabilitiesusing automation, analytics, and machine learning to proactively predict, prevent, and resolve operational issues. The ideal candidate understands utility operations, NERC-CIP influence areas, and mission-critical infrastructure requirements.
Responsibilities
Core Responsibilities
- Oversee 24 7 monitoring of Con Edisons IT and telecom infrastructure including data centers, substations connectivity, control systems interfaces, cloud platforms, and enterprise applications.
- Maintain real-time visibility across network, compute, storage, and operational technology (OT) supporting energy distribution and field operations.
- Lead optimization of monitoring and observability platforms.
- Deploy and administer AIOps solutions to detect anomalies, correlate events, predict failures, and drive automated remediation.
- Reduce operational noise and false-positive alerts through machine learning models and intelligent triage.
- Integrate AIOps with existing NOC workflows, ITSM platforms, and enterprise automation tools.
- Drive reduction of Mean Time to Detect (MTTD) and Mean Time to Resolve (MTTR) using predictive analytics and automated root-cause analysis.
- Apply predictive AIOps capabilities to forecast resource needs and prevent outages.
- Develop automation workflows to streamline troubleshooting, service restarts, patch validations, and configuration drift detection.
- Implement auto-remediation for recurring issues across IT and telecom systems.
- Produce real-time dashboards, operational scorecards, and reliability insights for leadership.
- Drive continuous enhancement of monitoring, automation, and operational stability.
- Lead and manage a team of direct reports.
Qualifications
Required Education/Experience
- Bachelor's Degree and 8 years of related work experience
Preferred Education/Experience
- Bachelor's Degree and 10 years of related work experience Experience working in customer communications, back office program management, billing and case management related field work. Experience working in the Clean Energy Marketplace
Relevant Work Experience
- 5+ years of experience in IT Operations, Infrastructure Management, Network Operations, or Telecom Operations, required.
- Experience with monitoring/observability tools and ITSM systems, required. (ServiceNow preferred).
- Experience supporting critical infrastructure environments or industries with high reliability requirements (utilities, telecom, transportation, finance, public safety), required.
- Hands-on automation and scripting skills (Python, PowerShell, Ansible, Terraform), required.
- Familiarity with hybrid cloud environments (Azure, AWS) and data center operations, required.
- Understanding of networking, servers, virtualization, firewalls, and enterprise telecom infrastructure, required.
- Strong communication skills and experience leading major incident responses, required.
- Experience managing a team of direct reports, required.
- Exposure to utility systems or energy operations (EMS/DMS, field communications, SCADA, OT networks), preferred.
- Knowledge of NERC-CIP, ICS security, or utility regulatory frameworks, preferred.
- Relevant certifications: ITIL, AWS/Azure, CCNA/CCNP, or SRE, preferred
- Experience with AIOps tools such as BigPanda, Moogsoft, Dynatrace, ScienceLogic, or Datadog, preferred.
Skills and Abilities
- Proficient in English written and verbal communication skills
- Effective leadership skills
- Ability to simultaneously handle multiple priorities
- Ability to work within tight timeframes and meet strict deadlines
Licenses and Certifications
- Driver's License Required
Physical Demands
- Ability to push, pull, and lift up to 25 pounds
- Sit or stand to use a keyboard, mouse, and computer for the duration of the workday
- Possess manual dexterity and the ability to use hands for the duration of the workday
- Ability to stoop, bend, reach, and kneel throughout the workday
- Stand to use/operate office equipment for the duration of the workday
- Ability to read small print and symbols
Additional Physical Demands
- The selected candidate will be assigned a System Emergency Assignment (i.e., an emergency response role) and will be expected to work non-business hours during emergencies, which may include nights, weekends, and holidays.
- Ability to respond to emergencies during off-hours
|