Job: Technical Incident Manager

Job Category: Operations
Location: Bellevue, WA, US
Job ID: 848946-121863
Division: Cloud and Enterprise Engineering

Description

The CPOC Technical Duty Officer position is a senior-level position in the Commerce Platform Operations Center (CPOC) that provides management level support to their shift. The CPOC Technical Duty Officer is expected to manage Major Incidents, ensure that all documentation surrounding a Major Incident is accurate and complete, and ensure that communication with management is clear and complete. 

Purpose of Job:
The Operation Center Technical Duty Officer is accountable for rapid identification and recovery from service impacting outages and incidents.

The Operation Center Technical Duty Officer Co-ordinates technical personnel through phases of triage and assessment, restoration, verification confirmation and repair planning. Candidates must be ready to be responsible for the availability & performance of our global services. 

The Operation Center Technical Duty Officer should also be expert on what levers to pull to restore service and what teams to contact in the event they do not have the ability to restore service. The Technical Duty Officer is accountable for key activities during an incident or crisis, including ownership of the technical control bridge and communications with stakeholder functions

The ideal candidate will have experience running or developing large online services, be able to make quick data driven decisions under pressure, and be able to work during non-core business hours

Key responsibilities
• Manages the Technical Operations communications with key stakeholder groups including Engineering Leadership, Customer Services, PR. Facilitates and chairs technical incident audio bridge/conference Effectively coordinate these incidents across multiple organizations within Microsoft
• Define and drive efforts to improve our incident coordination and communication capabilities
• Ability to multi-task on several incidents and/or projects at once
• Responsible for developing and maintaining incident management processes, policies and procedures
• Is responsible for service restoration; acting as the leader during service outages or periods of service degradation within the Commerce Platform.
• Drive service improvements through strong partnerships with change management, problem management and Engineering teams
• Problem Management: Supports the Problem Management Process. Tracks and investigates problems in his or her area of expertise by using standard and custom problem management tools and processes, and develops plans and recommendations for improvements
• Develop, communicate and drive Service Improvement Plans to maintain an environment of continuous improvement with a focus on growth, cost and quality
• Document troubleshooting guides to help the Operations Center team resolve lower priority incidents

Level of Authority 
Incident Management: Leadership of the Incident Management process and incident management response. Documents and investigates escalated incidents and incidents across multiple service dependencies. Analyze incidents, identify their severity, propose and implement fixes. Drives decision making for incident resolution and minimizing impact to the business. Plan the appropriate resources to resolve critical incidents in the environment.

Qualifications and Capabilities
Technical
• Bachelor’s degree in Computer Science or related discipline preferred. 
• At least 1 year of Operations Center/monitoring experience at Shift Lead level.
• Solid working knowledge of Microsoft SQL, Web Store and Cloud services
• Demonstrated experience in Application and/or Infrastructure Support including Service, Problem or Incident Management.
• ITIL/MOF Certification or similar certification is preferred

Other
• Ability and willingness to work a flexible schedule providing 24x7x365 support. 
• Excellent customer service, strong interpersonal communication, and superior written/oral communication skills required. 
• Highly motivated, self-starting individual contributor, capable of working closely and effectively within a team.
• Previous experience working in a 24x7x365 production environment supporting critical real-time applications.
• Operations Support experience following procedures and monitoring operations.
• Proven leadership and cross-group collaboration skills.

ST:OCP

Nearest Major Market: Seattle 
Nearest Secondary Market: Bellevue 
Job Segments: IT Manager, Information Technology, Engineer, Computer Science, Database, Technology, Engineerin
g

0 comments:

Post a Comment