Director – SRE & Cloud Ops

Website Mass Mutual

Objectives and Responsibilities

The Site Reliability Engineering (SRE) Team at Mass Mutual is part of the overall Cloud & Dev-Ops organization and provide cloud systems engineering, DevOps tools management, Cloud reliability engineering and level3 support for deploying and maintaining the technical solutions in production and non-production environments. The SRE team handles complex cloud integrations and is accountable for improving the performance, resiliency, visibility, availability and reliability of the public and private cloud technology infrastructure at Mass Mutual.
SRE team is looking to add a new technology director. This role will report directly to the ‘Head of SRE’ and will manage 24X7 Cloud Operations, leveraging the support from Global Capability Centers. This role will also be accountable for supporting private cloud containerization efforts and application migrations to the public cloud.

 

Technical Design, Development & Problem Solving

  • Heavy experience within public cloud infrastructure (AWS and Azure) and CI/CD pipeline automation. Be able to effectively translate design requirements into engineering tasks Build system components integrating appropriate technologies as needed Responsible for the following: application migrations, dev-ops tools maintenance & support, container orchestration, tagging, cloud cost management, AMI optimization.
  • Work hand-in-hand with enterprise architecture, application partners, cloud product management and cloud operations. Deep understanding and experience with Infrastructure as code (IaC) to manage cloud resources, accounts and services Experience building scalable production systems from start to finish adhering to the business needs and incorporating security, scalability, high availability, telemetry and observability

Critical Thinking & Problem Solving

  • Awareness of cloud services and supporting integrations. Ability to think out of the box and help resolve critical technical issues Master problem solving techniques by identifying the root cause and provide permanent and timely solutions to the problems. Respect for the IT processes and organizational policies Good stakeholder relationship management skills

Execution & Delivery

  • Be self-directed and self-motivated in following up on project tasks and maintenance. Work with fellow engineers, product owners and enterprise stakeholders to help deliver innovative, data integration & platform solutions. Define realistic estimating to assigned tasks, with clear assumptions & acceptance criteria. Independently deliver on assigned tasks within projected time frame Deliver with team spirit in mind (partner with team members, ensure they understand the work being done) to ensure success of the assigned project.
  • Adhere to team’s delivery methodology (agile, kanban e.tc.) and find/recommend ways to ensure efficiencies and improvements. Look for opportunities to reduce costs, improve function and inject value

Essential Skills:

  • See the big picture from business perspective Don’t fear complexity and scale Have a software-centric mindset Be comfortable with coding Relish change and frequent releases View problems as opportunities for improvements Ability to communicate with technical and non-technical stakeholders Provide technical leadership for complex, long term initiatives
  • Provide communication to leadership on progress & gaps Provide authoritative expertise to others by advising on prioritization, planning, and execution of projects within the subdomain

Leadership:

  • Typically oversees the day-to-day operation of a team/unit, which involves single to few service requirements
  • Has decision-making authority
  • Implements short to medium range work plans or goals for specific work groups managed
  • Mentor all team members regardless of their level. Develop collaborative partnerships with internal
  • partners Contribute material time to developing junior and mid-level team members Leads the team on technical decisions and helps drive other team members towards solutions.
  • Coaches other members on creating efficient and high performing frameworks Implements best practices for continuous delivery and continuous integration frameworks

Basic Qualifications 

  • Bachelor’s degree in Computer Science or equivalent, 12 years in architecting and implementing fully automated (IaC/Terraform), secure, reliable, scalable & resilient hybrid-cloud solutions.
  • Strong operational experience in supporting technology infrastructure solutions and Infrastructure-as-code in large organizations
  • 5+ Years of experience in building and supporting private cloud / AWS/AZURE infrastructure
  • Must have thorough understanding of Kubernetes, microservices architecture
  • Experience with DevOps concepts, tools (containers, (CI/CD – Github, Jenkins, Artifactory, Helm, Ansible, etc.) and emerging technologies
  • Experience with public cloud migrations
  • Experience with cloud security toolsets (Prisma, Zeronorth, Wiz, JFrog Xray etc)
  • Exposure to network infrastructure (Ex. managing firewalls, WAFs, network segregation, VPNs and network ACLs)
  • Must have strong ITIL process expertise and/or ITIL Foundation Certification
  • Strong written and verbal communication skills
  • AWS /Azure associate certification
  • Able to thrive in a collaborative and cross-functional environments

Preferred Qualifications: 

  • 5+ years of experience with Terraform
  • Experience with observability tools such as New Relic & Sumo Logic
  • Subject matter expert in Cloud Security and/or Cloud Networking AWS /Azure certification preferably at professional level

To apply for this job email your details to tmilanova37@massmutual.com

Please
GIVE TODAY

Your gift helps fulfill our mission to graduate 10,000 black engineers by 2025. Learn more about the Inspire STEM Gala on our website!