Manager III Cloud Reliability
Address: USA-SC-Mauldin-211 BiLo Boulevard Store Code: IT Executive & Administration (2760797) Retail Business Services is the services company of leading grocery retail group Ahold Delhaize USA, providing services to five East Coast grocery brands: Food Lion, Giant Food, The GIANT Company, Hannaford and Stop & Shop. Retail Business Services leverages the size and scale of the local brands to and provides industry-leading expertise, insights and analytics to local brands to support their strategies. We are committed to diversity, equity and inclusion and we foster a community of belonging where everyone is valued.For more information, visit https:// www.retailbusinessservices.com . Primary Purpose:RBS Cloud Reliability group is looking for an experienced Engineering Manager to help lead, build, and coach the team responsible for Platform reliability engineering for Azure Cloud Platform. As a leader of this group, you will set the vision, leading reliability initiatives, and governance functions with internal team members and managed service providers to provide best-in-class support for internal cloud services customers.If you are an Engineering leader passionate about reliability, have a consistent track record of building healthy, highly performant teams, have experience leading large-scale fault-tolerant systems, care about metrics and operational excellence then this role is for you.The Platform Reliability Engineering Team is responsible for providing Incident Management, Observability and Reliability engineering consultation across the organization as well as providing troubleshooting assistance on high-impact incidents which the application teams are unable to solveYou will work with the engineering and product teams to ensure we have a long-term technical vision in place, support the team in developing and delivering on their objectives, and will nurture a customer-centric culture that is inclusive both internally and externally.Duties and Responsibilities: Build and run support for cloud solutions that includes Core Azure Services, Container platforms, Networking, Security, Cost management, Operating systems, Web applications and data services.Build and manage a team of engineers across many time zones who work to analyze and maintain service stability by documenting policies in a 24/7/365 operation.Manage the customer experience and oversee daily operations, including escalations, logistics, operations support, space usage, budget support, future-proofing, and guidelines.Develop, own, and execute on a roadmap that addresses our immediate challenges and maps an incremental approach to longer-term reliability, automation, and instrumentation goalsPartner with Cloud Platform Engineering to identify and implement automation opportunities, efficiency in process to improve reliability, observability, and operationsDesign and implement tools that help product teams focus on shipping features, while making sure we build infrastructure that is cost efficient, secure, and reliable.Provide consultation to development and product teams to help them build reliable and scalable services, and resolve any production issues as quickly as possibleLead projects for disaster recovery, automated failure recovery, capacity planning, high availability, and scalingHelping us shape a DevOps culture, and foster its adoptionStay abreast of the latest SRE methodologies, and skillfully adopt the appropriate ones for cloud platformFoster innovation within the team, and join others manifesting the new SRE discipline for cloud platformTake an active role in driving and evolving the roadmap for the SRE Organization: particularly in the areas of infrastructure automation, observability, and AI OpsExecute various solution areas leveraging the Cloud FinOps operating model around Cloud governance, spend management, migrations, and modernizations as part of FinOps.Provide input and tracking of cloud costs to the of overall financial budgets, forecasts, and actualsDrive FinOps value by helping customers in understanding their cloud spend based on their business goals and budgetConducting risk assessments of security controls as they pertain to enterprise IT assets and related potential business impactExcellent stakeholder management skills and a proven ability to build strong relationships and trust throughout the organization, including with senior leadershipPlan and mange departmental budget, budget forecasting, chargeback, and performance reviews of associatesContribute to team culture and recruiting by leading activities to attract and retain top talent and mentoring and developing junior product associatesCollaborate with Solution architecture, Platform engineering, Managed service providers and Product teams for delivering solutionsA highly collaborative leader that is capable of formulating and advocating for a clear, impactful platform vision and strategy and working cross-functionally to deliver on that roadmap.Qualifications:Bachelor's Degree in Computer Science, Information Technology, Engineering, or related field10+ years' experience in Infrastructure technology solutions, DevOps, Agile development, architecture, consulting, and/or cloud/infrastructure technologies5+ years of experience leading, managing, supporting, maintaining, and automating private and public cloud environments3+ years in management roles, managing resources, projects, and budgets, forecasts, and chargeback3+ years of experience using IaC tools (ARM, Terraform, JSON, YAML, PowerShell, Github etcExperience crafting, implementing, and operating highly scalable and reliable platform solutions at scale on the public cloud like Azure or AWSDeep understanding of cloud technologies preferably Azure, including design, standard methodologies around securing cloud environments and hands on experience with IAC and SDLC models.Capable of technical deep dives into code, networking, systems, and storage with very experienced engineers.Hands on experience managing Azure Enterprise-scale reference architecture implementationsDeep and extensive experience in building and landing DevOps / SRE practices in a global environment, is required.Exposure to enabling and managing cloud services, usage, and optimization as well as automation and development of tools to support DevOps model and improvements based on trends and data analysis.Technical depth that allows you to develop and mentor others as well as build credibility with your teamExperience in Full stack Cloud Infrastructure Engineering, Operations, and Application knowledgeAbility to work in an Extreme Programming environment and work in a paired programming/engineering modelAble to manage diverse teams, multi-task, and work under pressure to meet aggressive schedule targetsHands on experience with IaC tools like ADO, ARM, terraform, ansible, PowerShell, python, azcli, githubExperience working with and automating enterprise scale cloud infrastructure deploymentsExperience with security compliance programs such as ISO, PCI, HIPPA, is strongly preferredNegotiation skills, stakeholder management and strong ability to manage opposing viewpointsAsks questions to encourage others to think differently and enrich their analyses of complex situationsPreferred Qualifications:Certification in Azure Administrator -preferred, Azure DevOps -preferred, Azure Solutions Architect -preferredPrior experience working in/with DevOps, Agile and automation and SRE teams.Prior experience managing Infrastructure and software development or devops teams with automation focus.#LI-RV1#LI-remote#DICEJobsRetail Business Services is an equal opportunity employer. We comply with all applicable federal, state and local laws. Qualified applicants are considered without regard to sex, race, color, ancestry, national origin, citizenship status, religion, age, marital status (including civil unions), military service, veteran status, pregnancy (including childbirth and related medical conditions), genetic information, sexual orientation, gender identity, legally recognized disability, domestic violence victim status or any other characteristic protected by law. We provide reasonable accommodations to applicants and employees with disabilities.If you have a disability and require assistance in the application process, please contact our Talent Acquisition Department at tad@retailbusinessservices.com> Job Requisition: 251862externalUSA-SC-Mauldin10312022