27 Jul
DevOps Manager
Vacancy expired!
- Assigns and monitors work of technical personnel, ensuring that application development and deployment is done in the best possible way, and implements quality control and review systems throughout the development and deployment processes
- Manages operational aspect of production and development servers including developing, training in, and validating compliance with procedures and checklists related to disk space usage, monitoring solutions, deployment, conventions, access to the production and development sources, source control access and usage, performance monitoring, code modifications validation, scheduling, and more
- Evaluates technological choices (network/hardware related and technology/code related) by querying providers and providing evaluations of each solution include ROI evaluations in the present and future implications, limitations, and opportunities
- Manages analysis and approval of new code through security and performance gates that you will design and develop for feature-complete software. Be an advocate for security and performance standards in the organization
- Works within IT, cross-functionally and with vendors, in order to successfully identify, prioritize, and resolve issues and provide subject matter expertise for enhancements, developments, and operational improvements to the website applications
- Identifies trending gaps or issues in day-to-day performance of all website applications and components and third party vendors by active monitoring, alert management, reporting, and process reviews
- Maintains issue tracking and documentation systems and provides reporting that ensures proper tracking and visibility of issues and projects
- Maintains accurate program estimates, timelines, project plans, and status reports
- Identifies technical and process improvement opportunities and socialize/advocate to get them implemented
- Possesses expert technical understanding of the intersection of development and operations (DevOps), monitoring and management tools, and deployment processes and tools
- Ensure proper security, monitoring, alerting and reporting for the infrastructure
- Work with the NOC and DC Tech teams, using Tivoli monitoring and other tools to ensure the integrity and availability of our hardware, server resources, reviewing system and application logs, and verifying completion of scheduled jobs such as backups
- Work with the incident team to diagnose and recover from hardware or software failures working with or as the Incident Commander to coordinate and communicate with our internal customers
- Assist project teams with technical issues in the planning phase of sites or project development efforts
- Gather system requirements and support several project teams in evolving, testing and rolling-out new products and services, then transitioning the site or product to post launch operations activities throughout life of the product or service
- Work with the R&D team and other systems engineers to make improvements
- Monitor documenting processes and procedures and follow a formal change management procedure
- Give direction and set priorities for all Sire Reliability Engineers, DevOps Engineers, and Linux Systems Administrators
- Will hire/fire, coach, and conduct and annual performance evaluations of the Linux Systems Administration team
- Manages and appropriately escalates delivery impediments, risks, issues, and changes associated to the product development initiatives
- Bachelor's degree in Computer Science or equivalent work experience
- 10+ years experience administering Linux and/or Devops, site reliability engineering or equivalent work experience
- 3 solid years experience with Kubernetes, designing, building, and deploying Kubernetes clusters
- Experience with architecting large scale distributed systems
- Experience building and configuring and maintaining source/byte code management system
- Experience with end-to-end Jenkins pipeline automation for built, test, and deploy
- Experience with Apache, nginx, lighttpd, and high volume web servers
- Experience with high volume mail servers
- Experience with load balancers
- Experience in analysis and system performance tuning
- Experience in Perl, Python or Shell Scripting, with experience implementing automation and monitoring using shell scripting and other related tools
- Experience with proxies / reverse proxies, e.g. squid, varnish
- 5+ years demonstrated hands-on experience managing a high performance team of engineers and administrators
- Knowledge of protocols such as DNS, HTTP, SMTP, SNMP
- Knowledge of OSI model
- Deep knowledge of Kubernetes and large scale distributed systems
- Knowledge of the challenges in a very large multifaceted global environment
- Ability to implement automation and monitoring using shell scripting and other related tools
- Demonstrated best practices knowledge of managing teams and conflict management
- Has excellent verbal and written communication skills
- Must have critical thinking skills in a complex IT environment to analyze, troubleshoot, and resolve problems without direction
- Outstanding organizational skills and the ability to handle multiple projects simultaneously while meeting deadlines
- High degree of honesty and integrity
- Team-player, positive attitude and flexible
- Must be comfortable with adult content
- Must be at least 21 years old
- 401(k) with a 5% match on eligible earnings with no vesting period
- Medical (Kaiser HMO, Aetna PPO), Dental, and Vision
- Flexible Spending Account for Healthcare and Dependent Care
- Life Insurance, AD&D, LTD and Short and Long Term Disability
- Paid Time Off (20 days PTO) and Holiday Pay (12 company paid holidays off)
- Employee Assistance Program
- Commuter Benefits
- Tuition Reimbursement
- Health Club Reimbursement
Vacancy expired!