At AccelByte, our mission is to empower game creators by providing them with the backend platform and tools required to make scalable, reliable AAA-quality games. The company was founded in 2016 by industry veterans who have engineered online systems for some of the largest game and distribution platforms in the world including Fortnite, Epic Store, Xbox Live, PlayStation Network, and EA Origin. We are backed by top investors including Softbank, Sony Interactive Entertainment, Galaxy Interactive, NetEase, and Krafton. Our latest Series B funding has firmly solidified our place as a top player in the gaming industry. AccelByte's talent has decades of experience building and shipping some of the largest game and distribution platforms in the world.
We believe that the best companies empower employees to make decisions, obsess about the best user experience, and are not afraid to make and learn from their mistakes. Our culture is based on humility, openness to feedback, drive, and collaboration, which we feel results in the best performing teams. As a company that values diversity, inclusion, and employee growth, our employees have opportunities to work with and learn from teams all over the world. We offer competitive salaries, a full range of health benefits, social activities, career growth opportunities, and an amazing team. Come join us!
**Position Summary**
As the Senior Site Reliability Engineer you will be responsible for contributing to designing, developing, and maintaining CI/CD pipelining for production environments. Your contributions will focus on refining and strengthening release procedures, enhancing automation, and optimizing tooling to improve the efficiency of service delivery. You discover requirements, guide other engineers collaborating in an area, and do exemplary work on complicated problems.
**Essential Functions/Responsibilities**
The Senior Site Reliability Engineer is accountable for the following functions and responsibilities:
- Lead the design, review, and maintenance of CI/CD pipelines, ensuring reliability and efficiency while mentoring team members.
- Drive automation in release processes, streamlining workflows and reducing manual effort.
- Lead database administration efforts.
- Design and implement scalable infrastructure focusing on stability and scalability, utilizing Kubernetes and CNCF projects.
- Direct the development of a secure, cost-effective, and scalable cloud platform, prioritizing operational excellence.
- Lead initiatives to identify and promote best release practices, optimizing infrastructure solutions.
- Proactively investigate operational incidents, designing resilient approaches for long-term prevention.
- Provide exceptional client support by understanding their needs and communicating effectively.
- Mentor less experienced engineers, setting standards for engineering excellence and fostering continuous improvement.
- Collaborate closely with PMs and stakeholders to address requirements effectively and align with project goals.
- Perform additional duties as assigned.
**Qualifications/Experience Required**
- Bachelor's Degree background or relevant work experience, certification, or courses.
- At least 6 years of experience specializing in developing and maintaining automation release processes or CI/CD pipelines is required.
- Strong Database Administration experience
- Proven experience in infrastructure as code, configuration management, and package management, evidenced by a consistent record of successful implementations.
- Advanced experience with Jenkins, GitLab CI, or similar tools for automation, CI/CD, and GitOps, with experience in utilizing CI/CD tooling and pipelines, particularly with emphasis on GitLab, Jenkins, and Flux.
- Advanced experience in scripting in programming languages such as Python, Bash, GoLang, etc.
- Advanced experience in performing cloud system operations on an AWS environment.
- Advanced experience in data design, including inter-service communication and artifact sharing.
- Experience in using Kubernetes involves familiarity with tools such as Kubectl, Flux, and others for debugging and modifying cluster states. Additionally, understanding Customization and the infrastructure as code (IaC) structure/mechanism within the 'deployments' repository is crucial.
- Experience in Terraform, Terragrunt syntax, and CloudFormation usage, including the ability to apply/modify/delete modules.
- Experience in cloud monitoring, logging, and APM solutions, with exposure to monitoring tools such as Prometheus, Grafana, and Datadog.
- Basic experience in database design and management, including knowledge of how to scale up databases.
- Experience in handling software architecture, especially from the infrastructure point of view.
- Advanced experience in developing and handling system architecture for scratch
- Experience in communicating technical concepts through documentation and specifications effectively.
- Experience in m