We are looking to add a Site Reliability Engineer to our growing engineering team! The
SRE team owns the cloud-based infrastructure, security and scaling. Our ideal
candidate is an experienced SRE who has built and maintained cloud infrastructure at
scale and has meticulous code style and quality, with the ability to own infrastructure projects end-to-end.
Responsibilities
• Design, build and maintain critical cloud-based systems (such as GCP, AWS, and Azure)
• Monitor site stability, performance, and security using common Site Reliability Engineering practices.
• Plan upgrades for scaling, capacity, API performance in a complex multi-tenant environment.
• Improve deployment, management, and scalability of our services.
• Champion the implementation of processes to improve visibility across the entire technology stack.
• Document system design and procedures.
• Provide clear status updates on projects in a timely manner.
• Participate in monthly on-call duties.
• Participate in weekly meetings as required.
Requirements
• BS in Computer Science, or equivalent experience\
• Strong programming and/or scripting skills in any of Python, Go, , Ruby
• Strong experience with Terraform or other Infrastructure as Code tools.
• Solid understanding of Linux containerization with Docker
• 4+ years production experience with one or more public Cloud providers (AWS/GCP/Azure)
• 2+ years production experience with Kubernetes (both operational and application design)
• Experience with Prometheus / New Relic for monitoring and dashboards.
• Proficiency with Linux system administration
• Strong Networking skills as they pertain to Cloud/Kubernetes infrastructure.
• Experience with test automation and CI/CD, such as GitOps
• Understanding of Kafka from an Operational perspective
• Desire to automate everything.
• Knowledge of best practices related to security, performance, and disaster recovery.
• Intellectual curiosity that motivates you to keep on top of technical trends.
• Highly organized and have the ability to juggle many tasks without losing sight of the highest priority items.
• Stay focused under pressure, prioritizing and managing multiple projects simultaneously in a very fast-paced environment.
• Extremely detail oriented, organized, a self-starter
• Demonstrate high ownership and ability to drive issues to resolution.
• Excellent communication skills, both written and verbal
• You are self-motivated with the ability to work independently and in globally distributed teams.
• You are service-oriented and enjoy working with engineers to make the software development process as painless as possible, providing continuous improvement.
APPLY FOR THIS JOB:
Company: Caddie Consigliere, LLC
Name: Lany Sol
Email: