Junior Site Reliability Engineer
Summary:
We’re looking for a passionate Junior Site Reliability Engineer to join our team and help build and maintain highly reliable, scalable, and performant systems. You’ll work closely with development and operations teams to ensure our infrastructure is robust and our applications are always available.
Responsibilities:
Infrastructure Management:
Assist in deploying and managing infrastructure using tools like Docker and Kubernetes, ensuring efficient resource utilization and scalability.
Help maintain infrastructure as code using tools like Terraform, promoting consistency and reproducibility.
Monitoring & Observability:
Implement and manage monitoring and observability tools to proactively identify and address potential issues.
Analyze system metrics and logs to gain insights into system performance and stability.
Incident Response:
Participate in on-call rotations and respond to incidents, working collaboratively to troubleshoot and resolve issues.
Contribute to post-incident reviews and identify areas for improvement.
Automation:
Develop and maintain scripts and automation tools to streamline routine tasks and improve operational efficiency.
Contribute to the development and maintenance of CI/CD pipelines to automate testing and deployment processes.
Collaboration:
Work closely with development teams to understand their needs and challenges.
Advocate for reliability and performance best practices throughout the software development lifecycle.
Learning & Growth:
Continuously learn and expand your knowledge of SRE principles, tools, and technologies.
Stay up-to-date with industry trends and best practices.
Qualifications:
Education: Bachelor’s degree in Computer Science, Engineering, or a related field.
Experience: 0-2 years of experience in SRE, DevOps, or a related field.
Technical Skills: Familiarity with Docker, Kubernetes, Terraform, and version control systems (e.g., Git).
Basic understanding of Linux/Unix systems, scripting languages (e.g., Bash, Python), and monitoring tools (e.g., Prometheus, Grafana, Datadog) is a plus.
Experience with MongoDB a plus
Experience with Cloud Providers (AWS, Azure, Google Cloud etc.) a plus
Soft Skills: Excellent communication, problem-solving, and teamwork skills. Ability to work calmly under pressure during incidents. Eagerness to learn and adapt in a fast-paced environment.
Benefits:
Mentorship: Work closely with experienced SREs to learn and grow.
Impactful Work: Contribute to building and maintaining systems that are critical to the company’s success.
Collaborative Environment: Work in a supportive and collaborative team environment.