Qualification
B.E / B.Tech / MCA / M.Sc. / M. Tech
Experience
2 to 14 years
Key Technologies / Stack
- Loadrunner, JMeter, Docker, Unix Administration, Ansible
- Docker, Kubernetes, Devops (Jenkins, Gitlab, Infra Config)
- Azure/GCP Cloud
- Python, Javascript, Git
- MySQL / MongoDB
Nice to have
- Redis, elasticsearch, prometheus, grafana, kibana
- Database tuning, administration
- Security testing, VA/PT tools, OWASP
Roles and Responsibilities
As a System Reliability Engineer (SRE) you will be responsible for keeping inhouse and SaaS applications up and running as well as improving the automation, scalability, and performance of systems.
Responsibilities
- Troubleshoot availability and performance issues, debug distributed applications via analysis of data such as logs, metrics and APM (application performance management)and perform front-line remediation
- Communicate with management and customers regarding aberrant system’s behavior
- Monitor and audit all aspects of a production application stack and create alerts and dashboards based on data received through monitoring
- Influence software and architecture design based on system and architecture observations related to performance and reliability
- Design, develop and maintain automation software, scripts, and tools
- Analyze and remove bottlenecks in the development workflow
- Able to manage and drive small team, prepare shift plans, and other team management activities
Behavioral skills
- Excellent verbal and written communication skills
- Strong problem-solving skills
- Passion for technology as well as helping customers and team members
- Comfortable in interacting with external customers and internal stakeholders, should have good interpersonal skills.
- Strong leadership traits and self-learning attitude
- Proactive to take up additional responsibilities
Technical Skills
- Good programming skills in one of C/C++, Java, Javascript, Shell scripting, Python or Go, and an ability to pick up new ones.
- Strong knowledge of Linux environment, its fundamentals, internals and administration
- Strong knowledge of at least one widely adopted database platform (MySql, PostgreSQL or Elasticsearch a plus)
- Develop automation tools and framework to automate operational tasks, deployment of code, applications, services and machines
- Represent SRE in design reviews and work cross-functionally with Engineering teams on operational readiness
- Knowledge of Docker, Kubernetes, and containers is plus
- Knowledge of / experience in anyof any security standards like ISMS, PCI DSS
Location: Pune
Company: ElasticRun
The Job is closed. Check the latest active jobs here.