Senior Site Reliability Engineer Tendermint San Francisco, United States / Berlin, Germany / Toronto $100,000 to $150,000 a year
October 2018
3 Applicants This Week
More Than 6 Months Old
This job posting is no longer available
Job Description
We're looking for someone who has:
- At least 5 years of software engineering experience with open source contributions.
- Written structured, high-quality programs and scripts for automation.
- Significant experience writing Golang or the ability and desire to become proficient in new languages.
- Experience developing, releasing, and maintaining production software and infrastructure tools like Elastic stack, InfluxDB stack, DataDog, PagerDuty, or VictorOps.
- Built solutions with a broad set of technologies in and around cloud solutions (AWS EC2, ECS, Route53, DynamoDB, RDS, Lambda, Docker, - Google Container Engine, Kubernetes or Docker Swarm).
- Implemented continuous deployment before (Jenkins, CircleCI, Travis, Ansible, Chef, Puppet).
- Experience with SDLC tools (Git, GitHub, Atlassian Stash/Bitbucket, GitLab, JIRA).
- Experience with QA/SIT tools (Selenium).
- Experience in Linux System administration including package management, network management, and security management.
- Familiarity with open source P2P networking protocols.
- Experience working in an agile development environment.
- The ability to take ownership and see initiatives through.
- Exceptional communication skills.
- Experience working with distributed teams.
What your primary responsibilities will be:
- Help scale software systems with automation, in an effort to improve reliability, velocity, and simplicity.
- Create, maintain, and improve the tooling for continuous integration and continuous delivery.
- Build and maintain tooling for deploying, monitoring, and maintaining clusters of Tendermint nodes on our testnets and mainnets.
- Build and maintain tooling to help shorten feedback cycles within teams and projects.
- Plan, build, and maintain public facing services in association with business goals.
- Build tools to measure and monitor availability, latency and overall system health.