Golang Distributed Systems Jobs Paying 50,000 USD a Year
Hand-Picked Golang jobs • Apply directly to companies •
Clear salary ranges
Browse 500+ Golang Distributed Systems Jobs (1 new this week) in November 2024 at companies like Datadog, Castor EDC and Wallet Connect paying at least 50,000 USD per year working as a Open-Source Software Engineer, Site Reliability Engineer and Backend Golang Engineer. Last post
Hiring Golang Developers?
Create your profile to continue
48 direct messages sent by companies to developers on Golang Cafe
in the last 30 days
35 developers joined Golang Cafe in the last 30 days
15,091 developer profiles page views in the last 30 days
Get access to exclusive discount on Golang courses up to 25% off
Last developer joined
2-Click Apply
Upload Your CV
Go to your Inbox & Confirm Your Application
10 of 545 Distributed Systems Jobs paying at least
50,000 USD per year • Sort by
Date
Open-Source Software Engineer Datadog New York City, United States / Paris, France / Remote $62,000 to $116,000 a year
August 2018
2 Applicants This Week
More Than 6 Months Old
This job posting is no longer available
Job Description
Datadog is building a world-class APM product that traces requests as they flow across complex systems. We are looking for an expert Go developer who can help push our tracing tools to the next level. Come and join us to build amazing open source software.
What you will do
Write open source code that instruments thousands of distributed applications written in Go around the world.
Drive our open source Go projects and engage with the community to find and address the most important challenges.
Join a great team building software the right way.
Who you must be
You’re a master Go programmer. You’ve written high-performance and concurrent applications, know your way around go tool pprof. You don’t reinvent the wheel but you prefer keeping your code concise and efficient.
You are a great community ambassador and can drive hard technical conversations towards a good solution.
You want to work in a fast, high growth startup environment.
You have a BS/MS/PhD in a scientific field.
Bonus Points
You have significant experience with Python, Java, JavaScript, Ruby or PHP.
You have have experience with code telemetry and introspection.
Site Reliability Engineer Castor EDC Amsterdam, The Netherlands €60,000 to €80,000 a year
February 2020
2 Applicants This Week
More Than 6 Months Old
This job posting is no longer available
Job Description
Our true purpose at Castor
Castor is one of the leading platforms for data collection in medical research. We believe standardizing and reusing datasets is key to overcoming the healthcare challenges of the future.
How we operate
Our main Electronic Data Capture (EDC) application runs on a proven stack consisting of Ubuntu, Nginx, PHP and MySQL. For our cloud installations, we orchestrate these setups by using Terraform combined with Ansible for the server configuration management.
Due to the nature of processing medical data, we have clients in different regions across the globe, often with specific regulatory constraints around where and how their research data is stored. To meet these customer demands we combine both traditional as well as cloud-based hosting solutions.
Most of our clients prefer to run in Azure, but we’re using Google Cloud Platform for things like Kubernetes hosting of greenfield projects, blob storage for scalable file upload storage and their Key Management System (KMS) to further secure our data.
For our metrics we’ve begun standardizing on Prometheus and we’re moving towards Loki for log aggregation. We use PagerDuty for alerting, communicate via Slack and host our code on Github.
Why we’re growing our team
With our recent expansion have come new challenges, both in how we organize ourselves and in how we manage and scale our infrastructure in the future.
To further these efforts we have formed a Platform team consisting of SRE and Software Engineering, which we are now looking to grow with the addition of an additional SRE.
Additionally, due to the sensitive nature of medical data, Castor is certified for both ISO/IEC 9001 (quality) and 27001 (Information security). In addition, we have to adhere to a number of other regulations, including Good Clinical Practice (GCP) guidelines.
Our goal is to unite these requirements with emerging SRE practices around infrastructure as code and other principles to create a well designed and documented system, while still allowing us to remain flexible to change.
How you will contribute
Our absolute commitment to patient data security and privacy informs our vendor selection with certified datacenter and cloud providers. To achieve real impact in medical research, Castor needs to operate security around the world.
Historically, our production platform has run on top of managed hosting services. This model doesn’t scale well for our global, international footprint, which is why we are currently expanding our in-house knowledge and transitioning to Infrastructure-as-a-Service providers.
As a Site Reliability Engineer, you’ll have the ability to shape our operations and continuously deliver a working product. Working very closely with the development teams, you’ll collaborate in supporting and structuring our efforts around automation, observability and security. With your help we plan to scale the Castor platform to the next level.
Some things we worked on recently
Whilst there are many operational challenges as we continue to grow and scale at Castor, our Platform team has made great improvements to a variety of our systems already. To give you some examples of what we achieved last month:
Migrated our DNS to AWS Route53
Set up automatic documentation pipelines using MkDocs
Moved our CI/CD pipelines from Jenkins to CircleCI
Built a key-service on AWS Lambda to store disk encryption keys off-site for an otherwise region-local setup
Your background
You have helped run web-facing services under production workloads and have experienced the challenges that come with maintaining and scaling these systems. Making and owning decisions about systems architecture together with your team is something you enjoy and feel comfortable with.
Qualities we’re looking for include:
A good grasp on how *NIX systems operate
The ability to evaluate and implement best practices for IT operations
A working knowledge of both cloud-native and traditional systems architecture and the trade-offs between them
Experience with a configuration management framework such as Ansible, Chef, Puppet or SaltStack
The ability and desire to work with a wide range of open source technologies
A strong privacy- and security mindset
Experience with some aspects of Observability and distributed systems: from monitoring, logging and metrics instrumentation to resiliency to failure
A good understanding of how relational databases operate
Experience with at least one programming or scripting language, preferably Python or Go(lang)
Knowledge that a list of skills and requirements doesn’t mean you have to tick every single box to apply ;)
How we say thank you
At Castor we truly live our core values, believing we can achieve anything with a healthy and happy team. With this in mind, we offer the following benefits:
Our own ‘Castor Burrow’ - brand new offices by Amsterdam Amstelstation
A competitive salary plus an annual company bonus plan
Employee Stock Option Programme incentive
30 days annual leave plus 6 public holidays
An individual training and professional development budget
Flexible working with the opportunity to work from home 1 day per week
Meditation room with daily yoga, mindfulness and company subscription to Calm
Backend Golang Engineer Wallet Connect Remote / Berlin, Germany $85,000 to $100,000 a year
December 2021
3 Applicants This Week
More Than 6 Months Old
Job Description
WalletConnect is the open-source web3 standard to connect blockchain wallets to dapps. Started four years ago, our mission is to make web3 accessible to everyone. Every month, millions of people use WalletConnect in over 200 integrations.
We’re looking for a backend golang engineer to join our team to build and scale our network. To help grow web3, we recently launched WalletConnect 2.0 with new features, including multi-chain support, a decentralized back-end, faster connections, and 10x performance and scalability. You will help us expand and scale our backend messaging infrastructure.
You will be responsible for building Golang messaging services. A main challenge is growing our services to scale for our millions of users across billions of websocket connections every month, as well as ensuring security and resiliency.
To help with your role, you will have the support of our devops team to deploy and manage our infrastructure, will work closely with our protocol and SDK teams, and have exposure to the full WalletConnect stack.
The ideal candidate is immersed in the best practices of golang at scale, messaging systems and Websockets.
Responsibilities:
Building a microservice architecture based on Golang with scaling in mind
Work with protocols such as Websockets, gRPC
Help with monitoring by creating metrics with Prometheus and Grafana
Develop unit and integration tests for core business logic
Work closely with our devops team to manage and scale our infrastructure
Must have:
3+ years professional experience in software development at least one modern programming language, including Golang, TypeScript, C++, Java, or Rust.
At least 1 year of professional Golang experience.
Experience using Postgres, AWS, with demonstrable experience with systems engineering and automation.
You have experience with network programming or distributed systems development
Experience working on products at scale
Nice to have:
Experience working on systems optimisation
Experience with k8s or Nomad a plus
Desire to learn more about Blockchain technologies or experience with PoS systems.
Familiarity with operations/SRE and the concept of infrastructure as code
Websocket experience
Benefits
What WalletConnect offers:
Fully remote position with flexible timezone (CET/EST preferred)
Software Engineer BlueLabs Europe (Remote) €58,000 to €76,000 a year
April 2021
1 Applicants This Week
More Than 6 Months Old
This job posting is no longer available
Job Description
At BlueLabs we are building a next-generation sports betting platform focused on performance, reliability, modularity and automation. After a period of experimentation, we are now excited to see our technology powering the launch of BetFox, a new B2C operator in Ghana.
To ensure the continuous enhancement of our platform while scaling up operations and entering additional African countries, we are now looking to growing our team. As a result, two of our teams (i.e. Betting and Account) are now on the lookout for seasoned Software Engineers who want to join our distributed team and help us execute our vision.
The Team
The Account Team is responsible for the development and daily operations of the core services powering business-critical functions such as player account management and wallets. Other focus areas include, but are not limited to: responsible gaming, integration with third-party payment providers, integration with Mobile Network Operators, and player acquisition and retention programs with a focus on personalisation and automation.
The Betting Team is responsible for designing, developing, and operating all services relating to the lifecycle of bets in our sports betting platform. This stretches from bet placement to bet settlement, including advanced features such as an event- and player-based risk management, ability to build complex bets such as from outcomes with dependent probabilities, or continuous calculation of early settlement offers (cash out).
The services built by our teams are to be concurrently used by thousands of users and are expected to be able to handle hundreds of thousands of daily transactions in a timely manner.
Sub-second latency is welcomed but high throughput has higher priority in the Betting domain. The goal is building a sports betting platform where no bet is rejected due to lack of capacity in the system. Bet settlement is worth a special mention as the platform needs to be able to quickly evaluate hundreds of thousands of bets upon the resulting of an underlying sporting event.
Raw performance isn't everything. The team must also ensure that the platform can be easily adapted to be compliant with the different and ever-changing regulatory demands our industry is facing all over the world. The ultimate goal being to ensure a fair and safe sports betting experience to all our players.
We are building a microservice architecture based on event sourcing using Pulsar. Our services are written in Golang and use PostgreSQL as an operational database. We use SemaphoreCI to deploy our services to a GKE cluster, which is provisioned using Terraform.
A good candidate should have high standards for himself, a desire to build high-quality, well-tested, production-ready solutions and constantly improve his/her skills. We expect you to take ownership of some parts of the platform, be proactive over the entire development lifecycle and have the ability to work in a fast-paced environment. If this sounds scary, don’t worry - you won’t be alone in this. We value teamwork, trust, communication and a healthy working relationship, so you can always count on the team for support.
About You
You have good problem-solving skills, a tendency towards simple and effective solutions, and a “getting things done” mentality.
Analytical thinking, troubleshooting skills, attention to detail.
You are a reliable, trustworthy person that keeps their promises.
Interest in keeping yourself up to date and learning new technologies.
Product-oriented mindset and eagerness to take part in shaping the products we build.
Ability to work autonomously in a fully distributed team.
Good communication skills in verbal and written English.
Requirements
BS degree in Computer Science or similar technical field
1+ years of professional software development experience using Go
Experience building large-scale distributed systems, communicating asynchronously via message passing using RabbitMQ, Kafka or Pulsar
Deep understanding of DDD, CQRS, microservices architecture, and SQL/NoSQL data stores
Ability to write clean, efficient, maintainable, and well-tested code
Familiarity with test automation, cloud and containerization technologies, code instrumentation and CI/CD pipelines
Interest in taking full ownership of your services and managing them in a production environment including the troubleshooting of live incidents
Remote Work
We are hiring for talent, not for a specific location. You will find that members of our team are distributed all over Europe. Being a distributed team enables us to hire only the best, without being restricted to the talent pool available at a specific geographic location. However, to facilitate team communication and collaboration we currently require you to be located in Europe. You must also be able to travel to other European locations a few times a year for on-site meetings and workshops.
Compensation
The budgeted compensation range for this role is €58,000 to €76,000 annually, depending on your background and experience. As an independent contractor, you will be responsible for paying any taxes or applicable fees in your country of residence. In addition to that, we offer a number of perks to each of our team members as we truly believe in a healthy work-life balance and continuous learning.
Software Engineer WIN.com Remote €75,000 to €95,000 a year
February 2021
2 Applicants This Week
More Than 6 Months Old
Job Description
About win.com
We’re a remote-first, fast growing tech startup that brings together the excitement of gaming and the thrill of real money competitions encapsulated in an all-in-one bite-sized entertainment experience.
WIN helps developers tap into a global realm of game monetisation by enabling real money tournaments in any skill-based game.
About the role
They say good looks only take you so far - that’s why we need you to demonstrate that our products are not only good-looking but also highly functional. As a Software Engineer, you will have an opportunity to solve highly technical problems to shape WIN's backend systems, infrastructure, development and deployment practices while evangelising a strong engineering culture.
Your work will have a direct impact on the User Experience of all the Win.com players across the world and the internal systems.
This is a contract, per-project, as-needed or full-time role
What you’ll do
Play a key role working on the backend services and infrastructure that powers WIN and other products
Build platforms, services, and APIs
You’ll chiefly be using Go in our various backend and data engineering projects
Using a range of different data stores across our teams including , but not limited to PostgreSQL, Redis, Bleve
You'll be working with RabbitMQ for queues
Collaborate with our cross-functional teams
Superpowers you’ll need
4-6 years of experience architecting and maintaining backend systems
Proven experience with Go; and a great foundation with another programming language (e.g. Java, NodeJS, PHP or Python, etc)
Experience working with Protobufs, gRPC & HTTP/2
Understanding of modern software engineering practices in areas like CI/CD, test automation, micro services, distributed systems, and data management
Experience working in a cloud environment such as Google Cloud Platform
Technical vision, ability to understand abstract problems and architect systems that help solve them
A good understanding of application, information and infrastructure architectures, such as API / SDK development and integrations
Experience working in a cloud environment such as Google Cloud Platform or AWS
Experience with technologies such as Prometheus, Grafana, Kibana is a plus
Excellent English communication skills to collaborate with a service-oriented team
At Cloudflare, we have our eyes set on an ambitious goal: to help build a better Internet. Today the company runs one of the world’s largest networks that powers trillions of requests per month. Cloudflare protects and accelerates any Internet application online without adding hardware, installing software, or changing a line of code. Internet properties powered by Cloudflare have all web traffic routed through its intelligent global network, which gets smarter with every request. As a result, they see significant improvement in performance and a decrease in spam and other attacks. Cloudflare was recognized by the World Economic Forum as a Technology Pioneer and named to Entrepreneur Magazine’s Top Company Cultures list.
We realize people do not fit into neat boxes. We are looking for curious and empathetic individuals who are committed to developing themselves and learning new skills, and we are ready to help you do that. We cannot complete our mission without building a diverse and inclusive team. We hire the best people based on an evaluation of their potential and support them throughout their time at Cloudflare. Come join us!
In this role, you can expect to:
Work on highly distributed and scalable systems
Participate in the constant cycle of knowledge sharing and mentoring
Manage and develop some of the biggest clusters in the world
Research and introduce cutting-edge technologies
Contribute to open-source
We are still a small team, well-funded, growing quickly and focused on building an extraordinary company. This is a superb opportunity to join a high-performing team and scale our high-growth network as Cloudflare’s business grows. You will build tools to constantly improve availability, performance, uptime and response times. You will nurture a passion for an “automate everything” approach that makes systems failure-resistant and ready-to-scale.
You may be a good fit for our team if:
You have proven skills of designing, developing and delivering HA scalable production systems.
You have deep knowledge of configuration management software, preferably Salt.
You have solid experience with cluster management systems (Kubernetes, Mesos)
You are comfortable with developing software in Go or Python
You know how network services (DNS, TLS/SSL, HTTP) and network fundamentals (DHCP, subnetting, routing, firewalls, IPv6, BGP) work
You have strong experience designing and managing multi-tenant database solutions (Clickhouse, PostgreSQL, CockroachDB)
You are confident in your knowledge with load balancers (nginx, HAProxy)
Bonus points if:
You have strong operational skills and are an expert in bash scripting
You have practical knowledge of web and systems performance, extensively used tracing tools like ebpf and strace.
What Makes Cloudflare Special?
We’re not just a highly ambitious, large-scale technology company. We’re a highly ambitious, large-scale technology company with a soul. Fundamental to our mission to help build a better Internet is protecting the free and open Internet.
Project Galileo: We equip politically and artistically important organizations and journalists with powerful tools to defend themselves against attacks that would otherwise censor their work, technology already used by Cloudflare’s enterprise customers--at no cost.
Athenian Project: We created Athenian Project to ensure that state and local governments have the highest level of protection and reliability for free, so that their constituents have access to election information and voter registration.
Path Forward Partnership: Since 2016, we have partnered with Path Forward, a nonprofit organization, to create 16-week positions for mid-career professionals who want to get back to the workplace after taking time off to care for a child, parent, or loved one.
1.1.1.1: We released 1.1.1.1 to help fix the foundation of the Internet by building a faster, more secure and privacy-centric public DNS resolver. This is available publicly for everyone to use - it is the first consumer-focused service Cloudflare has ever released. Here’s the deal - we don’t store client IP addresses never, ever. We will continue to abide by our privacy policy and ensure that no user data is sold to advertisers or used to target consumers.
Sound like something you’d like to be a part of? We’d love to hear from you!
Have you ever tried to monitor your infrastructure? We have, and our experience using multiple monitoring SaaS products drove us to build Watchly - a monitoring solution that transforms the way you monitor your products and makes life better for engineers. No more waking up at 2 am and correlating incident data from three different websites, no more ugly & confusing charts and logs, no more maintaining 3 different agents on each VM. One system to rule them all.
At Axiom we are transforming the self-hosted software experience, building a product suite that encapsulates everything a business needs while ensuring a high-quality experience. Our focus on ease of use, security, and privacy ensures our customers get all the benefits of traditional SaaS products, right inside their infrastructure.
**About the Engineering Team **
Engineers at every level directly impact improvements across the product, from feature scoping through design to end polish. Building an outstanding experience for each of these user flows is made more complex by our goal of creating what is best for customers - rather than what is easiest to deploy.
**About the Role **
As a software engineer at Axiom, your breadth of skills paired with our bottom-up product process will give you as much autonomy and license as you can handle. If you can build it and it’s good for the company, do it! There's no limit to how valuable you can be or how much impact you can make here. We’re looking for people who want to make a mark on the world—who have the ambition to dream big and the talent to bring those dreams to fruition.
Responsibilities
Explore new systems, and processes while also being able to discuss when (or when not) to use them.
Help further design and implement our distinct homegrown time-series database from an architectural and engineering viewpoint.
With a focus on performance and stability take our time-series database to the next level
Participate in a culture that values thoughtful code reviews, and frequent deploys.
Must-Have Qualifications
Possess a deep understanding of software architecture, design, and testing
Comfortable around Database fundamentals such as:
(Probabilistic) Data Structures
Big O notation
File systems
SQL processing
Distributed systems
Concurrency control
Data replication & Consensus Algorithms
Caching
Be proficient with golang, shell scripting
Familiarity with unix systems
**Nice-to-Have Qualifications **
* Be familiar with, and comfortable contributing to, robust backend tooling to support our growing team.
* Understand the ins-and-outs of debugging cloud systems, and, have in-depth experience with tuning performance for massive datasets
* Experience writing documentation and tests, appreciating their importance to the team and product
* Open source contributions, projects, and working with communities
**More About Us **
The team at Axiom has been fortunate to work together for many years across multiple companies and multiple products.
Throughout our journey, we would come across services that we wanted to use for monitoring/data visualization/etc, and we would always have a tough choice to make: hand over our data to a third party to get a fully featured product, or use a half-baked solution that could run inside our infrastructure and allow us to keep our data safely in our hands.
When the previous company we worked for was acquired by Microsoft, we decided to take that opportunity to work on this problem. We decided to build polished, featureful, and easy to use products which didn’t sacrifice privacy and security.
A few months ago we started out with the vision of building a next generation sports betting platform focused on performance, reliability, modularity and automation. We believe that our experience paired with today’s technologies, great talent and the agility of a startup environment will enable us to deliver a best-in-class product that meets the demands of the market of tomorrow.
Our Account Team is now on the lookout for an experienced Software Engineer who wants to join our distributed team and help us execute our vision.
The Team
The Account Team is responsible for the development and daily operations of the core services powering business-critical functions such as player account management and wallets. Other focus areas include, but are not limited to: responsible gaming, integration with third-party payment providers, and player acquisition and retention programs with a focus on personalisation and automation.
The services owned by the team are to be simultaneously used by thousands of users around the globe and are expected to be able to handle hundreds of thousands of daily transactions in a timely manner.
Raw performance isn't everything. The team must also ensure that the platform can be easily adapted to be compliant with the different and ever changing regulatory demands our industry is facing all over the world. The ultimate goal being to ensure a fair and safe sports betting experience to all our players.
Remote Work
We are hiring for talent, not for a specific location. You will find that members of our team are distributed all over Europe. Being a distributed team enables us to hire only the best, without being restricted to the talent pool available at a specific geographic location. However, to facilitate team communication and collaboration we currently require you to be located in a European time zone (between UTC-1 and UTC+3). You must also be able to travel to other European locations a few times a year for on-site meetings and workshops.
Compensation
The budgeted compensation range for this role is €55k-75k annually, depending on your background and experience. As an independent contractor you will be responsible for paying any taxes or applicable fees in your country of residence (unless you are based in Malta, in which case you will be employed). In addition to that, we offer a number of perks to each of our team members as we truly believe in a healthy work-life balance and continuous learning.
Requirements
BS degree in Computer Science or similar technical field
2+ years of professional software development experience using Go
Interest in or previous experience with Elixir will be considered an asset
Experience building large-scale distributed systems, communicating asynchronously via message passing using RabbitMQ or Kafka
Deep understanding of DDD, CQRS, microservices architecture, and SQL/NoSQL data stores
Interest in test automation, cloud and containerization technologies, code instrumentation and CI/CD pipelines
Interest and ability to keep yourself up to date and learn new languages, frameworks and technologies as required
Interest in taking full ownership of your services and managing them in a production environment including the troubleshooting of live incidents
Ability to work autonomously in a fully distributed team
Good communication skills in verbal and written English
Principal Software Engineer SendGrid Denver, Colorado, United States $130,000 to $170,000 a year
October 2018
3 Applicants This Week
More Than 6 Months Old
This job posting is no longer available
Job Description
The Principal Software Engineer opening is an exciting opportunity to join SendGrid’s Customer Growth Engineering team, developing features and software that impact all points of the customer lifecycle. You’ll make a tremendous impact with the team that spearheads microservice development and operability at SendGrid, using the latest distributed systems programming techniques and technologies like rate limiting, circuit breakers and multi-datacenter (including AWS). You’ll bring the ability and experience to write complex backend services, communicate effectively with cross functional teams, and have a tremendous drive to hone your craft.
Denver is our global headquarters and home to the Customer Growth Engineering team, our revenue growth engine - which your efforts will directly impact.
What You’ll Do
Live by and champion our cultural values of Happy, Hungry, Honest, and Humble
Design entire systems from scratch, end-to-end, that can fit into the SendGrid architecture
Develop solutions for complex problems both independently and with team members
Work with other teams to troubleshoot/determine resolution for complex issues across team domains
Focus on designing and implementing systems for scalability, testability, supportability and maintainability
Use your foresight and experience to keep our systems effectively running now and in the future through profiling, load testing, failure testing, monitoring and much more to have confidence in the robustness of the systems we deploy
Lead team initiatives and implementations from conception to completion
Recommend and champion improvements to our software and product development process
Drive improvements in quality of team's work output
Site Reliability Engineer PubNative Berlin, Germany €40,000 to €65,000 a year
October 2018
4 Applicants This Week
More Than 6 Months Old
This job posting is no longer available
Job Description
PubNative is a mobile publisher platform that serves native ads via a scalable and flexible API for mobile apps and web. Our publisher-first approach focuses on the specific needs of each publisher across all verticals. Our ad serving technology is used by developers and publishers around the world.
Our system consists of a myriad of high load Golang-based APIs, iOS SDKs, Ruby/Rails 5 dashboard, Scala and Spark data- and ML pipelines, Druid OLAP system, running on a Mesos and Kubernetes cluster.
We're always on call to keep our networks up and running, ensuring our users have the best and fastest experience possible. We follow “Infrastructure as Code” model and immutable deployment strategies.
We are looking for a Site Reliability Engineer (m/f) to help us build and operate infrastructure platforms, and provide technical consultancy to engineering teams on how to build reliable, scalable and efficient services.
Our Responsibilities:
- You help us build a hybrid, poly-cloud-provider environment
- You help to design, develop and operate monitoring, tracking platforms
- You drive scalability and operability of supported systems/infrastructure
- You participate in on-call rotation and be on-call for the services you build and support
- You work with other teams to provide consultations in systems architecture support for new and existing production systems
- You write code so that you can automate tasks, support SLA for Production Systems, you support other engineering teams on reliability, scalability and efficiency topics
- You manage OS image/templates via Packer, provision infrastructure via Terraform
- You support CI/CD and make new pipelines
- You engage in and improve the whole lifecycle of services—from inception and design, through deployment, operation, and refinement
- You support services before they go live through activities such as system design consulting
- You maintain services once they are live by measuring and monitoring availability, latency, and overall system health
Our Requirements:
- 3+ years of experience in a Site Reliability role/Full-stack developer
- Experience with public cloud providers (AWS, Google Cloud, Digital Ocean, etc.) and Infrastructure as Code (Terraform)
- Strong programming skills and familiarity with modern programming languages: Go, Ruby, Python, Shell etc.
- Knowledge of managing docker containers and microservices via Kubernetes
- Experience building and monitoring systems and metric collection pipelines
- Track record of building automation and solving multi-datacenter/clouds infrastructure problems
- Knowledge of algorithms, data structures, complexity analysis, software design and reverse engineering
- Interest in designing, analyzing and troubleshooting large-scale distributed systems
- Experience working with source control - Git
- Experience with continuous integration platforms such as TeamCity, Jenkins, CircleCI etc.
- Understanding of Agile, DevOps practices such as CI/CD, automated testing etc.