Hand-Picked Golang jobs • Apply directly to companies •
Clear salary ranges
Browse 100+ Golang Monitoring Jobs (1 new this week) in June 2025 at companies like Docker, Axiom Inc. and World Open Network with salaries from $25,000 to $250,000 working as a Senior Software Engineer (Backend), Software Engineer, Core Database and DevOps Engineer. Last post
Hiring Golang Developers?
Create your profile to continue
48 direct messages sent by companies to developers on Golang Cafe
in the last 30 days
52 developers joined Golang Cafe in the last 30 days
22,945 developer profiles page views in the last 30 days
The Docker Hub team develops and maintains the largest and most popular container registry service in the world today, Docker Hub. Millions of users - community developers, open source projects and Independent Software Vendors - push and pull Docker container images billions of times through Docker Hub. If you are an experienced backend Software Engineer and want to play a critical role in the evolution of Docker Hub and Docker’s next chapter, then this role is for you.
As a Senior Software Engineer in the Docker Hub team, you will build features around the container registry that operate reliably at massive scale and deliver a differentiated experience for free and paid users of Docker Hub. You will develop microservices and serverless functions that offer new functionality to other services within Docker Hub’s service oriented architecture besides enhancing existing services. You will be constantly seeking ways to improve monitoring and reliability of the various Docker Hub services as well as CI/CD around the Docker Hub services to ensure we maintain a high level of quality with a fast pace of delivery. Finally, you should be passionate about how developers’ lives could be made easier, and about Docker’s role in that.
Responsibilities
Develop, deploy and monitor microservices and serverless components in AWS
Scale the world’s largest repository of container images
Play an active role in product discussions, influence the roadmap and end user experience, take ownership and responsibility over new projects and features, and turn those ideas into reality
Deploy infrastructure for AWS using Terraform
Build and improve team automation tools including Github Actions, Slack integrations, Grafana dashboards
Interact with other teams within Docker, as well as with upstream open source communities and our users
Be ready to tackle high performance engineering challenges
Play an active role in improving the way Hub services are tested and deployed
Qualifications
5+ years experience building SaaS products with modern languages like Golang, Python or Java
Understanding of the challenges of running a SaaS platform at global scale
Good Written communication skills
Ability to work remotely across time zones
Solid API design skills (straightforward, unsurprising, defensible)
Direct experience developing applications at web scale
Proven ability to learn new technologies and languages, and to switch between them as necessary
Follow good software engineering practices such as code review, source control, continuous integration and testing
Ability to work in a team with other developers and partnering with User Experience experts, Product Management and Operations teams
Preferred qualifications
Experience with developing Microservices
Experience with Docker and Kubernetes
Experience with modern monitoring and logging platforms
Have you ever tried to monitor your infrastructure? We have, and our experience using multiple monitoring SaaS products drove us to build Watchly - a monitoring solution that transforms the way you monitor your products and makes life better for engineers. No more waking up at 2 am and correlating incident data from three different websites, no more ugly & confusing charts and logs, no more maintaining 3 different agents on each VM. One system to rule them all.
At Axiom we are transforming the self-hosted software experience, building a product suite that encapsulates everything a business needs while ensuring a high-quality experience. Our focus on ease of use, security, and privacy ensures our customers get all the benefits of traditional SaaS products, right inside their infrastructure.
**About the Engineering Team **
Engineers at every level directly impact improvements across the product, from feature scoping through design to end polish. Building an outstanding experience for each of these user flows is made more complex by our goal of creating what is best for customers - rather than what is easiest to deploy.
**About the Role **
As a software engineer at Axiom, your breadth of skills paired with our bottom-up product process will give you as much autonomy and license as you can handle. If you can build it and it’s good for the company, do it! There's no limit to how valuable you can be or how much impact you can make here. We’re looking for people who want to make a mark on the world—who have the ambition to dream big and the talent to bring those dreams to fruition.
Responsibilities
Explore new systems, and processes while also being able to discuss when (or when not) to use them.
Help further design and implement our distinct homegrown time-series database from an architectural and engineering viewpoint.
With a focus on performance and stability take our time-series database to the next level
Participate in a culture that values thoughtful code reviews, and frequent deploys.
Must-Have Qualifications
Possess a deep understanding of software architecture, design, and testing
Comfortable around Database fundamentals such as:
(Probabilistic) Data Structures
Big O notation
File systems
SQL processing
Distributed systems
Concurrency control
Data replication & Consensus Algorithms
Caching
Be proficient with golang, shell scripting
Familiarity with unix systems
**Nice-to-Have Qualifications **
* Be familiar with, and comfortable contributing to, robust backend tooling to support our growing team.
* Understand the ins-and-outs of debugging cloud systems, and, have in-depth experience with tuning performance for massive datasets
* Experience writing documentation and tests, appreciating their importance to the team and product
* Open source contributions, projects, and working with communities
**More About Us **
The team at Axiom has been fortunate to work together for many years across multiple companies and multiple products.
Throughout our journey, we would come across services that we wanted to use for monitoring/data visualization/etc, and we would always have a tough choice to make: hand over our data to a third party to get a fully featured product, or use a half-baked solution that could run inside our infrastructure and allow us to keep our data safely in our hands.
When the previous company we worked for was acquired by Microsoft, we decided to take that opportunity to work on this problem. We decided to build polished, featureful, and easy to use products which didn’t sacrifice privacy and security.
DevOps Engineer World Open Network Menlo Park, United States $110,000 to $140,000 a year
July 2019
1 Applicants This Week
More Than 6 Months Old
This job posting is no longer available
Job Description
Job Description
We are an exciting start-up company founded by proven leaders with repeated success in the technology space. Our newest company is developing a cryptocurrency platform based on an open-source third generation blockchain that we’re creating. Our goal is to set a new standard in security and protection for our end users and community.
We’re looking for an experienced Devops engineer responsible for automating and managing the technology processes around the development, testing, release, and deployments. Working closely with developers, support and the product manager, the Devops engineer uses continuous integration tools, scripts and manual processes to ensure that all teams have access to the systems and tools necessary to perform their jobs at all times.
The position is reporting directly to the VP of Infrastructure, Operations and CISO, the goal is to keep the department running in an efficient and profitable manner, to increase customer satisfaction, loyalty and retention, maintain standards and meet expectations of WON’s services, both internally and externally.
Responsibilities
- Pursue a rigorous, disciplined approach to software development process and automation.
- Develop, test and maintain build and deployment scripts in CI/CD framework/tools to automate and
streamline deployment processes.
- Drive the Operations team toward automation and deployment best practices.
- Lead efforts in automation, continuous deployment, build, and configuration management.
- Actively participate in Engineering Scrum and design meetings to drive quality releases.
- Monitor applications with Application Performance Monitoring tools.
- Produce and maintain documentation on installations, procedures and requirements for systems.
- Participate in on-call rotations.
- Assist with the development and implementation of mission critical applications
- Assist with the development of robust, scalable, high performing, high-volume production applications with users across the globe
- Build internal systems and support business needs with your domain expertise
Qualifications
Minimum qualifications:
- Bachelor's Degree+ in engineering or computer science
- Expert skills with Linux, networking, storage, and virtualization
- Automation with tools like Ansible/Chef/Puppet.
- Experience with setting up and supporting CI/CD for Java/C++/C#, Go, Nginx, Ruby, MySQL, Redis, RabbitMQ, NodeJS development environments
- Experience with Docker containers and Kubernetes a definite plus.
- Experience with setting up full stack Monitoring and alerting with tools like Sensu, Splunk or Nagios
- Proficiency in a high-level scripting language like Ruby, Python, shell scripts etc.
- Ability to plan and execute S/W and infrastructure upgrades based on data driven capacity planning
- Experience with build systems (Makefiles/Scons), and release management tools (Git, Jenkins, Jira)
Benefits
- Competitive Salary
- Awesome bonus
- 20 days annual leave
- 8 days personal leave
- 100% medical, dental and vision insurance
- Life insurance
- 401(k) and FSA
- Free shuttles between Caltrain Menlo Park and office
- Gym on site, accessible 24/7
- Located on corner of Marsh Road and 101, by the Dumbarton Bridge exit.
- Loads more!
Site Reliability Engineer Gtmhub Sofia, Bulgaria €30,000 to €35,000 a year
July 2019
1 Applicants This Week
More Than 6 Months Old
This job posting is no longer available
Job Description
Gtmhub is the world’s most beautiful and intuitive Objectives and Key Results (OKRs) management and employee experience solution. We build enterprise-scale software with a consumer-grade experience.
We help organizations amplify revenue growth by aligning every employee with their corporate purpose using the OKRs method. We are big believers in the power of employee experience to drive productivity, so our product facilitates best practice employee success features.
At heart, we are product people who love data so much that we built the only solution that integrates more than 150 data connectors to allow for true automation of progress and productivity management.
The Role
The term site reliability engineering is credited to Benjamin Treynor Sloss, Vice President of Engineering at Google. He said site reliability engineering is “what happens when a software engineer is tasked with what used to be called operations.”
To us, a Site Reliability Engineer (SRE) is responsible for the availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning of their services.
SREs design and implement automation with software to replace human labor. They want systems that are automatic, not just automated—such that their services are able to run and repair themselves.
Responsibilities
Engage in and improve the entire lifecycle of services—from inception and design, through to deployment, operation, and refinement/system tuning
Support services before they go live through activities like system design consulting, developing software platforms and frameworks, capacity planning and launch reviews
Maintain services once they are live by measuring and monitoring availability, latency, and overall system health
Identify performance bottlenecks and troubleshoot performance issues
Scale systems sustainably through mechanisms like automation, and evolve systems by advocating for changes that improve reliability and velocity
Practice sustainable incident response and postmortems
Basic Qualifications
Experience with algorithms, data structures, complexity analysis, and software design
Ability to work across teams (business and technical) to continuously analyze system performance in production, troubleshoot consumer reported issues, and proactively identify areas requiring optimization
Preferred Qualifications
Expertise in designing, analyzing and troubleshooting large-scale distributed systems
A systematic problem-solving approach, accompanying effective communication skills, a sense of ownership, self-direction, and drive
Ability to debug and optimize code and to automate routine tasks
Practical experience in supporting application reliability practices for consumer-facing web and mobile experiences
We started in Sofia in 2015 with a mission to ship a world-class data management and analytics engine which allows companies to automatically track and visualize KPIs in real-time and create custom insights to inform goal setting, performance management, and long-term strategic decision making. Today we operate across offices in Sofia, London, Berlin, and San Francisco.
Apply today if our mission inspires you! Join us in developing yourself and others as our Site Reliability Engineer.
Principal Software Engineer SendGrid Denver, Colorado, United States $130,000 to $170,000 a year
October 2018
1 Applicants This Week
More Than 6 Months Old
This job posting is no longer available
Job Description
The Principal Software Engineer opening is an exciting opportunity to join SendGrid’s Customer Growth Engineering team, developing features and software that impact all points of the customer lifecycle. You’ll make a tremendous impact with the team that spearheads microservice development and operability at SendGrid, using the latest distributed systems programming techniques and technologies like rate limiting, circuit breakers and multi-datacenter (including AWS). You’ll bring the ability and experience to write complex backend services, communicate effectively with cross functional teams, and have a tremendous drive to hone your craft.
Denver is our global headquarters and home to the Customer Growth Engineering team, our revenue growth engine - which your efforts will directly impact.
What You’ll Do
Live by and champion our cultural values of Happy, Hungry, Honest, and Humble
Design entire systems from scratch, end-to-end, that can fit into the SendGrid architecture
Develop solutions for complex problems both independently and with team members
Work with other teams to troubleshoot/determine resolution for complex issues across team domains
Focus on designing and implementing systems for scalability, testability, supportability and maintainability
Use your foresight and experience to keep our systems effectively running now and in the future through profiling, load testing, failure testing, monitoring and much more to have confidence in the robustness of the systems we deploy
Lead team initiatives and implementations from conception to completion
Recommend and champion improvements to our software and product development process
Drive improvements in quality of team's work output
Verse is a venture-funded startup headquartered in Barcelona. We’re a global payments technology platform: our beautiful product lets people pay each other back instantly. And it’s working, people love sending money with Verse and we’re evolving the way the world pays. From our founding in 2015 to our launches in over 15 countries today, Verse’s rapidly expanding global presence continues to revolutionize digital payments. If you want to have a huge impact on the world, this is the place to be!
Job Description:
Love Payments? Love software engineering? Join the Verse engineering team! We are looking for more great engineers who are passionate about building excellent user experiences to help us connect the universe through payments.
As a Verse Senior Backend Engineer you will create delightful software for all of our users around the world. You’ll own the back-end development for one or more projects and tackle tough design and product problems alongside other world-class engineers. Our Back-end team uses the latest technologies (Go, Python, Kubernetes) and we’ll need your expertise to keep us at the cutting-edge by rapidly developing, fast, and secure experiences for our users.
What you’ll do:
Write clean code to develop functional web APIs
Do code reviews and care about code quality
Write technical specifications, evaluating the trade-offs
Help maintain the infrastructure (monitoring, logging) and deployment pipelines
Have ownership of your work, from design to deployment and operation.
Build robust, lasting, and scalable products
Troubleshoot and debug applications
Collaborate with our app developers to integrate user-facing elements with server side logic
Build reusable code and libraries for future use
Write unit and integration tests for your code
Collaborate and work well with the team members
Who we're looking for:
BS/BA in Computer Science or similar experience in technical field
Excellent coder, who writes clean and maintainable code
Experience designing and developing REST APIs
Experience operating a Kubernetes cluster
Experience designing robust relational data models
Motivation when faced with tough technical challenges
Love of learning and passion when it comes to helping others
Nice to have:
Work experience programming in Go and/or Python (Django)
Good understanding of PostgreSQL
Monitoring with Prometheus/Grafana and Elasticsearch/Kibana
Experience writing distributed systems using queues (eg: RabbitMQ)
Experience in Google Cloud Compute
Background in the fintech industry
Perks & Benefits
❤️ Excellent place to work
💎 Amazing offices in the city center of Barcelona
🌍 Being part of a talented, multicultural and dynamic team
⏰ Flexible working hours
🍎 Healthy lunch, fresh fruit, juices, and coffee whenever you want
📚 Training / Meetups / Events budget
💲Stock Options
💲 Referral Bonus
As a Senior Software Developer at WebBeds, you'll play a pivotal role within our self-managed delivery team, driving the development of sophisticated software solutions that align with our technical vision and business goals, embracing our core values of Trust, Teamwork, and Technology. Leveraging your extensive experience and technical expertise, you'll contribute to architectural decisions, mentor junior team members, and provide critical insights to shape our technology landscape. Leading by example with a strong work ethic that values personal and professional well-being, and inspire the wider team to embrace these principles, fostering a culture of health and wellbeing.
Key elements to the role include:
Lead technical discussions within the self-managed delivery team, contributing insights during user story definition, planning sessions, and stakeholder interactions through collaboration, communication, and leading by example.
Design and develop complex software solutions that adhere to WebBeds' technical standards and meet business requirements, demonstrating mastery of software development principles and fostering innovation, and embracing accountability, challenges, and continuous development.
Provide mentorship to junior team members, nurturing their progress and enriching the team's overall skills, while empowering, guiding, and acknowledging their efforts.
Drive architectural decisions, contributing to the evolution of WebBeds' technology stack and ensuring scalability, maintainability, and performance.
Work closely with cross-functional teams, such as QA engineers, product owners, and business analysts, to ensure smooth integration of development work, embodying the anatomy of effective teamwork, professionalism, and thoughtful consideration.
Implement coding best practices, adhering to design patterns, SOLID principles, and security practices.
Participate in code reviews, promoting a culture of constructive feedback and continuous improvement, while embracing mistakes as opportunities for learning and growth.
Troubleshoot and resolve complex technical issues, optimizing software solutions for performance and reliability.
Stay informed about industry trends, emerging technologies, and advancements to continually enhance your skills and contribute to technical innovation.
Inspire and lead by example, fostering a security-first mindset and influencing the team to prioritize security in all development activities.
The skills we would love to see in your suitcase!
Strong proficiency in at least one of C#, Golang, or PHP, with a demonstrated ability to read, understand, and debug code. Adaptability to cross-train in the other language as needed, with the mindset to use the right tool for the job.
Debugging complex issues in distributed systems (e.g., deadlocks, race conditions) and optimizing resource-heavy processes
Mastery of concurrency patterns:
C#: Task Parallel Library (TPL), asynchronous programming, and thread-safe collections.
Go: Goroutines, channels, and sync package primitives (e.g., mutexes, wait groups).
Experience designing and optimizing RESTful APIs for high traffic (10k+ concurrent users) and low-latency requirements is a plus.
Strong knowledge of distributed system architectures and design patterns.
Experience with microservices architecture and service-oriented design.
Experience with containerization and orchestration (Docker, Kubernetes) & understanding of infrastructure-as-code principles is a plus.
Familiarity with CI/CD pipelines and tools (e.g., Jenkins, GitHub Actions).
Experience with monitoring and logging tools for distributed systems (e.g., Grafana, Datadog).
Demonstrated ability to quickly learn new technologies and programming languages.
Willingness to work with and understand legacy code to be able to modernize it.
Curiosity and enthusiasm for staying updated with industry trends.
Ability to work independently and take initiative to solve problems and improve systems.
Demonstrated expertise with clean coding principles and a commitment to producing high-quality code.
Excellent written and verbal communication skills, with the ability to explain technical concepts to both technical and non-technical audiences.
A collaborative team player, adept at thriving within a self-organized team structure and embracing shared responsibilities.
Happy to follow our motto: Build it, Ship it, Support it.
We are looking for a Senior Security Engineer to join our Security Team. In this role, you will propose new ideas and improvements, collaborate with peers on the architecture, and implement new software solutions for container and network security.
Responsibilities:
As a Senior Software Engineer, you will work in a highly skilled team and collaborate throughout the full development lifecycle.
Participate in feature brainstorming, requirement gathering, and customer needs solving
Design and architecture software, leveraging your experience to ensure scalability, maintainability and performance
Write clean, maintainable and documented code using best practices and coding standards
Ensure software reliability through test coverage and local testing
Contribute in maintaining CI/CD pipelines, monitoring and alerting stack
Respond to incidents to resolve customer issues or service disruptions.
We are looking for a Senior Software Engineer with a passion for platform engineering to join our Wire Team. This role has a strong focus on improving internal Developers' Experience (DX) tools and platforms that are integral to the success of our development process.
Initially, a portion of your time will be spent working closely with the Wire Team. This will help you familiarize yourself with CAST AI’s engineering practices and gain a deep understanding of our product. In the Wire team, your main responsibility will be managing and improving the observability (o11y) stack. Long term, you will have the exciting opportunity to transition and play a key role as one of the founding members of our Tooling Team, where you will shape and influence our approach to DX tooling and automation at scale. In this position, you will:
Maintain and optimize the observability (o11y) stack: manage Prometheus, Grafana, Loki, Phlare, Tempo, and other relevant observability tools. Ensure our monitoring, alerting, and logging systems provide a frictionless way to define engineering team-related alerts.
Improve continuous Integration and Delivery: manage and optimize CI/CD pipelines using tools like GitLab Pipelines, GitHub Actions, ArgoCD, and Helm, ensuring efficient and reliable deployment processes.
Development environment management: enable other engineering teams by maintaining and extending the existing local development tooling managed by Tilt.
Oversee incident management systems: integrate with incident management and alerting tools such as Opsgenie, Pagerduty, or similar to enhance our response capabilities and reduce downtime.
Staff Golang Engineer Rialtic USA, Remote (EST, CST, MNT) $200,000 to $250,000 a year
September 2024
12 Applicants This Week
More Than 6 Months Old
Job Description
*Please note that we can only consider candidates in the US within EST, CST, MST time zones.
About Rialtic
Rialtic is an enterprise software platform empowering health insurers and healthcare providers to run their most critical business functions. Founded in 2020 and backed by leading investors including Oak HC/FT, F-Prime Capital, Health Velocity Capital and Noro-Moseley Partners, Rialtic's best-in-class payment accuracy product brings programs in-house and helps health insurance companies gain total control over processes that have been managed by disparate and misaligned vendors. Currently working with leading healthcare insurers and providers, we are tackling a $1 trillion problem to reduce costs, increase efficiency and improve quality of care. For more information, please visit www.rialtic.io.
The Role
We seek a motivated and curious Staff Engineer with extensive background experience in cloud-native distributed systems who hates manual processes and feels compelled to build tools to automate them away. As a key contributor to our core healthcare claims processing platform team and senior member of the technical staff, you will play a vital role in building solutions to improve workflows across multiple engineering teams, supporting client evaluations and implementations, live system support, site reliability, system testing and monitoring, and logging/alerting integrations. This position requires a customer-first, quality-oriented mindset. We are a data-driven organization, so instrumentation and measurement are how we determine the success or failure of our engineering efforts.
We tackle challenges that are common to healthcare companies and healthcare data, but we do it using a modern, cloud-native stack. Our core processing platform and related services are written in Go, while our clinical and financial analytics components that run inside the platform are written in Python. This is a back-end systems focused role: we won’t ask you to write Javascript (but being able to read it never hurts, and we have many APIs and interfaces between us, our clients, and our own systems). Our ability to parse, validate, process, write code against, and manage enormous volumes of data while performing complex analyses quickly and accurately is critical to our success.
If that sounds like a fun challenge, then you should apply for this position!
You will
During any given week in this role, you might:
Develop core platform features using Golang, Python, PostgreSQL, Kafka, and various cloud (AWS) services, with a particular focus on developer experience, tools, and testing;
Apply your experience with distributed systems to our architecture and services, drawing on your hard-won knowledge of the places where whole new classes of fun and exciting bugs lurk;
Collaborate with your engineering peers and build productive relationships with members of the Go-to-Market, Product Management, Clinical Content, and other teams that need our expertise to translate their requirements into coherent technical solutions;
Partner with our cloud/SRE team to understand the performance characteristics and storage needs for our Kubernetes clusters and the pods and containers that run there, which requires continual tuning as we dynamically scale throughout the day to meet client usage patterns and data flows while meeting sub-second SLA performance requirements;
Assist our infosec team in reviewing the findings of automated and manual security testing and audits, including both HITRUST and SOC 2 Type II, and work with the engineering team to implement and refactor code and services in a secure fashion;
Influence the whole Engineering organization to adopt best practices in software development and testing, helping us all develop high-quality, scalable, testable, and maintainable code;
Participate with internal and external stakeholders to understand the business logic and other requirements (such as refresh latency) for our Web-based payment integrity solution, client data warehouse exports, and one-time/ad-hoc analysis needs;
Write and help maintain specifications, documentation, diagrams, test plans, and other artifacts that represent the current and planned future state of our systems;
Serve as a peer reviewer for a colleague’s code, participate in an engineering architecture specification review, work with the product management team to refine a set of requirements or break a story down into concrete tasks for implementation; or
Mentor less-experienced developers as they grow in their own mastery of these topics and more.
Our systems and services tech stack includes (but is not limited to) Golang, Python, SQL, shell scripts, AWS EC2, Athena, Aurora / PostgreSQL, Kafka / MSK, Kubernetes, SQLite, Airflow, Spark, and more!