Golang Site Reliability Jobs Paying 200,000 USD a Year
Hand-Picked Golang jobs • Apply directly to companies •
Clear salary ranges
Browse 10 Golang Site Reliability Jobs (8 new this week) in June 2023 at companies like TextNow, Rebellion Defense and Netflix paying at least 200,000 USD per year working as a Senior Site Reliability Engineer, Site Reliability Engineer and Senior Site Reliability Engineer, CORE. Last post
Hiring Golang Developers?
Create your profile to continue
48 direct messages sent by companies to developers on Golang Cafe
in the last 30 days
60 developers joined Golang Cafe in the last 30 days
6,452 developer profiles page views in the last 30 days
TextNow is based around a simple idea: Communication belongs to everyone. We work hard to help people stay connected by offering a solution that makes phone service free. At TextNow, we work together to solve complex and interesting problems that have a positive impact on ourcustomers'lives.
Join us in our mission to help people stay connected with technology that is free (or as close to free as possible.)
TextNow is looking for motivated Site Reliability Engineers (SRE's) to own infrastructure, monitoring, logging, ci/cd, reliability and everything in between!
What You’ll Do:
Be responsible for maintaining and scaling production services and servers for complex and high throughput.
Improve scalability, service reliability, capacity, and performance.
Write automation code for provisioning and operating infrastructure at scale.
Build tools for internal use to support software engineering best practices.
You are not an operator; you’re an experienced software engineer focused on operations.
Work with development teams to make sure the applications fit nicely within the infrastructure and scalability/reliability/security is designed and implemented from the start.
Participate in on-call rotation, being responsible for uptime and support.
Roll up the sleeves to troubleshoot incidents, formulate and test your hypotheses, and narrow down possibilities to find the root cause.
Who You Are:
Creator of cool stuff with experience deploying web apps and distributed, service-oriented architectures.
Brilliant Collaborator with 8+ years of professional experience in an operationally focused role, preferably in DevOps or SRE, with a B.S., M.S., or PhD. in Computer Science (or equivalent).
Someone who takes action and ownership with proven ability to use automation tools.
Respectfully candid with the ability to motivate people to act and work on behalf of our customer.
A bold risk-taker and self-starter who loves to solve challenging problems.
Resourceful and scrappy with the ability to be strategic, roll up your sleeves and execute.
Strong knowledge of Linux and open source software
Understanding of modern web architecture (HTTPS, REST) and technology stacks
2+ years of experience with programming/scripting languages (Bash, Go, Python, Ruby, etc.)
Experience with deployment automation using Ansible, Puppet, and Terraform
Experience supporting various databases such as MariaDB, Redis, and various NOSQL engines
Experience deploying containers using Docker and Kubernetes
Experience working in the Amazon public cloud (AWS)
Experience supporting mobile applications (Android and iOS)
Experience in the telecommunications industry
· Strong work life blend
· Flexible work arrangements (wfh, remote)
· Employee Stock Options
· Unlimited vacation
· Competitive pay and benefits
· Parental leave
· Benefits for both physical and mental well being
Diversity and Inclusion:
At TextNow, our mission is built around inclusion and offering a service for EVERYONE, in an industry that traditionally only caters to the few who have the means to afford it. We believe that diversity of thought and inclusion of others promotes a greater feeling of belonging and higher levels of engagement. We know that if we work together, we can do amazing things, and that our differences are what make our product and company great.
For TextNow Candidates:
The People and Culture team is available to support you through the hiring process by providing reasonable accommodations to help enable a barrier-free interview experience. If you need assistance applying for a role due to a disability or special need, please let us know by completing this form. Once received our Equity, Diversity and Inclusion Specialist will reach out to you and assist with accommodations that you may require.
We are looking for a Site Reliability Engineer (SRE). As an SRE, you will be tasked with the reliability and operation of our production environments. SREs are tasked with ensuring teams within the company receive help maintaining software at scale, as well as help designing and developing software for scale. SREs are expected to engage with the product teams to ensure the delivery of our software is as seamless as possible.
These position is based out of our Washington D.C. or Chicago Illinois office locations. An active clearance or ability to obtain TS/SCI clearance will be required.
We look for a track record of the following:
Coming alongside high energy engineering teams to enable the adoption of best practices to enable the scalability and reliability of deployed software,
Defined architecture and built services at scale on public infrastructure such as AWS and Azure,
Experience designing, implementing, deploying, and operating high scale production services,
Experience facilitating the definition and implementation of SLIs and SLOs,
Understanding how to carefully spend error budget to handle regular deployment of large changes to production,
Deep experience in Linux operating systems, and systems engineering,
Comfort delivering critical software in Go and Python,
Willingness to debug problems across the stack,
Comfortability with working on underspecified problems and are capable of rapidly learning and iterating on solutions,
Experience building the wrong system enough times to avoid the common pitfalls, whether building something personally or advising others.
You might be a good fit if you:
5+ years of relevant SRE experience in the tech industry,
demonstrable knowledge of TCP/IP, HTTP, web application security and experience supporting web application architecture,
experience working with a variety of storage systems, application architectures, compute infrastructure and network management systems,
experience designing, implementing, deploying, and operating high scale production service,
defined architecture and built services at scale on public infrastructure such as AWS and Azure,
proven knowledge at least one higher-level language (eg. Python and Golang),
The ability and desire to build and learn new systems with new technologies.
Rebellion is a well-capitalized technology start-up firm that is passionate about defining and delivering modern, life-changing software products to the US Department of Defense (DoD), the UK Ministry of Defence (MoD), and their allies. At Rebellion we believe in operating what we own, we deliver all of our products as managed services, this allows our product teams to maintain operational ownership across all deployments. Expect talented, motivated, intense, and interesting co-workers.
Compensation includes meaningful equity ownership, competitive salaries, full medical coverage, disability and life insurance, and transit reimbursement.
An Equal Opportunity Employer/Veterans/Disabled.
Rebellion Defense is an equal opportunity employer and makes employment decisions on the basis of merit and business needs. Rebellion Defense does not discriminate against applicants on the basis of race, color, religion, sex, sexual orientation, gender, gender identity, national origin, veteran status, disability, or any other protected characteristic in accordance with federal, state, and local law.
At Netflix, we strive to bring joy to people across the world through amazing stories. As we grow internationally, we are continually enhancing our cloud-based infrastructure to improve our performance, scalability, and reliability.
The SRE team's goal is to ensure customer joy by successfully managing risk and minimizing impact across Netflix. We do this through cross-functional engagement with other engineering teams, managing issues when they happen, as well as promoting reliability and resilience practices throughout the organization.
Improve our incident management lifecycle to identify, mitigate, and learn from reliability risks
Increase our reliability through establishing guidance and methods of improvement
Form and maintain relationships with internal and external partners
Develop deeper insights and analysis into the quality of experience for our customers
Curiosity about how complex sociotechnical systems successfully operate at scale when failure is inevitable
People who see influence as their preferred tool for cultivating relationships
Collaboration and continuous improvement
A desire to learn and readiness to teach
Iteration as the path forward
Drive incidents to resolution by coordinating with multiple engineering teams
Identify sources of instability in large-scale distributed systems and drive operational excellence
Analyze complex systems from a reliability and resilience perspective
Engage with product teams to diagnose operational surprises and carry forward improvements
Improve reliability and drive down the burden of toil with tooling and automation
Nice to Have
Experience with global, continuous delivery methods
Involvement with incident management and response
Knowledge of cloud platforms like AWS and microservices architecture
smlXL is a 'stealth' start-up building an Information retrieval service with Consumer and Enterprise applications. Our first focus is providing a far richer understanding of the semantics of blockchain activity, making data and information accessible and useful to all.
We aren't ready to talk broadly about what we are working on, but we might be a good place for you if:
You are highly technical; you care about your craft; you are constantly learning; used to working on baremetal servers and running your own stack, you are fascinated by Information Systems, Semiotics, and Blockchain data; you get excited by turning black boxes transparent; and you love working on things that add a ton of value to consumers and prosumers alike; or you are into the EVM, decompilers, databases, and distributed systems.
Experience keeping production systems running smoothly, experienced with working on private cloud/colo/bare-metal environments
Experience building software and systems to manage platform infrastructure and applications
Experience with and/or a desire to go deeper into blockchain technology and crypto protocols
HashiCorp or Nomad experience is a plus
You care about polish and adding value to our users but not perfectionism for perfectionism’s sake
You love working collaboratively with different disciplines and learning from others
You are an expert who stays curious with a beginner’s mindset
You are a thoughtful communicator and collaborator and work to gain consensus with your peers and stakeholders, but you’re not afraid to speak up
You want to win, but prefer to win as a team
You are proactive
You are thoughtful and open about your priorities, goals, and aspirations so we can help you achieve them
You have specific passions outside of work
We believe that on average it will take 5+ years of experience in an engineering role to get to the level we want, but don’t let that stop you
Benefits and Support
Comprehensive health benefits (Medical, Dental, Vision, Life)
Flexible working hours, flexible WFH policy and unlimited time off with approval
Gender-neutral parental leave program for primary and secondary caregivers
Competitive salary and equity compensation with 401K retirement plan options
Physical, Mental, and Financial Well-being applications are provided at little to no cost, including fertility benefits, fitness classes, mental health, physical therapy, and healthcare apps (One Medical)
We encourage, support, and make time for our team members to invest in side projects and community projects
We’re a small team of experienced engineers with diverse technical backgrounds. We’re passionate about driving our coworkers’ success and building the next generation of software tooling. If you want to work on distributed systems infrastructure and development practices or you have an entrepreneurial spirit and want to make something that your peers use every day, we’d love for you to join us.
Tooling handles many different areas, so we’re building a diverse team with a wide range of expertise.
What We Do
- We build shared infrastructure and tools to make engineering more productive, reliable, and cost effective.
- We maintain several Segment Open Source projects.
- We work in Go, Terraform and a bit of Node.js.
- Read more about Segment’s infrastructure and how we use: distributed logging and secure secrets. Or, read our code: conf, ksuid, cwlogs, go-prompt, ecs-logs, chamber.
- We manage the tooling and process around development environments, testing, CI, and deployment.
- Read more on our blog about how we use: CI and Make.
Who we are looking for:
You care about simple, practical, reliable, and secure software implementation and the kinds of process needed to produce it.
You can research a messy, complicated problem and design an approach that makes working in that area easy and consistent.
You empathize with the rest of your company, listen to them, and take pride in supporting their work.
Projects we’re working on:
Per-Engineer Dev Environments
Logging Pipeline Development
AWS Rate Limit Monitoring
Application Deployment Improvements
Incident Management Automation
Large Scale JSON Stream Data Manipulation Tools
Standardized Metrics and Alerting Infrastructure
Consistent Runbooks and Documentation
Minimum of 3 years experience as a software engineer, devops engineer, or site reliability engineer.
You have experience with AWS, Docker, Go, Node.js, or Terraform.
You are motivated to support your coworkers and make them productive.
You are a self-directed problem solver.
Building tooling for distributed systems development.
Working on or with a variety of engineering teams.
Location: Remote (EU, UK, US, Canada, South America)
At SlashID, we are rethinking the way companies manage identity and authentication, giving users a better experience while respecting their privacy and keeping their data safe.
At the core of our system are encrypted user identities, with API-based modules built on top, which accomplish tasks such as authentication, authorization, ID verification and many others.
SlashID’s products are on our customer’s critical path and most of them require 99.99% uptime, so reliability and security are key to our engineering culture.
Last but not least, we are a young startup. We work with tight deadlines, lean processes and ambitious roadmaps. We are a small, tight-knit team who strives to succeed in a competitive environment.
About the role
We’re looking for people with a strong technical background and a passion for building highly scalable and reliable systems. You’re a good fit if you are comfortable dealing with complex distributed systems, have exquisite attention to detail, and enjoy learning new technologies.
SlashID is remote-first and we offer flexible working arrangements to help our team manage their daily lives in the way that works best for them.
Please note: the exact level of the role (Senior or Principal) will depend on your experience and interview performance.
Design, build and maintain SlashID’s products, services and features
Be part of the engineering team working on our Authentication, Data Vault and User Management services
Use and adapt state-of-the-art cryptographic libraries and primitives
Build tooling to monitor and analyze SlashID’s services, both in terms of performance and security
Write technical documentation, blogs and guides
Work with other highly motivated engineers who all have an intrinsic drive to make things better
Use your passion for technology to ensure our platform operates flawlessly 24/7
Have broad exposure to our entire architecture
Hardware Security Modules (HSM)
Postgres and MySQL
You are a good fit if you:
Have a strong understanding of reliability practices, distributed systems, and cloud native architectures
Have experience as a cloud or backend engineer for a multi-tenant large scale mission critical system
Have a thorough understanding of engineering best practices, including appropriate testing paradigms, effective peer code reviews, resilient architecture
Have a good understanding of multi-threading, concurrency, and parallel processing technologies
Have experience producing high-quality technical documentation for the products you develop
Love building secure software, leveraging the latest cryptographic technology and methodology
Thrive in a fast-paced, test-driven, collaborative, and iterative environment
Have a passion for reliable and performant systems, and care deeply about user experience
Enjoy working with a diverse group of people with different backgrounds and expertise
What are we doing: we are a tech company, operating a thriving and growing broadcast platform, Alexa ranked in the top 100 sites internationally, and the top 25 in the United States, with approximately 10 million daily users, and a worldwide community of fans. Independent Broadcasters use our platform to create and share live streaming video, photographs, and similar content, generally adult in nature, (but no adult content is required).
Our sophisticated system has multiple parts, including but not limited to payment gateways, live chats, and video streaming technology. Every contribution here is of high impact and affects the experience of millions of users using the site every day.
We always explore new ways to use cutting-edge tech stack and move toward modern micro-services based architecture.
How we build the product: The platform is built on Python/Django framework with TypeScript on the front-end. Some parts of the platform use Java, Golang, and Rust.
Top-3 reasons to join our team:
People first culture - many initiatives that support the well-being of our employees.
Impressive team members who joined us after working at Google, Imgur, etc.
An inclusive environment that induces a high impact of everyone on a team.
What will you do: Being a part of our infrastructure team, you will be working with Lead backend Engineer on complex migration of our codebase to Python 3 with its further optimization. As a top-notch software engineer in our team, you will be accountable for building comprehensive technical solutions with high-quality standards and practicing clean coding styles.
This includes working on features that are supported across all major web browsers, mobile devices, smart TVs, and video consoles. As a backend engineer, it is expected to work with large data sets and being an expert in designing efficient algorithms, queries, and caching methods.
We value an initiative and are willing to support you in making appropriate technical decisions
A degree in STEM and/or relevant professional experience.
Solid knowledge of programming fundamentals - algorithms, data structures, design patterns, and paradigms.
7+years of professional experience with Python in a high-paced production environment.
Familiarity with Django or extensive experience with similar Python-based frameworks.
Expert knowledge of inner workings of Django is highly desirable.
Proven problem-solving and fast-learning skills.
Health&Life insurance with dental and vision plan
Paid holidays, vacation and sick days
What does the recruiting process look like: we value the sense of urgency and aspire to build a smooth and transparent recruiting process. These are our stages in the recruiting process: phone screen with a recruiter, resume review by our hiring manager, first technical interview with a backend team lead to test your knowledge of Python/Django, 1-hour live coding session with the Head of Engineering, Meet&Greet with your potential team.
We reserve the right to add additional selection stages to the process depending on the specific skills of each candidate.
At GRAX, it’s all about data. We help our customers secure and drive value across their ever expanding enterprise SaaS data footprint. Initially, we're focused on Salesforce, the wildly popular CRM platform used by the world's most successful companies. We capture and retain every data change over time, so it can be stored, processed and analyzed using the full power of AWS, Azure and GCP.
GRAX is a well-funded Series A startup. We’re one of the fastest growing partners in the Salesforce ecosystem with revenues more than doubling year-over-year.
Who we are
GRAX was founded by serial entrepreneurs with a long history of success in the Salesforce ecosystem. The product and engineering organization is led by veterans in cloud platform development, including some of the key architects behind Heroku.
About the role
The Backend team builds and maintains the core distributed data pipeline that slurps data from SaaS APIs, secures it for safekeeping before transforming and routing it to its final destination. As an engineer on this team you can expect to:
Program mostly in Go (golang.org) within a group of experienced developers committed to learning, sharing and continual improvement.
Work closely with internal teams from PM through Customer Success - and occasionally directly with customers.
Own the full lifecycle of specific features and product areas from design to release
You may be a good fit if…
You have experience with large scale data processing
You have built or operated a large cloud service
You have prior experience working with distributed systems with a focus on reliability and resiliency.
You have extensive experience building on AWS, GCP and/or Azure.
What it’s like to work here
Founded in Boston, GRAX is a remote-first, distributed team. We value collaboration, communication and accountability. You’ll be offered a competitive salary, equity, full health benefits incl. dependents and unlimited PTO.
GRAX embraces diversity and equal opportunity. We are committed to building a team that represents a variety of backgrounds, perspectives, and skills. We believe the more inclusive we are, the better our company will be.
Perks & Benefits
Full health benefits, including dependents. Unlimited PTO. Equity. Competitive salary.
Senior Software Engineer Helix San Diego / San Mateo, United States $155,000 to $205,000 a year
4 Applicants This Week
More Than 6 Months Old
It’s our mission to empower every person to improve their life through DNA. We believe DNA will be digitally accessible to each person so that it can be used—at any time—to improve health outcomes and accelerate research.
Helix powers life-changing population health programs. Our world-class clinical laboratory platform and proprietary Exome+ assay enable health systems to integrate genomic information into routine clinical care and enhance the accessibility of personalized healthcare. Additionally, Helix stores and protects participants’ DNA information, so that as the science evolves, health systems, patients and researchers are able to continuously benefit from a lifetime of DNA insights.
Our big vision comes with big responsibility. That’s why we’re building a diverse team of experts in the field of genetics, engineering, design, business development, and beyond to help bring actionable insights to our customers.
What’s important to us:
Curiosity — we are all passionate about the possibilities enabled by having access to your own genome
Responsibility — we have an obligation to people and our partners to operate with highly credible research guided by well respected advisors, with clear and effective communication about our products
Agility — flexibility and a desire to be nimble, smart, and effective are important to the Helix culture
Follow-through — we’re building a diverse team with amazing track records of achievement in multidisciplinary environments
As a Senior Software Engineer, you will:
Design and build high quality enterprise grade software solutions for genomics lab and data pipeline applications.
Collaborate with product managers, bioinformaticians, scientists, regulatory, and other engineers.
Synthesize requirements and author new engineering designs.
Build for reliability and scale across users and health systems.
Advance engineering best practices.
Mentor other engineers to reinforce a culture of learning and teaching.
A passion for improving people’s lives through access to better information about their DNA
5+ years development experience in one of the following: Go, Python, or a similar language
A proven track record of building software solutions for managing and processing large datasets
Strong written and verbal communication skills
Familiarity with developing software on cloud platforms — AWS, GCP, Azure
Affinity for an engineering culture that emphasizes Agile, DevOps, and continuous delivery
Familiarity with full-stack development
Familiarity with regulated software systems and entities
BS+ in Computer Science; coursework in genetics or bioinformatics
What Helix can offer you:
Competitive compensation, including meaningful equity
401(k) with employer matching
Health insurance, including medical, dental, and vision
As an early Platform Engineer at Watchtower, you’ll enable us to deliver our platform reliably, securely, and at massive scale. You’ll help architect low latency, real-time microservices that process & detect sensitive data at scale.
Building highly-available and secure authentication and API services
Maintaining and evolving mission-critical internal databases and services
Optimizing and operating high volume auto-scaling streaming data services
Instrumenting streaming data services for visibility into utilization per customer
Expertise in one or more systems/high-level programming language (e.g. Go, Rust, Python, Java, C++) and the eagerness to learn more.
Eagerness to wear multiple hats in a startup environment
Experience running scalable (thousands of RPS) and reliable (five 9’s) systems.
Experience with developing complex software systems scaling to substantial data volumes or millions of users with production quality deployment, monitoring and reliability.
Experience with large-scale distributed storage and database systems (SQL or NoSQL, e.g. Cassandra, CockroachDB, Spanner)
Ability to decompose complex business problems and lead a team in solving them
Data Processing - experience with building and maintaining large scale and/or real-time complex data processing pipelines using Kafka, Hadoop, Hive, Storm, or Zookeeper
Watchtower is a cybersecurity startup dedicated to helping enterprises secure and manage their sensitive data. As a leading enterprise technology company, our product affects the personal data that people entrust businesses to store & process with care every day. Critical data in the modern enterprise is often sprayed across a broad set of cloud systems (e.g. SaaS & data infrastructure), and it’s a herculean task for security teams to monitor, manage, and protect this highly sensitive data. Via machine learning, our product makes it easy for companies to discover, classify, and protect this sensitive data across their cloud footprint - such as their corporate SaaS, data infrastructure, and APIs. In doing so, we prevent data leakage, provide unprecedented data visibility & protection across the cloud, and enable compliance. We're a technology startup based in San Francisco and Palo Alto, well-funded by leading institutional investors with deep expertise in the cybersecurity industry. Learn more at our website www.watchtower.ai or by reaching out via email at [email protected].