The Demand Engineering team plays a critical role in ensuring the fast, secure, and reliable delivery of Slack to over 14M+ daily active users worldwide. At the heart of our work is the design and operation of robust service-to-service networking, powered by advanced service mesh technologies and service discovery. This enables secure, scalable, and resilient communication between internal servicessupporting high availability and enforcing strong security boundaries. In addition, we architect, build, and maintain the systems behind Ingress load balancing and intelligent traffic management, including a custom fleet of software load balancers, cloud-based load balancing infrastructure, DNS, and CDN services.
Slacks infrastructure is always evolving to support our fast-growing business. Demand Engineerings roadmap is aimed at improving ease-of-use of our infrastructure by providing our developers' features such as blue-green deployments out of the box. We are a small team making a large impact. We rapidly iterate and work closely with other teams in engineering ensuring resilient systems built to scale. We have a strong commitment to quality and understand that simplicity and reliability should be primary aspects of the systems that we build.
Reliability is Slacks most critical feature! Accordingly, Demand Engineering is responsible for systems vital to Slacks availability. We work to make our systems scalable, efficient, and operating according to our high standards in production. We also partner with other engineering teams to find solutions to improve end-to-end customer experience in Slack.
Slack has a positive, diverse, and supportive culture we look for people who are curious, inventive, and work to be a little better every single day. In our work together we aim to be smart, humble, hardworking and, above all, collaborative. If this sounds like a good fit for you, why not say hello?
About the Role
This is a full-time staff engineering position based in the U.S.
What you will be doing
Lead the design and development of scalable, reliable, and secure service mesh infrastructure across our platform, enabling seamless service-to-service communication.
Drive architectural decisions and provide technical leadership for initiatives related to service discovery, observability, security (mTLS, policy enforcement), and traffic management (circuit breaking, graceful failovers, blue/green routing).
Collaborate cross-functionally with the Compute, Webapp infrastructure, Security, and Monitoring teams to integrate service mesh capabilities into development and deployment workflows.
Contribute to and/or extend open-source projects such as Istio, Linkerd, or Envoy to meet the evolving needs of our infrastructure.
Mentor and guide engineers across teams, fostering knowledge sharing and elevating the overall technical capability of the organization.
Continuously evaluate emerging technologies in the service mesh and cloud-native space, identifying opportunities for innovation and improvement.
Taking ownership of critical technical issues to maintain optimal service mesh operation, meeting or exceeding performance, reliability, and SLO targets.
What you should have
Must have lawful permanent residency in the U.S.
5+ years of experience in software engineering, with a strong focus on distributed systems, cloud-native applications, and microservices.
Deep understanding of service mesh technologies such as Istio, Linkerd, or other Envoy-based service meshes.
Hands-on experience with cloud providers such as GCP or AWS, with expertise in container orchestration using Kubernetes.
Enjoys troubleshooting in distributed Linux systems environments and is comfortable tracing issues across applications, systems, and networks
Proven track record of building tools, automation, or services using one or more programming languages (e.g., Go, Ruby, Python, C/C++).
Strong interpersonal and communication skills; able to explain complex technical concepts to designers, support staff, and fellow engineers.
Qualifications
Experience with configuring and operating service mesh on larger-scale production operations, focusing on stability, scalability, and performance limits of web services
Experience with TCP/IP, DNS, and network-related protocols
Experience with Linux / Unix operating on high volume systems at scale
Experience running deployment automation/configuration management systems at scale - e.g., Chef, Puppet, Terraform, Ansible, CloudFormation or others
Certifications in Istio, Kubernetes, Google Cloud, and/or other technologies
Experience with algorithms, data structures, complexity analysis, distributed systems and software development
A BS, MS, or Ph.D. in engineering or related technical field (or equivalent work experience)
...Requisition Number: 152753 Employment Status: Full time Location: CRMH - Carilion Roanoke Memorial Hospital Shift: Day/Evening Shift Details: Monday-Friday, 8am-4:30pm, full time Recruiter: DANA E JOHNSON Recruiter Phone: (***) ***-**** Recruiter Email...
..., and assessment department staff to develop and maintain the LMS platform and included components. Provides service to State... ...setup and learning with the LMS system Perform high level administrative functions Oversees, coordinates, monitors, and evaluates daily...
...quality care. Together, we are transforming the healthcare experience with an innovative and whole-person focus on physical, mental, spiritual and social healing to support community well-being. Benefits ~ Continuing Education ~ Bereavement ~ Medical benefits ~...
...for our direct client , a well-established and growing structural engineering consulting firm based in Irvine, CA . The firm specializes... ...engineering and project management. This is an entry-level to early-career opportunity designed for individuals with...
Location: St. Louis, MO. We are open to remote hires if you dont live in St. Louis. The Normal Brand has gained a reputation for providing durable and long-lasting products that is above all - comfortable. Jimmy Sansone started the company by selling online out...