Site Reliability Engineer

Job Description

*Description:*
It is an exciting time to be part of our client's Platform Experience Group where teams strive to make their Network Platform highly reliable, scalable, operable and secure. As part of the Platform Experience group, the Platform Support team is a tight-knit group that operates and supports the core infrastructure foundation of the network. The team works directly with software engineering teams to deliver services and configurations to enable our company to deliver new experiences and functionality to millions of customers.
This SRE role will focus on providing direct, level one and two support to internal engineering teams. It will require collaborating with multiple global teams to ensure each customer request is addressed in a way that is reliable, secure, and supportable.
*Responsibilities*
* Build, deploy and operate a combination of open source, custom written, and vendor provided software to support the Network platform infrastructure
* Contribute to additional automation and testing for service deployments to improve deployment processes, working towards 100% automation
* Engage directly with engineering customers on troubleshooting requests and guiding them on solutions
* Identify opportunities for process improvement to reduce customer queue time
* Perform monthly service deployments for cloud platform services
* Perform on-call duties for general troubleshooting of core services
* Provide Tier 1/2 support for all foundational platform services
*Key Qualifications*
* Ability to design and provide operational and infrastructural requirements that promote uptime, speed and security at all phases of the software lifecycle on a global scale
* Excellent troubleshooting skills that span code, system, and network
* Hands on experience in working with distributed systems and availability, reliability, scalability
* Proven experience at building, deploying and operating services at scale in public cloud environments
*Required Technical Skills*
* Strong ability to troubleshoot complex issues ranging from system resources to application stack traces
* Technical certifications or other demonstrations of passion in security and technology (e.g., CISSP, AWS Associate, open-source projects, or equivalent)
* Experience in developing tools for system configuration, deployment, and monitoring
* Solid grounding in information security principles
* Experience building and operating various core infrastructure services (prefer experience with multiple of these or similar technologies): Cloud Networking, Certificate Management, Software Delivery, Configuration Management, DNS, Traffic Management, Identity & Access Management, Network Access Management, Observability, Remote Access Solutions, Secure Images
* Experience in public cloud services and deployment (AWS experience preferred)
* Strong software development experience in Python, JavaScript, or Go (Python preferred)
* Experience operating in regulated environments such as SOX/PCI
*Required Soft Skills*
* Systems thinking with a security mindset
* Ability to empathize with developers in a way that drives tracing solutions
* Customer and peer relationship focused with strong interpersonal and communication skills
* Ability to learn new skills/technologies quickly and independently
* Ability to thrive in a fast-paced team environment
* Methodical and systematic problem-solving approach
* Basic writing skills that support an almost compulsive drive to document
* Strong belief in driving operational excellence, owning efficiency and automation at the core of operations
*Skills:*
AWS, troubleshooting, ticketing system, observability, automation, root cause analysis, python, akamai
*Top Skills Details:*
AWS, troubleshooting, ticketing system, observability, automation, root cause analysis, python
*Additional Skills & Qualifications:*
* Systems thinking with a security mindset
* Ability to empathize with developers in a way that drives tracing solutions
* Customer and peer relationship focused with strong interpersonal and communication skills
* Ability to learn new skills/technologies quickly and independently
* Ability to thrive in a fast-paced team environment
* Methodical and systematic problem-solving approach
* Basic writing skills that support an almost compulsive drive to document
* Strong belief in driving operational excellence, owning efficiency and automation at the core of operations
*Experience Level:*
Intermediate Level
*Benefits:*

* Eligibility requirements apply to some benefits and may depend on your job classification and length of employment. Benefits are subject to change and may be subject to specific elections, plan, or program terms. If eligible, the benefits available for this temporary role may include the following:
* Medical, dental & vision
* Critical Illness, Accident, and Hospital
* 401(k) Retirement Plan - Pre-tax and Roth post-tax contributions available
* Life Insurance (Voluntary Life & AD&D for the employee and dependents)
* Short and long-term disability
* Health Spending Account (HSA)
* Transportation benefits
* Employee Assistance Program
* Time Off/Leave (PTO, Vacation or Sick Leave)





About TEKsystems:

We're partners in transformation. We help clients activate ideas and solutions to take advantage of a new world of opportunity. We are a team of 80,000 strong, working with over 6,000 clients, including 80% of the Fortune 500, across North America, Europe and Asia. As an industry leader in Full-Stack Technology Services, Talent Services, and real-world application, we work with progressive leaders to drive change. That's the power of true partnership. TEKsystems is an Allegis Group company.

The company is an equal opportunity employer and will consider all applications without regards to race, sex, age, color, religion, national origin, veteran status, disability, sexual orientation, gender identity, genetic information or any characteristic protected by law.

 

*Please mention you saw this ad on HigherEdPost.*

Apply Now

®