SRE/DevOps Engineering manager

East Bay, (none selected)

Posted: 02/25/2019 Employment Type: Direct Hire Industry: IT Job Number: JOS000008464

Key requirements:   

  • SRE - Site Reliability Engineer (Software that will be supported has on call mgmt/data analytics, etc)  
  • Examples of SaaS: Salesforce, CRM, Big Biller, Ebay,...  
  • Must be good at coding: write automative scripts  
  • Experience with Cloud (3+ years): deploy & maintain SaaS, has experience leading a team  
  • Plus: ideally can also speak Mandarin 

Questions to ask candidates: 

  • Do you support any SaaS application platform?  
  • What type of web service do you use?  
  • Can you code & automate applications? 

Engineering manager, SRE/DevOps (summary): 

Deploy and maintain cloud-based infrastructure. Develop and operate reliable, distributed software systems that require high availability to support mission-critical business tasks. Build and lead team of SRE/DevOps engineers to provide cloud infrastructure support and own end-to-end availability and performance of mission critical services and build automation to prevent problem recurrence. 

Job Description: 

  • Build and lead a team of SRE or DevOps engineers to maintain and support large scale cloud-native application infrastructure, storage and networking systems.  
  • Own and manage product support and change management process of backend systems.  
  • Help develop systems / components to improve operation simplicity and system reliability. For example, monitoring system, deployment automation and diagnosis tools. 
  • Troubleshoot and fix reliability bugs in our software and/or report them to our software engineering teams.  
  • Help with infrastructure resource planning.  
  • Guide systems architecture (initial build-out and continuous improvements) by participating in design reviews and offering suggestions for reliability and optimization of production systems. 

Required Skills:  

  • Ability to establish and fine tune the technology support process  
  • Ability to think clearly and respond promptly in various operational scenarios  
  • Ability to attract, motivate, lead, and retain exceptional talent and can clearly articulate vision and purpose  
  • Proven track record of developing and mentoring direct reports,  
  • Ability to develop healthy working relationships quickly, build consensus and influence across the organization  
  • Proven ability to balance long-range, strategic leadership approach with near-term, detailed, day-to-day operational imperatives (e.g., can maintain bigger picture orientation while remaining hands-on and tactical)  
  • Self-motivated, self-directed team player 

Minimum Qualifications:  

  • BS degree in computer science or related technical field, or equivalent practical experience.  
  • 10+ years of experience in software development and deployment in one or more of the following: C, C++, Java, Go and/or Perl, Python, Ruby.  
  • Experience managing an engineering team on large-scale projects with technical deep-dives into code, networking, operating systems and/or storage. 

Desired Previous Job Experience 

  • Proficiency working with algorithms, data structures and production troubleshooting.  
  • Expertise in problem solving and analyzing global scale distributed systems.  
  • Experience with development and deployment in a hosted cloud environment like AWS or Azure  
  • Experience with e-Commerce systems like supply chain management and payment processing systems  
  • Experience with big data processing pipelines and data analytical platforms  

Send an email reminder to:

Share This Job:

Related Jobs:

Login to save this search and get notified of similar positions.