Duration: Direct Hire/Permanent
Location: REMOTE; Charlotte NC, Salt Lake City UT, Houston TX
*** U.S. Citizens and those authorized to work in the U.S. are encouraged to apply. We are unable to sponsor or transfer visas at this time.***
This role is expected to understand the product in depth, collect and analyze meaningful measurements and provide feedback to the business, Software Engineering and Product teams. The SRE will work very closely with the key stakeholders to help drive changes to increase customer satisfaction, product availability, reliability, and the completion of strategic technical initiatives.
- Performs application specific production support, incident management, problem management, RCAs, and service restoration as needed to quickly respond to and resolve production issues.
- Free up the developer resources to focus on developing new features in the product by handling most of the relevant aspects of how to operate the products effectively and proactively manager customer experience.
- Plan and achieve high availability, performance, and availability of the product service.
- Establish observability of the business system health by integrating with the observability platform using automation
- 5 plus years of experience in a Site Reliability Engineering or Software Engineering role
- 3-5 years of working experience with Windows Server OS administration, SQL Server and Entity Framework ORM, writing and tuning SQL queries, IIS configuration and scalability, VMWare VSphere, ASP.Net MVC
- 2-4 years of working experience with Dynatrace, Azure monitor, AppInsight, log analytics
- Strong understanding of web hosting infrastructure and high availability architecture
- Experience measuring and monitoring .NET applications, SQL Servers/Database, and Serverless cloud resources or equivalent Java-based experience
- PowerShell or Linux scripting for creating automated routines for ensuring site availability