# of Openings: 2
Job Description:
Work closely with both Cloud and VMware architecture and engineers across the company to ensure platform related outages are resolved. Analyze service and system issues, determine root-cause for failure and develop corrective actions. Will contribute to on-going process improvement reviews using Key Performance Indicator (KPI) metrics to eliminate errors, maximize efficiencies and increase service up-time with a goal of top tier customer experience.
Responsibilities:
- Provide technical troubleshooting expertise in resolving escalated incidents.
- Be the subject matter expert on PaaS (VMware & AWS) domain.
- Handle Outages/Service degradations, perform initial root cause analysis and coordinate with vendors as well as engineering teams.
- Lead troubleshooting bridges with internal and vendor teams as needed.
- Engage and escalate issues with various internal teams and vendors when critical, time sensitive support and resolutions are needed.
- Work with technical teams to document troubleshooting steps and methods to improve processes.
- Develop key metrics for performance monitoring and service assurance.
- Contribute to ongoing process improvement reviews, identifying areas for automation and overall efficiency improvements increasing service up-time top tier customer experience.
- Maintain a detailed working knowledge of network technologies and would understand how data moves between Cloud, PaaS Solutions, and Legacy TDM/IP environments.
- Analyze issues before Merchandise Authorization (RMA) and sign off on hardware replacements and upgrades per best practices.
- Take on other duties as assigned.
- This position may provide on-call support on a rotational basis.
Requirements:
- 8+ years of experience working in the PaaS domain supporting a major operator/vendor.
- Knowledge in Open-stack & various flavors, Infrastructure Virtualization, System management, Cloud Environments, NFV/SDN functions and PaaS.
- Cloud / Virtualization Helm, Docker, Kubernetes, AWS, Azure, Google Cloud, OpenStack, OpenShift, VMware vSphere / Tanzu.
- In-depth knowledge of cloud storage solutions on top of AWS, GCP, Azure and/or on-prem private cloud, such as Ceph, CephFS, GlusterFS.
- DevOps Jenkins, Git, Gerrit, Azure DevOps, Ansible, Terraform.
- Backend knowledge Bash, Python, Go.
- PaaS Level solutions such as Keycload for IAM, Prometheus, Grafana, ELK, DBaaS (such as MySQL, Cassandra).
- Knowledge of Kubernetes, OpenNESS and experience of deploying cloud native applications.
- Good understanding and knowledge of BMC, IPMI, PXE boot, RedFish, etc. for management of remote server.
- Good understanding of mobile operator network & deployment architecture and security requirements.
- Good understanding of 5G architecture and RAN concepts.
- Must be self-motivated, able to manage simultaneous events efficiently.
- Able to achieve goals by working professionally with other team members.
- Strong problem solving, analytical, and time management skills.
- BS Engineering or technical degree or an equivalent combination of highly related experience and education.