Site Reliability Engineering (SRE) applies software engineering techniques and discipline to production operations to attack major problems and fix them for good. Our customers count on us to provide extraordinary availability, scalability and security for our services. SRE should be comfortable with taking on new engineering challenges, defining potential solutions, and implementing designs in a team environment. This position will play an important role in our organization's evolution towards contemporary application and infrastructure management practices and will be expected to both guide and support the team's growth and learning. SRE is new at Fiserv, and members of this team will have the chance to influence the direction for a critical and global SRE organization. SRE will also be focused on addressing the hot/tactical/engineering issues that are impacting the ongoing integration activities within Fiserv Technology Services (FTS) by using your Infrastructure as Code expertise.
Build holistic visibility into SLIs, SLOs, and SLAs, dependency graphs, past performance of software, network, and system to ensure that we can continue to scale without increasing operational burden or toil.
Assess the current state of the environment and drive "SWAT" initiatives in collaboration with the rest of the Organization to ensure transparency, resiliency, stability, reliability etc... Across both Applications & Infrastructure stack. SWAT initiatives for future state can vary from Incident Analysis leveraging ML & AI/ Assisting with Datacenter Stability & Consolidation effort to Application Transformation [Monolithic to Microservices, PaaS etc.]
Enables the adoption and implementation of cloud-based application reliability, resiliency, and observability /deployment best practices for production & non-prod environments including public cloud migration of our mission critical applications from the onprem data-centers.
Build infrastructure and drive projects that break things with the aim to improve the robustness of production systems.
Use the core Site Reliability Engineering principles of change management, monitoring, emergency response, capacity planning, and production readiness reviews to run the platform.
Step back to observe patterns and develop innovative tools and automation to minimize toil. Use those learnings to drive the best operational practices.
Monitor and report on service level objectives for a given applications services. Work with business and product owners to establish key performance indicators.
Partnering with security engineers and developing plans and automation to aggressively and safely respond to new risks and vulnerabilities.
Partner with the broader Fiserv organization to build a culture of rigorously learning from incidents.
Share your knowledge by giving brown bags, tech talks, and evangelizing appropriate tech and engineering best practices.
Unblock, support, and effectively communicate across teams to achieve results.
Define roadmap and architecture based on technology and business needs.
Ensure predictable, consistent, and successful program(s) delivery of the data center site closures as defined in the program scope.
Exhibiting proactive behavior is key to being successful in this position
Basic Qualifications for Consideration:
6+ years of experience supporting an Enterprise IT environment
Experience with high level programming languages (Python, Go, Java, etc.)
Experience designing, debugging and running fault tolerant large-scale distributed systems
Knowledge of public cloud platforms (e.g., AWS, Google Cloud Platform, Microsoft Azure, etc.)
Experience with creating and improving documented procedures and/or playbooks.
Knowledge of open-source configuration, orchestration, and CI/CD tools.
Knowledge of Kubernetes, PCF and/or Docker.
Understanding of Cloud Architecture and Operations
Strong troubleshooting and debugging skills
Experience with tools & technologies such as Prometheus, Grafana, AppDynamics, Dynatrace, Splunk and Moogsoft is a plus.
Experience handling large numbers of diverse systems with configuration management systems like: Puppet, Chef, Ansible, or Salt.
Understanding of standard networking protocols and components such as: HTTP, DNS, ECMP, TCP/IP, ICMP, the OSI Model, Subnetting and Load Balancing strategies.
Learn more about Fiserv: Life moves fast. And as it does, we know most people aren't thinking about "financial services" But we are. We help people and businesses move money and information every minute of every day. Our solutions connect financial institutions, corporations, merchants and consumers to one another, millions of times a day, behind the scenes, reliably and securely. We're Fiserv, a global leader in Fintech and payments enabling innovative financial services experiences that are in step with the way people live and work today. The company's approximately 44,000 associates proudly serve clients in more than 100 countries, so their customers, members and consumers can move money when and where they need it, at the point of thought. Our Aspiration is to move money and information in a way that moves the world. As a FORTUNE(tm) 500 company and one of FORTUNE Magazine World's Most Admired Companies for the sixth consecutive year, we are committed to excellence and purposeful innovation. Explore the possibilities of a career with Fiserv and Find Your Forward with us.
We welcome and encourage diversity in our workforce. Fiserv is an equal opportunity employer/disability/vet
In order to protect our Fiserv community, Fiserv requires all newly hired employees in the United States to be fully vaccinated before their start date. Proof of vaccination will be a condition to hiring. Fiserv complies with all applicable laws regarding the reasonable accommodation of individuals with disabilities and/or sincerely held religious beliefs.
Fiserv is a global leader in financial services technology solutions. We're helping more than 12,000 clients worldwide create and deliver experiences for a digital world that's always on. Solutions that enable today's consumer to move and manage money with ease, speed and convenience. At the point of thought.