Senior Site Reliability Engineer
Company: Disability Solutions
Location: Pennington
Posted on: February 15, 2025
Job Description:
Job Description:At Bank of America, we are guided by a common
purpose to help make financial lives better through the power of
every connection. We do this by driving Responsible Growth and
delivering for our clients, teammates, communities and shareholders
every day.Being a Great Place to Work is core to how we drive
Responsible Growth. This includes our commitment to being a diverse
and inclusive workplace, attracting and developing exceptional
talent, supporting our teammates' physical, emotional, and
financial wellness, recognizing and rewarding performance, and how
we make an impact in the communities we serve.At Bank of America,
you can build a successful career with opportunities to learn,
grow, and make an impact. Join us!Job Description:This job is
responsible for partnering with leaders across engineering and
technology to define objective reliability goals for services. Key
responsibilities include composing observability designs through
instrumentation and dashboards, identifying root causes of
complex/impactful issues, partnering with cross functional teams to
deliver sustainable design patterns, and driving early adoption of
non-functional production support requirements. Job expectations
include automating services to improve reliability and efficiency
and influencing a culture of innovation and continuous
improvement.Position Summary:This position would enhance stability
of GIS Tech application systems and drive multiple initiatives
within GIS production services for automation, enhanced monitoring
etc. for effective proactive support and
efficiency.Responsibilities:
- Responsible for reliability and support of Global Information
Security technology application systems along with related
infrastructure
- Designs solutions to visualize key production support metrics
enabling Operational Readiness and Site Reliability Engineer teams
to identify scenarios requiring intervention
- Leads initiatives to automate repetitive tasks and processes,
reducing manual intervention and increasing system
reliability.
- Develops software solutions and/or improved processes to
address work identified as 'toil' by collaborating with key
partners to identify, track and remediate processes to free time
allocated to reliability
- Partners with Development and Infrastructure teams to create
error budget policies prioritizing reliability stories that fall
below Service Level Objective (SLO) thresholds and suggests code
optimizations, additional instrumentation and/or logging structures
to gain service reliability visibility
- Designs and delivers software to improve the availability,
scalability, latency, and efficiency of complex large-scale
information security related systems.
- Develops and implements strategies to enhance the operational
efficiency and supportability of large-scale platforms, ensuring
seamless integration and minimal downtime.
- Identifies and plans for capacity bottlenecks, vulnerabilities
and opportunities for reliability improvement, such as low level
error rates and 'noise', and reduces manual support effort and/or
improves system reliability
- Assesses monitoring for new changes with development partners
and works with monitoring tools team to monitor dashboards and
enhance application and system monitoring designs
- Troubleshoots priority incidents, conducts blameless
post-mortems, and ensures permanent closure of incidents.
- Collaborates with Development and Infrastructure teams to
understand technical solutions and develop Service Level Indicators
and SLOs to measure/improve the reliability of the services they
supportRequired Qualifications:
- 10 -15+ years applied experience in software engineering.
- 3-5 years of experience leading technologists to manage,
anticipate, and solve complex technical items within your domain of
expertise.
- Expert knowledge of software applications and technical
processes with considerable depth in more than one technical
discipline.
- Experience with detecting opportunities to automate, combine,
or simplify control points and executing solutions.
- Ability to lead ongoing assessments to improve
application.
- Deepened understanding of using techniques to creative
innovative solutions to roadblocks (e.g., design thinking,
continuous improvement, rapid prototyping, etc.)
- Deemed an innovator across functions and advises leadership on
new technologies and their potential to elevate the firm.
- Expertly executes complex and scalable coding frameworks using
appropriate software design frameworks.
- Experience working in a highly available multi-datacenter
environment.
- Proven ability to work independently with minimal supervision
and as part of a team with direct responsibilities.
- Ability to juggle competing priorities and adapt to changes in
project scope.Desired Qualifications:
- Programming Languages - Proficiency in at least one programming
language (e.g., Python, Go, Java) for automation and development
tasks.
- System Administration - Expertise in Linux/Unix operating
systems, network administration, and cloud platforms (AWS, Azure,
GCP).
- Monitoring & Logging Tools - Experience with monitoring tools
like Dynatrace, Prometheus, Datadog, Splunk, and logging
frameworks.
- DevOps Practices - Deep understanding of Continuous Integration
and Delivery CI/CD pipelines, containerization technologies
(Docker, Kubernetes), and infrastructure as code.
- Problem-Solving - Excellent analytical and troubleshooting
skills to diagnose complex system issues. Skills:
- Architecture
- Collaboration
- Innovative Thinking
- Result Orientation
- Solution Design
- Adaptability
- Analytical Thinking
- Influence
- Stakeholder Management
- Technical Strategy Development
- Automation
- DevOps Practices
- Production Support
- Project Management
- Risk ManagementShift:1st shift (United States of America)Hours
Per Week: 40
Keywords: Disability Solutions, Trenton , Senior Site Reliability Engineer, Professions , Pennington, New Jersey
Didn't find what you're looking for? Search again!
Loading more jobs...