Observability & Site Reliability Engineer (SRE) Job at Artmac Soft LLC, Fort Worth, TX

QW5xRFhzbkJTMEwzZ3A0aFVrNGtiR0I3RVE9PQ==
  • Artmac Soft LLC
  • Fort Worth, TX

Job Description

Who we are:

Artmac Soft is a technology consulting and service-oriented IT company dedicated to providing innovative technology solutions and services to Customers.

Job Description:

Job Title : Observability & Site Reliability Engineer (SRE)

Job Type : W2

Experience : 5-15 Years

Location : Fort Worth, Texas

Responsibilities:

  • Experience with Dynatrace, AppMon, Zabbix, SCOM, Datadog, CloudWatch, X-Ray, and Splunk.
  • Self-motivated and able to work in a 7x24 environment.
  • Experience managing critical system outages and interacting at all organizational levels.
  • On-call support availability.
  • Proficiency in monitoring and alerting tools (e.g., Dynatrace, Datadog, CloudWatch, Splunk).
  • Strong understanding of IT infrastructure, including servers, networks, databases, and cloud environments.
  • Some Experience with incident, problem, and change management processes a plus
  • Ability to analyze complex systems and identify performance bottlenecks.
  • Excellent troubleshooting and problem-solving skills.
  • Effective communication and collaboration skills.
  • Familiarity with ITIL best practices and service management frameworks.
  • Operate in a 7-day/24-hour environment with after-hours support flexibility.
  • Collaborate with internal teams and suppliers to resolve and lead event resolution across all mission-critical IT and Telecom service levels.
  • Protect business system availability through integrated incident, problem, and change management.
  • Monitor systems for faults and optimization opportunities.
  • Assist the major incident response team and escalate critical events.
  • Evaluate and improve monitoring/alerting tools and processes.
  • Conduct technical root cause analysis and engage with management teams for internal issues.
  • Identify potential business-impacting events and manage incident processes.
  • Provide expert guidance during reviews and debriefs.
  • Analyze problem trends and monitor tools to identify chronic activity.
  • Communicate effectively with senior management.
  • System Monitoring: Implement and maintain monitoring solutions to track the performance, health, and availability of IT systems, applications, and networks.
  • Alert Management: Configure and manage alerting mechanisms to ensure timely notifications of any anomalies, failures, or performance degradations.
  • Incident Response: Collaborate with support and operations teams to analyze, resolve, and lead event resolution processes during incidents and outages.
  • Root Cause Analysis: Conduct thorough investigations to determine the root cause of incidents and implement corrective actions to prevent recurrence.
  • Optimization: Identify opportunities for system optimization and performance improvements through data analysis and trend identification.
  • Tool Evaluation and Integration: Evaluate, recommend, and integrate new monitoring and alerting tools and technologies to enhance the organization's monitoring capabilities.
  • Documentation and Reporting: Develop and maintain comprehensive documentation, including monitoring configurations, incident reports, and performance metrics.
  • Collaboration and Communication: Work closely with various IT teams, including application, infrastructure, and DevOps teams, to ensure seamless operations and effective communication during incidents.

Qualification:

  • Bachelor's degree or equivalent combination of education and experience.

Job Tags

Similar Jobs

Jackson Healthcare

Commercial Property Electrician Job at Jackson Healthcare

 ...The Commercial Property Electrician is responsible for maintaining and overseeing the operation of facilities, ensuring safety, comfort, and functionality. Key duties include coordinating tasks, performing repairs, and maintaining physical assets throughout the campus.... 

POP MART

Design Manager (Retail) Job at POP MART

POP MART, founded in 2010 (SEHK: 9992), is a market-leading entertainment company and a global champion of designer toy culture. Through global artist development, IP operations, designer toy culture evangelism, and strategic investments, we have built an integrated platform...

James Irwin Charter Schools

High School Chemistry/Biology Teacher Job at James Irwin Charter Schools

 ...Title : High School Chemistry/Biology Teacher Location: James Irwin Charter High School Job Type: Full-time Position Overview: James Irwin Charter High School (JICHS) is seeking a dedicated and passionate High School Chemistry/Biology Teacher to join our... 

Insight Global

Inspector Job at Insight Global

 ...client of Insight Global is looking to bring on a Fire Alarm System Inspector to the team in St. Louis, MO. The Fire Alarm System Inspector...  ...and will dispatch and travel to local client sites from their homes each morning. Salary: $60,000-$70,000 Required Skills &... 

SM Diversity

Grants Manager Job at SM Diversity

 ...Join our partners at Perigee Fund in the search for an experienced Grants Manager! We are looking for someone who thrives in dynamic environments, creatively tackles administrative hurdles for our grantee partners and staff, demonstrates a robust commitment to serving...