JobTarget Logo

Senior Networking Solution Test Engineer – AI Cluster Debugging in Switzerland at Jobgether

NewJob Function: Engineering
Jobgether
Switzerland, Switzerland
Posted on
New job! Apply early to increase your chances of getting hired.

Explore Related Opportunities

Job Description

Senior Networking Solution Test Engineer AI Cluster Debugging

This position is posted by Jobgether on behalf of a partner company. We are currently looking for a Senior Networking Solution Test Engineer – AI Cluster Debugging in Switzerland.

This role sits at the forefront of large-scale AI infrastructure validation, where networking, systems engineering, and artificial intelligence workloads converge. You will be responsible for ensuring the reliability and performance of complex AI clusters built on high-speed interconnect technologies such as NVLink, Ethernet, and InfiniBand. Working in a highly technical and collaborative environment, you will investigate deep system-level issues spanning hardware, drivers, networking stacks, and AI frameworks. The position requires strong debugging intuition and the ability to reproduce and analyze real-world customer scenarios in advanced test environments. You will contribute directly to the stability and scalability of next-generation AI training and inference systems used at massive scale. This is a hands-on engineering role where your analysis and findings directly shape product quality and system performance.

Accountabilities:
  • Design and review test strategies and product requirements for NVLink, Ethernet, and InfiniBand-based AI cluster systems.
  • Build and maintain realistic, large-scale test environments replicating customer-like AI infrastructure, including heterogeneous hardware and software stacks.
  • Lead end-to-end system debugging across hardware, firmware, networking, and AI software layers to identify and resolve root causes.
  • Analyze logs, inspect source code, and validate fixes across components such as NICs, DPUs, switches, and AI communication libraries.
  • Collaborate closely with development teams to debug and optimize protocols such as NCCL, RoCE, and RDMA.
  • Define, design, and guide automation efforts for robust testing frameworks producing actionable logs, metrics, and traces.
  • Execute regression, performance, functional, and scalability testing, and deliver clear, data-driven technical reports.
  • Profile and benchmark AI training and inference workloads, correlating application behavior with system and network performance metrics.
Requirements:
  • Bachelor’s or Master’s degree in Computer Science, Electrical Engineering, or equivalent hands-on experience in systems/network engineering.
  • 8+ years of experience in Linux-based networking, system testing, and complex debugging environments.
  • Strong expertise in Linux networking tools and debugging utilities (e.g., tcpdump, ethtool, iproute2, perf).
  • Proven experience in production-grade troubleshooting, hypothesis-driven debugging, and root cause analysis under pressure.
  • Solid understanding of NIC architecture, offloads, queue management, and driver/firmware interactions.
  • Deep knowledge of AI networking technologies such as NCCL, RoCE, and RDMA.
  • Ability to read, understand, and debug source code in C/C++, Python, or similar languages.
  • Strong scripting and automation skills using Bash, Python, and/or Ansible.
  • Experience working in fast-evolving technical environments with strong adaptability and learning ability.
  • Excellent analytical, communication, and collaboration skills with strong ownership mindset.
Benefits:
  • Competitive compensation aligned with senior-level expertise and Swiss market standards.
  • Opportunity to work on cutting-edge AI cluster and high-performance networking technologies.
  • Exposure to large-scale systems powering advanced AI training and inference workloads.
  • Highly technical, research-driven engineering environment with strong innovation focus.
  • Collaborative international team working on next-generation infrastructure challenges.
  • Access to complex, large-scale test environments and advanced debugging tools.
  • Inclusive workplace culture supporting diversity, equity, and professional growth.
  • Relocation and accommodation of accessibility needs where applicable.
How Jobgether works:
We use an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. Our system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company. The final decision and next steps (interviews, assessments) are managed by their internal team.
We appreciate your interest and wish you the best!
Data Privacy Notice: By submitting your application, you acknowledge that Jobgether will process your personal data to evaluate your candidacy and share relevant information with the hiring employer. This processing is based on legitimate interest and pre-contractual measures under applicable data protection laws (including GDPR). You may exercise your rights (access, rectification, erasure, objection) at any time.
#LI-CL1

Job Location

Switzerland, Switzerland

Frequently asked questions about this position

Continue to apply
Enter your email to continue. You’ll be redirected to the employer’s application.
By clicking Continue, you understand and agree to JobTarget's Terms of Use and Privacy Policy.