Posted 20 May 26

AI Research Engineer (Kernel & Inference Optimization)

full timeengineeringaidataremote FROM 🇧🇷

Open to candidates in: Brazil

Jobgether

🏭 Not specified

📍 N/A

👤 Not specified

🌐 Website

Apply Now

This position is posted by Jobgether on behalf of a partner company. We are currently looking for a AI Research Engineer (Kernel & Inference Optimization) in Brazil.

This is an exciting opportunity for a highly technical AI engineer to contribute to the next generation of scalable and high-performance inference systems powering real-world AI applications. In this role, you will work on optimizing model serving architectures, improving latency and throughput, and enhancing deployment efficiency across cloud, edge, and resource-constrained environments. You will collaborate with globally distributed engineering and research teams focused on advanced AI systems, multi-modal architectures, and infrastructure innovation. The position offers a research-driven environment where experimentation, benchmarking, and performance optimization are central to daily work. Ideal candidates are passionate about low-level optimization, inference scalability, and building robust AI systems that deliver measurable production impact at scale.

Accountabilities:

Design, develop, and optimize advanced model serving architectures focused on high throughput, low latency, and efficient memory utilization.
Build scalable inference pipelines capable of running across cloud, edge, and resource-constrained environments.
Conduct controlled inference experiments in simulated and production environments to evaluate system performance and reliability.
Monitor and analyze key performance metrics such as latency, throughput, memory consumption, token response time, and error rates.
Develop and maintain benchmarking methodologies and performance validation frameworks for AI inference systems.
Identify bottlenecks in serving pipelines, including batch processing inefficiencies, network overhead, and excessive memory usage.
Optimize inference frameworks and deployment strategies for scalability, resilience, and operational efficiency.
Collaborate with cross-functional engineering and research teams to integrate optimized inference solutions into production environments.
Create high-quality testing datasets and deployment scenarios that reflect real-world operational challenges.
Continuously improve inference infrastructure through experimentation, iteration, and adoption of cutting-edge AI serving techniques.

Requirements:

Strong experience in AI/ML engineering with a focus on inference optimization, model serving, or AI systems performance.
Deep understanding of model deployment architectures and inference frameworks for large-scale AI applications.
Expertise in optimizing latency, throughput, scalability, and memory footprint in production AI systems.
Hands-on experience with performance monitoring, benchmarking, profiling, and bottleneck analysis.
Strong knowledge of advanced AI model architectures, including multi-modal systems and resource-efficient models.
Experience building and deploying AI systems across cloud, edge, or low-resource hardware environments.
Proficiency in programming languages commonly used in AI infrastructure and optimization workflows.
Strong analytical and problem-solving abilities with a research-oriented mindset.
Ability to work independently in a highly distributed and fast-moving global environment.
Excellent English communication skills and ability to collaborate across technical and non-technical teams.
Passion for innovation, experimentation, and scalable AI infrastructure development.

Benefits:

Fully remote global work environment with flexible location options.
Opportunity to work on cutting-edge AI, blockchain, and fintech technologies.
Collaborative international team of highly skilled engineers and researchers.
Exposure to innovative projects involving AI infrastructure, digital finance, and decentralized technologies.
High-impact role with significant technical ownership and influence on product direction.
Fast-paced and innovation-driven culture focused on experimentation and growth.
Opportunities for continuous learning and professional development.
Work environment that values autonomy, creativity, and technical excellence.
Participation in projects with global reach and real-world scalability challenges.

How Jobgether works: We use an AI-powered matching process to ensure your application is reviewed quickly, objectively, and fairly against the role's core requirements. Our system identifies the top-fitting candidates, and this shortlist is then shared directly with the hiring company. The final decision and next steps (interviews, assessments) are managed by their internal team. We appreciate your interest and wish you the best! Why Apply Through Jobgether? Data Privacy Notice: By submitting your application, you acknowledge that Jobgether will process your personal data to evaluate your candidacy and share relevant information with the hiring employer. This processing is based on legitimate interest and pre-contractual measures under applicable data protection laws (including GDPR). You may exercise your rights (access, rectification, erasure, objection) at any time. #LI-CL1

APPLY NOW