AI Engineer (Vision-Language-Action / Multimodal Systems)

Location San Francisco
Expertise Robotics
Job Type Permanent
Salary $ 150,000 per annum

Apply Now

AI Engineer (Vision-Language-Action / Multimodal Systems)

A well-funded, early-stage robotics company is building next-generation autonomous systems designed to operate in complex, real-world environments.

Their focus is on developing general-purpose robotic platforms that combine cutting-edge AI with physical systems to tackle high-impact challenges across industrial and defense-adjacent applications.

As they scale, they’re investing heavily in multimodal AI and embodied intelligence to enable robots to understand, reason, and act in dynamic environments.

The Role

Seeking an AI Engineer to develop and deploy advanced multimodal models that bridge perception, reasoning, and action in real-world robotic systems.

This role sits at the intersection of machine learning and robotics, with a focus on vision-language-action (VLA) and vision-language models (VLMs).

What You’ll Do

Develop and optimize multimodal models (e.g. transformers, diffusion models, vision-language-action architectures)
Build representations for perception, scene understanding, spatial reasoning, and affordances
Integrate language-based reasoning with planning and control systems
Design and curate large-scale multimodal datasets (video, teleoperation, synthetic data, instruction-based learning)
Deploy models onto edge or onboard compute, optimizing for latency and reliability
Build pipelines for training, evaluation, and scaling of ML systems
Develop simulation-to-real (Sim2Real) workflows for robust real-world performance
Collaborate closely with robotics, controls, and hardware teams to ensure models translate effectively into real-world behavior
Participate in testing and iteration based on real-world system performance

What We’re Looking For

Strong experience with multimodal machine learning (VLMs, VLAs, transformers, or similar)
Deep expertise in PyTorch or JAX, including distributed training and GPU acceleration
Experience building and scaling large training pipelines
Strong software engineering skills in Python and modern ML tooling
Experience with dataset creation, curation, and augmentation (including synthetic data)
Understanding of deployment constraints on edge or embedded systems
Degree (MSc/PhD preferred) in Computer Science, Machine Learning, Robotics, or related field, or equivalent experience

Nice to Have

Experience with robotics, embodied AI, or real-world ML deployment
Familiarity with simulation environments (e.g. Mujoco, Isaac, or similar)
Experience with reinforcement learning, imitation learning, or policy learning
Exposure to real-time systems or safety-critical applications

Why This Role

Work on cutting-edge embodied AI systems bridging language, vision, and action
High ownership over model development and real-world deployment
Opportunity to operate at the frontier of general-purpose robotics
Fast-paced, highly technical team building from first principles

Additional Details

Location: San Francisco (on-site)
Compensation: Competitive base + equity
Benefits included

Apply Now

James McNeil

Consultant

Senior Product Validation Engineer

Austin
Robotics
Permanent
$ 150,000 per annum

Senior Product Validation Engineer ???? Austin, TX (On-site) I’m working with a well-funded robotics company building next-generation humanoid systems desi...

Learn More

Senior Autonomy Software Engineer

San Francisco
Robotics
Permanent
$ 150,000 per annum

Autonomy Software Engineer A fast-growing, venture-backed company is building next-generation electric mobility platforms designed to transform transportation ...

Learn More

Physical AI Architect

Boston
Robotics
Permanent
$ 300,000 per annum

AI Architect Boston, MA (Hybrid) About the role We’re partnering with a well-funded, high-growth robotics company building real-world automation systems deplo...

Learn More