← Back to Projects Page

Introduction

Problem: Modern warehouses still rely heavily on human labor for repetitive value-added services (VAS) tasks such as kitting, SKU reorientation, packing, and palletizing. Manual operations are time-consuming, error-prone, and difficult to scale, especially when product shapes and layouts frequently change. These workflows create throughput bottlenecks and increase labor expenses, remaining the weakest link even in partially automated facilities.

Solution: This project builds an adaptive robotic ecosystem for smart warehouses, transitioning from bottlenecked manual operations to a zero touch, fully adaptive robotic workflow. Using a leader follower SO 101 robotic arm platform, the system learns through imitation and adapts to changing scenarios on the fly. The architecture is driven by transformer based policies, primarily the Action Chunking Transformer (ACT) for low level imitation learning and smolVLA Vision Language Action models for high level multimodal reasoning, with ROS 2 and MoveIt 2 serving as a deterministic fallback to ensure reliability when AI control falls below confidence thresholds.

System Architecture

The complete system integrates hardware, software, and AI models into a unified adaptive pipeline:

Key Features

Benefits

Skills

How ACT Works (Action Chunking Transformer)

ACT trains on demonstration sequences so the robot can predict the next chunk of actions, using self-attention to focus on the most relevant past context. When ACT or VLA models fail or fall below confidence thresholds, control is handed off to ROS 2 + MoveIt 2 for deterministic execution.

ACT pipeline
ACT how it works diagram

Media

Digital twin in Isaac Sim: simulation to real transfer

Final ACT model inference achieving stable motion

Learning Experience

This internship at CEVA Logistics provided deep insights into bridging academic research with industrial robotics. Key outcomes include:

Technical Skills

Research & Analytical Skills

Professional Growth

Credits

A special thank you to all those who contributed to the development of this project:

Conclusion & Future Work

This project established a strong foundation for adaptive robotic control in warehouse automation. The integration of ACT for imitation based control, smolVLA for vision language reasoning, and digital twin synchronization demonstrated a feasible pathway toward intelligent, real world robot autonomy, progressing from simulation prototyping in Gazebo and Isaac Sim to teleoperation data collection and validation on the SO 101 hardware platform.

Next Phase:

Comments