Deep Research run · 2026-04-28
**Query:** Research whether anyone has tried using Physical Intelligence-style VLA models or Pi-branded VLA models on affordable desktop or home robot arms. Focus on real-world experiments, demos, repos, startups, hobbyist projects, and research efforts using low-cost tabletop/desktop manipulators or home robot arms rather than industrial systems. Identify which arms/platforms are being used, approximate hardware cost, compute requirements, whether models are open or closed, what level of embodiment/adaptation is needed, and whether there are credible paths for an affordable home setup in 2026. Include concrete examples, links, and a bottom-line assessment of what is actually feasible today versus still aspirational.
**Model:** o4-mini-deep-research
**Date:** 2026-04-28
**Searches performed:** 29
**Sources cited:** 25
While Physical Intelligence’s commercial VLA models (π₀, π₀.5, etc.) showcase impressive demos on expensive hardware, hobbyists and researchers have begun to adapt VLA or imitation-based policies to very low-cost arms. For example, Hugging Face’s open **SmolVLA** (450M parameters) runs at real-time on consumer hardware and has been evaluated on SO-ARM100/101 kits (3D-printed arms ~\$100 each) with ~78% pick-place success ([huggingface.co](https://huggingface.co/blog/smolvla#:~:text=SmolVLA%20initially%20achieves%2051.7,data%20regimes)). Even Physical Intelligence’s π₀.₅ model (300M parameters) was trialed on an SO-101 arm by an independent experimenter: with 100 teleoperated demos and a single RTX-5090 GPU, it achieved ~90% success grasping a bottle ([medium.com](https://medium.com/%40alexsirk5/teaching-a-robot-arm-to-pick-a-bottle-first-results-with-pi0-5-and-imitation-learning-ba8ffadb2c10#:~:text=Hardware%20setup)) ([medium.com](https://medium.com/%40alexsirk5/teaching-a-robot-arm-to-pick-a-bottle-first-results-with-pi0-5-and-imitation-learning-ba8ffadb2c10#:~:text=Results%20%C2%B7%2015%20trials)). Other efforts (e.g. Alibaba DAMO’s 7B **RynnVLA-002** on the same SO-arm hardware ([github.com](https://github.com/alibaba-damo-academy/WorldVLA#:~:text=,world%20LeRobot%20experiments)), or ACT imitation policies on the open-source ALOHA arms ([openreview.net](https://openreview.net/forum?id=e8Eu1lqLaf#:~:text=teleoperation%20interface,10%20minutes%20worth%20of%20demonstrations))) further show that cheap manipulators can run advanced policies. However, these successes still rely on heavy compute (high-end GPUs or clusters) and careful hardware tuning. In summary: **today’s feasible home setup is likely a LeRobot/SO-ARM kit ($100–500) paired with an RTX-class PC running an open model (like SmolVLA or a fine-tuned policy)**. Fully realizing “Pi-level” versatile general-purpose VLA at home remains aspirational, requiring far more data, compute, and mechanical fidelity than a typical hobbyist system provides.
| Category / Example | Robot Platform & Cost | Compute Requirement | Model (Params, Open/Closed) | Tasks/Demo Outcomes | Feasible Today? |
|---|---|---|---|---|---|
| **Physical Intelligence VLA (π₀, π₀.₅)** | *Custom high-end arms used in PI demos* (each arm ~$5–9K ([www.linkedin.com](https://www.linkedin.com/posts/igor-kulakov-364a1553_we-are-about-to-launch-a-300-open-source-activity-7266857628314587137-hGjS#:~:text=We%20are%20about%20to%20launch,want%20to%20follow%20and%20participate))) | Multi-GPU or cluster training; inference on flagship GPU (e.g. RTX 5090 used in one test ([medium.com](https://medium.com/%40alexsirk5/teaching-a-robot-arm-to-pick-a-bottle-first-results-with-pi0-5-and-imitation-learning-ba8ffadb2c10#:~:text=Hardware%20setup))) | Closed proprietary π-models (π₀, π₀.₅);π₀.₅≈300M (flow-matching, VLM backbone) | Demonstrated complex tasks (laundry folding, kitchen cleanup) in new home scenes ([www.physicalintelligence.company](https://www.physicalintelligence.company/blog/pi05?curius=1433#:~:text=We%20have%20been%20developing%20robotic,new%20home%20that%20was%20not)). In private tests, π₀.₅ generalized to new houses with hint of common-sense task reasoning. (Not publicly available.) | **Not off-the-shelf**. Only PI’s own lab has run these; outside access would require enormous compute/data. ([www.physicalintelligence.company](https://www.physicalintelligence.company/blog/pi05?curius=1433#:~:text=abstraction,about%20400%20hours))<br>*Not practical for hobbyists.* |
| **Open VLA models (SmolVLA, RynnVLA, etc.)** | SO-ARM100/101 kits (~$100 base; $300–500 assembled) ([techcrunch.com](https://techcrunch.com/2025/04/28/hugging-face-releases-a-3d-printed-robotic-arm-starting-at-100/#:~:text=Compared%20to%20the%20SO,placing%20it%20in%20a%20bin)); LeRobot arms (open-source, 3D-printed). | Single consumer GPU (RTX30xx); SmolVLA even runs on CPU ([huggingface.co](https://huggingface.co/blog/smolvla#:~:text=Today%2C%20we%20introduce%20SmolVLA%2C%20a,that%20runs%20on%20consumer%20hardware)); RynnVLA (7B) needs >1 GPU. | **SmolVLA**: 450M VLA (arch. with VLM + flow transform) – open ([huggingface.co](https://huggingface.co/blog/smolvla#:~:text=Today%2C%20we%20introduce%20SmolVLA%2C%20a,that%20runs%20on%20consumer%20hardware)). **RynnVLA-002**: 7B VLA (open) ([github.com](https://github.com/alibaba-damo-academy/WorldVLA#:~:text=,world%20LeRobot%20experiments)). | SmolVLA achieved ~78% success on SO-ARM100 pick-and-place after mixed pretraining on community data ([huggingface.co](https://huggingface.co/blog/smolvla#:~:text=SmolVLA%20initially%20achieves%2051.7,data%20regimes)). Alibaba’s RynnVLA-002 (7B) reported 97% on the LIBERO sim benchmark and has code for real SO-ARM evaluation ([github.com](https://github.com/alibaba-damo-academy/WorldVLA#:~:text=,world%20LeRobot%20experiments)). | **Yes, accessible.** Example: SmolVLA runs on $100 arms and consumer GPU ([huggingface.co](https://huggingface.co/blog/smolvla#:~:text=SmolVLA%20initially%20achieves%2051.7,data%20regimes)). However, training even a 450M model demands hours of GPU time and some data. Running pre-trained models on cheap arms is quite feasible. |
| **Imitation/RL on low-cost arms** | **ALOH A kit** (open dual-arms for ~\$2K), Niryo One ($499 classroom arm), Wonder Workshop ($200). | 1–2 consumer GPUs or even laptop CPU. | E.g. **ACT** (Action-Chucking Transformer) – open imitation model ([huggingface.co](https://huggingface.co/pavelsimo/SO-101-pick-and-place-act#:~:text=Action%20Chunking%20with%20Transformers%20,often%20achieves%20high%20success%20rates)); various diffusion or behavior-cloning policies (open). | ACT on ALOHA achieved ~80–90% success on fine manipulation tasks (opening cups, inserting batteries) with only ~10 min of demos per task ([openreview.net](https://openreview.net/forum?id=e8Eu1lqLaf#:~:text=teleoperation%20interface,10%20minutes%20worth%20of%20demonstrations)). HuggingFace’s SO-ARM has pick-place demos learned via RL/BC. | **Yes, practical.** Low-cost arms (Niryo, SO kits) are widely used in labs and hobby (with ROS/Python). Imitation learning can produce robust policies on specific tasks with small data. Requires collecting demos/tele-op, but no giant models. |
| **Hobbyist DIY Arms** | Custom 6-DOF kits (~$200–300) like the arm Igor Kulakov built ([www.linkedin.com](https://www.linkedin.com/posts/igor-kulakov-364a1553_we-are-about-to-launch-a-300-open-source-activity-7266857628314587137-hGjS#:~:text=We%20are%20about%20to%20launch,want%20to%20follow%20and%20participate)), or kits like Hiwonder LeArm ($200). | Consumer PC + high-end GPU (RTX 30/40/50). | Plans to run VLA (e.g. Pi0 on Igor’s $300 arm); existing codebases (LeRobot, Pi0.₅ code) may be ported. | Samples ongoing: Igor’s team is prototyping a $300 arm “compatible with Pi0” ([www.linkedin.com](https://www.linkedin.com/posts/igor-kulakov-364a1553_we-are-about-to-launch-a-300-open-source-activity-7266857628314587137-hGjS#:~:text=We%20are%20about%20to%20launch,want%20to%20follow%20and%20participate)). Alex’s blog ran π₀.₅ on an SO-101 (similar hobby arm) with 87–93% success ([medium.com](https://medium.com/%40alexsirk5/teaching-a-robot-arm-to-pick-a-bottle-first-results-with-pi0-5-and-imitation-learning-ba8ffadb2c10#:~:text=Hardware%20setup)) ([medium.com](https://medium.com/%40alexsirk5/teaching-a-robot-arm-to-pick-a-bottle-first-results-with-pi0-5-and-imitation-learning-ba8ffadb2c10#:~:text=Results%20%C2%B7%2015%20trials)). | **Experimental.** Parts are cheap, but building stable 6-DOF hardware is hard. Open-source LeRobot kits ($100–300) are easier. Enthusiasts have shown it can work (with careful tuning) ([medium.com](https://medium.com/%40alexsirk5/teaching-a-robot-arm-to-pick-a-bottle-first-results-with-pi0-5-and-imitation-learning-ba8ffadb2c10#:~:text=Hardware%20setup)), but this isn’t plug-and-play for average users. |
**Feasible today:** The most credible “affordable VLA” path (2026) is to buy an **SO-ARM kit (100–300 USD)** or equivalent desktop arm, hook it up to a PC/RTX-40 GPU, and use open-source frameworks. For example, one could install the LeRobot stack, mount cameras, and finetune SmolVLA or an imitation policy on a desired task. Several repositories (on Hugging Face and GitHub) provide datasets and code for pick/place, stacking, etc., on SO-arms ([huggingface.co](https://huggingface.co/blog/smolvla#:~:text=SmolVLA%20initially%20achieves%2051.7,data%20regimes)) ([huggingface.co](https://huggingface.co/pavelsimo/SO-101-pick-and-place-act#:~:text=Action%20Chunking%20with%20Transformers%20,often%20achieves%20high%20success%20rates)). SmolVLA’s blog explicitly encourages users to “plug it into a robotics stack” and run it on affordable hardware ([huggingface.co](https://huggingface.co/blog/smolvla#:~:text=Today%2C%20we%20introduce%20SmolVLA%2C%20a,that%20runs%20on%20consumer%20hardware)). In summary: **an SO-ARM + RTX-4090/5090 + SmolVLA/ACT** is a working combo today for narrow tasks (with effort), yielding results on par with a highly competent research robot (c.f. ~80–90% pick/lift success).
**Still aspirational:** By contrast, having a single home robot that interprets general language and tidies any room (like PI’s vision) is beyond reach. The novice will face challenges like low-precision servos, limited training data, and the lack of pretrained VLA weights (π-model weights aren’t public). Fully open VLA efforts are promising but need more time: e.g. RynnVLA (7B) or new GPT-style controllers likely require still more data/hardware. At best, an ambitious user might experiment with Pi0.₅ by replicating Alex’s setup: collecting demos and running on a high-end GPU. That showed good results on one object ([medium.com](https://medium.com/%40alexsirk5/teaching-a-robot-arm-to-pick-a-bottle-first-results-with-pi0-5-and-imitation-learning-ba8ffadb2c10#:~:text=Hardware%20setup)) ([medium.com](https://medium.com/%40alexsirk5/teaching-a-robot-arm-to-pick-a-bottle-first-results-with-pi0-5-and-imitation-learning-ba8ffadb2c10#:~:text=Results%20%C2%B7%2015%20trials)), but scaling it up (many objects/tasks) will require major compute and custom coding.
**Recommendation:** For someone “trying this now,” the smart approach is **open, incremental steps**. Acquire a LeRobot-compatible arm (SO-101 or similar), install an RTX GPU, and leverage Hugging Face models. Start with imitation/BC: record yourselves teleoperating the arm for a few dozen trials of a simple task (e.g. pick up a known object). Then train a policy (ACT or behavior cloning) using the LeRobot tools. Meanwhile, experiment with SmolVLA or Pix3d-like VL models on the same data. This mirrors the successes above: even 100 demos yielded a working π₀.₅ policy ([medium.com](https://medium.com/%40alexsirk5/teaching-a-robot-arm-to-pick-a-bottle-first-results-with-pi0-5-and-imitation-learning-ba8ffadb2c10#:~:text=Training%20configuration)) ([medium.com](https://medium.com/%40alexsirk5/teaching-a-robot-arm-to-pick-a-bottle-first-results-with-pi0-5-and-imitation-learning-ba8ffadb2c10#:~:text=Results%20%C2%B7%2015%20trials)). Additional tips: use multiple camera views (wrist + overhead) for robustness ([medium.com](https://medium.com/%40alexsirk5/teaching-a-robot-arm-to-pick-a-bottle-first-results-with-pi0-5-and-imitation-learning-ba8ffadb2c10#:~:text=The%20camera%20configuration%20directly%20shapes,loose%20stereo%20analog%20during%20training)), and tune control loops tightly. In short, focus on *task-specific training on a cheap arm*, rather than trying to run a full smart-home model. That will give real, repeatable results today – leaving full “general VLA” for future advances.
1. [SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data](https://huggingface.co/blog/smolvla#:~:text=SmolVLA%20initially%20achieves%2051.7,data%20regimes)
2. [Teaching a Robot Arm to Pick a Bottle: First Results with the Pi0.5 Imitation Learning model | by AlexRoboticsExperiments | Apr, 2026 | Medium](https://medium.com/%40alexsirk5/teaching-a-robot-arm-to-pick-a-bottle-first-results-with-pi0-5-and-imitation-learning-ba8ffadb2c10#:~:text=Hardware%20setup)
3. [Teaching a Robot Arm to Pick a Bottle: First Results with the Pi0.5 Imitation Learning model | by AlexRoboticsExperiments | Apr, 2026 | Medium](https://medium.com/%40alexsirk5/teaching-a-robot-arm-to-pick-a-bottle-first-results-with-pi0-5-and-imitation-learning-ba8ffadb2c10#:~:text=Results%20%C2%B7%2015%20trials)
4. [GitHub - alibaba-damo-academy/RynnVLA-002: RynnVLA-002: A Unified Vision-Language-Action and World Model](https://github.com/alibaba-damo-academy/WorldVLA#:~:text=,world%20LeRobot%20experiments)
5. [Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware | OpenReview](https://openreview.net/forum?id=e8Eu1lqLaf#:~:text=teleoperation%20interface,10%20minutes%20worth%20of%20demonstrations)
6. [We are about to launch a $300 open-source robot arm compatible with Pi0 from Physical Intelligence! This new foundational robotic model is super impressive and far ahead of previous works! However… | Igor Kulakov](https://www.linkedin.com/posts/igor-kulakov-364a1553_we-are-about-to-launch-a-300-open-source-activity-7266857628314587137-hGjS#:~:text=We%20are%20about%20to%20launch,want%20to%20follow%20and%20participate)
7. [A VLA with Open-World Generalization](https://www.physicalintelligence.company/blog/pi05?curius=1433#:~:text=We%20have%20been%20developing%20robotic,new%20home%20that%20was%20not)
8. [A VLA with Open-World Generalization](https://www.physicalintelligence.company/blog/pi05?curius=1433#:~:text=abstraction,about%20400%20hours)
9. [Hugging Face releases a 3D-printed robotic arm starting at $100 | TechCrunch](https://techcrunch.com/2025/04/28/hugging-face-releases-a-3d-printed-robotic-arm-starting-at-100/#:~:text=Compared%20to%20the%20SO,placing%20it%20in%20a%20bin)
10. [SmolVLA: Efficient Vision-Language-Action Model trained on Lerobot Community Data](https://huggingface.co/blog/smolvla#:~:text=Today%2C%20we%20introduce%20SmolVLA%2C%20a,that%20runs%20on%20consumer%20hardware)
11. [pavelsimo/SO-101-pick-and-place-act · Hugging Face](https://huggingface.co/pavelsimo/SO-101-pick-and-place-act#:~:text=Action%20Chunking%20with%20Transformers%20,often%20achieves%20high%20success%20rates)
12. [Teaching a Robot Arm to Pick a Bottle: First Results with the Pi0.5 Imitation Learning model | by AlexRoboticsExperiments | Apr, 2026 | Medium](https://medium.com/%40alexsirk5/teaching-a-robot-arm-to-pick-a-bottle-first-results-with-pi0-5-and-imitation-learning-ba8ffadb2c10#:~:text=The%20policy%20architecture%20is%20Pi0,dimensional%20action%20space)
13. [Teaching a Robot Arm to Pick a Bottle: First Results with the Pi0.5 Imitation Learning model | by AlexRoboticsExperiments | Apr, 2026 | Medium](https://medium.com/%40alexsirk5/teaching-a-robot-arm-to-pick-a-bottle-first-results-with-pi0-5-and-imitation-learning-ba8ffadb2c10#:~:text=A%20note%20on%20SmolVLA)
14. [Teaching a Robot Arm to Pick a Bottle: First Results with the Pi0.5 Imitation Learning model | by AlexRoboticsExperiments | Apr, 2026 | Medium](https://medium.com/%40alexsirk5/teaching-a-robot-arm-to-pick-a-bottle-first-results-with-pi0-5-and-imitation-learning-ba8ffadb2c10#:~:text=Training%20configuration)
15. [GitHub - alibaba-damo-academy/RynnVLA-002: RynnVLA-002: A Unified Vision-Language-Action and World Model](https://github.com/alibaba-damo-academy/WorldVLA#:~:text=%F0%9F%97%9D%EF%B8%8F%20Training%20on%20LeRobot)
16. [GitHub - alibaba-damo-academy/RynnVLA-002: RynnVLA-002: A Unified Vision-Language-Action and World Model](https://github.com/alibaba-damo-academy/WorldVLA#:~:text=%60rynnvla,data%20and%20save%20them%20all)
17. [Mobile ALOHA: Learning Bimanual Mobile Manipulation using Low-Cost Whole-Body Teleoperation](https://proceedings.mlr.press/v270/fu25b#:~:text=develop%20a%20system%20for%20imitating,sauteing%20and%20serving%20a%20piece)
18. [Hiwonder LeRobot SO-ARM101 Open-Source 6-Axis Robotic Arm with AI Visi](https://www.hiwonder.com/products/lerobot-so-101#:~:text=The%20SO,source)
19. [Hiwonder LeRobot SO-ARM101 Open-Source 6-Axis Robotic Arm with AI Visi](https://www.hiwonder.com/products/lerobot-so-101#:~:text=models%20to%20help%20developers%2C%20students%2C,reinforcement%20learning%20in%20robot%20development)
20. [Hugging Face releases a 3D-printed robotic arm starting at $100 | TechCrunch](https://techcrunch.com/2025/04/28/hugging-face-releases-a-3d-printed-robotic-arm-starting-at-100/#:~:text=To%20be%20clear%2C%20%24100%20is,500%2C%20depending%20on%20the%20supplier)
21. [Building the Future of Robotics With Open Source Hardware - Robo9](https://robonine.com/building-the-future-of-robotics-with-open-source-hardware/#:~:text=ALOHA%202%20disrupts%20that%20paradigm,like%20Linux%20did%20for%20software)
22. [Hugging Face releases a 3D-printed robotic arm starting at $100 | TechCrunch](https://techcrunch.com/2025/04/28/hugging-face-releases-a-3d-printed-robotic-arm-starting-at-100/#:~:text=,ClementDelangue%29%20April%2028%2C%202025)
23. [Teaching a Robot Arm to Pick a Bottle: First Results with the Pi0.5 Imitation Learning model | by AlexRoboticsExperiments | Apr, 2026 | Medium](https://medium.com/%40alexsirk5/teaching-a-robot-arm-to-pick-a-bottle-first-results-with-pi0-5-and-imitation-learning-ba8ffadb2c10#:~:text=Servo%20inconsistency%20under%20load%20is,tuning%20regime%20that%20finds%20a)
24. [Teaching a Robot Arm to Pick a Bottle: First Results with the Pi0.5 Imitation Learning model | by AlexRoboticsExperiments | Apr, 2026 | Medium](https://medium.com/%40alexsirk5/teaching-a-robot-arm-to-pick-a-bottle-first-results-with-pi0-5-and-imitation-learning-ba8ffadb2c10#:~:text=The%20camera%20configuration%20directly%20shapes,loose%20stereo%20analog%20during%20training)
25. [Teaching a Robot Arm to Pick a Bottle: First Results with the Pi0.5 Imitation Learning model | by AlexRoboticsExperiments | Apr, 2026 | Medium](https://medium.com/%40alexsirk5/teaching-a-robot-arm-to-pick-a-bottle-first-results-with-pi0-5-and-imitation-learning-ba8ffadb2c10#:~:text=Two%20distinct%20failure%20modes%20emerged,adjusting%20the%20execution%20horizon%20parameter)