What is PRObE

Parallel Reconfigurable Observational Environment

PRObE is an NSF-sponsored project aimed at providing a large-scale, low-level systems research facility. It is a collaborative effort by the New Mexico Consortium, Los Alamos National Laboratory, Carnegie Mellon University, the University of Utah, and the University of New Mexico. It is housed at NMC in the Los Alamos Research Park.

PRObE will provide a highly reconfigurable, remotely accessible and controllable environment that researchers can use to perform experiments that are not possible at a smaller scale. PRObE at full production scale provides at least two 1024 node clusters, one of 200 nodes, and some smaller machines with extreme core count and bleeding edge technology. The machines are retired large clusters donated by DOE facilities.

The PRObE research environment will be based on the successful Emulab testbed-management software, developed by the Flux Research Group at the University of Utah's School of Computing. The Emulab software is a full-featured suite for testbed management. It is designed to provide the low-level access to testbeds that systems researchers require, as well as higher-level tools that enhance researcher productivity. It has been developed over the past decade by the Flux Research Group, part of the School of Computing at the University of Utah. Emulab is widely used in the systems research community: it powers over three dozen testbeds around the world, which are used by thousands of researchers and educators.

The bulk of the PRObE facilities are located at the New Mexico Consortium, with a smaller facility located at Carnegie Mellon University. Researchers will be able to access the facility remotely or visit Los Alamos to work at the facility.

PRObE is dedicated to systems research. The computer facility allows hands-on operation of very large computing resources. Researchers will have complete control of the hardware while they are running experiments. Researchers can inject both hardware and software failures while monitoring the system to see how it reacts to such failures. We envision this unique system will support research in many systems related fields such as Operating Systems, Storage, and High End Computing.

No other system at this scale in the world provides this ability.