Skip to content

Getting Started

This guide will help you get started with OptiReduce.

System Requirements

Hardware Requirements

  • Network Interface Card (NIC):
    • Recommended: Mellanox ConnectX NICs (supports flow bifurcation)
    • Alternative: Two NICs (one for TCP-based communication, one DPDK-compatible NIC)
  • CPU: At least 4 dedicated cores for OptiReduce
  • Memory: 16GB of hugepages
  • Ubuntu 22.04 LTS
  • Python 3.9.19
  • CUDA 11.7 and cuDNN 8.5 (optional, for GPU training)

Installation Options

You can install OptiReduce in two ways:

For automated deployment across multiple nodes:

git clone https://github.com/OptiReduce/ansible.git
cd ansible
make optireduce-full

2. Manual Installation

Clone and install the core repository on all nodes manually:

git clone https://github.com/OptiReduce/setup.git
cd setup
make install

For detailed installation instructions, see our Installation Guide.

Quick Start

1. Configure Environment

export GLOO_ALGO=Optireduce
export GLOO_SOCKET_IFNAME="ens17"  # Your NIC name
export GLOO_DPDK_THREADS_OFFSET=11 # OptiReduce cores (will be 11-15 here)

2. Setup Your Code

import torch.distributed as dist
from torch.nn.parallel import DistributedDataParallel as DDP

# Initialize process group
dist.init_process_group(backend="gloo")

# Create DDP model
model = DDP(model, bucket_cap_mb=1350)  # This value is required

3. Run Training

We provide ready-made training scripts for popular models (VGG19, BERT, BART, RoBERTa, GPT2) in our benchmark repository. See our benchmarking guide for running these scripts.

What's Next?

Understanding OptiReduce

Optimizing Performance

  • Follow our Benchmarking Guide to evaluate performance
  • Learn how to simulate different network environments
  • Compare OptiReduce with other communication schemes

Getting Help

If you encounter issues:

  1. Check the documentation pages linked above
  2. Review existing GitHub issues
  3. Open a new issue with a minimal example

Contributing

We welcome contributions to OptiReduce! Whether it's improving documentation, fixing bugs, optimizing performance, or adding new features, your help is appreciated. Please check our Contributing Guide for guidelines on how to get started.