Training

Prepare Required Checkpoints

We provide pretrained weights of visual encoder and LLM, you can download them from the Table and put them in model_zoo directory.

Organize the pretrained weights as below:

model_zoo
├── vicuna_ckpt
│   ├── 13b_v0          
│   └── 7b_v0
└── epcl_vit-L_256tokens

2D Models Training

cd src
sh tools/LAMM/train_lamm2d.sh lamm_2d
# or
sh tools/LAMM/train_lamm2d_slurm.sh <YOUR_PARTITION> lamm_2d

3D Models Training

cd src
sh tools/LAMM/train_lamm3d.sh lamm_3d
# or
sh tools/LAMM/train_lamm3d_slurm.sh <YOUR_PARTITION> lamm_3d

For your reference, GPU memory consumption for different models are shown as follows

Image modality only

cd src
sh tools/Octavius/train_octavius_slurm.sh <YOUR_PARTITION> <NUM_GPU> \
    config/Octavius/octavius_2d_e4_bs64.yaml octavius_2d_e4_bs64

Point cloud modality only

cd src
sh tools/Octavius/train_octavius_slurm.sh <YOUR_PARTITION> <NUM_GPU> \
    config/Octavius/octavius_3d_e3_bs64.yaml octavius_3d_e3_bs64

Image & point cloud modality joint

cd src
sh tools/Octavius/train_octavius_slurm.sh <YOUR_PARTITION> <NUM_GPU> \
    config/Octavius/octavius_2d+3d_e6_bs64.yaml octavius_2d+3d_e6_bs64

We provide several pretrained LAMM/Octavius checkpoints here:

# Training Samples	Vision Encoder	LLM	Training Data	Lora Rank	Link
98K	CLIP-ViT-L	Vicuna_v0_7B	LAMM-2D daily dialogue & desctiption	32	Checkpoints
186K	CLIP-ViT-L	Vicuna_v0_7B	LAMM-2D Instruction Data	32	Checkpoints
186K	CLIP-ViT-L	LLaMA2_chat_7B	LAMM-2D Instruction Data	32	Checkpoints
98K	CLIP-ViT-L	Vicuna_v0_13B	LAMM-2D daily dialogue & desctiption	32	Checkpoints
186K	CLIP-ViT-L	Vicuna_v0_13B	LAMM-2D Instruction Data	32	Checkpoints
10K	EPCL-ViT-L	Vicuna_v0_13B	LAMM-3D Instruction Data	32	Checkpoints

#Samples	Vision Encoder	LLM	Training Data	Link
286k	CLIP-ViT-L	Vicuna_v0_13B	LAMM-2D Instruction Data & COCO-Detection	ckpt
90k	Obj-As-Scene	Vicuna_v0_13B	Scan2Inst	ckpt
376K	CLIP-ViT-L & Obj-As-Scene	LLaMA2_chat_13B	LAMM-2D Instruction Data & COCO-Detection & Scan2Inst	ckpt

You can download them and put them into ckpt directory for fast evaluation.