Skip to main content

Training

Prepare Required Checkpoints

We provide pretrained weights of visual encoder and LLM, you can download them from the Table and put them in model_zoo directory.

Model NameLink
VicunaLink
clip-vit-large-patch14-336Link
epcl_vit-L_256tokensLink

Organize the pretrained weights as below:

model_zoo
├── vicuna_ckpt
│ ├── 13b_v0
│ └── 7b_v0
└── epcl_vit-L_256tokens

LAMM

  • 2D Models Training

    cd src
    sh tools/LAMM/train_lamm2d.sh lamm_2d
    # or
    sh tools/LAMM/train_lamm2d_slurm.sh <YOUR_PARTITION> lamm_2d
  • 3D Models Training

    cd src
    sh tools/LAMM/train_lamm3d.sh lamm_3d
    # or
    sh tools/LAMM/train_lamm3d_slurm.sh <YOUR_PARTITION> lamm_3d

For your reference, GPU memory consumption for different models are shown as follows

Model SizeSample Num/GPUGPU Memory
Vicuna_v0_7B1~30GB
Vicuna_v0_7B2~46GB
Vicuna_v0_13B1~53GB
Vicuna_v0_13B2~70GB

Octavius

  • Image modality only

    cd src
    sh tools/Octavius/train_octavius_slurm.sh <YOUR_PARTITION> <NUM_GPU> \
    config/Octavius/octavius_2d_e4_bs64.yaml octavius_2d_e4_bs64
  • Point cloud modality only

    cd src
    sh tools/Octavius/train_octavius_slurm.sh <YOUR_PARTITION> <NUM_GPU> \
    config/Octavius/octavius_3d_e3_bs64.yaml octavius_3d_e3_bs64
  • Image & point cloud modality joint

    cd src
    sh tools/Octavius/train_octavius_slurm.sh <YOUR_PARTITION> <NUM_GPU> \
    config/Octavius/octavius_2d+3d_e6_bs64.yaml octavius_2d+3d_e6_bs64

Model Zoo

We provide several pretrained LAMM/Octavius checkpoints here:

LAMM Model Zoo

# Training SamplesVision EncoderLLMTraining DataLora RankLink
98KCLIP-ViT-LVicuna_v0_7BLAMM-2D daily dialogue & desctiption32Checkpoints
186KCLIP-ViT-LVicuna_v0_7BLAMM-2D Instruction Data32Checkpoints
186KCLIP-ViT-LLLaMA2_chat_7BLAMM-2D Instruction Data32Checkpoints
98KCLIP-ViT-LVicuna_v0_13BLAMM-2D daily dialogue & desctiption32Checkpoints
186KCLIP-ViT-LVicuna_v0_13BLAMM-2D Instruction Data32Checkpoints
10KEPCL-ViT-LVicuna_v0_13BLAMM-3D Instruction Data32Checkpoints

Octavius Model Zoo

#SamplesVision EncoderLLMTraining DataLink
286kCLIP-ViT-LVicuna_v0_13BLAMM-2D Instruction Data & COCO-Detectionckpt
90kObj-As-SceneVicuna_v0_13BScan2Instckpt
376KCLIP-ViT-L &
Obj-As-Scene
LLaMA2_chat_13BLAMM-2D Instruction Data & COCO-Detection &
Scan2Inst
ckpt

You can download them and put them into ckpt directory for fast evaluation.


Sign up to get email updates on the LAMM or email us at openlamm@gmail.com.
© 2024. LAMM