Aller au contenu

Lab Session 5 - Foundation Model Tutorial

Note

Before you begin, make sure you have downloaded the latest update of the course slides from here, and keep them close while doing the lab.

The goal of this lab is to regenerate the embeddings for the chosen modality (computer vision, text, or audio) from Labs 2 and 3, using a pre-trained foundation model and the original dataset.

We have reduced and simplified the datasets to facilitate easy manipulation on your local machine.

Additionally, depending on the modality, some modifications have been made to the original data before generating embeddings with the pre-trained model.

Steps:

  1. Load the dataset.
  2. Load the pre-trained foundation model.
  3. Preprocess the raw data as instructed, if necessary.
  4. Generate the embeddings.
  5. Confirm that the classification performance is similar to what you achieved in Lab 2 (supervised learning) and Lab 3 (unsupervised learning/visualization).

Note

Copy / modify / playaround with the code snippets that we provide.

You can now refer to the application pages (see "Resources for Session 5")

  1. Computer Vision

  2. Audio

  3. Text