Introduction to Artificial Intelligence⚓

Welcome!

All course slides are on this github repository.

The links to all lab sessions and ressources can be found on the left panel.

Outline of the Labs⚓

The module is segmented into 5 Labs, following the subjects of the courses.

Introduction to key tools and concepts
Supervised Learning
Unsupervised Learning
Deep Learning
Foundation Models

Labs are deisgned following several objectives:

Provide an introduction to the main frameworks in AI (mainly sklearn, PyTorch, and HuggingFace).
Present basic tutorials about "how to code an AI".
Showcase the performances of open-source Foundation Models, and motivate to use them instead of redeveloping its own model.

In that context, Labs will follow a path of increasing difficulty and required autonomy.

You are not expected to give anything back to the instructors at the end of the Labs, we trust you to complete the assignments. You are free to use any external source for help (StackOverflow, Copilot, ChatGPT, ...), although we strongly recommend that you try to solve the problem yourself first, and use these external sources only for debugging. In particular, it seems very likely that Copilot and ChatGPT can easily solve the labs, but using these tools without first trying to solve the problem yourself will not provide you with any pedagogical benefit.

Important Concepts⚓

In this module, we have chosen to focus on foundation models. They will be presented in detail in lesson 5. However, some important messages can be taken away:

Foundation models are the biggest and newest AI tools (e.g. GPT for text, DINOv2 for image, wav2vec for audio).
Foundation models are very powerful feature extractors. In other words, such models can extract rich representations from input data and benefit the final task (e.g. image classification).
Foundation models can be open source and easily integrated into your own development process.

We decided to focus the Labs on using foundation models for these reasons.

Key Definitions⚓

Vectors, Matrices, Tensors: Representations of data. If you do not feel at east with concepts of linear algebra, we invite you to study this guide.
Latent space: A vector space obtained by applying a foundation model. Generally, we expect that data projected into that space will better represent concepts in data (e.g. the concept of a cat in an image) compared to the original image. (*In general, any deep learning model can admit a latent space, but in this module we restrict latent spaces to foundation models to make it easier to understand.)
Embeddings: Projections of the data in a latent space, via a foundation model. (*In general, any deep learning model can generate embeddings, but in this module we restrict latent spaces and embeddings to foundation models to make it easier to understand.)

Feel free to ask us any question in the Discord group!

Ethical Considerations⚓

When discussing Artificial Intelligence (AI), several important ethical considerations must be taken into account:

Environmental Impact: AI algorithms require computing resources to run, which can vary significantly depending on the model's complexity. While small models, such as Principal Component Analysis (PCA), have a negligible computational footprint, large foundation models demand extensive processing power. Training and deploying such models, especially in cloud-based environments, contribute to energy consumption and carbon emissions. Even personal usage of AI models can have a non-trivial impact, as highlighted in this paper, which analyzes the computing power required for different tasks and models.
Data Ethics and Intellectual Property: Deep learning models, particularly foundation models, rely on vast amounts of training data. This data may be sourced with or without the explicit consent of its original creators. This raises significant ethical and legal questions: Should AI models be allowed to train on publicly available data without permission? Who holds the intellectual property rights to AI-generated content—its users, developers, or the original data contributors? Current regulations and industry standards on these matters remain evolving, making it an active area of debate.
Bias and Fairness: AI models inherit biases present in their training data, which can lead to unfair or discriminatory outcomes. If the data used to train a model reflects societal biases—whether related to gender, race, or socioeconomic status—the AI may perpetuate or even amplify these biases. Addressing this requires careful dataset curation, transparency in AI development, and robust fairness evaluations.
Accountability and Decision-Making: As AI systems increasingly assist in or automate decision-making processes in critical areas such as healthcare, finance, and law enforcement, questions of accountability arise. Who is responsible when an AI-driven decision leads to harm? Ensuring that AI remains explainable, auditable, and subject to human oversight is crucial for maintaining trust and accountability.

Any consideration of these ethical dimensions in your study of AI will contribute to a more responsible and informed approach to AI development and deployment.