|
| 1 | +--- |
| 2 | +layout: page |
| 3 | +title: Assignment 3 |
| 4 | +mathjax: true |
| 5 | +permalink: /assignments2025/assignment3/ |
| 6 | +--- |
| 7 | + |
| 8 | +<span style="color:red">This assignment is due on **Tuesday, May 28 2024** at 11:59pm PST.</span> |
| 9 | + |
| 10 | +Starter code containing Colab notebooks can be [downloaded here]({{site.hw_3_colab}}). |
| 11 | + |
| 12 | +- [Setup](#setup) |
| 13 | +- [Goals](#goals) |
| 14 | +- [Q1: Image Captioning with Vanilla RNNs](#q1-image-captioning-with-vanilla-rnns) |
| 15 | +- [Q2: Image Captioning with Transformers](#q2-image-captioning-with-transformers) |
| 16 | +- [Q3: Generative Adversarial Networks](#q3-generative-adversarial-networks) |
| 17 | +- [Q4: Self-Supervised Learning for Image Classification](#q4-self-supervised-learning-for-image-classification) |
| 18 | +- [Extra Credit: Image Captioning with LSTMs](#extra-credit-image-captioning-with-lstms-5-points) |
| 19 | +- [Submitting your work](#submitting-your-work) |
| 20 | + |
| 21 | +### Setup |
| 22 | + |
| 23 | +Please familiarize yourself with the [recommended workflow]({{site.baseurl}}/setup-instructions/#working-remotely-on-google-colaboratory) before starting the assignment. You should also watch the Colab walkthrough tutorial below. |
| 24 | + |
| 25 | +<iframe style="display: block; margin: auto;" width="560" height="315" src="https://www.youtube.com/embed/DsGd2e9JNH4" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe> |
| 26 | + |
| 27 | +**Note**. Ensure you are periodically saving your notebook (`File -> Save`) so that you don't lose your progress if you step away from the assignment and the Colab VM disconnects. |
| 28 | + |
| 29 | +While we don't officially support local development, we've added a <b>requirements.txt</b> file that you can use to setup a virtual env. |
| 30 | + |
| 31 | +Once you have completed all Colab notebooks **except `collect_submission.ipynb`**, proceed to the [submission instructions](#submitting-your-work). |
| 32 | + |
| 33 | +### Goals |
| 34 | + |
| 35 | +In this assignment, you will implement language networks and apply them to image captioning on the COCO dataset. Then you will train a Generative Adversarial Network to generate images that look like a training dataset. Finally, you will be introduced to self-supervised learning to automatically learn the visual representations of an unlabeled dataset. |
| 36 | + |
| 37 | +The goals of this assignment are as follows: |
| 38 | + |
| 39 | +- Understand and implement RNN and Transformer networks. Combine them with CNN networks for image captioning. |
| 40 | +- Understand how to train and implement a Generative Adversarial Network (GAN) to produce images that resemble samples from a dataset. |
| 41 | +- Understand how to leverage self-supervised learning techniques to help with image classification tasks. |
| 42 | + |
| 43 | +**You will use PyTorch for the majority of this homework.** |
| 44 | + |
| 45 | +### Q1: Image Captioning with Vanilla RNNs |
| 46 | + |
| 47 | +The notebook `RNN_Captioning.ipynb` will walk you through the implementation of vanilla recurrent neural networks and apply them to image captioning on COCO. |
| 48 | + |
| 49 | +### Q2: Image Captioning with Transformers |
| 50 | + |
| 51 | +The notebook `Transformer_Captioning.ipynb` will walk you through the implementation of a Transformer model and apply it to image captioning on COCO. |
| 52 | + |
| 53 | +### Q3: Generative Adversarial Networks |
| 54 | + |
| 55 | +In the notebook `Generative_Adversarial_Networks.ipynb` you will learn how to generate images that match a training dataset and use these models to improve classifier performance when training on a large amount of unlabeled data and a small amount of labeled data. **When first opening the notebook, go to `Runtime > Change runtime type` and set `Hardware accelerator` to `GPU`.** |
| 56 | + |
| 57 | +### Q4: Self-Supervised Learning for Image Classification |
| 58 | + |
| 59 | +In the notebook `Self_Supervised_Learning.ipynb`, you will learn how to leverage self-supervised pretraining to obtain better performance on image classification tasks. **When first opening the notebook, go to `Runtime > Change runtime type` and set `Hardware accelerator` to `GPU`.** |
| 60 | + |
| 61 | +### Extra Credit: Image Captioning with LSTMs |
| 62 | + |
| 63 | +The notebook `LSTM_Captioning.ipynb` will walk you through the implementation of Long-Short Term Memory (LSTM) RNNs and apply them to image captioning on COCO. |
| 64 | + |
| 65 | +### Submitting your work |
| 66 | + |
| 67 | +**Important**. Please make sure that the submitted notebooks have been run and the cell outputs are visible. |
| 68 | + |
| 69 | +Once you have completed all notebooks and filled out the necessary code, you need to follow the below instructions to submit your work: |
| 70 | + |
| 71 | +**1.** Open `collect_submission.ipynb` in Colab and execute the notebook cells. |
| 72 | + |
| 73 | +This notebook/script will: |
| 74 | + |
| 75 | +* Generate a zip file of your code (`.py` and `.ipynb`) called `a3_code_submission.zip`. |
| 76 | +* Convert all notebooks into a single PDF file called `a3_inline_submission.pdf`. |
| 77 | + |
| 78 | +If your submission for this step was successful, you should see the following display message: |
| 79 | + |
| 80 | +`### Done! Please submit a3_code_submission.zip and a3_inline_submission.pdf to Gradescope. ###` |
| 81 | + |
| 82 | +**2.** Submit the PDF and the zip file to Gradescope. |
| 83 | + |
| 84 | +Remember to download `a3_code_submission.zip` and `a3_inline_submission.pdf` locally before submitting to Gradescope. |
0 commit comments