Bird recognition in the city of Peacetopia (case study) >> Structuring Machine Learning Projects
*Please Do Not Click On The Options.
* If You Click Mistakenly Then Please Refresh The Page To Get The Right Answers.
Bird recognition in the city of Peacetopia (case study)
1. Problem Statement
This example is adapted from a real production application, but with details disguised to protect confidentiality.
You are a famous researcher in the City of Peacetopia. The people of Peacetopia have a common characteristic: they are afraid of birds. To save them, you have to build an algorithm that will detect any bird flying over Peacetopia and alert the population.
The City Council gives you a dataset of 10,000,000 images of the sky above Peacetopia, taken from the city’s security cameras. They are labelled:
- y = 0: There is no bird on the image
- y = 1: There is a bird on the image
Your goal is to build an algorithm able to classify new images taken by security cameras from Peacetopia.
There are a lot of decisions to make:
- What is the evaluation metric?
- How do you structure your data into train/dev/test sets?
Metric of success
The City Council tells you that they want an algorithm that
- Has high accuracy
- Runs quickly and takes only a short time to classify a new image.
- Can fit in a small amount of memory, so that it can run in a small processor that the city will attach to many different security cameras.
Note: Having three evaluation metrics makes it harder for you to quickly choose between two different algorithms, and will slow down the speed with which your team can iterate. True/False?
- “We need an algorithm that can let us know a bird is flying over Peacetopia as accurately as possible.”
- “We want the trained model to take no more than 10sec to classify a new image.”
- “We want the model to fit in 10MB of memory.”
If you had the three following models, which one would you choose?
Test Accuracy | Runtime | Memory size |
97% | 1 sec | 3MB |
Test Accuracy | Runtime | Memory size |
99% | 13 sec | 9MB |
Test Accuracy | Runtime | Memory size |
97% | 3 sec | 2MB |
Test Accuracy | Runtime | Memory size |
98% | 9 sec | 9MB |
4. Structuring your data
Before implementing your algorithm, you need to split your data into train/dev/test sets. Which of these do you think is the best choice?
Train | Dev | Test |
6,000,000 | 3,000,000 | 1,000,000 |
Train | Dev | Test |
6,000,000 | 1,000,000 | 3,000,000 |
Train | Dev | Test |
3,333,334 | 3,333,333 | 3,333,333 |
Train | Dev | Test |
9,500,000 | 250,000 | 250,000 |
Notice that adding this additional data to the training set will make the distribution of the training set different from the distributions of the dev and test sets.
Is the following statement true or false?
“You should not add the citizens’ data to the training set, because if the training distribution is different from the dev and test sets, then this will not allow the model to perform well on the test set.”
Training set error | 4.0% |
Dev set error | 4.5% |
This suggests that one good avenue for improving performance is to train a bigger network so as to drive down the 4.0% training error. Do you agree?
Bird watching expert #1 | 0.3% error |
Bird watching expert #2 | 0.5% error |
Normal person #1 (not a bird watching expert) | 1.0% error |
Normal person #2 (not a bird watching expert) | 1.2% error |
If your goal is to have “human-level performance” be a proxy (or estimate) for Bayes error, how would you define “human-level performance”?
Human-level performance | 0.1% |
Training set error | 2.0% |
Dev set error | 2.1% |
Based on the evidence you have, which two of the following four options seem the most promising to try? (Check two options.)
Human-level performance | 0.1% |
Training set error | 2.0% |
Dev set error | 2.1% |
Test set error | 7.0% |
What does this mean? (Check the two best options.)
Human-level performance | 0.10% |
Training set error | 0.05% |
Dev set error | 0.05% |
What can you conclude? (Check all that apply.)
You have only 1,000 images of the new species of bird. The city expects a better system from you within the next 3 months. Which of these should you do first?