Introduction to Data Analytics for Business – Coursera
4.7 Stars (2,037 ratings)
Instructor: David Torgerson
Week-1 Quiz Answer
Week-2 Quiz Answer
Week-3 Quiz Answer
1.In this quiz, you’ll be writing queries based on the following database. Follow this link to access a larger picture of the database.
Note: your SQL code entries will not be saved between quiz attempts! Please copy paste them somewhere so you don’t have to retype the entire code when you take the quiz again.
How many aircrafts are there in the PLANES table?
Input |
SELECT count(aircraft_type) from planes |
Output |
+———————-+ | count(aircraft_type) | +———————-+ | 25 | +———————-+ |
Refer to the following video if you need a refresher: video 1.
- 25
- 50
- 562
- 10961
2. Write a query that provides a list of all planes that have a seat count of 100 or more, ordered from lowest to highest number of seats.
What is the TAIL_NUMBER of the plane with the second lowest number of seats in that list?
Input |
SELECT * FROM PLANES WHERE SEAT_COUNT >= 100 ORDER BY SEAT_COUNT |
Output |
+————-+———+—————-+————+————+ | TAIL_NUMBER | AIRLINE | AIRCRAFT_TYPE | FLEET_TYPE | SEAT_COUNT | +————-+———+—————-+————+————+ | N132AA | SKY | A319-114 | 319-100 | 120 | | N120EE | SKY | A319-132 | 319-100 | 124 | | N117BB | SKY | A319-112 | 319-100 | 128 | | N118CC | ALN | A319-115 | 319-100 | 128 | | N119DD | GLB | A319-115 WL | 319-100 | 128 | | N127AA | ALN | DC-9-82(MD-82) | MD-82 | 140 | | N128BB | GLB | DC-9-83(MD-83) | MD-83 | 140 | | N111AA | SKY | 737-823 | 737-800 | 150 | | N112BB | ALN | 737-823 WL | 737-800 | 150 | | N113CC | GLB | 737-832 | 737-800 | 150 | | N121AA | ALN | A320-214 | 320-200 | 150 | | N122BB | GLB | A320-231 | 320-200 | 150 | | N123CC | SKY | A320-232 | 320-200 | 150 | | N115EE | ALN | 767-223 | 767-200 | 168 | | N116AA | GLB | 767-223 ER | 767-200 | 168 | | N124DD | ALN | A321-211 | 321-200 | 187 | | N125EE | GLB | A321-231 | 321-200 | 187 | | N114DD | SKY | 757-223 | 757-200 | 188 | +————-+———+—————-+————+————+ |
Refer to the following video if you need a refresher: video 2.
- N114DD
- N132AA
- N120EE
- N126AA
3. Write a query that provides the number of flights flown by each aircraft.
Which aircraft flew the most flights?
Input |
SELECT TAIL_NUMBER, COUNT(*) AS NUM_FLIGHTS FROM FLIGHTS GROUP BY TAIL_NUMBER ORDER BY NUM_FLIGHTS DESC |
Output |
+————-+————-+ | TAIL_NUMBER | NUM_FLIGHTS | +————-+————-+ | N111AA | 5 | | N116AA | 4 | | N118CC | 4 | | N122BB | 4 | | N125EE | 4 | | N119DD | 3 | | N124DD | 3 | | N132AA | 3 | | N134CC | 3 | | N117BB | 2 | | N129CC | 2 | | N130DD | 2 | | N135DD | 2 | | N112BB | 1 | | N113CC | 1 | | N115EE | 1 | | N120EE | 1 | | N123CC | 1 | | N126AA | 1 | | N127AA | 1 | | N128BB | 1 | | N133BB | 1 | +————-+————-+ |
Refer to the following video if you need a refresher: video 2
-
N125EE
-
N135DD
-
N120EE
-
N111AA
4. Write a query that provides a list of planes that flew more than 600 passengers across all flights.
How many planes are in that list?
Refer to the following video if you need a refresher: video 2.
Input |
SELECT TAIL_NUMBER, SUM(PASSENGER_COUNT) AS TOT_PASSENGERS FROM FLIGHTS GROUP BY TAIL_NUMBER HAVING TOT_PASSENGERS > 600 |
Output |
+————-+—————-+ | TAIL_NUMBER | TOT_PASSENGERS | +————-+—————-+ | N111AA | 675 | | N116AA | 608 | | N125EE | 659 | +————-+—————-+ |
-
5
-
22
-
3
-
7
5. Write a query that provides the total number of flights by country.
How many flights originated in the United States (Country US)?
Refer to the following video if you need a refresher: video 3.
Input |
SELECT COUNTRY, COUNT(*) AS NUM_FLIGHTS FROM FLIGHTS a LEFT JOIN AIRPORTS b ON a.DEPARTURE_AIRPORT = b.AIRPORT |
Output |
+———+————-+ | COUNTRY | NUM_FLIGHTS | +———+————-+ | IT | 50 | +———+————-+ |
-
34
-
23
-
12
-
2
6. Write a query that provides the total number of flights by regionality.
Which regionality has the second highest number of flights?
Refer to the following video if you need a refresher: video 3.
Input |
SELECT REGIONALITY, COUNT(*) AS NUM_FLIGHTS FROM FLIGHTS a LEFT JOIN CITY_PAIRS b ON a.DEPARTURE_AIRPORT = b.DEPARTURE_AIRPORT AND a.ARRIVAL_AIRPORT = b.ARRIVAL_AIRPORT GROUP BY REGIONALITY |
Output |
+————-+————-+ | REGIONALITY | NUM_FLIGHTS | +————-+————-+ | NA-NA | 1 | | NA-ROW | 1 | | NA-US | 1 | | ROW-ROW | 10 | | ROW-US | 3 | | US-NA | 1 | | US-ROW | 4 | | US-US | 29 | +————-+————-+ |
-
US-US
-
US-ROW
-
ROW-NA
-
ROW-ROW
7. How many CITY_PAIRS are there which depart from one of the following airports?
KLAX, KDEN, KORD, KDET, KLGA
Hint: use the IN operator!
Refer to the following video if you need a refresher: video 5.
Input |
SELECT COUNT(*) FROM CITY_PAIRS WHERE DEPARTURE_AIRPORT IN (‘KLAX’,’KDEN’,’KORD’,’KDET’,’KLGA’) |
Output |
+———-+ | COUNT(*) | +———-+ | 722 | +———-+ |
-
722
-
803
-
245
-
104
8. How many airports are missing elevation values?
Refer to the following video if you need a refresher: video 5.
Input |
SELECT COUNT(*) FROM AIRPORTS WHERE ELEVATION IS NULL |
Output |
+———-+ | COUNT(*) | +———-+ | 6 | +———-+ |
-
10
-
8
-
6
-
4
9. What flight number had the lowest passenger count (try using a subquery if you can!)?
Refer to the following video if you need a refresher: video 6.
Input |
SELECT FLIGHT_NUMBER FROM FLIGHTS WHERE PASSENGER_COUNT IN (SELECT MIN(PASSENGER_COUNT) FROM FLIGHTS) |
Output |
+—————+ | FLIGHT_NUMBER | +—————+ | ALN626 | +—————+ |
-
ALN626
-
ALN745
-
GLB805
-
SKY996
10. What is the average distance flown by SKY Airline flights (nearest mile)?
Input |
SELECT AVG (DISTANCE) AS AVG_DISTANCE FROM FLIGHTS a LEFT JOIN PLANES b ON a.TAIL_NUMBER = b.TAIL_NUMBER LEFT JOIN CITY_PAIRS c ON a.DEPARTURE_AIRPORT = c.DEPARTURE_AIRPORT AND a.ARRIVAL_AIRPORT = c.ARRIVAL_AIRPORT WHERE AIRLINE = ‘SKY’ |
Output |
+—————+ | AVG_DISTANCE | +—————+ | 1767.96389687 | +—————+ |
Refer to the following video if you need a refresher: video 6.
-
1474
-
1570
-
1577
-
1768
Week-4 Quiz Answers
-
IT Infrastructure Resource
-
Application Developer
-
Data Architect
-
Database Administrator
-
ETL Developer
-
BI Developer
-
Database Analyst
-
Data Analyst
-
Business Analyst
-
Modeler
-
Data Scientist
-
IT Infrastructure Resource
-
Application Developer
-
Data Architect
-
Database Administrator
-
ETL Developer
-
BI Developer
-
Database Analyst
-
Data Analyst
-
Business Analyst
-
Modeler
-
Data Scientist
-
IT Infrastructure Resource
-
Application Developer
-
Data Architect
-
Database Administrator
-
ETL Developer
-
BI Developer
-
Database Analyst
-
Data Analyst
-
Business Analyst
-
Modeler
-
Data Scientist
-
IT Infrastructure Resource
-
Application Developer
-
Data Architect
-
Database Administrator
-
ETL Developer
-
BI Developer
-
Database Analyst
-
Data Analyst
-
Business Analyst
-
Modeler
-
Data Scientist
-
IT Infrastructure Resource
-
Application Developer
-
Data Architect
-
Database Administrator
-
ETL Developer
-
BI Developer
-
Database Analyst
-
Data Analyst
-
Business Analyst
-
Modeler
-
Data Scientist
-
IT Infrastructure Resource
-
Application Developer
-
Data Architect
-
Database Administrator
-
ETL Developer
-
BI Developer
-
Database Analyst
-
Data Analyst
-
Business Analyst
-
Modeler
-
Data Scientist
-
Allocated Model
-
Centralized Model
-
Distributed Model
-
Coordinated Model
-
Allocated Model
-
Centralized Model
-
Distributed Model
-
Coordinated Model
-
Allocated Model
-
Centralized Model
-
Distributed Model
-
Coordinated Model
-
SOX Compliance
-
Metadata Management
-
Master Data Management
-
Data Governance
-
Data Stewardship
-
Data Steward
-
Data Architect
-
Data Governance Council
-
SOX Compliance Auditor
-
ETL Developer
-
Policy Standards
-
Legal Standards
-
Good Judgement
-
Ethical Standards
-
Policy Standards
-
Legal Standards
-
Good Judgement
-
Ethical Standards
-
Completeness / Uniqueness
-
Accuracy / Consistency
-
Conformance / Validity
-
Timeliness
-
Provenance
-
Completeness / Uniqueness
-
Accuracy / Consistency
-
Conformance / Validity
-
Timeliness
-
Provenance
-
Completeness / Uniqueness
-
Accuracy / Consistency
-
Conformance / Validity
-
Timeliness
-
Provenance
-
Completeness / Uniqueness
-
Accuracy / Consistency
-
Conformance / Validity
-
Timeliness
-
Provenance
Peer-graded Assignment: Final Course Assignment
Part 1: Conceptual business model. Construct a conceptual business model for an industry or business that you are familiar with or have interest in. Visually it should be similar to the one we illustrated in Module 1, Video 2.
- Your model should represent at least 10 ideas
- It should visually represent one to one, many to one, or many to many relationships among ideas
Part 2: Relational data model. Take a subset of the ideas from the conceptual model you constructed in Part 1 and design a simple relationship model similar to the ones we discussed in Module 2, Video 4
- Your model should have at least 5 tables
- You should include at least 20 attributes, or fields, in your model (20 total across all tables, not per table)
- Your model should be normalized
- Identify the primary key in each table, and state whether it is a natural or surrogate key
- For each relationship between tables, identify any foreign keys needed to define the relationship
- For each table, identify what type of system or systems you think the data might come from, like those we discussed in Module 1, Video 6.
Part 3: SQL queries. Using the data model you constructed in Part 2, come up with two data extracts you think would be interesting, then write SQL queries to provide each one.
- For each query, state what data you are trying to get and why it would be interesting
- Provide the SQL query using the commands and syntax we learned in Module 3
- For maximum credit, at least one of your queries should involve a join across two or more tables.
– Query to find the distribution of services used by customers in different regions. It is interesting because this info could lead to changes in service offerings in some regions if any of the services are found to be severely underused in those regions , and more business can be generated as the result.
Answer:
SELECT r.reg_name , p.service_id , COUNT(p.purch_id) as cnt FROM Customers c LEFT JOIN Purchases p ON c.cust_id = p.cust_id LEFT JOIN Regions r ON c.region_id = r.region_id GROUP BY r.reg_name, p.service_id ORDER BY cnt; — Query to find out year-to-date revenue by customer for service #2. It is interesting because this info can be used for a targeted marketing campaign to generate more business from the customers currently under-using service #2. SELECT t.cust_id , SUM(t.amount) as revenue FROM Transactions t LEFT JOIN Purchases p ON t.purch_id = p.purch_id WHERE p.purch_date BETWEEN ‘2017-01-01’ AND DATE.NOW () AND p.service_id = 2 GROUP BY t.cust_id ORDER BY revenue; |
Part 4 : Sensitive data and data quality issues. Consider the data privacy and data quality implications of the data model you constructed in Part 2
- Identify any fields you think might be PII, CFI, CPNI, or PHI as we defined in Module 4, Video 4
- What data elements in your model will present the most significant data quality challenges? Explain your reasoning.
CREATE TABLE customers ( cust_id INT PRIMARY KEY AUTO_INCREMENT , last_name VARCHAR(30) , first_name VARCHAR(30) , region_id FOREIGN KEY );
CREATE TABLE purchases ( purch_id INT PRIMARY KEY AUTO_INCREMENT , purch_date DATE , num_items INT , service_id FOREIGN KEY , cust_id FOREIGN KEY , seller , status_id FOREIGN KEY , transaction_id FOREIGN KEY , package_id FOREIGN KEY ); CREATE TABLE items ( item_id INT PRIMARY KEY , purch_id FOREIGN KEY , cust_id FOREIGN KEY );
CREATE TABLE package ( package_id , item_id , carrier , PRIMARY KEY (package_id, item_id) );
CREATE TABLE transactions ( transaction_id PRIMARY KEY , t_date DATE , type , amount , cust_id , payment_method , purch_id ); |
Related Questions & Answers:
- A data analyst finishes analyzing data for a marketing project. The results are clear, so they present findings and recommendations to the client. What should they have done before that presentation? 7.A data analyst finishes analyzing data for a marketing project. The results are clear, so they present findings and recommendations ... Read more...
- An airport wants to make its luggage-handling process faster and simpler for travelers. A data analyst examines and evaluates how the process works currently in order to achieve the goal of a more efficient process. What methodology do they use? 7. An airport wants to make its luggage-handling process faster and simpler for travelers. A data analyst examines and evaluates ... Read more...
- When writing a query, what word does a data analyst use to indicate the table from which the data will be retrieved? 5.When writing a query, what word does a data analyst use to indicate the table from which the data will ... Read more...
- A magazine wants to understand why its subscribers have been increasing. A data analyst could help answer that question with a report that predicts the result of a half-price sale on future subscription rates. 1.A magazine wants to understand why its subscribers have been increasing. A data analyst could help answer that question with ... Read more...
- You’ve reached the share phase of the data analysis process. It involves creating a data visualization to highlight the Splashtastic sales insights you’ve discovered. 5.Scenario 1 continued You’ve reached the share phase of the data analysis process. It involves creating a data visualization to ... Read more...
- The share phase of the data analysis process typically involves which of the following activities? Select all that apply. 2. The share phase of the data analysis process typically involves which of the following activities? Select all that apply. ... Read more...