代做CSE 414 - Data Management Homework 2 | Basic SQL Queries代写留学生SQL语言

Homework 2 | Basic SQL Queries

CSE 414 - Data Management

Objectives: To create and import databases and to practice simple SQL queries using SQLite.

Assignment tools: SQLite 3, the flights dataset hosted here.

What to turn in: create-tables.sql, import-tables.sql, hw2-q1.sql, hw2-q2.sql, etc (see below). You should compose these files in a code editor like Sublime Text (or your favorite IDE).

Where to turn in: Gradescope

Assignment Details

In this homework, you will write several SQL queries on a relational flights database. The data in this database is abridged from the Bureau of Transportation Statistics The database consists of  four tables regarding a subset of flights that took place in 2015. The schema you should use is as follows. Be sure to use exactly these column names in this order.

FLIGHTS (fid int,

month_idint,        -- 1-12

day of month int,    -- 1-31

day of week_idint,  -- 1-7, 1 = Monday, 2 = Tuesday, etc

carrier_id varchar(7),

flight_num int,

origin_city varchar(34),

origin_state varchar(47),

dest_city varchar(34),

dest_state varchar(46),

departure_delay int, -- in mins

taxi_out int,        -- in mins

arrival_delay int,   -- in mins

canceled int,        -- 1 means canceled

actual_time int,     -- in mins

distance int,        -- in miles

capacity int,

price int            -- in $

)

CARRIERS (cid varchar(7), name varchar(83))

MONTHS (mid int, month varchar(9))

WEEKDAYS (did int,day of week varchar(9))

In addition, make sure you impose the following constraints to the tables above:

The primary key of the FLIGHTS table is fid.

●   The primary keys for the other tables are cid, mid, and did respectively. Other than these,do not assume any other attribute(s) is a key/unique across tuples.

●    Flights.carrier_id references Carriers.cid

●    Flights.month_id references Months.mid

●    Flights.day_of_week_id references Weekdays.did

We provide the flights database as a set of plain-textdata files in the linked .zip archive. Each file in this archive contains all the rows for the named table, one row per line.

In this homework, you need to do two things:

1.   import the flights dataset into SQLite

2.   run SQL queries to answer a set of questions about the data.

IMPORTING THE FLIGHTS DATABASE (20 points)

Currently, SQLite does not enforce foreign keys by default. To enable foreign keys use the following as the first command in your create-tables.sql file.

PRAGMA foreign_keys=ON;

To import the flights database into SQLite, you will need to run sqlite3 with a new database file.  For example sqlite3 hw2.db. Then you can run CREATE TABLE statements to create the tables while specifying all key constraints as described above:

CREATE TABLE table_name ( ... );

Then, you can use the SQLite .import command to read data from each text file into itstable after setting the input data to be in CSV (comma separated value) form.

.mode csv

.import filename tablename

See examples of .import statements in the SQLite documentation or sqlite3's help online for details. Depending on where you downloaded and extracted the data files, your import

statement might look something like ".import /Users/maas/Downloads/filename.csv tablename". On most operating systems you can find the file path by right-clicking the file and looking at the  properties.

Put all the code for creating your tables into a file called create-tables.sql and all the code for importing the data into these tables into a separate file called import-tables.sql . If done correctly, you should be able to open up a new db file in sqlite and setup the database using these two commands:

.read create-tables.sql

.read import-tables.sql

WRITING SQL QUERIES (80 points, 10 points each)

For each question below, write a single SQL query to answer that question. Put each of your queries in a separate .sql file, i.e., hw2-q1.sql, hw2-q2.sql, etc.

Important points before starting:

●    Like in HW 1, the code in your .sql files must be valid SQL. If running the file causes errors we will subtract points.

●   Your answer should NOT contain any subqueries. In HW 3 we will use subqueries, but for this homework you shouldn’t use them.

●    Make sure you name the output columns as indicated. Do not change the output column names/return more or fewer columns.

●    If a query uses a GROUP BY clause, make sure that all attributes in your SELECT

clause for that query are either group by attributes or contained in an aggregate function. SQLite will let you select other attributes, but that is wrong as we discussed in lecture. Other database systems would reject the query and we will subtract points for this mistake.

Generally the boolean filters in your queries should correspond to the English

descriptions. For example if a question asks you to find flights on a Tuesday, your query should test day_of_week = ‘Tuesday’ . It is not correct to instead test did = 2, as this isn’t the description in the problem statement. The reasoning is that a database user doesn’t know that Tuesday has did = 2, they need to join Weekdays to Flights to filter on particular weekday strings. This rule also applies for filters over carrier names, months, etc.

●   A tip for solving these problems is to think about FROM clause first. Which tables do you need to join, and what attributes do you need to compute? If you think of the acronym FWGHOS we learned in class, it might help you compose your query.

In the following questions below flights include canceled flights as well, unless otherwise noted. Also, when asked to output times you can report them in minutes.

1.   (10 points) List the distinct flight numbers of all flights from Seattle to Boston by Alaska Airlines Inc. on Mondays. Also notice that, in the database, the city names include the   state, so Seattle appears as ‘Seattle WA’ . Please use the flight_num column instead of fid. Name the output column flight_num.

[Hint: Output relation cardinality: 3 rows]

2.   (10 points) Find all itineraries from Seattle to Boston on July 15th. Search only for

itineraries that have one stop (i.e., flight 1: Seattle -> [somewhere], flight2: [somewhere] - > Boston). Both flights must depart on the same date and must be with the same carrier.

It's fine if the landing date is different from the departing date (in the case of an overnight flight).

The total flight time (actual_time) of the entire itinerary should be fewer than 7 hours (but notice that actual_time is in minutes). For each itinerary, the query should return the name of the carrier,the first flight number, the origin and destination of that first flight, the flight time, the second flight number, the origin and destination of the second flight, the second flight time, and finally the total flight time. Only count flight times here; do not include any layover time.

Name the output columns name (as in the name of the carrier), f1_flight_num,

f1_origin_city, f1_dest_city, f1_actual_time, f2_flight_num, f2_origin_city,

f2_dest_city, f2_actual_time, and actual_time as the total flight time. List the output columns in this order. [Output relation cardinality: 1472 rows]

3.   (10 points) Find the day of the week with the longest average arrival delay. Return the name of the day and the average delay.

Name the output columns day_of_week and delay, in that order. (Hint: consider using LIMIT. Look up what it does!)

[Output relation cardinality: 1 row]

4.   (10 points) Find the names of all airlines that ever flew more than 1000 flights in one day (i.e., a specific day/month, but not any 24-hour period). Return only the names of the airlines. Do not return any duplicates (i.e., airlines with the exact same name).

Name the output column name.

[Output relation cardinality: 12 rows]

5.   (10 points) Find all airlines that had more than 0.5% (= 0.005) of their flights out of

Seattle canceled. Return the name of the airline and the percentage of canceled flights out of Seattle. Percentages should be outputted in percent format (3.5% as 3.5 not 0.035). Order the results by the percentage of canceled flights in ascending order. Name the output columns name and percentage, in that order.

[Output relation cardinality: 6 rows]

6.   (10 points) Find the maximum price of tickets between Seattle and New York, NY (i.e. Seattle to NY or NY to Seattle). Show the maximum price for each airline separately.

Name the output columns carrier and max_price, in that order. [Output relation cardinality: 3 rows]

7.   (10 points) Find the total capacity of all direct flights that fly between Seattle and San Francisco, CA on July 10th (i.e. Seattle to SF or SF to Seattle).

       Name the output column capacity. [Output relation cardinality: 1 row]

8.   (10 points) Compute the total departure delay of each airline across all flights. Some   departure delays may be negative (indicating an early departure); they should reduce  the total, so you don't need to handle them specially. Name the output columns name and delay, in that order. [Output relation cardinality: 22 rows]




热门主题

课程名

mktg2509 csci 2600 38170 lng302 csse3010 phas3226 77938 arch1162 engn4536/engn6536 acx5903 comp151101 phl245 cse12 comp9312 stat3016/6016 phas0038 comp2140 6qqmb312 xjco3011 rest0005 ematm0051 5qqmn219 lubs5062m eee8155 cege0100 eap033 artd1109 mat246 etc3430 ecmm462 mis102 inft6800 ddes9903 comp6521 comp9517 comp3331/9331 comp4337 comp6008 comp9414 bu.231.790.81 man00150m csb352h math1041 eengm4100 isys1002 08 6057cem mktg3504 mthm036 mtrx1701 mth3241 eeee3086 cmp-7038b cmp-7000a ints4010 econ2151 infs5710 fins5516 fin3309 fins5510 gsoe9340 math2007 math2036 soee5010 mark3088 infs3605 elec9714 comp2271 ma214 comp2211 infs3604 600426 sit254 acct3091 bbt405 msin0116 com107/com113 mark5826 sit120 comp9021 eco2101 eeen40700 cs253 ece3114 ecmm447 chns3000 math377 itd102 comp9444 comp(2041|9044) econ0060 econ7230 mgt001371 ecs-323 cs6250 mgdi60012 mdia2012 comm221001 comm5000 ma1008 engl642 econ241 com333 math367 mis201 nbs-7041x meek16104 econ2003 comm1190 mbas902 comp-1027 dpst1091 comp7315 eppd1033 m06 ee3025 msci231 bb113/bbs1063 fc709 comp3425 comp9417 econ42915 cb9101 math1102e chme0017 fc307 mkt60104 5522usst litr1-uc6201.200 ee1102 cosc2803 math39512 omp9727 int2067/int5051 bsb151 mgt253 fc021 babs2202 mis2002s phya21 18-213 cege0012 mdia1002 math38032 mech5125 07 cisc102 mgx3110 cs240 11175 fin3020s eco3420 ictten622 comp9727 cpt111 de114102d mgm320h5s bafi1019 math21112 efim20036 mn-3503 fins5568 110.807 bcpm000028 info6030 bma0092 bcpm0054 math20212 ce335 cs365 cenv6141 ftec5580 math2010 ec3450 comm1170 ecmt1010 csci-ua.0480-003 econ12-200 ib3960 ectb60h3f cs247—assignment tk3163 ics3u ib3j80 comp20008 comp9334 eppd1063 acct2343 cct109 isys1055/3412 math350-real math2014 eec180 stat141b econ2101 msinm014/msing014/msing014b fit2004 comp643 bu1002 cm2030
联系我们
EMail: 99515681@qq.com
QQ: 99515681
留学生作业帮-留学生的知心伴侣!
工作时间:08:00-21:00
python代写
微信客服:codinghelp
站长地图