代写New Assignment 3 2025帮做R语言

New Assignment 32025

Part 1 Select an appropriate model to train the dataset and make predictions (3 Points)

The UCI Adult dataset-sometimes called the Census Income dataset-is a classic resource in machine learning for demonstrating classification tasks, particularly binary classification.

Dataset Description

·Number of Instances:Around 48,842 rows(depending on whether duplicates/missing rows are handled).

·Number  of  Attributes:14  features(plus the  target)

·Feature    Types:

Numeric(e.g.,age,hours-per-week,capital-gain).

Categorical   (e.g.,workclass,marital-status,occupation,sex).

·Target    Column:

Labeled as income,with possible values >50K or<=50K.

·Common  practice  is to convert this to  binary(1 for>50K,O for<=50K).

Feature List

·age(numeric)

·workclass        (categorical:Private,Self-emp,Government,etc.)

·fnlwgt(numeric:“final weight,”representing how many people in the US population each record represents)

·education (categorical:Bachelors,HS-grad,etc.)

·education_num (numeric:1-16,encoded  years  of  education)

·marital_status(categorical)

·occupation (categorical)

·relationship(categorical:Husband,Wife,Not-in-family,etc.)

·race (categorical)

·sex(categorical:Male/Female)

·capital_gain(numeric)

·capital_loss(numeric)

·hours_per_week (numeric)

·native_country(categorical)

·income          (target:>50K/<=50K)

Task Overview

Data   Acquisition   &Understanding(Code   provided)

·Download  the  dataset (e.g.,adult.data  from  the   UCI  Repository  or  Kaggle).

·Familiarize  yourself  with  the   14  features  and  the  target  column  (>50K/<=50K).

Data  Cleaning

·Import  the  dataset into  a  DataFrame  (Code  provided)

·Identify  and  handle  missing  values  (often  represented  by"?").Decide  whether  to  drop  or  impute  those  rows( 0.25 points). Feature   Engineering   &Encoding

·Convert   the   target   (income)to   a   binary   numeric:1   if>50K,0 if<=50K(0.25 points).

·Encode      categorical      columns appropriately(e.g.,workclass,education,marital_status):(0.5   points)

One-hot  encoding(dummy  variables)or  label  encoding.

·Consider dropping  high-cardinality  or  rarely  occurring  categories,or  grouping  them.

Data  Splitting:Split  into  train  and  test  sets(0.5   points)

Model  Training:Select   a suitable model and appropriate columns to  train  the  model.(0.5   points)

Evaluation:( 0.5 points)

·Generate  predictions  on  the  test set  and  compute  classification  metrics:

■   Accuracy

■      Precision,Recall,F1-score

■  Confusion  matrix  Prediction:Make  an  imaginary  person,use  the  model  to  predict  whether  the  person's  income  will  be  above  50K(0.5 points).

#If you have not installed the UCI Machine Learning Repo module,un-comment the next line and install it.

#!pip install ucimlrepo

#This is the part you download the dataset and convert it to a pandas data frame.

from ucimlrepo import fetch_ucirepo

import pandas as pd

import numpy as np

adult =fetch ucirepo(id=2) A=adult.data.features

B=adult.data.targets

df=pd.concat([A,B],axis=1) df


热门主题

课程名

mktg2509 csci 2600 38170 lng302 csse3010 phas3226 77938 arch1162 engn4536/engn6536 acx5903 comp151101 phl245 cse12 comp9312 stat3016/6016 phas0038 comp2140 6qqmb312 xjco3011 rest0005 ematm0051 5qqmn219 lubs5062m eee8155 cege0100 eap033 artd1109 mat246 etc3430 ecmm462 mis102 inft6800 ddes9903 comp6521 comp9517 comp3331/9331 comp4337 comp6008 comp9414 bu.231.790.81 man00150m csb352h math1041 eengm4100 isys1002 08 6057cem mktg3504 mthm036 mtrx1701 mth3241 eeee3086 cmp-7038b cmp-7000a ints4010 econ2151 infs5710 fins5516 fin3309 fins5510 gsoe9340 math2007 math2036 soee5010 mark3088 infs3605 elec9714 comp2271 ma214 comp2211 infs3604 600426 sit254 acct3091 bbt405 msin0116 com107/com113 mark5826 sit120 comp9021 eco2101 eeen40700 cs253 ece3114 ecmm447 chns3000 math377 itd102 comp9444 comp(2041|9044) econ0060 econ7230 mgt001371 ecs-323 cs6250 mgdi60012 mdia2012 comm221001 comm5000 ma1008 engl642 econ241 com333 math367 mis201 nbs-7041x meek16104 econ2003 comm1190 mbas902 comp-1027 dpst1091 comp7315 eppd1033 m06 ee3025 msci231 bb113/bbs1063 fc709 comp3425 comp9417 econ42915 cb9101 math1102e chme0017 fc307 mkt60104 5522usst litr1-uc6201.200 ee1102 cosc2803 math39512 omp9727 int2067/int5051 bsb151 mgt253 fc021 babs2202 mis2002s phya21 18-213 cege0012 mdia1002 math38032 mech5125 07 cisc102 mgx3110 cs240 11175 fin3020s eco3420 ictten622 comp9727 cpt111 de114102d mgm320h5s bafi1019 math21112 efim20036 mn-3503 fins5568 110.807 bcpm000028 info6030 bma0092 bcpm0054 math20212 ce335 cs365 cenv6141 ftec5580 math2010 ec3450 comm1170 ecmt1010 csci-ua.0480-003 econ12-200 ib3960 ectb60h3f cs247—assignment tk3163 ics3u ib3j80 comp20008 comp9334 eppd1063 acct2343 cct109 isys1055/3412 math350-real math2014 eec180 stat141b econ2101 msinm014/msing014/msing014b fit2004 comp643 bu1002 cm2030
联系我们
EMail: 99515681@qq.com
QQ: 99515681
留学生作业帮-留学生的知心伴侣!
工作时间:08:00-21:00
python代写
微信客服:codinghelp
站长地图