dev_xulongjin 655911b748 chore(project): 初始化项目结构和配置
- 添加 .idea 目录和相关配置文件,设置项目忽略文件、编码、模块管理等
- 创建商务大数据分析目录和子目录,准备数据和任务笔记本
- 添加示例数据文件:中国城市人口数据.csv
- 创建任务笔记本文件,进行数据处理和分析示例
2025-04-14 16:06:13 +08:00

903 lines
27 KiB
Plaintext

{
"cells": [
{
"cell_type": "code",
"id": "initial_id",
"metadata": {
"collapsed": true,
"ExecuteTime": {
"end_time": "2025-04-14T02:39:40.769558Z",
"start_time": "2025-04-14T02:39:40.456570Z"
}
},
"source": "import pandas as pd",
"outputs": [],
"execution_count": 1
},
{
"metadata": {
"ExecuteTime": {
"end_time": "2025-04-14T02:41:58.436846Z",
"start_time": "2025-04-14T02:41:58.386566Z"
}
},
"cell_type": "code",
"source": [
"data1 = pd.read_excel('data/healthcare-dataset-stroke.xlsx')\n",
"data1.head(3)"
],
"id": "4b3c42b38f05d480",
"outputs": [
{
"data": {
"text/plain": [
" 编号 性别 高血压 是否结婚 工作类型 居住类型 体重指数 吸烟史 中风\n",
"0 9046 男 否 是 私人 城市 36.6 以前吸烟 是\n",
"1 51676 女 否 是 私营企业 农村 NaN 从不吸烟 是\n",
"2 31112 男 否 是 私人 农村 32.5 从不吸烟 是"
],
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>编号</th>\n",
" <th>性别</th>\n",
" <th>高血压</th>\n",
" <th>是否结婚</th>\n",
" <th>工作类型</th>\n",
" <th>居住类型</th>\n",
" <th>体重指数</th>\n",
" <th>吸烟史</th>\n",
" <th>中风</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>9046</td>\n",
" <td>男</td>\n",
" <td>否</td>\n",
" <td>是</td>\n",
" <td>私人</td>\n",
" <td>城市</td>\n",
" <td>36.6</td>\n",
" <td>以前吸烟</td>\n",
" <td>是</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>51676</td>\n",
" <td>女</td>\n",
" <td>否</td>\n",
" <td>是</td>\n",
" <td>私营企业</td>\n",
" <td>农村</td>\n",
" <td>NaN</td>\n",
" <td>从不吸烟</td>\n",
" <td>是</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>31112</td>\n",
" <td>男</td>\n",
" <td>否</td>\n",
" <td>是</td>\n",
" <td>私人</td>\n",
" <td>农村</td>\n",
" <td>32.5</td>\n",
" <td>从不吸烟</td>\n",
" <td>是</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"execution_count": 8
},
{
"metadata": {
"ExecuteTime": {
"end_time": "2025-04-14T02:42:02.131783Z",
"start_time": "2025-04-14T02:42:02.114377Z"
}
},
"cell_type": "code",
"source": [
"data2 = pd.read_excel('data/healthcare-dataset-age_abs.xlsx')\n",
"data2.head(3)"
],
"id": "e72f2e11a9b2e88d",
"outputs": [
{
"data": {
"text/plain": [
" 编号 年龄 平均血糖\n",
"0 9046 67.0 228.69\n",
"1 51676 61.0 202.21\n",
"2 31112 80.0 105.92"
],
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>编号</th>\n",
" <th>年龄</th>\n",
" <th>平均血糖</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>9046</td>\n",
" <td>67.0</td>\n",
" <td>228.69</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>51676</td>\n",
" <td>61.0</td>\n",
" <td>202.21</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>31112</td>\n",
" <td>80.0</td>\n",
" <td>105.92</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"execution_count": 10
},
{
"metadata": {
"ExecuteTime": {
"end_time": "2025-04-14T02:44:09.987977Z",
"start_time": "2025-04-14T02:44:09.985187Z"
}
},
"cell_type": "code",
"source": [
"print(data1.size)\n",
"data2.size"
],
"id": "40c26c71f24c511d",
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"15903\n"
]
},
{
"data": {
"text/plain": [
"5301"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"execution_count": 17
},
{
"metadata": {
"ExecuteTime": {
"end_time": "2025-04-14T07:59:22.335960Z",
"start_time": "2025-04-14T07:59:22.326530Z"
}
},
"cell_type": "code",
"source": [
"merge_data = data1.merge(data2, on=['编号'], how='left')\n",
"merge_data.head(3)"
],
"id": "37f42c042c31af5e",
"outputs": [
{
"data": {
"text/plain": [
" 编号 性别 高血压 是否结婚 工作类型 居住类型 体重指数 吸烟史 中风 年龄 平均血糖\n",
"0 9046 男 否 是 私人 城市 36.6 以前吸烟 是 67.0 228.69\n",
"1 51676 女 否 是 私营企业 农村 NaN 从不吸烟 是 61.0 202.21\n",
"2 31112 男 否 是 私人 农村 32.5 从不吸烟 是 80.0 105.92"
],
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>编号</th>\n",
" <th>性别</th>\n",
" <th>高血压</th>\n",
" <th>是否结婚</th>\n",
" <th>工作类型</th>\n",
" <th>居住类型</th>\n",
" <th>体重指数</th>\n",
" <th>吸烟史</th>\n",
" <th>中风</th>\n",
" <th>年龄</th>\n",
" <th>平均血糖</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>9046</td>\n",
" <td>男</td>\n",
" <td>否</td>\n",
" <td>是</td>\n",
" <td>私人</td>\n",
" <td>城市</td>\n",
" <td>36.6</td>\n",
" <td>以前吸烟</td>\n",
" <td>是</td>\n",
" <td>67.0</td>\n",
" <td>228.69</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>51676</td>\n",
" <td>女</td>\n",
" <td>否</td>\n",
" <td>是</td>\n",
" <td>私营企业</td>\n",
" <td>农村</td>\n",
" <td>NaN</td>\n",
" <td>从不吸烟</td>\n",
" <td>是</td>\n",
" <td>61.0</td>\n",
" <td>202.21</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>31112</td>\n",
" <td>男</td>\n",
" <td>否</td>\n",
" <td>是</td>\n",
" <td>私人</td>\n",
" <td>农村</td>\n",
" <td>32.5</td>\n",
" <td>从不吸烟</td>\n",
" <td>是</td>\n",
" <td>80.0</td>\n",
" <td>105.92</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
]
},
"execution_count": 71,
"metadata": {},
"output_type": "execute_result"
}
],
"execution_count": 71
},
{
"metadata": {
"ExecuteTime": {
"end_time": "2025-04-14T07:59:24.287769Z",
"start_time": "2025-04-14T07:59:24.284471Z"
}
},
"cell_type": "code",
"source": [
"def age_process(x):\n",
" if (x % 1 != 0 or x < 0):\n",
" return None\n",
" return int(x)"
],
"id": "d45e61b4e5c45d4a",
"outputs": [],
"execution_count": 72
},
{
"metadata": {
"ExecuteTime": {
"end_time": "2025-04-14T07:59:26.832979Z",
"start_time": "2025-04-14T07:59:26.827710Z"
}
},
"cell_type": "code",
"source": "merge_data['年龄'] = merge_data['年龄'].apply(lambda x: age_process(x))",
"id": "b81f4203662a2950",
"outputs": [],
"execution_count": 73
},
{
"metadata": {
"ExecuteTime": {
"end_time": "2025-04-14T07:59:30.620159Z",
"start_time": "2025-04-14T07:59:30.606700Z"
}
},
"cell_type": "code",
"source": "merge_data[merge_data['年龄'].isna()]",
"id": "da4b29e8f3d56bc6",
"outputs": [
{
"data": {
"text/plain": [
" 编号 性别 高血压 是否结婚 工作类型 居住类型 体重指数 吸烟史 中风 年龄 平均血糖\n",
"162 69768 女 否 否 学生 城市 NaN 未知 是 NaN 70.37\n",
"363 7559 女 否 否 学生 城市 24.9 未知 否 NaN 83.82\n",
"376 22706 女 否 否 学生 农村 15.5 未知 否 NaN 88.11\n",
"562 45238 女 否 否 学生 城市 16.5 未知 否 NaN 58.26\n",
"564 61511 女 否 否 学生 农村 16.2 未知 否 NaN 73.71\n",
"597 40639 女 否 否 学生 农村 17.5 未知 否 NaN 60.53\n",
"607 9906 女 否 否 学生 城市 17.0 未知 否 NaN 102.34\n",
"684 53016 女 否 否 学生 城市 14.4 未知 否 NaN 130.61\n",
"753 49529 女 否 否 学生 城市 17.2 未知 否 NaN 60.98\n",
"850 41615 女 否 否 学生 农村 18.1 未知 否 NaN 126.18\n",
"913 17733 女 否 否 学生 农村 19.5 未知 否 NaN 109.51\n",
"982 54747 男 否 否 学生 农村 19.2 未知 否 NaN 157.57\n",
"995 60211 男 否 否 学生 城市 18.9 未知 否 NaN 90.51\n",
"996 53279 男 否 否 学生 农村 16.3 未知 否 NaN 118.87\n",
"1093 66772 女 否 否 学生 农村 16.0 未知 否 NaN 55.86\n",
"1101 57854 男 否 否 学生 城市 19.7 未知 否 NaN 56.30\n",
"1134 47848 男 否 否 学生 农村 20.1 未知 否 NaN 93.74\n",
"1137 59734 男 否 否 学生 城市 17.6 未知 否 NaN 75.79\n",
"1206 68908 女 否 否 学生 城市 23.0 未知 否 NaN 66.36\n",
"1218 20282 男 否 否 学生 农村 21.8 未知 否 NaN 77.91\n",
"1244 45554 女 否 否 学生 城市 22.1 未知 否 NaN 62.40\n",
"1317 30084 男 否 否 学生 农村 17.5 未知 否 NaN 98.67\n",
"1366 35737 男 否 否 学生 城市 19.5 未知 否 NaN 86.09\n",
"1486 1405 男 否 否 学生 城市 16.3 未知 否 NaN 111.65\n",
"1499 45357 女 否 否 学生 农村 21.5 未知 否 NaN 113.96\n",
"1600 40544 男 否 否 学生 城市 14.3 未知 否 NaN 109.56\n",
"1609 38043 女 否 否 学生 农村 10.3 未知 否 NaN 122.04\n",
"1614 47350 女 否 否 学生 城市 14.1 未知 否 NaN 139.67\n",
"1632 57485 女 否 否 学生 农村 18.5 未知 否 NaN 55.51\n",
"1758 27279 男 否 否 学生 城市 22.5 未知 否 NaN 90.46"
],
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>编号</th>\n",
" <th>性别</th>\n",
" <th>高血压</th>\n",
" <th>是否结婚</th>\n",
" <th>工作类型</th>\n",
" <th>居住类型</th>\n",
" <th>体重指数</th>\n",
" <th>吸烟史</th>\n",
" <th>中风</th>\n",
" <th>年龄</th>\n",
" <th>平均血糖</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>162</th>\n",
" <td>69768</td>\n",
" <td>女</td>\n",
" <td>否</td>\n",
" <td>否</td>\n",
" <td>学生</td>\n",
" <td>城市</td>\n",
" <td>NaN</td>\n",
" <td>未知</td>\n",
" <td>是</td>\n",
" <td>NaN</td>\n",
" <td>70.37</td>\n",
" </tr>\n",
" <tr>\n",
" <th>363</th>\n",
" <td>7559</td>\n",
" <td>女</td>\n",
" <td>否</td>\n",
" <td>否</td>\n",
" <td>学生</td>\n",
" <td>城市</td>\n",
" <td>24.9</td>\n",
" <td>未知</td>\n",
" <td>否</td>\n",
" <td>NaN</td>\n",
" <td>83.82</td>\n",
" </tr>\n",
" <tr>\n",
" <th>376</th>\n",
" <td>22706</td>\n",
" <td>女</td>\n",
" <td>否</td>\n",
" <td>否</td>\n",
" <td>学生</td>\n",
" <td>农村</td>\n",
" <td>15.5</td>\n",
" <td>未知</td>\n",
" <td>否</td>\n",
" <td>NaN</td>\n",
" <td>88.11</td>\n",
" </tr>\n",
" <tr>\n",
" <th>562</th>\n",
" <td>45238</td>\n",
" <td>女</td>\n",
" <td>否</td>\n",
" <td>否</td>\n",
" <td>学生</td>\n",
" <td>城市</td>\n",
" <td>16.5</td>\n",
" <td>未知</td>\n",
" <td>否</td>\n",
" <td>NaN</td>\n",
" <td>58.26</td>\n",
" </tr>\n",
" <tr>\n",
" <th>564</th>\n",
" <td>61511</td>\n",
" <td>女</td>\n",
" <td>否</td>\n",
" <td>否</td>\n",
" <td>学生</td>\n",
" <td>农村</td>\n",
" <td>16.2</td>\n",
" <td>未知</td>\n",
" <td>否</td>\n",
" <td>NaN</td>\n",
" <td>73.71</td>\n",
" </tr>\n",
" <tr>\n",
" <th>597</th>\n",
" <td>40639</td>\n",
" <td>女</td>\n",
" <td>否</td>\n",
" <td>否</td>\n",
" <td>学生</td>\n",
" <td>农村</td>\n",
" <td>17.5</td>\n",
" <td>未知</td>\n",
" <td>否</td>\n",
" <td>NaN</td>\n",
" <td>60.53</td>\n",
" </tr>\n",
" <tr>\n",
" <th>607</th>\n",
" <td>9906</td>\n",
" <td>女</td>\n",
" <td>否</td>\n",
" <td>否</td>\n",
" <td>学生</td>\n",
" <td>城市</td>\n",
" <td>17.0</td>\n",
" <td>未知</td>\n",
" <td>否</td>\n",
" <td>NaN</td>\n",
" <td>102.34</td>\n",
" </tr>\n",
" <tr>\n",
" <th>684</th>\n",
" <td>53016</td>\n",
" <td>女</td>\n",
" <td>否</td>\n",
" <td>否</td>\n",
" <td>学生</td>\n",
" <td>城市</td>\n",
" <td>14.4</td>\n",
" <td>未知</td>\n",
" <td>否</td>\n",
" <td>NaN</td>\n",
" <td>130.61</td>\n",
" </tr>\n",
" <tr>\n",
" <th>753</th>\n",
" <td>49529</td>\n",
" <td>女</td>\n",
" <td>否</td>\n",
" <td>否</td>\n",
" <td>学生</td>\n",
" <td>城市</td>\n",
" <td>17.2</td>\n",
" <td>未知</td>\n",
" <td>否</td>\n",
" <td>NaN</td>\n",
" <td>60.98</td>\n",
" </tr>\n",
" <tr>\n",
" <th>850</th>\n",
" <td>41615</td>\n",
" <td>女</td>\n",
" <td>否</td>\n",
" <td>否</td>\n",
" <td>学生</td>\n",
" <td>农村</td>\n",
" <td>18.1</td>\n",
" <td>未知</td>\n",
" <td>否</td>\n",
" <td>NaN</td>\n",
" <td>126.18</td>\n",
" </tr>\n",
" <tr>\n",
" <th>913</th>\n",
" <td>17733</td>\n",
" <td>女</td>\n",
" <td>否</td>\n",
" <td>否</td>\n",
" <td>学生</td>\n",
" <td>农村</td>\n",
" <td>19.5</td>\n",
" <td>未知</td>\n",
" <td>否</td>\n",
" <td>NaN</td>\n",
" <td>109.51</td>\n",
" </tr>\n",
" <tr>\n",
" <th>982</th>\n",
" <td>54747</td>\n",
" <td>男</td>\n",
" <td>否</td>\n",
" <td>否</td>\n",
" <td>学生</td>\n",
" <td>农村</td>\n",
" <td>19.2</td>\n",
" <td>未知</td>\n",
" <td>否</td>\n",
" <td>NaN</td>\n",
" <td>157.57</td>\n",
" </tr>\n",
" <tr>\n",
" <th>995</th>\n",
" <td>60211</td>\n",
" <td>男</td>\n",
" <td>否</td>\n",
" <td>否</td>\n",
" <td>学生</td>\n",
" <td>城市</td>\n",
" <td>18.9</td>\n",
" <td>未知</td>\n",
" <td>否</td>\n",
" <td>NaN</td>\n",
" <td>90.51</td>\n",
" </tr>\n",
" <tr>\n",
" <th>996</th>\n",
" <td>53279</td>\n",
" <td>男</td>\n",
" <td>否</td>\n",
" <td>否</td>\n",
" <td>学生</td>\n",
" <td>农村</td>\n",
" <td>16.3</td>\n",
" <td>未知</td>\n",
" <td>否</td>\n",
" <td>NaN</td>\n",
" <td>118.87</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1093</th>\n",
" <td>66772</td>\n",
" <td>女</td>\n",
" <td>否</td>\n",
" <td>否</td>\n",
" <td>学生</td>\n",
" <td>农村</td>\n",
" <td>16.0</td>\n",
" <td>未知</td>\n",
" <td>否</td>\n",
" <td>NaN</td>\n",
" <td>55.86</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1101</th>\n",
" <td>57854</td>\n",
" <td>男</td>\n",
" <td>否</td>\n",
" <td>否</td>\n",
" <td>学生</td>\n",
" <td>城市</td>\n",
" <td>19.7</td>\n",
" <td>未知</td>\n",
" <td>否</td>\n",
" <td>NaN</td>\n",
" <td>56.30</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1134</th>\n",
" <td>47848</td>\n",
" <td>男</td>\n",
" <td>否</td>\n",
" <td>否</td>\n",
" <td>学生</td>\n",
" <td>农村</td>\n",
" <td>20.1</td>\n",
" <td>未知</td>\n",
" <td>否</td>\n",
" <td>NaN</td>\n",
" <td>93.74</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1137</th>\n",
" <td>59734</td>\n",
" <td>男</td>\n",
" <td>否</td>\n",
" <td>否</td>\n",
" <td>学生</td>\n",
" <td>城市</td>\n",
" <td>17.6</td>\n",
" <td>未知</td>\n",
" <td>否</td>\n",
" <td>NaN</td>\n",
" <td>75.79</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1206</th>\n",
" <td>68908</td>\n",
" <td>女</td>\n",
" <td>否</td>\n",
" <td>否</td>\n",
" <td>学生</td>\n",
" <td>城市</td>\n",
" <td>23.0</td>\n",
" <td>未知</td>\n",
" <td>否</td>\n",
" <td>NaN</td>\n",
" <td>66.36</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1218</th>\n",
" <td>20282</td>\n",
" <td>男</td>\n",
" <td>否</td>\n",
" <td>否</td>\n",
" <td>学生</td>\n",
" <td>农村</td>\n",
" <td>21.8</td>\n",
" <td>未知</td>\n",
" <td>否</td>\n",
" <td>NaN</td>\n",
" <td>77.91</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1244</th>\n",
" <td>45554</td>\n",
" <td>女</td>\n",
" <td>否</td>\n",
" <td>否</td>\n",
" <td>学生</td>\n",
" <td>城市</td>\n",
" <td>22.1</td>\n",
" <td>未知</td>\n",
" <td>否</td>\n",
" <td>NaN</td>\n",
" <td>62.40</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1317</th>\n",
" <td>30084</td>\n",
" <td>男</td>\n",
" <td>否</td>\n",
" <td>否</td>\n",
" <td>学生</td>\n",
" <td>农村</td>\n",
" <td>17.5</td>\n",
" <td>未知</td>\n",
" <td>否</td>\n",
" <td>NaN</td>\n",
" <td>98.67</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1366</th>\n",
" <td>35737</td>\n",
" <td>男</td>\n",
" <td>否</td>\n",
" <td>否</td>\n",
" <td>学生</td>\n",
" <td>城市</td>\n",
" <td>19.5</td>\n",
" <td>未知</td>\n",
" <td>否</td>\n",
" <td>NaN</td>\n",
" <td>86.09</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1486</th>\n",
" <td>1405</td>\n",
" <td>男</td>\n",
" <td>否</td>\n",
" <td>否</td>\n",
" <td>学生</td>\n",
" <td>城市</td>\n",
" <td>16.3</td>\n",
" <td>未知</td>\n",
" <td>否</td>\n",
" <td>NaN</td>\n",
" <td>111.65</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1499</th>\n",
" <td>45357</td>\n",
" <td>女</td>\n",
" <td>否</td>\n",
" <td>否</td>\n",
" <td>学生</td>\n",
" <td>农村</td>\n",
" <td>21.5</td>\n",
" <td>未知</td>\n",
" <td>否</td>\n",
" <td>NaN</td>\n",
" <td>113.96</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1600</th>\n",
" <td>40544</td>\n",
" <td>男</td>\n",
" <td>否</td>\n",
" <td>否</td>\n",
" <td>学生</td>\n",
" <td>城市</td>\n",
" <td>14.3</td>\n",
" <td>未知</td>\n",
" <td>否</td>\n",
" <td>NaN</td>\n",
" <td>109.56</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1609</th>\n",
" <td>38043</td>\n",
" <td>女</td>\n",
" <td>否</td>\n",
" <td>否</td>\n",
" <td>学生</td>\n",
" <td>农村</td>\n",
" <td>10.3</td>\n",
" <td>未知</td>\n",
" <td>否</td>\n",
" <td>NaN</td>\n",
" <td>122.04</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1614</th>\n",
" <td>47350</td>\n",
" <td>女</td>\n",
" <td>否</td>\n",
" <td>否</td>\n",
" <td>学生</td>\n",
" <td>城市</td>\n",
" <td>14.1</td>\n",
" <td>未知</td>\n",
" <td>否</td>\n",
" <td>NaN</td>\n",
" <td>139.67</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1632</th>\n",
" <td>57485</td>\n",
" <td>女</td>\n",
" <td>否</td>\n",
" <td>否</td>\n",
" <td>学生</td>\n",
" <td>农村</td>\n",
" <td>18.5</td>\n",
" <td>未知</td>\n",
" <td>否</td>\n",
" <td>NaN</td>\n",
" <td>55.51</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1758</th>\n",
" <td>27279</td>\n",
" <td>男</td>\n",
" <td>否</td>\n",
" <td>否</td>\n",
" <td>学生</td>\n",
" <td>城市</td>\n",
" <td>22.5</td>\n",
" <td>未知</td>\n",
" <td>否</td>\n",
" <td>NaN</td>\n",
" <td>90.46</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
]
},
"execution_count": 74,
"metadata": {},
"output_type": "execute_result"
}
],
"execution_count": 74
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}