{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# pd.DataFrame\n",
"\n",
"`pd.DataFrame` --- по сути дела таблица, на которую можно смотреть как на объединение столбцов `pd.Series` с выравниванием по общему индексу. \n",
"\n",
"\n",
"Можно изменять содержимое ячеек таблицы, добавление новых строк --- дорогостоящая операция (пересоздание таблицы), но добавление столбцов осуществляется на месте.\n",
"\n",
"## Создание таблицы\n",
"\n",
"Есть множество способов создать датафрейм из уже существующих объектов python. \n",
"\n",
"Один самых удобных --- используя словари. \n",
"\n",
"Если ключи в словаре соответствуют именам столбцов, а значения --- списки, `NumPy` массивы, `pandas` серии данных этих столбцов, то используется конструктор `pd.DataFrame`. При этом если в значениях присутствуют объекты `pd.Series`, то индекс созданного датафрейма будет объединением индексов серий, а данные всех столбцов выравниваются по меткам этого индекса."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" col1 | \n",
" col2 | \n",
" col3 | \n",
"
\n",
" \n",
" \n",
" \n",
" a | \n",
" 1.0 | \n",
" 4.0 | \n",
" 5 | \n",
"
\n",
" \n",
" b | \n",
" 2.0 | \n",
" 3.0 | \n",
" 6 | \n",
"
\n",
" \n",
" c | \n",
" NaN | \n",
" NaN | \n",
" 7 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" col1 col2 col3\n",
"a 1.0 4.0 5\n",
"b 2.0 3.0 6\n",
"c NaN NaN 7"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import pandas as pd\n",
"import numpy as np\n",
"\n",
"col1 = pd.Series([1, 2], index=[\"a\", \"b\"])\n",
"col2 = pd.Series([3, 4], index=[\"b\", \"a\"])\n",
"col3 = pd.Series([5, 6, 7], index=[\"a\", \"b\", \"c\"])\n",
"\n",
"d = {\n",
" 'col1': col1, \n",
" 'col2': col2,\n",
" 'col3': col3,\n",
"}\n",
"\n",
"df = pd.DataFrame(d)\n",
"df"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Если значения в словаре списки или массивы, то они должны быть одной длинны. В созданной таблице данные в столбцах идут в том же порядке, что и в исходных списках/массивах (без выравнивания), что и в списках. Индекс созданной таблицы по умолчанию --- `RangeIndex`, но можно явно указать его при создании опциональным параметром `index`."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" col1 | \n",
" col2 | \n",
"
\n",
" \n",
" \n",
" \n",
" a | \n",
" 1 | \n",
" 3 | \n",
"
\n",
" \n",
" b | \n",
" 2 | \n",
" 4 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" col1 col2\n",
"a 1 3\n",
"b 2 4"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"col1 = np.array([1, 2])\n",
"col2 = [3, 4]\n",
"\n",
"d = {\n",
" \"col1\": col1,\n",
" \"col2\": col2\n",
"}\n",
"\n",
"df = pd.DataFrame(d, index=[\"a\", \"b\"])\n",
"df"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[pd.DataFrame.form_dict](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.from_dict.html#pandas.DataFrame.from_dict) --- более специализированный метод для создания датафрейма из словаря. В предыдущих двух примерах он сработал бы аналогично, но его имя длиннее. Его преимущество заключается, например, в том, что он позволяет создать датафрейм из словаря, в котором ключи --- метки строк (индекс), а значения --- строки будущей таблицы.\n",
"\n",
"Для этого необходимо указать в качестве параметра `orient` строку `index`. Далее все аналогично. Если строки представлены в виде `pd.Series`, то метки попадут в названия столбцов с выравниванием данных. Если это простые массивы, то имена столбцов --- `RangeIndex`, но можно явно указать имена столбцов по параметру `columns`."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" a | \n",
" b | \n",
"
\n",
" \n",
" \n",
" \n",
" row1 | \n",
" 1 | \n",
" 2 | \n",
"
\n",
" \n",
" row2 | \n",
" 4 | \n",
" 3 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" a b\n",
"row1 1 2\n",
"row2 4 3"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"row1 = pd.Series([1, 2], index=[\"a\", \"b\"])\n",
"row2 = pd.Series([3, 4], index=[\"b\", \"a\"])\n",
"\n",
"d = {\n",
" \"row1\": row1,\n",
" \"row2\": row2,\n",
"}\n",
"\n",
"pd.DataFrame.from_dict(d, orient=\"index\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Если данные представлены в виде какой-то двухмерной структуры, в которой отсутствуют метки и строк и столбцов, то можно воспользоваться методом `pd.DataFrame`, при этом индекс и имена столбцов можно указать соответствующими параметрами `index` и `columns`. "
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" col1 | \n",
" col2 | \n",
"
\n",
" \n",
" \n",
" \n",
" row1 | \n",
" 1 | \n",
" 2 | \n",
"
\n",
" \n",
" row2 | \n",
" 3 | \n",
" 4 | \n",
"
\n",
" \n",
" row3 | \n",
" 5 | \n",
" 6 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" col1 col2\n",
"row1 1 2\n",
"row2 3 4\n",
"row3 5 6"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"data = [\n",
" [1, 2],\n",
" [3, 4],\n",
" [5, 6]\n",
"]\n",
"\n",
"pd.DataFrame(data, index=[\"row1\", \"row2\", \"row3\"], columns=[\"col1\", \"col2\"])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Хранение и чтение таблиц \n",
"\n",
"Для хранения таблиц чаще всего используются \n",
"\n",
"Гораздо чаще датафрейм получается, как результат чтение таблицы из файла. Чаще всего такие таблицы хранятся в формате [csv](https://ru.wikipedia.org/wiki/CSV) или в форматах таблиц `excel`.\n",
"\n",
"\n",
"> Строка таблицы соответствует строке текста, которая содержит одно или несколько полей, разделенных запятыми.\n",
"\n",
"\n",
"```\n",
"brand,mpg,cyl,disp,hp,drat,wt,qsec,vs,am,gear,carb\n",
"Mazda RX4,21.0,6,160.0,110,3.9,2.62,16.46,0,1,4,4\n",
"Mazda RX4 Wag,21.0,6,160.0,110,3.9,2.875,17.02,0,1,4,4\n",
"Datsun 710,22.8,4,108.0,93,3.85,2.32,18.61,1,1,4,1\n",
"Hornet 4 Drive,21.4,6,258.0,110,3.08,3.215,19.44,1,0,3,1\n",
"Hornet Sportabout,18.7,8,360.0,175,3.15,3.44,17.02,0,0,3,2\n",
"Valiant,18.1,6,225.0,105,2.76,3.46,20.22,1,0,3,1\n",
"Duster 360,14.3,8,360.0,245,3.21,3.57,15.84,0,0,3,4\n",
"Merc 240D,24.4,4,146.7,62,3.69,3.19,20.0,1,0,4,2\n",
"Merc 230,22.8,4,140.8,95,3.92,3.15,22.9,1,0,4,2\n",
"Merc 280,19.2,6,167.6,123,3.92,3.44,18.3,1,0,4,4\n",
"Merc 280C,17.8,6,167.6,123,3.92,3.44,18.9,1,0,4,4\n",
"Merc 450SE,16.4,8,275.8,180,3.07,4.07,17.4,0,0,3,3\n",
"Merc 450SL,17.3,8,275.8,180,3.07,3.73,17.6,0,0,3,3\n",
"Merc 450SLC,15.2,8,275.8,180,3.07,3.78,18.0,0,0,3,3\n",
"Cadillac Fleetwood,10.4,8,472.0,205,2.93,5.25,17.98,0,0,3,4\n",
"Lincoln Continental,10.4,8,460.0,215,3.0,5.424,17.82,0,0,3,4\n",
"Chrysler Imperial,14.7,8,440.0,230,3.23,5.345,17.42,0,0,3,4\n",
"Fiat 128,32.4,4,78.7,66,4.08,2.2,19.47,1,1,4,1\n",
"Honda Civic,30.4,4,75.7,52,4.93,1.615,18.52,1,1,4,2\n",
"Toyota Corolla,33.9,4,71.1,65,4.22,1.835,19.9,1,1,4,1\n",
"Toyota Corona,21.5,4,120.1,97,3.7,2.465,20.01,1,0,3,1\n",
"Dodge Challenger,15.5,8,318.0,150,2.76,3.52,16.87,0,0,3,2\n",
"AMC Javelin,15.2,8,304.0,150,3.15,3.435,17.3,0,0,3,2\n",
"Camaro Z28,13.3,8,350.0,245,3.73,3.84,15.41,0,0,3,4\n",
"Pontiac Firebird,19.2,8,400.0,175,3.08,3.845,17.05,0,0,3,2\n",
"Fiat X1-9,27.3,4,79.0,66,4.08,1.935,18.9,1,1,4,1\n",
"Porsche 914-2,26.0,4,120.3,91,4.43,2.14,16.7,0,1,5,2\n",
"Lotus Europa,30.4,4,95.1,113,3.77,1.513,16.9,1,1,5,2\n",
"Ford Pantera L,15.8,8,351.0,264,4.22,3.17,14.5,0,1,5,4\n",
"Ferrari Dino,19.7,6,145.0,175,3.62,2.77,15.5,0,1,5,6\n",
"Maserati Bora,15.0,8,301.0,335,3.54,3.57,14.6,0,1,5,8\n",
"Volvo 142E,,,,,,,,,,,\n",
"\n",
"```\n",
"\n",
"Пример простейшего `csv` файла приведен выше. В этом примере первая строка выступает в качестве заголовка, т.е. в ней перечислены названия столбцов, разделенные запятыми. Далее в каждой строке приведены значения этих столбцов, разделенные запятыми. В последней строке пропущены значения всех столбцов, кроме первого (brand).\n",
"\n",
"\n",
"Предположим, что текстовый файл с таким содержимым хранится в папке `data` под именем `mtcars.csv`. Для чтения такой таблицы используется метод [read_csv](https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html). "
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"\n",
"RangeIndex: 32 entries, 0 to 31\n",
"Data columns (total 12 columns):\n",
" # Column Non-Null Count Dtype \n",
"--- ------ -------------- ----- \n",
" 0 brand 32 non-null object \n",
" 1 mpg 31 non-null float64\n",
" 2 cyl 31 non-null float64\n",
" 3 disp 31 non-null float64\n",
" 4 hp 31 non-null float64\n",
" 5 drat 31 non-null float64\n",
" 6 wt 31 non-null float64\n",
" 7 qsec 31 non-null float64\n",
" 8 vs 31 non-null float64\n",
" 9 am 31 non-null float64\n",
" 10 gear 31 non-null float64\n",
" 11 carb 31 non-null float64\n",
"dtypes: float64(11), object(1)\n",
"memory usage: 3.1+ KB\n",
"None\n"
]
},
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" brand | \n",
" mpg | \n",
" cyl | \n",
" disp | \n",
" hp | \n",
" drat | \n",
" wt | \n",
" qsec | \n",
" vs | \n",
" am | \n",
" gear | \n",
" carb | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" Mazda RX4 | \n",
" 21.0 | \n",
" 6.0 | \n",
" 160.0 | \n",
" 110.0 | \n",
" 3.90 | \n",
" 2.620 | \n",
" 16.46 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 4.0 | \n",
" 4.0 | \n",
"
\n",
" \n",
" 1 | \n",
" Mazda RX4 Wag | \n",
" 21.0 | \n",
" 6.0 | \n",
" 160.0 | \n",
" 110.0 | \n",
" 3.90 | \n",
" 2.875 | \n",
" 17.02 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 4.0 | \n",
" 4.0 | \n",
"
\n",
" \n",
" 2 | \n",
" Datsun 710 | \n",
" 22.8 | \n",
" 4.0 | \n",
" 108.0 | \n",
" 93.0 | \n",
" 3.85 | \n",
" 2.320 | \n",
" 18.61 | \n",
" 1.0 | \n",
" 1.0 | \n",
" 4.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 3 | \n",
" Hornet 4 Drive | \n",
" 21.4 | \n",
" 6.0 | \n",
" 258.0 | \n",
" 110.0 | \n",
" 3.08 | \n",
" 3.215 | \n",
" 19.44 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" 4 | \n",
" Hornet Sportabout | \n",
" 18.7 | \n",
" 8.0 | \n",
" 360.0 | \n",
" 175.0 | \n",
" 3.15 | \n",
" 3.440 | \n",
" 17.02 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 2.0 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" brand mpg cyl disp hp drat wt qsec vs am \\\n",
"0 Mazda RX4 21.0 6.0 160.0 110.0 3.90 2.620 16.46 0.0 1.0 \n",
"1 Mazda RX4 Wag 21.0 6.0 160.0 110.0 3.90 2.875 17.02 0.0 1.0 \n",
"2 Datsun 710 22.8 4.0 108.0 93.0 3.85 2.320 18.61 1.0 1.0 \n",
"3 Hornet 4 Drive 21.4 6.0 258.0 110.0 3.08 3.215 19.44 1.0 0.0 \n",
"4 Hornet Sportabout 18.7 8.0 360.0 175.0 3.15 3.440 17.02 0.0 0.0 \n",
"\n",
" gear carb \n",
"0 4.0 4.0 \n",
"1 4.0 4.0 \n",
"2 4.0 1.0 \n",
"3 3.0 1.0 \n",
"4 3.0 2.0 "
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import os\n",
"path = os.path.join(\"data\", \"mtcars.csv\")\n",
"\n",
"df = pd.read_csv(path)\n",
"print(df.info())\n",
"df.head()\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"В самом простом варианте использования функции `read_csv` \n",
"- имена столбцов распознаются из первой строки файла (параметром `header` можно повлиять на это);\n",
"- в качестве индекса генерируется `RangeIndex` (параметром `index_col` можно выбрать столбец индекса);\n",
"- в качестве разделителя ожидается символ запятой (`,`) (параметром `sep` можно на это повлиять);\n",
"- пропущенные значения заполняются `np.nan`;\n",
"- столбцы с датами не распознаются.\n",
"\n",
"```{note}\n",
"Метод [DataFrame.head](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.head.html) возвращает первые `n` строк таблицы. По умолчанию `n` равно 5, но можно указать явно. Похожий по смыслу метод [DataFrame.tail](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.tail.html) возвращает последние `n` строк.\n",
"\n",
"Метод [DataFrame.info](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.info.html) печатает информацию о таблице. В частности, из вывода этой функции можно понять количество строк и столбцов, тип индекса таблицы, имя каждого столбца, тип данных и количество непропущенных значений в них.\n",
"```\n",
"\n",
"Считаем эту таблицу ещё раз, указав в этот раз в качестве индекса столбец `brand`."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" mpg | \n",
" cyl | \n",
" disp | \n",
" hp | \n",
" drat | \n",
" wt | \n",
" qsec | \n",
" vs | \n",
" am | \n",
" gear | \n",
" carb | \n",
"
\n",
" \n",
" brand | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
"
\n",
" \n",
" \n",
" \n",
" Mazda RX4 | \n",
" 21.0 | \n",
" 6.0 | \n",
" 160.0 | \n",
" 110.0 | \n",
" 3.90 | \n",
" 2.620 | \n",
" 16.46 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 4.0 | \n",
" 4.0 | \n",
"
\n",
" \n",
" Mazda RX4 Wag | \n",
" 21.0 | \n",
" 6.0 | \n",
" 160.0 | \n",
" 110.0 | \n",
" 3.90 | \n",
" 2.875 | \n",
" 17.02 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 4.0 | \n",
" 4.0 | \n",
"
\n",
" \n",
" Datsun 710 | \n",
" 22.8 | \n",
" 4.0 | \n",
" 108.0 | \n",
" 93.0 | \n",
" 3.85 | \n",
" 2.320 | \n",
" 18.61 | \n",
" 1.0 | \n",
" 1.0 | \n",
" 4.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" Hornet 4 Drive | \n",
" 21.4 | \n",
" 6.0 | \n",
" 258.0 | \n",
" 110.0 | \n",
" 3.08 | \n",
" 3.215 | \n",
" 19.44 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" Hornet Sportabout | \n",
" 18.7 | \n",
" 8.0 | \n",
" 360.0 | \n",
" 175.0 | \n",
" 3.15 | \n",
" 3.440 | \n",
" 17.02 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 2.0 | \n",
"
\n",
" \n",
" Valiant | \n",
" 18.1 | \n",
" 6.0 | \n",
" 225.0 | \n",
" 105.0 | \n",
" 2.76 | \n",
" 3.460 | \n",
" 20.22 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" Duster 360 | \n",
" 14.3 | \n",
" 8.0 | \n",
" 360.0 | \n",
" 245.0 | \n",
" 3.21 | \n",
" 3.570 | \n",
" 15.84 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 4.0 | \n",
"
\n",
" \n",
" Merc 240D | \n",
" 24.4 | \n",
" 4.0 | \n",
" 146.7 | \n",
" 62.0 | \n",
" 3.69 | \n",
" 3.190 | \n",
" 20.00 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 4.0 | \n",
" 2.0 | \n",
"
\n",
" \n",
" Merc 230 | \n",
" 22.8 | \n",
" 4.0 | \n",
" 140.8 | \n",
" 95.0 | \n",
" 3.92 | \n",
" 3.150 | \n",
" 22.90 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 4.0 | \n",
" 2.0 | \n",
"
\n",
" \n",
" Merc 280 | \n",
" 19.2 | \n",
" 6.0 | \n",
" 167.6 | \n",
" 123.0 | \n",
" 3.92 | \n",
" 3.440 | \n",
" 18.30 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 4.0 | \n",
" 4.0 | \n",
"
\n",
" \n",
" Merc 280C | \n",
" 17.8 | \n",
" 6.0 | \n",
" 167.6 | \n",
" 123.0 | \n",
" 3.92 | \n",
" 3.440 | \n",
" 18.90 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 4.0 | \n",
" 4.0 | \n",
"
\n",
" \n",
" Merc 450SE | \n",
" 16.4 | \n",
" 8.0 | \n",
" 275.8 | \n",
" 180.0 | \n",
" 3.07 | \n",
" 4.070 | \n",
" 17.40 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 3.0 | \n",
"
\n",
" \n",
" Merc 450SL | \n",
" 17.3 | \n",
" 8.0 | \n",
" 275.8 | \n",
" 180.0 | \n",
" 3.07 | \n",
" 3.730 | \n",
" 17.60 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 3.0 | \n",
"
\n",
" \n",
" Merc 450SLC | \n",
" 15.2 | \n",
" 8.0 | \n",
" 275.8 | \n",
" 180.0 | \n",
" 3.07 | \n",
" 3.780 | \n",
" 18.00 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 3.0 | \n",
"
\n",
" \n",
" Cadillac Fleetwood | \n",
" 10.4 | \n",
" 8.0 | \n",
" 472.0 | \n",
" 205.0 | \n",
" 2.93 | \n",
" 5.250 | \n",
" 17.98 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 4.0 | \n",
"
\n",
" \n",
" Lincoln Continental | \n",
" 10.4 | \n",
" 8.0 | \n",
" 460.0 | \n",
" 215.0 | \n",
" 3.00 | \n",
" 5.424 | \n",
" 17.82 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 4.0 | \n",
"
\n",
" \n",
" Chrysler Imperial | \n",
" 14.7 | \n",
" 8.0 | \n",
" 440.0 | \n",
" 230.0 | \n",
" 3.23 | \n",
" 5.345 | \n",
" 17.42 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 4.0 | \n",
"
\n",
" \n",
" Fiat 128 | \n",
" 32.4 | \n",
" 4.0 | \n",
" 78.7 | \n",
" 66.0 | \n",
" 4.08 | \n",
" 2.200 | \n",
" 19.47 | \n",
" 1.0 | \n",
" 1.0 | \n",
" 4.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" Honda Civic | \n",
" 30.4 | \n",
" 4.0 | \n",
" 75.7 | \n",
" 52.0 | \n",
" 4.93 | \n",
" 1.615 | \n",
" 18.52 | \n",
" 1.0 | \n",
" 1.0 | \n",
" 4.0 | \n",
" 2.0 | \n",
"
\n",
" \n",
" Toyota Corolla | \n",
" 33.9 | \n",
" 4.0 | \n",
" 71.1 | \n",
" 65.0 | \n",
" 4.22 | \n",
" 1.835 | \n",
" 19.90 | \n",
" 1.0 | \n",
" 1.0 | \n",
" 4.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" Toyota Corona | \n",
" 21.5 | \n",
" 4.0 | \n",
" 120.1 | \n",
" 97.0 | \n",
" 3.70 | \n",
" 2.465 | \n",
" 20.01 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" Dodge Challenger | \n",
" 15.5 | \n",
" 8.0 | \n",
" 318.0 | \n",
" 150.0 | \n",
" 2.76 | \n",
" 3.520 | \n",
" 16.87 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 2.0 | \n",
"
\n",
" \n",
" AMC Javelin | \n",
" 15.2 | \n",
" 8.0 | \n",
" 304.0 | \n",
" 150.0 | \n",
" 3.15 | \n",
" 3.435 | \n",
" 17.30 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 2.0 | \n",
"
\n",
" \n",
" Camaro Z28 | \n",
" 13.3 | \n",
" 8.0 | \n",
" 350.0 | \n",
" 245.0 | \n",
" 3.73 | \n",
" 3.840 | \n",
" 15.41 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 4.0 | \n",
"
\n",
" \n",
" Pontiac Firebird | \n",
" 19.2 | \n",
" 8.0 | \n",
" 400.0 | \n",
" 175.0 | \n",
" 3.08 | \n",
" 3.845 | \n",
" 17.05 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 2.0 | \n",
"
\n",
" \n",
" Fiat X1-9 | \n",
" 27.3 | \n",
" 4.0 | \n",
" 79.0 | \n",
" 66.0 | \n",
" 4.08 | \n",
" 1.935 | \n",
" 18.90 | \n",
" 1.0 | \n",
" 1.0 | \n",
" 4.0 | \n",
" 1.0 | \n",
"
\n",
" \n",
" Porsche 914-2 | \n",
" 26.0 | \n",
" 4.0 | \n",
" 120.3 | \n",
" 91.0 | \n",
" 4.43 | \n",
" 2.140 | \n",
" 16.70 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 5.0 | \n",
" 2.0 | \n",
"
\n",
" \n",
" Lotus Europa | \n",
" 30.4 | \n",
" 4.0 | \n",
" 95.1 | \n",
" 113.0 | \n",
" 3.77 | \n",
" 1.513 | \n",
" 16.90 | \n",
" 1.0 | \n",
" 1.0 | \n",
" 5.0 | \n",
" 2.0 | \n",
"
\n",
" \n",
" Ford Pantera L | \n",
" 15.8 | \n",
" 8.0 | \n",
" 351.0 | \n",
" 264.0 | \n",
" 4.22 | \n",
" 3.170 | \n",
" 14.50 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 5.0 | \n",
" 4.0 | \n",
"
\n",
" \n",
" Ferrari Dino | \n",
" 19.7 | \n",
" 6.0 | \n",
" 145.0 | \n",
" 175.0 | \n",
" 3.62 | \n",
" 2.770 | \n",
" 15.50 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 5.0 | \n",
" 6.0 | \n",
"
\n",
" \n",
" Maserati Bora | \n",
" 15.0 | \n",
" 8.0 | \n",
" 301.0 | \n",
" 335.0 | \n",
" 3.54 | \n",
" 3.570 | \n",
" 14.60 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 5.0 | \n",
" 8.0 | \n",
"
\n",
" \n",
" Volvo 142E | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" mpg cyl disp hp drat wt qsec vs am \\\n",
"brand \n",
"Mazda RX4 21.0 6.0 160.0 110.0 3.90 2.620 16.46 0.0 1.0 \n",
"Mazda RX4 Wag 21.0 6.0 160.0 110.0 3.90 2.875 17.02 0.0 1.0 \n",
"Datsun 710 22.8 4.0 108.0 93.0 3.85 2.320 18.61 1.0 1.0 \n",
"Hornet 4 Drive 21.4 6.0 258.0 110.0 3.08 3.215 19.44 1.0 0.0 \n",
"Hornet Sportabout 18.7 8.0 360.0 175.0 3.15 3.440 17.02 0.0 0.0 \n",
"Valiant 18.1 6.0 225.0 105.0 2.76 3.460 20.22 1.0 0.0 \n",
"Duster 360 14.3 8.0 360.0 245.0 3.21 3.570 15.84 0.0 0.0 \n",
"Merc 240D 24.4 4.0 146.7 62.0 3.69 3.190 20.00 1.0 0.0 \n",
"Merc 230 22.8 4.0 140.8 95.0 3.92 3.150 22.90 1.0 0.0 \n",
"Merc 280 19.2 6.0 167.6 123.0 3.92 3.440 18.30 1.0 0.0 \n",
"Merc 280C 17.8 6.0 167.6 123.0 3.92 3.440 18.90 1.0 0.0 \n",
"Merc 450SE 16.4 8.0 275.8 180.0 3.07 4.070 17.40 0.0 0.0 \n",
"Merc 450SL 17.3 8.0 275.8 180.0 3.07 3.730 17.60 0.0 0.0 \n",
"Merc 450SLC 15.2 8.0 275.8 180.0 3.07 3.780 18.00 0.0 0.0 \n",
"Cadillac Fleetwood 10.4 8.0 472.0 205.0 2.93 5.250 17.98 0.0 0.0 \n",
"Lincoln Continental 10.4 8.0 460.0 215.0 3.00 5.424 17.82 0.0 0.0 \n",
"Chrysler Imperial 14.7 8.0 440.0 230.0 3.23 5.345 17.42 0.0 0.0 \n",
"Fiat 128 32.4 4.0 78.7 66.0 4.08 2.200 19.47 1.0 1.0 \n",
"Honda Civic 30.4 4.0 75.7 52.0 4.93 1.615 18.52 1.0 1.0 \n",
"Toyota Corolla 33.9 4.0 71.1 65.0 4.22 1.835 19.90 1.0 1.0 \n",
"Toyota Corona 21.5 4.0 120.1 97.0 3.70 2.465 20.01 1.0 0.0 \n",
"Dodge Challenger 15.5 8.0 318.0 150.0 2.76 3.520 16.87 0.0 0.0 \n",
"AMC Javelin 15.2 8.0 304.0 150.0 3.15 3.435 17.30 0.0 0.0 \n",
"Camaro Z28 13.3 8.0 350.0 245.0 3.73 3.840 15.41 0.0 0.0 \n",
"Pontiac Firebird 19.2 8.0 400.0 175.0 3.08 3.845 17.05 0.0 0.0 \n",
"Fiat X1-9 27.3 4.0 79.0 66.0 4.08 1.935 18.90 1.0 1.0 \n",
"Porsche 914-2 26.0 4.0 120.3 91.0 4.43 2.140 16.70 0.0 1.0 \n",
"Lotus Europa 30.4 4.0 95.1 113.0 3.77 1.513 16.90 1.0 1.0 \n",
"Ford Pantera L 15.8 8.0 351.0 264.0 4.22 3.170 14.50 0.0 1.0 \n",
"Ferrari Dino 19.7 6.0 145.0 175.0 3.62 2.770 15.50 0.0 1.0 \n",
"Maserati Bora 15.0 8.0 301.0 335.0 3.54 3.570 14.60 0.0 1.0 \n",
"Volvo 142E NaN NaN NaN NaN NaN NaN NaN NaN NaN \n",
"\n",
" gear carb \n",
"brand \n",
"Mazda RX4 4.0 4.0 \n",
"Mazda RX4 Wag 4.0 4.0 \n",
"Datsun 710 4.0 1.0 \n",
"Hornet 4 Drive 3.0 1.0 \n",
"Hornet Sportabout 3.0 2.0 \n",
"Valiant 3.0 1.0 \n",
"Duster 360 3.0 4.0 \n",
"Merc 240D 4.0 2.0 \n",
"Merc 230 4.0 2.0 \n",
"Merc 280 4.0 4.0 \n",
"Merc 280C 4.0 4.0 \n",
"Merc 450SE 3.0 3.0 \n",
"Merc 450SL 3.0 3.0 \n",
"Merc 450SLC 3.0 3.0 \n",
"Cadillac Fleetwood 3.0 4.0 \n",
"Lincoln Continental 3.0 4.0 \n",
"Chrysler Imperial 3.0 4.0 \n",
"Fiat 128 4.0 1.0 \n",
"Honda Civic 4.0 2.0 \n",
"Toyota Corolla 4.0 1.0 \n",
"Toyota Corona 3.0 1.0 \n",
"Dodge Challenger 3.0 2.0 \n",
"AMC Javelin 3.0 2.0 \n",
"Camaro Z28 3.0 4.0 \n",
"Pontiac Firebird 3.0 2.0 \n",
"Fiat X1-9 4.0 1.0 \n",
"Porsche 914-2 5.0 2.0 \n",
"Lotus Europa 5.0 2.0 \n",
"Ford Pantera L 5.0 4.0 \n",
"Ferrari Dino 5.0 6.0 \n",
"Maserati Bora 5.0 8.0 \n",
"Volvo 142E NaN NaN "
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df = pd.read_csv(path, index_col=\"brand\", sep=\",\")\n",
"df"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
"\n",
"\n",
"Аналогично можно считывать данные из таблиц `excel` методом [read_excel](https://pandas.pydata.org/docs/reference/api/pandas.read_excel.html). \n",
"\n",
"Методами [to_csv](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_csv.html#pandas.DataFrame.to_csv) и [to_excel](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.to_excel.html#pandas.DataFrame.to_excel) можно сохранить `DataFrame` в таблицу удобном формате (для сохранения в `excel` необходимо поставить библиотеку [openpyxl](https://openpyxl.readthedocs.io/en/stable/) или её аналоги)."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Индексация DataFrame\n",
"\n",
"Доступ к строкам `DataFrame` осуществляется в точности, как осуществляется доступ к элементам `Series`, т.е. используются методы `loc` и `iloc`, которые сразу возвращают строку в виде объекта `pd.Series`, у которого в качестве индекса выступают имена столбцов."
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"mpg NaN\n",
"cyl NaN\n",
"disp NaN\n",
"hp NaN\n",
"drat NaN\n",
"wt NaN\n",
"qsec NaN\n",
"vs NaN\n",
"am NaN\n",
"gear NaN\n",
"carb NaN\n",
"Name: Volvo 142E, dtype: float64"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.loc[\"Volvo 142E\"]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Оператор `[]` датафрейма предназначен для выбора столбцов. Передавать в него можно имя нужного столбца или список таких имен. Результат в первом случае --- столбец в виде `pd.Series`, а во втором случае --- таблица `pd.DataFrame` из указанных столбцов."
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"brand\n",
"Mazda RX4 21.0\n",
"Mazda RX4 Wag 21.0\n",
"Datsun 710 22.8\n",
"Hornet 4 Drive 21.4\n",
"Hornet Sportabout 18.7\n",
"Name: mpg, dtype: float64"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df[\"mpg\"].head()"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" cyl | \n",
" wt | \n",
"
\n",
" \n",
" brand | \n",
" | \n",
" | \n",
"
\n",
" \n",
" \n",
" \n",
" Lotus Europa | \n",
" 4.0 | \n",
" 1.513 | \n",
"
\n",
" \n",
" Ford Pantera L | \n",
" 8.0 | \n",
" 3.170 | \n",
"
\n",
" \n",
" Ferrari Dino | \n",
" 6.0 | \n",
" 2.770 | \n",
"
\n",
" \n",
" Maserati Bora | \n",
" 8.0 | \n",
" 3.570 | \n",
"
\n",
" \n",
" Volvo 142E | \n",
" NaN | \n",
" NaN | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" cyl wt\n",
"brand \n",
"Lotus Europa 4.0 1.513\n",
"Ford Pantera L 8.0 3.170\n",
"Ferrari Dino 6.0 2.770\n",
"Maserati Bora 8.0 3.570\n",
"Volvo 142E NaN NaN"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df[[\"cyl\", \"wt\"]].tail()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Если в названии столбца нет пробелов, то можно обращаться к столбцу, как к атрибуту объекта. "
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"brand\n",
"Mazda RX4 4.0\n",
"Mazda RX4 Wag 4.0\n",
"Datsun 710 1.0\n",
"Name: carb, dtype: float64"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.carb.head(3)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Методы [DataFrame.at](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.at.html) и [DataFrame.iat](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.iat.html#pandas.DataFrame.iat) позволяют получить одно значение на пересечении строки и столбца."
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"21.0"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.iat[0, 0]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Добавление столбцов \n",
"\n",
"Добавление и изменение столбцов в таблицу похоже на добавление элементов в словарь. При этом, если добавляется столбец с индексом (`pd.Series`), то данные автоматически выравниваются. \n",
"\n",
"Пусть поступили данные о стоимости автомобилей определенной марки."
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" mpg | \n",
" cyl | \n",
" disp | \n",
" hp | \n",
" drat | \n",
" wt | \n",
" qsec | \n",
" vs | \n",
" am | \n",
" gear | \n",
" carb | \n",
" price | \n",
"
\n",
" \n",
" brand | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
"
\n",
" \n",
" \n",
" \n",
" Mazda RX4 | \n",
" 21.0 | \n",
" 6.0 | \n",
" 160.0 | \n",
" 110.0 | \n",
" 3.90 | \n",
" 2.620 | \n",
" 16.46 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 4.0 | \n",
" 4.0 | \n",
" 100000.0 | \n",
"
\n",
" \n",
" Mazda RX4 Wag | \n",
" 21.0 | \n",
" 6.0 | \n",
" 160.0 | \n",
" 110.0 | \n",
" 3.90 | \n",
" 2.875 | \n",
" 17.02 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 4.0 | \n",
" 4.0 | \n",
" NaN | \n",
"
\n",
" \n",
" Datsun 710 | \n",
" 22.8 | \n",
" 4.0 | \n",
" 108.0 | \n",
" 93.0 | \n",
" 3.85 | \n",
" 2.320 | \n",
" 18.61 | \n",
" 1.0 | \n",
" 1.0 | \n",
" 4.0 | \n",
" 1.0 | \n",
" NaN | \n",
"
\n",
" \n",
" Hornet 4 Drive | \n",
" 21.4 | \n",
" 6.0 | \n",
" 258.0 | \n",
" 110.0 | \n",
" 3.08 | \n",
" 3.215 | \n",
" 19.44 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 1.0 | \n",
" NaN | \n",
"
\n",
" \n",
" Hornet Sportabout | \n",
" 18.7 | \n",
" 8.0 | \n",
" 360.0 | \n",
" 175.0 | \n",
" 3.15 | \n",
" 3.440 | \n",
" 17.02 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 2.0 | \n",
" NaN | \n",
"
\n",
" \n",
" Valiant | \n",
" 18.1 | \n",
" 6.0 | \n",
" 225.0 | \n",
" 105.0 | \n",
" 2.76 | \n",
" 3.460 | \n",
" 20.22 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 1.0 | \n",
" NaN | \n",
"
\n",
" \n",
" Duster 360 | \n",
" 14.3 | \n",
" 8.0 | \n",
" 360.0 | \n",
" 245.0 | \n",
" 3.21 | \n",
" 3.570 | \n",
" 15.84 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 4.0 | \n",
" NaN | \n",
"
\n",
" \n",
" Merc 240D | \n",
" 24.4 | \n",
" 4.0 | \n",
" 146.7 | \n",
" 62.0 | \n",
" 3.69 | \n",
" 3.190 | \n",
" 20.00 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 4.0 | \n",
" 2.0 | \n",
" NaN | \n",
"
\n",
" \n",
" Merc 230 | \n",
" 22.8 | \n",
" 4.0 | \n",
" 140.8 | \n",
" 95.0 | \n",
" 3.92 | \n",
" 3.150 | \n",
" 22.90 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 4.0 | \n",
" 2.0 | \n",
" NaN | \n",
"
\n",
" \n",
" Merc 280 | \n",
" 19.2 | \n",
" 6.0 | \n",
" 167.6 | \n",
" 123.0 | \n",
" 3.92 | \n",
" 3.440 | \n",
" 18.30 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 4.0 | \n",
" 4.0 | \n",
" NaN | \n",
"
\n",
" \n",
" Merc 280C | \n",
" 17.8 | \n",
" 6.0 | \n",
" 167.6 | \n",
" 123.0 | \n",
" 3.92 | \n",
" 3.440 | \n",
" 18.90 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 4.0 | \n",
" 4.0 | \n",
" NaN | \n",
"
\n",
" \n",
" Merc 450SE | \n",
" 16.4 | \n",
" 8.0 | \n",
" 275.8 | \n",
" 180.0 | \n",
" 3.07 | \n",
" 4.070 | \n",
" 17.40 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 3.0 | \n",
" NaN | \n",
"
\n",
" \n",
" Merc 450SL | \n",
" 17.3 | \n",
" 8.0 | \n",
" 275.8 | \n",
" 180.0 | \n",
" 3.07 | \n",
" 3.730 | \n",
" 17.60 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 3.0 | \n",
" NaN | \n",
"
\n",
" \n",
" Merc 450SLC | \n",
" 15.2 | \n",
" 8.0 | \n",
" 275.8 | \n",
" 180.0 | \n",
" 3.07 | \n",
" 3.780 | \n",
" 18.00 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 3.0 | \n",
" NaN | \n",
"
\n",
" \n",
" Cadillac Fleetwood | \n",
" 10.4 | \n",
" 8.0 | \n",
" 472.0 | \n",
" 205.0 | \n",
" 2.93 | \n",
" 5.250 | \n",
" 17.98 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 4.0 | \n",
" NaN | \n",
"
\n",
" \n",
" Lincoln Continental | \n",
" 10.4 | \n",
" 8.0 | \n",
" 460.0 | \n",
" 215.0 | \n",
" 3.00 | \n",
" 5.424 | \n",
" 17.82 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 4.0 | \n",
" NaN | \n",
"
\n",
" \n",
" Chrysler Imperial | \n",
" 14.7 | \n",
" 8.0 | \n",
" 440.0 | \n",
" 230.0 | \n",
" 3.23 | \n",
" 5.345 | \n",
" 17.42 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 4.0 | \n",
" NaN | \n",
"
\n",
" \n",
" Fiat 128 | \n",
" 32.4 | \n",
" 4.0 | \n",
" 78.7 | \n",
" 66.0 | \n",
" 4.08 | \n",
" 2.200 | \n",
" 19.47 | \n",
" 1.0 | \n",
" 1.0 | \n",
" 4.0 | \n",
" 1.0 | \n",
" NaN | \n",
"
\n",
" \n",
" Honda Civic | \n",
" 30.4 | \n",
" 4.0 | \n",
" 75.7 | \n",
" 52.0 | \n",
" 4.93 | \n",
" 1.615 | \n",
" 18.52 | \n",
" 1.0 | \n",
" 1.0 | \n",
" 4.0 | \n",
" 2.0 | \n",
" NaN | \n",
"
\n",
" \n",
" Toyota Corolla | \n",
" 33.9 | \n",
" 4.0 | \n",
" 71.1 | \n",
" 65.0 | \n",
" 4.22 | \n",
" 1.835 | \n",
" 19.90 | \n",
" 1.0 | \n",
" 1.0 | \n",
" 4.0 | \n",
" 1.0 | \n",
" NaN | \n",
"
\n",
" \n",
" Toyota Corona | \n",
" 21.5 | \n",
" 4.0 | \n",
" 120.1 | \n",
" 97.0 | \n",
" 3.70 | \n",
" 2.465 | \n",
" 20.01 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 1.0 | \n",
" NaN | \n",
"
\n",
" \n",
" Dodge Challenger | \n",
" 15.5 | \n",
" 8.0 | \n",
" 318.0 | \n",
" 150.0 | \n",
" 2.76 | \n",
" 3.520 | \n",
" 16.87 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 2.0 | \n",
" NaN | \n",
"
\n",
" \n",
" AMC Javelin | \n",
" 15.2 | \n",
" 8.0 | \n",
" 304.0 | \n",
" 150.0 | \n",
" 3.15 | \n",
" 3.435 | \n",
" 17.30 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 2.0 | \n",
" NaN | \n",
"
\n",
" \n",
" Camaro Z28 | \n",
" 13.3 | \n",
" 8.0 | \n",
" 350.0 | \n",
" 245.0 | \n",
" 3.73 | \n",
" 3.840 | \n",
" 15.41 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 4.0 | \n",
" NaN | \n",
"
\n",
" \n",
" Pontiac Firebird | \n",
" 19.2 | \n",
" 8.0 | \n",
" 400.0 | \n",
" 175.0 | \n",
" 3.08 | \n",
" 3.845 | \n",
" 17.05 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 2.0 | \n",
" NaN | \n",
"
\n",
" \n",
" Fiat X1-9 | \n",
" 27.3 | \n",
" 4.0 | \n",
" 79.0 | \n",
" 66.0 | \n",
" 4.08 | \n",
" 1.935 | \n",
" 18.90 | \n",
" 1.0 | \n",
" 1.0 | \n",
" 4.0 | \n",
" 1.0 | \n",
" NaN | \n",
"
\n",
" \n",
" Porsche 914-2 | \n",
" 26.0 | \n",
" 4.0 | \n",
" 120.3 | \n",
" 91.0 | \n",
" 4.43 | \n",
" 2.140 | \n",
" 16.70 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 5.0 | \n",
" 2.0 | \n",
" NaN | \n",
"
\n",
" \n",
" Lotus Europa | \n",
" 30.4 | \n",
" 4.0 | \n",
" 95.1 | \n",
" 113.0 | \n",
" 3.77 | \n",
" 1.513 | \n",
" 16.90 | \n",
" 1.0 | \n",
" 1.0 | \n",
" 5.0 | \n",
" 2.0 | \n",
" 120000.0 | \n",
"
\n",
" \n",
" Ford Pantera L | \n",
" 15.8 | \n",
" 8.0 | \n",
" 351.0 | \n",
" 264.0 | \n",
" 4.22 | \n",
" 3.170 | \n",
" 14.50 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 5.0 | \n",
" 4.0 | \n",
" NaN | \n",
"
\n",
" \n",
" Ferrari Dino | \n",
" 19.7 | \n",
" 6.0 | \n",
" 145.0 | \n",
" 175.0 | \n",
" 3.62 | \n",
" 2.770 | \n",
" 15.50 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 5.0 | \n",
" 6.0 | \n",
" NaN | \n",
"
\n",
" \n",
" Maserati Bora | \n",
" 15.0 | \n",
" 8.0 | \n",
" 301.0 | \n",
" 335.0 | \n",
" 3.54 | \n",
" 3.570 | \n",
" 14.60 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 5.0 | \n",
" 8.0 | \n",
" NaN | \n",
"
\n",
" \n",
" Volvo 142E | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" mpg cyl disp hp drat wt qsec vs am \\\n",
"brand \n",
"Mazda RX4 21.0 6.0 160.0 110.0 3.90 2.620 16.46 0.0 1.0 \n",
"Mazda RX4 Wag 21.0 6.0 160.0 110.0 3.90 2.875 17.02 0.0 1.0 \n",
"Datsun 710 22.8 4.0 108.0 93.0 3.85 2.320 18.61 1.0 1.0 \n",
"Hornet 4 Drive 21.4 6.0 258.0 110.0 3.08 3.215 19.44 1.0 0.0 \n",
"Hornet Sportabout 18.7 8.0 360.0 175.0 3.15 3.440 17.02 0.0 0.0 \n",
"Valiant 18.1 6.0 225.0 105.0 2.76 3.460 20.22 1.0 0.0 \n",
"Duster 360 14.3 8.0 360.0 245.0 3.21 3.570 15.84 0.0 0.0 \n",
"Merc 240D 24.4 4.0 146.7 62.0 3.69 3.190 20.00 1.0 0.0 \n",
"Merc 230 22.8 4.0 140.8 95.0 3.92 3.150 22.90 1.0 0.0 \n",
"Merc 280 19.2 6.0 167.6 123.0 3.92 3.440 18.30 1.0 0.0 \n",
"Merc 280C 17.8 6.0 167.6 123.0 3.92 3.440 18.90 1.0 0.0 \n",
"Merc 450SE 16.4 8.0 275.8 180.0 3.07 4.070 17.40 0.0 0.0 \n",
"Merc 450SL 17.3 8.0 275.8 180.0 3.07 3.730 17.60 0.0 0.0 \n",
"Merc 450SLC 15.2 8.0 275.8 180.0 3.07 3.780 18.00 0.0 0.0 \n",
"Cadillac Fleetwood 10.4 8.0 472.0 205.0 2.93 5.250 17.98 0.0 0.0 \n",
"Lincoln Continental 10.4 8.0 460.0 215.0 3.00 5.424 17.82 0.0 0.0 \n",
"Chrysler Imperial 14.7 8.0 440.0 230.0 3.23 5.345 17.42 0.0 0.0 \n",
"Fiat 128 32.4 4.0 78.7 66.0 4.08 2.200 19.47 1.0 1.0 \n",
"Honda Civic 30.4 4.0 75.7 52.0 4.93 1.615 18.52 1.0 1.0 \n",
"Toyota Corolla 33.9 4.0 71.1 65.0 4.22 1.835 19.90 1.0 1.0 \n",
"Toyota Corona 21.5 4.0 120.1 97.0 3.70 2.465 20.01 1.0 0.0 \n",
"Dodge Challenger 15.5 8.0 318.0 150.0 2.76 3.520 16.87 0.0 0.0 \n",
"AMC Javelin 15.2 8.0 304.0 150.0 3.15 3.435 17.30 0.0 0.0 \n",
"Camaro Z28 13.3 8.0 350.0 245.0 3.73 3.840 15.41 0.0 0.0 \n",
"Pontiac Firebird 19.2 8.0 400.0 175.0 3.08 3.845 17.05 0.0 0.0 \n",
"Fiat X1-9 27.3 4.0 79.0 66.0 4.08 1.935 18.90 1.0 1.0 \n",
"Porsche 914-2 26.0 4.0 120.3 91.0 4.43 2.140 16.70 0.0 1.0 \n",
"Lotus Europa 30.4 4.0 95.1 113.0 3.77 1.513 16.90 1.0 1.0 \n",
"Ford Pantera L 15.8 8.0 351.0 264.0 4.22 3.170 14.50 0.0 1.0 \n",
"Ferrari Dino 19.7 6.0 145.0 175.0 3.62 2.770 15.50 0.0 1.0 \n",
"Maserati Bora 15.0 8.0 301.0 335.0 3.54 3.570 14.60 0.0 1.0 \n",
"Volvo 142E NaN NaN NaN NaN NaN NaN NaN NaN NaN \n",
"\n",
" gear carb price \n",
"brand \n",
"Mazda RX4 4.0 4.0 100000.0 \n",
"Mazda RX4 Wag 4.0 4.0 NaN \n",
"Datsun 710 4.0 1.0 NaN \n",
"Hornet 4 Drive 3.0 1.0 NaN \n",
"Hornet Sportabout 3.0 2.0 NaN \n",
"Valiant 3.0 1.0 NaN \n",
"Duster 360 3.0 4.0 NaN \n",
"Merc 240D 4.0 2.0 NaN \n",
"Merc 230 4.0 2.0 NaN \n",
"Merc 280 4.0 4.0 NaN \n",
"Merc 280C 4.0 4.0 NaN \n",
"Merc 450SE 3.0 3.0 NaN \n",
"Merc 450SL 3.0 3.0 NaN \n",
"Merc 450SLC 3.0 3.0 NaN \n",
"Cadillac Fleetwood 3.0 4.0 NaN \n",
"Lincoln Continental 3.0 4.0 NaN \n",
"Chrysler Imperial 3.0 4.0 NaN \n",
"Fiat 128 4.0 1.0 NaN \n",
"Honda Civic 4.0 2.0 NaN \n",
"Toyota Corolla 4.0 1.0 NaN \n",
"Toyota Corona 3.0 1.0 NaN \n",
"Dodge Challenger 3.0 2.0 NaN \n",
"AMC Javelin 3.0 2.0 NaN \n",
"Camaro Z28 3.0 4.0 NaN \n",
"Pontiac Firebird 3.0 2.0 NaN \n",
"Fiat X1-9 4.0 1.0 NaN \n",
"Porsche 914-2 5.0 2.0 NaN \n",
"Lotus Europa 5.0 2.0 120000.0 \n",
"Ford Pantera L 5.0 4.0 NaN \n",
"Ferrari Dino 5.0 6.0 NaN \n",
"Maserati Bora 5.0 8.0 NaN \n",
"Volvo 142E NaN NaN NaN "
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"price = pd.Series({\n",
" \"Mazda RX4\": 100_000, \n",
" \"Lotus Europa\": 120_000\n",
" })\n",
"df[\"price\"] = price\n",
"df"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Пропущенные значения\n",
"\n",
"Большинство методов `pandas` разработаны таким образом, чтобы обрабатывать пропущенные значения. "
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"mpg 20.048387\n",
"cyl 6.258065\n",
"disp 234.261290\n",
"hp 147.903226\n",
"drat 3.580000\n",
"wt 3.231355\n",
"qsec 17.824516\n",
"vs 0.419355\n",
"am 0.387097\n",
"gear 3.677419\n",
"carb 2.838710\n",
"price 110000.000000\n",
"dtype: float64"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.mean()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Тем не менее иногда необходимо избавиться от строк или столбцов с пропущенными данными. Для этого удобно использовать метод [DataFrame.dropna](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.dropna.html), который возвращает новую таблицу, состоящую из строк исходной таблицы, не содержащих `np.nan`."
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" mpg | \n",
" cyl | \n",
" disp | \n",
" hp | \n",
" drat | \n",
" wt | \n",
" qsec | \n",
" vs | \n",
" am | \n",
" gear | \n",
" carb | \n",
" price | \n",
"
\n",
" \n",
" brand | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
"
\n",
" \n",
" \n",
" \n",
" Mazda RX4 | \n",
" 21.0 | \n",
" 6.0 | \n",
" 160.0 | \n",
" 110.0 | \n",
" 3.90 | \n",
" 2.620 | \n",
" 16.46 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 4.0 | \n",
" 4.0 | \n",
" 100000.0 | \n",
"
\n",
" \n",
" Lotus Europa | \n",
" 30.4 | \n",
" 4.0 | \n",
" 95.1 | \n",
" 113.0 | \n",
" 3.77 | \n",
" 1.513 | \n",
" 16.90 | \n",
" 1.0 | \n",
" 1.0 | \n",
" 5.0 | \n",
" 2.0 | \n",
" 120000.0 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" mpg cyl disp hp drat wt qsec vs am gear \\\n",
"brand \n",
"Mazda RX4 21.0 6.0 160.0 110.0 3.90 2.620 16.46 0.0 1.0 4.0 \n",
"Lotus Europa 30.4 4.0 95.1 113.0 3.77 1.513 16.90 1.0 1.0 5.0 \n",
"\n",
" carb price \n",
"brand \n",
"Mazda RX4 4.0 100000.0 \n",
"Lotus Europa 2.0 120000.0 "
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.dropna()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Параметром `axis` можно выбрать, отбрасывать ли строки или столбцы. Параметр `how` может быть равен или `all` или `any` (по умолчанию `any`) и в зависимости от этого отбрасываются те строки таблицы (столбцы), в которых пропущены все или хотя бы одно значения. \n",
"\n",
"Вместо того чтобы отбросить пропущенные значения, методом [df.fillna](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.fillna.html) можно их заполнить. В самом простом варианте он заполняет `na` во всех столбцах одним и тем же значением (по умолчанию не на месте)."
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" mpg | \n",
" cyl | \n",
" disp | \n",
" hp | \n",
" drat | \n",
" wt | \n",
" qsec | \n",
" vs | \n",
" am | \n",
" gear | \n",
" carb | \n",
" price | \n",
"
\n",
" \n",
" brand | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
"
\n",
" \n",
" \n",
" \n",
" Mazda RX4 | \n",
" 21.0 | \n",
" 6.0 | \n",
" 160.0 | \n",
" 110.0 | \n",
" 3.90 | \n",
" 2.620 | \n",
" 16.46 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 4.0 | \n",
" 4.0 | \n",
" 100000.0 | \n",
"
\n",
" \n",
" Mazda RX4 Wag | \n",
" 21.0 | \n",
" 6.0 | \n",
" 160.0 | \n",
" 110.0 | \n",
" 3.90 | \n",
" 2.875 | \n",
" 17.02 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 4.0 | \n",
" 4.0 | \n",
" 0.0 | \n",
"
\n",
" \n",
" Datsun 710 | \n",
" 22.8 | \n",
" 4.0 | \n",
" 108.0 | \n",
" 93.0 | \n",
" 3.85 | \n",
" 2.320 | \n",
" 18.61 | \n",
" 1.0 | \n",
" 1.0 | \n",
" 4.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
"
\n",
" \n",
" Hornet 4 Drive | \n",
" 21.4 | \n",
" 6.0 | \n",
" 258.0 | \n",
" 110.0 | \n",
" 3.08 | \n",
" 3.215 | \n",
" 19.44 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
"
\n",
" \n",
" Hornet Sportabout | \n",
" 18.7 | \n",
" 8.0 | \n",
" 360.0 | \n",
" 175.0 | \n",
" 3.15 | \n",
" 3.440 | \n",
" 17.02 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 2.0 | \n",
" 0.0 | \n",
"
\n",
" \n",
" Valiant | \n",
" 18.1 | \n",
" 6.0 | \n",
" 225.0 | \n",
" 105.0 | \n",
" 2.76 | \n",
" 3.460 | \n",
" 20.22 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
"
\n",
" \n",
" Duster 360 | \n",
" 14.3 | \n",
" 8.0 | \n",
" 360.0 | \n",
" 245.0 | \n",
" 3.21 | \n",
" 3.570 | \n",
" 15.84 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 4.0 | \n",
" 0.0 | \n",
"
\n",
" \n",
" Merc 240D | \n",
" 24.4 | \n",
" 4.0 | \n",
" 146.7 | \n",
" 62.0 | \n",
" 3.69 | \n",
" 3.190 | \n",
" 20.00 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 4.0 | \n",
" 2.0 | \n",
" 0.0 | \n",
"
\n",
" \n",
" Merc 230 | \n",
" 22.8 | \n",
" 4.0 | \n",
" 140.8 | \n",
" 95.0 | \n",
" 3.92 | \n",
" 3.150 | \n",
" 22.90 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 4.0 | \n",
" 2.0 | \n",
" 0.0 | \n",
"
\n",
" \n",
" Merc 280 | \n",
" 19.2 | \n",
" 6.0 | \n",
" 167.6 | \n",
" 123.0 | \n",
" 3.92 | \n",
" 3.440 | \n",
" 18.30 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 4.0 | \n",
" 4.0 | \n",
" 0.0 | \n",
"
\n",
" \n",
" Merc 280C | \n",
" 17.8 | \n",
" 6.0 | \n",
" 167.6 | \n",
" 123.0 | \n",
" 3.92 | \n",
" 3.440 | \n",
" 18.90 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 4.0 | \n",
" 4.0 | \n",
" 0.0 | \n",
"
\n",
" \n",
" Merc 450SE | \n",
" 16.4 | \n",
" 8.0 | \n",
" 275.8 | \n",
" 180.0 | \n",
" 3.07 | \n",
" 4.070 | \n",
" 17.40 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 3.0 | \n",
" 0.0 | \n",
"
\n",
" \n",
" Merc 450SL | \n",
" 17.3 | \n",
" 8.0 | \n",
" 275.8 | \n",
" 180.0 | \n",
" 3.07 | \n",
" 3.730 | \n",
" 17.60 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 3.0 | \n",
" 0.0 | \n",
"
\n",
" \n",
" Merc 450SLC | \n",
" 15.2 | \n",
" 8.0 | \n",
" 275.8 | \n",
" 180.0 | \n",
" 3.07 | \n",
" 3.780 | \n",
" 18.00 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 3.0 | \n",
" 0.0 | \n",
"
\n",
" \n",
" Cadillac Fleetwood | \n",
" 10.4 | \n",
" 8.0 | \n",
" 472.0 | \n",
" 205.0 | \n",
" 2.93 | \n",
" 5.250 | \n",
" 17.98 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 4.0 | \n",
" 0.0 | \n",
"
\n",
" \n",
" Lincoln Continental | \n",
" 10.4 | \n",
" 8.0 | \n",
" 460.0 | \n",
" 215.0 | \n",
" 3.00 | \n",
" 5.424 | \n",
" 17.82 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 4.0 | \n",
" 0.0 | \n",
"
\n",
" \n",
" Chrysler Imperial | \n",
" 14.7 | \n",
" 8.0 | \n",
" 440.0 | \n",
" 230.0 | \n",
" 3.23 | \n",
" 5.345 | \n",
" 17.42 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 4.0 | \n",
" 0.0 | \n",
"
\n",
" \n",
" Fiat 128 | \n",
" 32.4 | \n",
" 4.0 | \n",
" 78.7 | \n",
" 66.0 | \n",
" 4.08 | \n",
" 2.200 | \n",
" 19.47 | \n",
" 1.0 | \n",
" 1.0 | \n",
" 4.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
"
\n",
" \n",
" Honda Civic | \n",
" 30.4 | \n",
" 4.0 | \n",
" 75.7 | \n",
" 52.0 | \n",
" 4.93 | \n",
" 1.615 | \n",
" 18.52 | \n",
" 1.0 | \n",
" 1.0 | \n",
" 4.0 | \n",
" 2.0 | \n",
" 0.0 | \n",
"
\n",
" \n",
" Toyota Corolla | \n",
" 33.9 | \n",
" 4.0 | \n",
" 71.1 | \n",
" 65.0 | \n",
" 4.22 | \n",
" 1.835 | \n",
" 19.90 | \n",
" 1.0 | \n",
" 1.0 | \n",
" 4.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
"
\n",
" \n",
" Toyota Corona | \n",
" 21.5 | \n",
" 4.0 | \n",
" 120.1 | \n",
" 97.0 | \n",
" 3.70 | \n",
" 2.465 | \n",
" 20.01 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
"
\n",
" \n",
" Dodge Challenger | \n",
" 15.5 | \n",
" 8.0 | \n",
" 318.0 | \n",
" 150.0 | \n",
" 2.76 | \n",
" 3.520 | \n",
" 16.87 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 2.0 | \n",
" 0.0 | \n",
"
\n",
" \n",
" AMC Javelin | \n",
" 15.2 | \n",
" 8.0 | \n",
" 304.0 | \n",
" 150.0 | \n",
" 3.15 | \n",
" 3.435 | \n",
" 17.30 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 2.0 | \n",
" 0.0 | \n",
"
\n",
" \n",
" Camaro Z28 | \n",
" 13.3 | \n",
" 8.0 | \n",
" 350.0 | \n",
" 245.0 | \n",
" 3.73 | \n",
" 3.840 | \n",
" 15.41 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 4.0 | \n",
" 0.0 | \n",
"
\n",
" \n",
" Pontiac Firebird | \n",
" 19.2 | \n",
" 8.0 | \n",
" 400.0 | \n",
" 175.0 | \n",
" 3.08 | \n",
" 3.845 | \n",
" 17.05 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 2.0 | \n",
" 0.0 | \n",
"
\n",
" \n",
" Fiat X1-9 | \n",
" 27.3 | \n",
" 4.0 | \n",
" 79.0 | \n",
" 66.0 | \n",
" 4.08 | \n",
" 1.935 | \n",
" 18.90 | \n",
" 1.0 | \n",
" 1.0 | \n",
" 4.0 | \n",
" 1.0 | \n",
" 0.0 | \n",
"
\n",
" \n",
" Porsche 914-2 | \n",
" 26.0 | \n",
" 4.0 | \n",
" 120.3 | \n",
" 91.0 | \n",
" 4.43 | \n",
" 2.140 | \n",
" 16.70 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 5.0 | \n",
" 2.0 | \n",
" 0.0 | \n",
"
\n",
" \n",
" Lotus Europa | \n",
" 30.4 | \n",
" 4.0 | \n",
" 95.1 | \n",
" 113.0 | \n",
" 3.77 | \n",
" 1.513 | \n",
" 16.90 | \n",
" 1.0 | \n",
" 1.0 | \n",
" 5.0 | \n",
" 2.0 | \n",
" 120000.0 | \n",
"
\n",
" \n",
" Ford Pantera L | \n",
" 15.8 | \n",
" 8.0 | \n",
" 351.0 | \n",
" 264.0 | \n",
" 4.22 | \n",
" 3.170 | \n",
" 14.50 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 5.0 | \n",
" 4.0 | \n",
" 0.0 | \n",
"
\n",
" \n",
" Ferrari Dino | \n",
" 19.7 | \n",
" 6.0 | \n",
" 145.0 | \n",
" 175.0 | \n",
" 3.62 | \n",
" 2.770 | \n",
" 15.50 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 5.0 | \n",
" 6.0 | \n",
" 0.0 | \n",
"
\n",
" \n",
" Maserati Bora | \n",
" 15.0 | \n",
" 8.0 | \n",
" 301.0 | \n",
" 335.0 | \n",
" 3.54 | \n",
" 3.570 | \n",
" 14.60 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 5.0 | \n",
" 8.0 | \n",
" 0.0 | \n",
"
\n",
" \n",
" Volvo 142E | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.00 | \n",
" 0.000 | \n",
" 0.00 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" mpg cyl disp hp drat wt qsec vs am \\\n",
"brand \n",
"Mazda RX4 21.0 6.0 160.0 110.0 3.90 2.620 16.46 0.0 1.0 \n",
"Mazda RX4 Wag 21.0 6.0 160.0 110.0 3.90 2.875 17.02 0.0 1.0 \n",
"Datsun 710 22.8 4.0 108.0 93.0 3.85 2.320 18.61 1.0 1.0 \n",
"Hornet 4 Drive 21.4 6.0 258.0 110.0 3.08 3.215 19.44 1.0 0.0 \n",
"Hornet Sportabout 18.7 8.0 360.0 175.0 3.15 3.440 17.02 0.0 0.0 \n",
"Valiant 18.1 6.0 225.0 105.0 2.76 3.460 20.22 1.0 0.0 \n",
"Duster 360 14.3 8.0 360.0 245.0 3.21 3.570 15.84 0.0 0.0 \n",
"Merc 240D 24.4 4.0 146.7 62.0 3.69 3.190 20.00 1.0 0.0 \n",
"Merc 230 22.8 4.0 140.8 95.0 3.92 3.150 22.90 1.0 0.0 \n",
"Merc 280 19.2 6.0 167.6 123.0 3.92 3.440 18.30 1.0 0.0 \n",
"Merc 280C 17.8 6.0 167.6 123.0 3.92 3.440 18.90 1.0 0.0 \n",
"Merc 450SE 16.4 8.0 275.8 180.0 3.07 4.070 17.40 0.0 0.0 \n",
"Merc 450SL 17.3 8.0 275.8 180.0 3.07 3.730 17.60 0.0 0.0 \n",
"Merc 450SLC 15.2 8.0 275.8 180.0 3.07 3.780 18.00 0.0 0.0 \n",
"Cadillac Fleetwood 10.4 8.0 472.0 205.0 2.93 5.250 17.98 0.0 0.0 \n",
"Lincoln Continental 10.4 8.0 460.0 215.0 3.00 5.424 17.82 0.0 0.0 \n",
"Chrysler Imperial 14.7 8.0 440.0 230.0 3.23 5.345 17.42 0.0 0.0 \n",
"Fiat 128 32.4 4.0 78.7 66.0 4.08 2.200 19.47 1.0 1.0 \n",
"Honda Civic 30.4 4.0 75.7 52.0 4.93 1.615 18.52 1.0 1.0 \n",
"Toyota Corolla 33.9 4.0 71.1 65.0 4.22 1.835 19.90 1.0 1.0 \n",
"Toyota Corona 21.5 4.0 120.1 97.0 3.70 2.465 20.01 1.0 0.0 \n",
"Dodge Challenger 15.5 8.0 318.0 150.0 2.76 3.520 16.87 0.0 0.0 \n",
"AMC Javelin 15.2 8.0 304.0 150.0 3.15 3.435 17.30 0.0 0.0 \n",
"Camaro Z28 13.3 8.0 350.0 245.0 3.73 3.840 15.41 0.0 0.0 \n",
"Pontiac Firebird 19.2 8.0 400.0 175.0 3.08 3.845 17.05 0.0 0.0 \n",
"Fiat X1-9 27.3 4.0 79.0 66.0 4.08 1.935 18.90 1.0 1.0 \n",
"Porsche 914-2 26.0 4.0 120.3 91.0 4.43 2.140 16.70 0.0 1.0 \n",
"Lotus Europa 30.4 4.0 95.1 113.0 3.77 1.513 16.90 1.0 1.0 \n",
"Ford Pantera L 15.8 8.0 351.0 264.0 4.22 3.170 14.50 0.0 1.0 \n",
"Ferrari Dino 19.7 6.0 145.0 175.0 3.62 2.770 15.50 0.0 1.0 \n",
"Maserati Bora 15.0 8.0 301.0 335.0 3.54 3.570 14.60 0.0 1.0 \n",
"Volvo 142E 0.0 0.0 0.0 0.0 0.00 0.000 0.00 0.0 0.0 \n",
"\n",
" gear carb price \n",
"brand \n",
"Mazda RX4 4.0 4.0 100000.0 \n",
"Mazda RX4 Wag 4.0 4.0 0.0 \n",
"Datsun 710 4.0 1.0 0.0 \n",
"Hornet 4 Drive 3.0 1.0 0.0 \n",
"Hornet Sportabout 3.0 2.0 0.0 \n",
"Valiant 3.0 1.0 0.0 \n",
"Duster 360 3.0 4.0 0.0 \n",
"Merc 240D 4.0 2.0 0.0 \n",
"Merc 230 4.0 2.0 0.0 \n",
"Merc 280 4.0 4.0 0.0 \n",
"Merc 280C 4.0 4.0 0.0 \n",
"Merc 450SE 3.0 3.0 0.0 \n",
"Merc 450SL 3.0 3.0 0.0 \n",
"Merc 450SLC 3.0 3.0 0.0 \n",
"Cadillac Fleetwood 3.0 4.0 0.0 \n",
"Lincoln Continental 3.0 4.0 0.0 \n",
"Chrysler Imperial 3.0 4.0 0.0 \n",
"Fiat 128 4.0 1.0 0.0 \n",
"Honda Civic 4.0 2.0 0.0 \n",
"Toyota Corolla 4.0 1.0 0.0 \n",
"Toyota Corona 3.0 1.0 0.0 \n",
"Dodge Challenger 3.0 2.0 0.0 \n",
"AMC Javelin 3.0 2.0 0.0 \n",
"Camaro Z28 3.0 4.0 0.0 \n",
"Pontiac Firebird 3.0 2.0 0.0 \n",
"Fiat X1-9 4.0 1.0 0.0 \n",
"Porsche 914-2 5.0 2.0 0.0 \n",
"Lotus Europa 5.0 2.0 120000.0 \n",
"Ford Pantera L 5.0 4.0 0.0 \n",
"Ferrari Dino 5.0 6.0 0.0 \n",
"Maserati Bora 5.0 8.0 0.0 \n",
"Volvo 142E 0.0 0.0 0.0 "
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.fillna(0)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Можно настроить более гибкое заполнение. Например, передавая словарь, можно выбрать свой заполнитель для каждого столбца. А параметром `method` можно заполнять пропущенные значения ближайшим непропущенным в строчках выше или ниже.\n",
"\n",
"Методом [DataFrame.interpolate](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.interpolate.html) можно заполнить пропущенные значения интерполируя по непропущенным."
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" mpg | \n",
" cyl | \n",
" disp | \n",
" hp | \n",
" drat | \n",
" wt | \n",
" qsec | \n",
" vs | \n",
" am | \n",
" gear | \n",
" carb | \n",
" price | \n",
"
\n",
" \n",
" brand | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
"
\n",
" \n",
" \n",
" \n",
" Mazda RX4 | \n",
" 21.0 | \n",
" 6.0 | \n",
" 160.0 | \n",
" 110.0 | \n",
" 3.90 | \n",
" 2.620 | \n",
" 16.46 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 4.0 | \n",
" 4.0 | \n",
" 100000.000000 | \n",
"
\n",
" \n",
" Mazda RX4 Wag | \n",
" 21.0 | \n",
" 6.0 | \n",
" 160.0 | \n",
" 110.0 | \n",
" 3.90 | \n",
" 2.875 | \n",
" 17.02 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 4.0 | \n",
" 4.0 | \n",
" 100740.740741 | \n",
"
\n",
" \n",
" Datsun 710 | \n",
" 22.8 | \n",
" 4.0 | \n",
" 108.0 | \n",
" 93.0 | \n",
" 3.85 | \n",
" 2.320 | \n",
" 18.61 | \n",
" 1.0 | \n",
" 1.0 | \n",
" 4.0 | \n",
" 1.0 | \n",
" 101481.481481 | \n",
"
\n",
" \n",
" Hornet 4 Drive | \n",
" 21.4 | \n",
" 6.0 | \n",
" 258.0 | \n",
" 110.0 | \n",
" 3.08 | \n",
" 3.215 | \n",
" 19.44 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 1.0 | \n",
" 102222.222222 | \n",
"
\n",
" \n",
" Hornet Sportabout | \n",
" 18.7 | \n",
" 8.0 | \n",
" 360.0 | \n",
" 175.0 | \n",
" 3.15 | \n",
" 3.440 | \n",
" 17.02 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 2.0 | \n",
" 102962.962963 | \n",
"
\n",
" \n",
" Valiant | \n",
" 18.1 | \n",
" 6.0 | \n",
" 225.0 | \n",
" 105.0 | \n",
" 2.76 | \n",
" 3.460 | \n",
" 20.22 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 1.0 | \n",
" 103703.703704 | \n",
"
\n",
" \n",
" Duster 360 | \n",
" 14.3 | \n",
" 8.0 | \n",
" 360.0 | \n",
" 245.0 | \n",
" 3.21 | \n",
" 3.570 | \n",
" 15.84 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 4.0 | \n",
" 104444.444444 | \n",
"
\n",
" \n",
" Merc 240D | \n",
" 24.4 | \n",
" 4.0 | \n",
" 146.7 | \n",
" 62.0 | \n",
" 3.69 | \n",
" 3.190 | \n",
" 20.00 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 4.0 | \n",
" 2.0 | \n",
" 105185.185185 | \n",
"
\n",
" \n",
" Merc 230 | \n",
" 22.8 | \n",
" 4.0 | \n",
" 140.8 | \n",
" 95.0 | \n",
" 3.92 | \n",
" 3.150 | \n",
" 22.90 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 4.0 | \n",
" 2.0 | \n",
" 105925.925926 | \n",
"
\n",
" \n",
" Merc 280 | \n",
" 19.2 | \n",
" 6.0 | \n",
" 167.6 | \n",
" 123.0 | \n",
" 3.92 | \n",
" 3.440 | \n",
" 18.30 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 4.0 | \n",
" 4.0 | \n",
" 106666.666667 | \n",
"
\n",
" \n",
" Merc 280C | \n",
" 17.8 | \n",
" 6.0 | \n",
" 167.6 | \n",
" 123.0 | \n",
" 3.92 | \n",
" 3.440 | \n",
" 18.90 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 4.0 | \n",
" 4.0 | \n",
" 107407.407407 | \n",
"
\n",
" \n",
" Merc 450SE | \n",
" 16.4 | \n",
" 8.0 | \n",
" 275.8 | \n",
" 180.0 | \n",
" 3.07 | \n",
" 4.070 | \n",
" 17.40 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 3.0 | \n",
" 108148.148148 | \n",
"
\n",
" \n",
" Merc 450SL | \n",
" 17.3 | \n",
" 8.0 | \n",
" 275.8 | \n",
" 180.0 | \n",
" 3.07 | \n",
" 3.730 | \n",
" 17.60 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 3.0 | \n",
" 108888.888889 | \n",
"
\n",
" \n",
" Merc 450SLC | \n",
" 15.2 | \n",
" 8.0 | \n",
" 275.8 | \n",
" 180.0 | \n",
" 3.07 | \n",
" 3.780 | \n",
" 18.00 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 3.0 | \n",
" 109629.629630 | \n",
"
\n",
" \n",
" Cadillac Fleetwood | \n",
" 10.4 | \n",
" 8.0 | \n",
" 472.0 | \n",
" 205.0 | \n",
" 2.93 | \n",
" 5.250 | \n",
" 17.98 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 4.0 | \n",
" 110370.370370 | \n",
"
\n",
" \n",
" Lincoln Continental | \n",
" 10.4 | \n",
" 8.0 | \n",
" 460.0 | \n",
" 215.0 | \n",
" 3.00 | \n",
" 5.424 | \n",
" 17.82 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 4.0 | \n",
" 111111.111111 | \n",
"
\n",
" \n",
" Chrysler Imperial | \n",
" 14.7 | \n",
" 8.0 | \n",
" 440.0 | \n",
" 230.0 | \n",
" 3.23 | \n",
" 5.345 | \n",
" 17.42 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 4.0 | \n",
" 111851.851852 | \n",
"
\n",
" \n",
" Fiat 128 | \n",
" 32.4 | \n",
" 4.0 | \n",
" 78.7 | \n",
" 66.0 | \n",
" 4.08 | \n",
" 2.200 | \n",
" 19.47 | \n",
" 1.0 | \n",
" 1.0 | \n",
" 4.0 | \n",
" 1.0 | \n",
" 112592.592593 | \n",
"
\n",
" \n",
" Honda Civic | \n",
" 30.4 | \n",
" 4.0 | \n",
" 75.7 | \n",
" 52.0 | \n",
" 4.93 | \n",
" 1.615 | \n",
" 18.52 | \n",
" 1.0 | \n",
" 1.0 | \n",
" 4.0 | \n",
" 2.0 | \n",
" 113333.333333 | \n",
"
\n",
" \n",
" Toyota Corolla | \n",
" 33.9 | \n",
" 4.0 | \n",
" 71.1 | \n",
" 65.0 | \n",
" 4.22 | \n",
" 1.835 | \n",
" 19.90 | \n",
" 1.0 | \n",
" 1.0 | \n",
" 4.0 | \n",
" 1.0 | \n",
" 114074.074074 | \n",
"
\n",
" \n",
" Toyota Corona | \n",
" 21.5 | \n",
" 4.0 | \n",
" 120.1 | \n",
" 97.0 | \n",
" 3.70 | \n",
" 2.465 | \n",
" 20.01 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 1.0 | \n",
" 114814.814815 | \n",
"
\n",
" \n",
" Dodge Challenger | \n",
" 15.5 | \n",
" 8.0 | \n",
" 318.0 | \n",
" 150.0 | \n",
" 2.76 | \n",
" 3.520 | \n",
" 16.87 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 2.0 | \n",
" 115555.555556 | \n",
"
\n",
" \n",
" AMC Javelin | \n",
" 15.2 | \n",
" 8.0 | \n",
" 304.0 | \n",
" 150.0 | \n",
" 3.15 | \n",
" 3.435 | \n",
" 17.30 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 2.0 | \n",
" 116296.296296 | \n",
"
\n",
" \n",
" Camaro Z28 | \n",
" 13.3 | \n",
" 8.0 | \n",
" 350.0 | \n",
" 245.0 | \n",
" 3.73 | \n",
" 3.840 | \n",
" 15.41 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 4.0 | \n",
" 117037.037037 | \n",
"
\n",
" \n",
" Pontiac Firebird | \n",
" 19.2 | \n",
" 8.0 | \n",
" 400.0 | \n",
" 175.0 | \n",
" 3.08 | \n",
" 3.845 | \n",
" 17.05 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 2.0 | \n",
" 117777.777778 | \n",
"
\n",
" \n",
" Fiat X1-9 | \n",
" 27.3 | \n",
" 4.0 | \n",
" 79.0 | \n",
" 66.0 | \n",
" 4.08 | \n",
" 1.935 | \n",
" 18.90 | \n",
" 1.0 | \n",
" 1.0 | \n",
" 4.0 | \n",
" 1.0 | \n",
" 118518.518519 | \n",
"
\n",
" \n",
" Porsche 914-2 | \n",
" 26.0 | \n",
" 4.0 | \n",
" 120.3 | \n",
" 91.0 | \n",
" 4.43 | \n",
" 2.140 | \n",
" 16.70 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 5.0 | \n",
" 2.0 | \n",
" 119259.259259 | \n",
"
\n",
" \n",
" Lotus Europa | \n",
" 30.4 | \n",
" 4.0 | \n",
" 95.1 | \n",
" 113.0 | \n",
" 3.77 | \n",
" 1.513 | \n",
" 16.90 | \n",
" 1.0 | \n",
" 1.0 | \n",
" 5.0 | \n",
" 2.0 | \n",
" 120000.000000 | \n",
"
\n",
" \n",
" Ford Pantera L | \n",
" 15.8 | \n",
" 8.0 | \n",
" 351.0 | \n",
" 264.0 | \n",
" 4.22 | \n",
" 3.170 | \n",
" 14.50 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 5.0 | \n",
" 4.0 | \n",
" 120000.000000 | \n",
"
\n",
" \n",
" Ferrari Dino | \n",
" 19.7 | \n",
" 6.0 | \n",
" 145.0 | \n",
" 175.0 | \n",
" 3.62 | \n",
" 2.770 | \n",
" 15.50 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 5.0 | \n",
" 6.0 | \n",
" 120000.000000 | \n",
"
\n",
" \n",
" Maserati Bora | \n",
" 15.0 | \n",
" 8.0 | \n",
" 301.0 | \n",
" 335.0 | \n",
" 3.54 | \n",
" 3.570 | \n",
" 14.60 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 5.0 | \n",
" 8.0 | \n",
" 120000.000000 | \n",
"
\n",
" \n",
" Volvo 142E | \n",
" 15.0 | \n",
" 8.0 | \n",
" 301.0 | \n",
" 335.0 | \n",
" 3.54 | \n",
" 3.570 | \n",
" 14.60 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 5.0 | \n",
" 8.0 | \n",
" 120000.000000 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" mpg cyl disp hp drat wt qsec vs am \\\n",
"brand \n",
"Mazda RX4 21.0 6.0 160.0 110.0 3.90 2.620 16.46 0.0 1.0 \n",
"Mazda RX4 Wag 21.0 6.0 160.0 110.0 3.90 2.875 17.02 0.0 1.0 \n",
"Datsun 710 22.8 4.0 108.0 93.0 3.85 2.320 18.61 1.0 1.0 \n",
"Hornet 4 Drive 21.4 6.0 258.0 110.0 3.08 3.215 19.44 1.0 0.0 \n",
"Hornet Sportabout 18.7 8.0 360.0 175.0 3.15 3.440 17.02 0.0 0.0 \n",
"Valiant 18.1 6.0 225.0 105.0 2.76 3.460 20.22 1.0 0.0 \n",
"Duster 360 14.3 8.0 360.0 245.0 3.21 3.570 15.84 0.0 0.0 \n",
"Merc 240D 24.4 4.0 146.7 62.0 3.69 3.190 20.00 1.0 0.0 \n",
"Merc 230 22.8 4.0 140.8 95.0 3.92 3.150 22.90 1.0 0.0 \n",
"Merc 280 19.2 6.0 167.6 123.0 3.92 3.440 18.30 1.0 0.0 \n",
"Merc 280C 17.8 6.0 167.6 123.0 3.92 3.440 18.90 1.0 0.0 \n",
"Merc 450SE 16.4 8.0 275.8 180.0 3.07 4.070 17.40 0.0 0.0 \n",
"Merc 450SL 17.3 8.0 275.8 180.0 3.07 3.730 17.60 0.0 0.0 \n",
"Merc 450SLC 15.2 8.0 275.8 180.0 3.07 3.780 18.00 0.0 0.0 \n",
"Cadillac Fleetwood 10.4 8.0 472.0 205.0 2.93 5.250 17.98 0.0 0.0 \n",
"Lincoln Continental 10.4 8.0 460.0 215.0 3.00 5.424 17.82 0.0 0.0 \n",
"Chrysler Imperial 14.7 8.0 440.0 230.0 3.23 5.345 17.42 0.0 0.0 \n",
"Fiat 128 32.4 4.0 78.7 66.0 4.08 2.200 19.47 1.0 1.0 \n",
"Honda Civic 30.4 4.0 75.7 52.0 4.93 1.615 18.52 1.0 1.0 \n",
"Toyota Corolla 33.9 4.0 71.1 65.0 4.22 1.835 19.90 1.0 1.0 \n",
"Toyota Corona 21.5 4.0 120.1 97.0 3.70 2.465 20.01 1.0 0.0 \n",
"Dodge Challenger 15.5 8.0 318.0 150.0 2.76 3.520 16.87 0.0 0.0 \n",
"AMC Javelin 15.2 8.0 304.0 150.0 3.15 3.435 17.30 0.0 0.0 \n",
"Camaro Z28 13.3 8.0 350.0 245.0 3.73 3.840 15.41 0.0 0.0 \n",
"Pontiac Firebird 19.2 8.0 400.0 175.0 3.08 3.845 17.05 0.0 0.0 \n",
"Fiat X1-9 27.3 4.0 79.0 66.0 4.08 1.935 18.90 1.0 1.0 \n",
"Porsche 914-2 26.0 4.0 120.3 91.0 4.43 2.140 16.70 0.0 1.0 \n",
"Lotus Europa 30.4 4.0 95.1 113.0 3.77 1.513 16.90 1.0 1.0 \n",
"Ford Pantera L 15.8 8.0 351.0 264.0 4.22 3.170 14.50 0.0 1.0 \n",
"Ferrari Dino 19.7 6.0 145.0 175.0 3.62 2.770 15.50 0.0 1.0 \n",
"Maserati Bora 15.0 8.0 301.0 335.0 3.54 3.570 14.60 0.0 1.0 \n",
"Volvo 142E 15.0 8.0 301.0 335.0 3.54 3.570 14.60 0.0 1.0 \n",
"\n",
" gear carb price \n",
"brand \n",
"Mazda RX4 4.0 4.0 100000.000000 \n",
"Mazda RX4 Wag 4.0 4.0 100740.740741 \n",
"Datsun 710 4.0 1.0 101481.481481 \n",
"Hornet 4 Drive 3.0 1.0 102222.222222 \n",
"Hornet Sportabout 3.0 2.0 102962.962963 \n",
"Valiant 3.0 1.0 103703.703704 \n",
"Duster 360 3.0 4.0 104444.444444 \n",
"Merc 240D 4.0 2.0 105185.185185 \n",
"Merc 230 4.0 2.0 105925.925926 \n",
"Merc 280 4.0 4.0 106666.666667 \n",
"Merc 280C 4.0 4.0 107407.407407 \n",
"Merc 450SE 3.0 3.0 108148.148148 \n",
"Merc 450SL 3.0 3.0 108888.888889 \n",
"Merc 450SLC 3.0 3.0 109629.629630 \n",
"Cadillac Fleetwood 3.0 4.0 110370.370370 \n",
"Lincoln Continental 3.0 4.0 111111.111111 \n",
"Chrysler Imperial 3.0 4.0 111851.851852 \n",
"Fiat 128 4.0 1.0 112592.592593 \n",
"Honda Civic 4.0 2.0 113333.333333 \n",
"Toyota Corolla 4.0 1.0 114074.074074 \n",
"Toyota Corona 3.0 1.0 114814.814815 \n",
"Dodge Challenger 3.0 2.0 115555.555556 \n",
"AMC Javelin 3.0 2.0 116296.296296 \n",
"Camaro Z28 3.0 4.0 117037.037037 \n",
"Pontiac Firebird 3.0 2.0 117777.777778 \n",
"Fiat X1-9 4.0 1.0 118518.518519 \n",
"Porsche 914-2 5.0 2.0 119259.259259 \n",
"Lotus Europa 5.0 2.0 120000.000000 \n",
"Ford Pantera L 5.0 4.0 120000.000000 \n",
"Ferrari Dino 5.0 6.0 120000.000000 \n",
"Maserati Bora 5.0 8.0 120000.000000 \n",
"Volvo 142E 5.0 8.0 120000.000000 "
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.interpolate(method=\"linear\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## apply, map, applymap\n",
"\n",
"Применить некоторую функцию к всем строкам/столбцам таблицы можно методом [DataFrame.apply](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.apply.html). Передавать этому методу необходимо такую функцию, рассчитывая, что ей на вход будет подаваться объект `pd.series`, который в зависимости от значения параметра `axis` будет или строкой или столбцом таблицы."
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"mpg 23.50\n",
"cyl 4.00\n",
"disp 400.90\n",
"hp 283.00\n",
"drat 2.17\n",
"dtype: float64"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"def f(x):\n",
" return x.max() - x.min() \n",
"\n",
"df.apply(f, axis=\"rows\").head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Метод [Series.map](https://pandas.pydata.org/docs/reference/api/pandas.Series.map.html) позволяет применить функцию к каждому значению столбца. Чтобы продемонстрировать принцип работы этого метода, извлечем из столбца `brand` марку и модель автомобиля в разные столбцы.\n",
"\n",
"Для этого определим функции, которые в качестве аргумента принимают строку из столбца `brand` и возвращают марку или модель автомобиля из неё. В данной таблице, марка автомобиля всегда встречается до первого пробела, а модель автомобиля --- все после первого пробела."
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Mazda RX4 Wag -> Mazda and RX4 Wag\n"
]
}
],
"source": [
"def get_brand(x):\n",
" return x.split(\" \")[0]\n",
"\n",
"def get_model(x):\n",
" b, *m = x.split(\" \") # все после первого пробела попадает в переменную m\n",
" return \" \".join(m)\n",
"\n",
"x = \"Mazda RX4 Wag\"\n",
"print(f\"{x} -> {get_brand(x)} and {get_model(x)}\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Определив эти функции применим их к столбцу `brand`. Т.к. `brand` --- индекс, то создадим из него `pd.Series`, применим определенные функции методом `map` и запишем результаты в таблицу в качестве новых столбцов."
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" mpg | \n",
" cyl | \n",
" disp | \n",
" hp | \n",
" drat | \n",
" wt | \n",
" qsec | \n",
" vs | \n",
" am | \n",
" gear | \n",
" carb | \n",
" price | \n",
" brand | \n",
" model | \n",
"
\n",
" \n",
" car | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
"
\n",
" \n",
" \n",
" \n",
" Mazda RX4 | \n",
" 21.0 | \n",
" 6.0 | \n",
" 160.0 | \n",
" 110.0 | \n",
" 3.90 | \n",
" 2.620 | \n",
" 16.46 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 4.0 | \n",
" 4.0 | \n",
" 100000.0 | \n",
" Mazda | \n",
" RX4 | \n",
"
\n",
" \n",
" Mazda RX4 Wag | \n",
" 21.0 | \n",
" 6.0 | \n",
" 160.0 | \n",
" 110.0 | \n",
" 3.90 | \n",
" 2.875 | \n",
" 17.02 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 4.0 | \n",
" 4.0 | \n",
" NaN | \n",
" Mazda | \n",
" RX4 Wag | \n",
"
\n",
" \n",
" Datsun 710 | \n",
" 22.8 | \n",
" 4.0 | \n",
" 108.0 | \n",
" 93.0 | \n",
" 3.85 | \n",
" 2.320 | \n",
" 18.61 | \n",
" 1.0 | \n",
" 1.0 | \n",
" 4.0 | \n",
" 1.0 | \n",
" NaN | \n",
" Datsun | \n",
" 710 | \n",
"
\n",
" \n",
" Hornet 4 Drive | \n",
" 21.4 | \n",
" 6.0 | \n",
" 258.0 | \n",
" 110.0 | \n",
" 3.08 | \n",
" 3.215 | \n",
" 19.44 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 1.0 | \n",
" NaN | \n",
" Hornet | \n",
" 4 Drive | \n",
"
\n",
" \n",
" Hornet Sportabout | \n",
" 18.7 | \n",
" 8.0 | \n",
" 360.0 | \n",
" 175.0 | \n",
" 3.15 | \n",
" 3.440 | \n",
" 17.02 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 2.0 | \n",
" NaN | \n",
" Hornet | \n",
" Sportabout | \n",
"
\n",
" \n",
" Valiant | \n",
" 18.1 | \n",
" 6.0 | \n",
" 225.0 | \n",
" 105.0 | \n",
" 2.76 | \n",
" 3.460 | \n",
" 20.22 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 1.0 | \n",
" NaN | \n",
" Valiant | \n",
" | \n",
"
\n",
" \n",
" Duster 360 | \n",
" 14.3 | \n",
" 8.0 | \n",
" 360.0 | \n",
" 245.0 | \n",
" 3.21 | \n",
" 3.570 | \n",
" 15.84 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 4.0 | \n",
" NaN | \n",
" Duster | \n",
" 360 | \n",
"
\n",
" \n",
" Merc 240D | \n",
" 24.4 | \n",
" 4.0 | \n",
" 146.7 | \n",
" 62.0 | \n",
" 3.69 | \n",
" 3.190 | \n",
" 20.00 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 4.0 | \n",
" 2.0 | \n",
" NaN | \n",
" Merc | \n",
" 240D | \n",
"
\n",
" \n",
" Merc 230 | \n",
" 22.8 | \n",
" 4.0 | \n",
" 140.8 | \n",
" 95.0 | \n",
" 3.92 | \n",
" 3.150 | \n",
" 22.90 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 4.0 | \n",
" 2.0 | \n",
" NaN | \n",
" Merc | \n",
" 230 | \n",
"
\n",
" \n",
" Merc 280 | \n",
" 19.2 | \n",
" 6.0 | \n",
" 167.6 | \n",
" 123.0 | \n",
" 3.92 | \n",
" 3.440 | \n",
" 18.30 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 4.0 | \n",
" 4.0 | \n",
" NaN | \n",
" Merc | \n",
" 280 | \n",
"
\n",
" \n",
" Merc 280C | \n",
" 17.8 | \n",
" 6.0 | \n",
" 167.6 | \n",
" 123.0 | \n",
" 3.92 | \n",
" 3.440 | \n",
" 18.90 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 4.0 | \n",
" 4.0 | \n",
" NaN | \n",
" Merc | \n",
" 280C | \n",
"
\n",
" \n",
" Merc 450SE | \n",
" 16.4 | \n",
" 8.0 | \n",
" 275.8 | \n",
" 180.0 | \n",
" 3.07 | \n",
" 4.070 | \n",
" 17.40 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 3.0 | \n",
" NaN | \n",
" Merc | \n",
" 450SE | \n",
"
\n",
" \n",
" Merc 450SL | \n",
" 17.3 | \n",
" 8.0 | \n",
" 275.8 | \n",
" 180.0 | \n",
" 3.07 | \n",
" 3.730 | \n",
" 17.60 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 3.0 | \n",
" NaN | \n",
" Merc | \n",
" 450SL | \n",
"
\n",
" \n",
" Merc 450SLC | \n",
" 15.2 | \n",
" 8.0 | \n",
" 275.8 | \n",
" 180.0 | \n",
" 3.07 | \n",
" 3.780 | \n",
" 18.00 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 3.0 | \n",
" NaN | \n",
" Merc | \n",
" 450SLC | \n",
"
\n",
" \n",
" Cadillac Fleetwood | \n",
" 10.4 | \n",
" 8.0 | \n",
" 472.0 | \n",
" 205.0 | \n",
" 2.93 | \n",
" 5.250 | \n",
" 17.98 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 4.0 | \n",
" NaN | \n",
" Cadillac | \n",
" Fleetwood | \n",
"
\n",
" \n",
" Lincoln Continental | \n",
" 10.4 | \n",
" 8.0 | \n",
" 460.0 | \n",
" 215.0 | \n",
" 3.00 | \n",
" 5.424 | \n",
" 17.82 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 4.0 | \n",
" NaN | \n",
" Lincoln | \n",
" Continental | \n",
"
\n",
" \n",
" Chrysler Imperial | \n",
" 14.7 | \n",
" 8.0 | \n",
" 440.0 | \n",
" 230.0 | \n",
" 3.23 | \n",
" 5.345 | \n",
" 17.42 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 4.0 | \n",
" NaN | \n",
" Chrysler | \n",
" Imperial | \n",
"
\n",
" \n",
" Fiat 128 | \n",
" 32.4 | \n",
" 4.0 | \n",
" 78.7 | \n",
" 66.0 | \n",
" 4.08 | \n",
" 2.200 | \n",
" 19.47 | \n",
" 1.0 | \n",
" 1.0 | \n",
" 4.0 | \n",
" 1.0 | \n",
" NaN | \n",
" Fiat | \n",
" 128 | \n",
"
\n",
" \n",
" Honda Civic | \n",
" 30.4 | \n",
" 4.0 | \n",
" 75.7 | \n",
" 52.0 | \n",
" 4.93 | \n",
" 1.615 | \n",
" 18.52 | \n",
" 1.0 | \n",
" 1.0 | \n",
" 4.0 | \n",
" 2.0 | \n",
" NaN | \n",
" Honda | \n",
" Civic | \n",
"
\n",
" \n",
" Toyota Corolla | \n",
" 33.9 | \n",
" 4.0 | \n",
" 71.1 | \n",
" 65.0 | \n",
" 4.22 | \n",
" 1.835 | \n",
" 19.90 | \n",
" 1.0 | \n",
" 1.0 | \n",
" 4.0 | \n",
" 1.0 | \n",
" NaN | \n",
" Toyota | \n",
" Corolla | \n",
"
\n",
" \n",
" Toyota Corona | \n",
" 21.5 | \n",
" 4.0 | \n",
" 120.1 | \n",
" 97.0 | \n",
" 3.70 | \n",
" 2.465 | \n",
" 20.01 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 1.0 | \n",
" NaN | \n",
" Toyota | \n",
" Corona | \n",
"
\n",
" \n",
" Dodge Challenger | \n",
" 15.5 | \n",
" 8.0 | \n",
" 318.0 | \n",
" 150.0 | \n",
" 2.76 | \n",
" 3.520 | \n",
" 16.87 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 2.0 | \n",
" NaN | \n",
" Dodge | \n",
" Challenger | \n",
"
\n",
" \n",
" AMC Javelin | \n",
" 15.2 | \n",
" 8.0 | \n",
" 304.0 | \n",
" 150.0 | \n",
" 3.15 | \n",
" 3.435 | \n",
" 17.30 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 2.0 | \n",
" NaN | \n",
" AMC | \n",
" Javelin | \n",
"
\n",
" \n",
" Camaro Z28 | \n",
" 13.3 | \n",
" 8.0 | \n",
" 350.0 | \n",
" 245.0 | \n",
" 3.73 | \n",
" 3.840 | \n",
" 15.41 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 4.0 | \n",
" NaN | \n",
" Camaro | \n",
" Z28 | \n",
"
\n",
" \n",
" Pontiac Firebird | \n",
" 19.2 | \n",
" 8.0 | \n",
" 400.0 | \n",
" 175.0 | \n",
" 3.08 | \n",
" 3.845 | \n",
" 17.05 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 2.0 | \n",
" NaN | \n",
" Pontiac | \n",
" Firebird | \n",
"
\n",
" \n",
" Fiat X1-9 | \n",
" 27.3 | \n",
" 4.0 | \n",
" 79.0 | \n",
" 66.0 | \n",
" 4.08 | \n",
" 1.935 | \n",
" 18.90 | \n",
" 1.0 | \n",
" 1.0 | \n",
" 4.0 | \n",
" 1.0 | \n",
" NaN | \n",
" Fiat | \n",
" X1-9 | \n",
"
\n",
" \n",
" Porsche 914-2 | \n",
" 26.0 | \n",
" 4.0 | \n",
" 120.3 | \n",
" 91.0 | \n",
" 4.43 | \n",
" 2.140 | \n",
" 16.70 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 5.0 | \n",
" 2.0 | \n",
" NaN | \n",
" Porsche | \n",
" 914-2 | \n",
"
\n",
" \n",
" Lotus Europa | \n",
" 30.4 | \n",
" 4.0 | \n",
" 95.1 | \n",
" 113.0 | \n",
" 3.77 | \n",
" 1.513 | \n",
" 16.90 | \n",
" 1.0 | \n",
" 1.0 | \n",
" 5.0 | \n",
" 2.0 | \n",
" 120000.0 | \n",
" Lotus | \n",
" Europa | \n",
"
\n",
" \n",
" Ford Pantera L | \n",
" 15.8 | \n",
" 8.0 | \n",
" 351.0 | \n",
" 264.0 | \n",
" 4.22 | \n",
" 3.170 | \n",
" 14.50 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 5.0 | \n",
" 4.0 | \n",
" NaN | \n",
" Ford | \n",
" Pantera L | \n",
"
\n",
" \n",
" Ferrari Dino | \n",
" 19.7 | \n",
" 6.0 | \n",
" 145.0 | \n",
" 175.0 | \n",
" 3.62 | \n",
" 2.770 | \n",
" 15.50 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 5.0 | \n",
" 6.0 | \n",
" NaN | \n",
" Ferrari | \n",
" Dino | \n",
"
\n",
" \n",
" Maserati Bora | \n",
" 15.0 | \n",
" 8.0 | \n",
" 301.0 | \n",
" 335.0 | \n",
" 3.54 | \n",
" 3.570 | \n",
" 14.60 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 5.0 | \n",
" 8.0 | \n",
" NaN | \n",
" Maserati | \n",
" Bora | \n",
"
\n",
" \n",
" Volvo 142E | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" Volvo | \n",
" 142E | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" mpg cyl disp hp drat wt qsec vs am \\\n",
"car \n",
"Mazda RX4 21.0 6.0 160.0 110.0 3.90 2.620 16.46 0.0 1.0 \n",
"Mazda RX4 Wag 21.0 6.0 160.0 110.0 3.90 2.875 17.02 0.0 1.0 \n",
"Datsun 710 22.8 4.0 108.0 93.0 3.85 2.320 18.61 1.0 1.0 \n",
"Hornet 4 Drive 21.4 6.0 258.0 110.0 3.08 3.215 19.44 1.0 0.0 \n",
"Hornet Sportabout 18.7 8.0 360.0 175.0 3.15 3.440 17.02 0.0 0.0 \n",
"Valiant 18.1 6.0 225.0 105.0 2.76 3.460 20.22 1.0 0.0 \n",
"Duster 360 14.3 8.0 360.0 245.0 3.21 3.570 15.84 0.0 0.0 \n",
"Merc 240D 24.4 4.0 146.7 62.0 3.69 3.190 20.00 1.0 0.0 \n",
"Merc 230 22.8 4.0 140.8 95.0 3.92 3.150 22.90 1.0 0.0 \n",
"Merc 280 19.2 6.0 167.6 123.0 3.92 3.440 18.30 1.0 0.0 \n",
"Merc 280C 17.8 6.0 167.6 123.0 3.92 3.440 18.90 1.0 0.0 \n",
"Merc 450SE 16.4 8.0 275.8 180.0 3.07 4.070 17.40 0.0 0.0 \n",
"Merc 450SL 17.3 8.0 275.8 180.0 3.07 3.730 17.60 0.0 0.0 \n",
"Merc 450SLC 15.2 8.0 275.8 180.0 3.07 3.780 18.00 0.0 0.0 \n",
"Cadillac Fleetwood 10.4 8.0 472.0 205.0 2.93 5.250 17.98 0.0 0.0 \n",
"Lincoln Continental 10.4 8.0 460.0 215.0 3.00 5.424 17.82 0.0 0.0 \n",
"Chrysler Imperial 14.7 8.0 440.0 230.0 3.23 5.345 17.42 0.0 0.0 \n",
"Fiat 128 32.4 4.0 78.7 66.0 4.08 2.200 19.47 1.0 1.0 \n",
"Honda Civic 30.4 4.0 75.7 52.0 4.93 1.615 18.52 1.0 1.0 \n",
"Toyota Corolla 33.9 4.0 71.1 65.0 4.22 1.835 19.90 1.0 1.0 \n",
"Toyota Corona 21.5 4.0 120.1 97.0 3.70 2.465 20.01 1.0 0.0 \n",
"Dodge Challenger 15.5 8.0 318.0 150.0 2.76 3.520 16.87 0.0 0.0 \n",
"AMC Javelin 15.2 8.0 304.0 150.0 3.15 3.435 17.30 0.0 0.0 \n",
"Camaro Z28 13.3 8.0 350.0 245.0 3.73 3.840 15.41 0.0 0.0 \n",
"Pontiac Firebird 19.2 8.0 400.0 175.0 3.08 3.845 17.05 0.0 0.0 \n",
"Fiat X1-9 27.3 4.0 79.0 66.0 4.08 1.935 18.90 1.0 1.0 \n",
"Porsche 914-2 26.0 4.0 120.3 91.0 4.43 2.140 16.70 0.0 1.0 \n",
"Lotus Europa 30.4 4.0 95.1 113.0 3.77 1.513 16.90 1.0 1.0 \n",
"Ford Pantera L 15.8 8.0 351.0 264.0 4.22 3.170 14.50 0.0 1.0 \n",
"Ferrari Dino 19.7 6.0 145.0 175.0 3.62 2.770 15.50 0.0 1.0 \n",
"Maserati Bora 15.0 8.0 301.0 335.0 3.54 3.570 14.60 0.0 1.0 \n",
"Volvo 142E NaN NaN NaN NaN NaN NaN NaN NaN NaN \n",
"\n",
" gear carb price brand model \n",
"car \n",
"Mazda RX4 4.0 4.0 100000.0 Mazda RX4 \n",
"Mazda RX4 Wag 4.0 4.0 NaN Mazda RX4 Wag \n",
"Datsun 710 4.0 1.0 NaN Datsun 710 \n",
"Hornet 4 Drive 3.0 1.0 NaN Hornet 4 Drive \n",
"Hornet Sportabout 3.0 2.0 NaN Hornet Sportabout \n",
"Valiant 3.0 1.0 NaN Valiant \n",
"Duster 360 3.0 4.0 NaN Duster 360 \n",
"Merc 240D 4.0 2.0 NaN Merc 240D \n",
"Merc 230 4.0 2.0 NaN Merc 230 \n",
"Merc 280 4.0 4.0 NaN Merc 280 \n",
"Merc 280C 4.0 4.0 NaN Merc 280C \n",
"Merc 450SE 3.0 3.0 NaN Merc 450SE \n",
"Merc 450SL 3.0 3.0 NaN Merc 450SL \n",
"Merc 450SLC 3.0 3.0 NaN Merc 450SLC \n",
"Cadillac Fleetwood 3.0 4.0 NaN Cadillac Fleetwood \n",
"Lincoln Continental 3.0 4.0 NaN Lincoln Continental \n",
"Chrysler Imperial 3.0 4.0 NaN Chrysler Imperial \n",
"Fiat 128 4.0 1.0 NaN Fiat 128 \n",
"Honda Civic 4.0 2.0 NaN Honda Civic \n",
"Toyota Corolla 4.0 1.0 NaN Toyota Corolla \n",
"Toyota Corona 3.0 1.0 NaN Toyota Corona \n",
"Dodge Challenger 3.0 2.0 NaN Dodge Challenger \n",
"AMC Javelin 3.0 2.0 NaN AMC Javelin \n",
"Camaro Z28 3.0 4.0 NaN Camaro Z28 \n",
"Pontiac Firebird 3.0 2.0 NaN Pontiac Firebird \n",
"Fiat X1-9 4.0 1.0 NaN Fiat X1-9 \n",
"Porsche 914-2 5.0 2.0 NaN Porsche 914-2 \n",
"Lotus Europa 5.0 2.0 120000.0 Lotus Europa \n",
"Ford Pantera L 5.0 4.0 NaN Ford Pantera L \n",
"Ferrari Dino 5.0 6.0 NaN Ferrari Dino \n",
"Maserati Bora 5.0 8.0 NaN Maserati Bora \n",
"Volvo 142E NaN NaN NaN Volvo 142E "
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"fullname = pd.Series(df.index, index=df.index)\n",
"df.index.rename(\"car\", inplace=True)\n",
"df[\"brand\"] = fullname.map(get_brand)\n",
"df[\"model\"] = fullname.map(get_model)\n",
"df"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Вместо функции, методу `map` можно передать словарь."
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" mpg | \n",
" cyl | \n",
" disp | \n",
" hp | \n",
" drat | \n",
" wt | \n",
" qsec | \n",
" vs | \n",
" am | \n",
" gear | \n",
" carb | \n",
" price | \n",
" brand | \n",
" model | \n",
"
\n",
" \n",
" car | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
"
\n",
" \n",
" \n",
" \n",
" Mazda RX4 | \n",
" 21.0 | \n",
" six | \n",
" 160.0 | \n",
" 110.0 | \n",
" 3.90 | \n",
" 2.620 | \n",
" 16.46 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 4.0 | \n",
" 4.0 | \n",
" 100000.0 | \n",
" Mazda | \n",
" RX4 | \n",
"
\n",
" \n",
" Mazda RX4 Wag | \n",
" 21.0 | \n",
" six | \n",
" 160.0 | \n",
" 110.0 | \n",
" 3.90 | \n",
" 2.875 | \n",
" 17.02 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 4.0 | \n",
" 4.0 | \n",
" NaN | \n",
" Mazda | \n",
" RX4 Wag | \n",
"
\n",
" \n",
" Datsun 710 | \n",
" 22.8 | \n",
" four | \n",
" 108.0 | \n",
" 93.0 | \n",
" 3.85 | \n",
" 2.320 | \n",
" 18.61 | \n",
" 1.0 | \n",
" 1.0 | \n",
" 4.0 | \n",
" 1.0 | \n",
" NaN | \n",
" Datsun | \n",
" 710 | \n",
"
\n",
" \n",
" Hornet 4 Drive | \n",
" 21.4 | \n",
" six | \n",
" 258.0 | \n",
" 110.0 | \n",
" 3.08 | \n",
" 3.215 | \n",
" 19.44 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 1.0 | \n",
" NaN | \n",
" Hornet | \n",
" 4 Drive | \n",
"
\n",
" \n",
" Hornet Sportabout | \n",
" 18.7 | \n",
" eight | \n",
" 360.0 | \n",
" 175.0 | \n",
" 3.15 | \n",
" 3.440 | \n",
" 17.02 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 2.0 | \n",
" NaN | \n",
" Hornet | \n",
" Sportabout | \n",
"
\n",
" \n",
" Valiant | \n",
" 18.1 | \n",
" six | \n",
" 225.0 | \n",
" 105.0 | \n",
" 2.76 | \n",
" 3.460 | \n",
" 20.22 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 1.0 | \n",
" NaN | \n",
" Valiant | \n",
" | \n",
"
\n",
" \n",
" Duster 360 | \n",
" 14.3 | \n",
" eight | \n",
" 360.0 | \n",
" 245.0 | \n",
" 3.21 | \n",
" 3.570 | \n",
" 15.84 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 4.0 | \n",
" NaN | \n",
" Duster | \n",
" 360 | \n",
"
\n",
" \n",
" Merc 240D | \n",
" 24.4 | \n",
" four | \n",
" 146.7 | \n",
" 62.0 | \n",
" 3.69 | \n",
" 3.190 | \n",
" 20.00 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 4.0 | \n",
" 2.0 | \n",
" NaN | \n",
" Merc | \n",
" 240D | \n",
"
\n",
" \n",
" Merc 230 | \n",
" 22.8 | \n",
" four | \n",
" 140.8 | \n",
" 95.0 | \n",
" 3.92 | \n",
" 3.150 | \n",
" 22.90 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 4.0 | \n",
" 2.0 | \n",
" NaN | \n",
" Merc | \n",
" 230 | \n",
"
\n",
" \n",
" Merc 280 | \n",
" 19.2 | \n",
" six | \n",
" 167.6 | \n",
" 123.0 | \n",
" 3.92 | \n",
" 3.440 | \n",
" 18.30 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 4.0 | \n",
" 4.0 | \n",
" NaN | \n",
" Merc | \n",
" 280 | \n",
"
\n",
" \n",
" Merc 280C | \n",
" 17.8 | \n",
" six | \n",
" 167.6 | \n",
" 123.0 | \n",
" 3.92 | \n",
" 3.440 | \n",
" 18.90 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 4.0 | \n",
" 4.0 | \n",
" NaN | \n",
" Merc | \n",
" 280C | \n",
"
\n",
" \n",
" Merc 450SE | \n",
" 16.4 | \n",
" eight | \n",
" 275.8 | \n",
" 180.0 | \n",
" 3.07 | \n",
" 4.070 | \n",
" 17.40 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 3.0 | \n",
" NaN | \n",
" Merc | \n",
" 450SE | \n",
"
\n",
" \n",
" Merc 450SL | \n",
" 17.3 | \n",
" eight | \n",
" 275.8 | \n",
" 180.0 | \n",
" 3.07 | \n",
" 3.730 | \n",
" 17.60 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 3.0 | \n",
" NaN | \n",
" Merc | \n",
" 450SL | \n",
"
\n",
" \n",
" Merc 450SLC | \n",
" 15.2 | \n",
" eight | \n",
" 275.8 | \n",
" 180.0 | \n",
" 3.07 | \n",
" 3.780 | \n",
" 18.00 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 3.0 | \n",
" NaN | \n",
" Merc | \n",
" 450SLC | \n",
"
\n",
" \n",
" Cadillac Fleetwood | \n",
" 10.4 | \n",
" eight | \n",
" 472.0 | \n",
" 205.0 | \n",
" 2.93 | \n",
" 5.250 | \n",
" 17.98 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 4.0 | \n",
" NaN | \n",
" Cadillac | \n",
" Fleetwood | \n",
"
\n",
" \n",
" Lincoln Continental | \n",
" 10.4 | \n",
" eight | \n",
" 460.0 | \n",
" 215.0 | \n",
" 3.00 | \n",
" 5.424 | \n",
" 17.82 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 4.0 | \n",
" NaN | \n",
" Lincoln | \n",
" Continental | \n",
"
\n",
" \n",
" Chrysler Imperial | \n",
" 14.7 | \n",
" eight | \n",
" 440.0 | \n",
" 230.0 | \n",
" 3.23 | \n",
" 5.345 | \n",
" 17.42 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 4.0 | \n",
" NaN | \n",
" Chrysler | \n",
" Imperial | \n",
"
\n",
" \n",
" Fiat 128 | \n",
" 32.4 | \n",
" four | \n",
" 78.7 | \n",
" 66.0 | \n",
" 4.08 | \n",
" 2.200 | \n",
" 19.47 | \n",
" 1.0 | \n",
" 1.0 | \n",
" 4.0 | \n",
" 1.0 | \n",
" NaN | \n",
" Fiat | \n",
" 128 | \n",
"
\n",
" \n",
" Honda Civic | \n",
" 30.4 | \n",
" four | \n",
" 75.7 | \n",
" 52.0 | \n",
" 4.93 | \n",
" 1.615 | \n",
" 18.52 | \n",
" 1.0 | \n",
" 1.0 | \n",
" 4.0 | \n",
" 2.0 | \n",
" NaN | \n",
" Honda | \n",
" Civic | \n",
"
\n",
" \n",
" Toyota Corolla | \n",
" 33.9 | \n",
" four | \n",
" 71.1 | \n",
" 65.0 | \n",
" 4.22 | \n",
" 1.835 | \n",
" 19.90 | \n",
" 1.0 | \n",
" 1.0 | \n",
" 4.0 | \n",
" 1.0 | \n",
" NaN | \n",
" Toyota | \n",
" Corolla | \n",
"
\n",
" \n",
" Toyota Corona | \n",
" 21.5 | \n",
" four | \n",
" 120.1 | \n",
" 97.0 | \n",
" 3.70 | \n",
" 2.465 | \n",
" 20.01 | \n",
" 1.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 1.0 | \n",
" NaN | \n",
" Toyota | \n",
" Corona | \n",
"
\n",
" \n",
" Dodge Challenger | \n",
" 15.5 | \n",
" eight | \n",
" 318.0 | \n",
" 150.0 | \n",
" 2.76 | \n",
" 3.520 | \n",
" 16.87 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 2.0 | \n",
" NaN | \n",
" Dodge | \n",
" Challenger | \n",
"
\n",
" \n",
" AMC Javelin | \n",
" 15.2 | \n",
" eight | \n",
" 304.0 | \n",
" 150.0 | \n",
" 3.15 | \n",
" 3.435 | \n",
" 17.30 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 2.0 | \n",
" NaN | \n",
" AMC | \n",
" Javelin | \n",
"
\n",
" \n",
" Camaro Z28 | \n",
" 13.3 | \n",
" eight | \n",
" 350.0 | \n",
" 245.0 | \n",
" 3.73 | \n",
" 3.840 | \n",
" 15.41 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 4.0 | \n",
" NaN | \n",
" Camaro | \n",
" Z28 | \n",
"
\n",
" \n",
" Pontiac Firebird | \n",
" 19.2 | \n",
" eight | \n",
" 400.0 | \n",
" 175.0 | \n",
" 3.08 | \n",
" 3.845 | \n",
" 17.05 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 3.0 | \n",
" 2.0 | \n",
" NaN | \n",
" Pontiac | \n",
" Firebird | \n",
"
\n",
" \n",
" Fiat X1-9 | \n",
" 27.3 | \n",
" four | \n",
" 79.0 | \n",
" 66.0 | \n",
" 4.08 | \n",
" 1.935 | \n",
" 18.90 | \n",
" 1.0 | \n",
" 1.0 | \n",
" 4.0 | \n",
" 1.0 | \n",
" NaN | \n",
" Fiat | \n",
" X1-9 | \n",
"
\n",
" \n",
" Porsche 914-2 | \n",
" 26.0 | \n",
" four | \n",
" 120.3 | \n",
" 91.0 | \n",
" 4.43 | \n",
" 2.140 | \n",
" 16.70 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 5.0 | \n",
" 2.0 | \n",
" NaN | \n",
" Porsche | \n",
" 914-2 | \n",
"
\n",
" \n",
" Lotus Europa | \n",
" 30.4 | \n",
" four | \n",
" 95.1 | \n",
" 113.0 | \n",
" 3.77 | \n",
" 1.513 | \n",
" 16.90 | \n",
" 1.0 | \n",
" 1.0 | \n",
" 5.0 | \n",
" 2.0 | \n",
" 120000.0 | \n",
" Lotus | \n",
" Europa | \n",
"
\n",
" \n",
" Ford Pantera L | \n",
" 15.8 | \n",
" eight | \n",
" 351.0 | \n",
" 264.0 | \n",
" 4.22 | \n",
" 3.170 | \n",
" 14.50 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 5.0 | \n",
" 4.0 | \n",
" NaN | \n",
" Ford | \n",
" Pantera L | \n",
"
\n",
" \n",
" Ferrari Dino | \n",
" 19.7 | \n",
" six | \n",
" 145.0 | \n",
" 175.0 | \n",
" 3.62 | \n",
" 2.770 | \n",
" 15.50 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 5.0 | \n",
" 6.0 | \n",
" NaN | \n",
" Ferrari | \n",
" Dino | \n",
"
\n",
" \n",
" Maserati Bora | \n",
" 15.0 | \n",
" eight | \n",
" 301.0 | \n",
" 335.0 | \n",
" 3.54 | \n",
" 3.570 | \n",
" 14.60 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 5.0 | \n",
" 8.0 | \n",
" NaN | \n",
" Maserati | \n",
" Bora | \n",
"
\n",
" \n",
" Volvo 142E | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" Volvo | \n",
" 142E | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" mpg cyl disp hp drat wt qsec vs am \\\n",
"car \n",
"Mazda RX4 21.0 six 160.0 110.0 3.90 2.620 16.46 0.0 1.0 \n",
"Mazda RX4 Wag 21.0 six 160.0 110.0 3.90 2.875 17.02 0.0 1.0 \n",
"Datsun 710 22.8 four 108.0 93.0 3.85 2.320 18.61 1.0 1.0 \n",
"Hornet 4 Drive 21.4 six 258.0 110.0 3.08 3.215 19.44 1.0 0.0 \n",
"Hornet Sportabout 18.7 eight 360.0 175.0 3.15 3.440 17.02 0.0 0.0 \n",
"Valiant 18.1 six 225.0 105.0 2.76 3.460 20.22 1.0 0.0 \n",
"Duster 360 14.3 eight 360.0 245.0 3.21 3.570 15.84 0.0 0.0 \n",
"Merc 240D 24.4 four 146.7 62.0 3.69 3.190 20.00 1.0 0.0 \n",
"Merc 230 22.8 four 140.8 95.0 3.92 3.150 22.90 1.0 0.0 \n",
"Merc 280 19.2 six 167.6 123.0 3.92 3.440 18.30 1.0 0.0 \n",
"Merc 280C 17.8 six 167.6 123.0 3.92 3.440 18.90 1.0 0.0 \n",
"Merc 450SE 16.4 eight 275.8 180.0 3.07 4.070 17.40 0.0 0.0 \n",
"Merc 450SL 17.3 eight 275.8 180.0 3.07 3.730 17.60 0.0 0.0 \n",
"Merc 450SLC 15.2 eight 275.8 180.0 3.07 3.780 18.00 0.0 0.0 \n",
"Cadillac Fleetwood 10.4 eight 472.0 205.0 2.93 5.250 17.98 0.0 0.0 \n",
"Lincoln Continental 10.4 eight 460.0 215.0 3.00 5.424 17.82 0.0 0.0 \n",
"Chrysler Imperial 14.7 eight 440.0 230.0 3.23 5.345 17.42 0.0 0.0 \n",
"Fiat 128 32.4 four 78.7 66.0 4.08 2.200 19.47 1.0 1.0 \n",
"Honda Civic 30.4 four 75.7 52.0 4.93 1.615 18.52 1.0 1.0 \n",
"Toyota Corolla 33.9 four 71.1 65.0 4.22 1.835 19.90 1.0 1.0 \n",
"Toyota Corona 21.5 four 120.1 97.0 3.70 2.465 20.01 1.0 0.0 \n",
"Dodge Challenger 15.5 eight 318.0 150.0 2.76 3.520 16.87 0.0 0.0 \n",
"AMC Javelin 15.2 eight 304.0 150.0 3.15 3.435 17.30 0.0 0.0 \n",
"Camaro Z28 13.3 eight 350.0 245.0 3.73 3.840 15.41 0.0 0.0 \n",
"Pontiac Firebird 19.2 eight 400.0 175.0 3.08 3.845 17.05 0.0 0.0 \n",
"Fiat X1-9 27.3 four 79.0 66.0 4.08 1.935 18.90 1.0 1.0 \n",
"Porsche 914-2 26.0 four 120.3 91.0 4.43 2.140 16.70 0.0 1.0 \n",
"Lotus Europa 30.4 four 95.1 113.0 3.77 1.513 16.90 1.0 1.0 \n",
"Ford Pantera L 15.8 eight 351.0 264.0 4.22 3.170 14.50 0.0 1.0 \n",
"Ferrari Dino 19.7 six 145.0 175.0 3.62 2.770 15.50 0.0 1.0 \n",
"Maserati Bora 15.0 eight 301.0 335.0 3.54 3.570 14.60 0.0 1.0 \n",
"Volvo 142E NaN NaN NaN NaN NaN NaN NaN NaN NaN \n",
"\n",
" gear carb price brand model \n",
"car \n",
"Mazda RX4 4.0 4.0 100000.0 Mazda RX4 \n",
"Mazda RX4 Wag 4.0 4.0 NaN Mazda RX4 Wag \n",
"Datsun 710 4.0 1.0 NaN Datsun 710 \n",
"Hornet 4 Drive 3.0 1.0 NaN Hornet 4 Drive \n",
"Hornet Sportabout 3.0 2.0 NaN Hornet Sportabout \n",
"Valiant 3.0 1.0 NaN Valiant \n",
"Duster 360 3.0 4.0 NaN Duster 360 \n",
"Merc 240D 4.0 2.0 NaN Merc 240D \n",
"Merc 230 4.0 2.0 NaN Merc 230 \n",
"Merc 280 4.0 4.0 NaN Merc 280 \n",
"Merc 280C 4.0 4.0 NaN Merc 280C \n",
"Merc 450SE 3.0 3.0 NaN Merc 450SE \n",
"Merc 450SL 3.0 3.0 NaN Merc 450SL \n",
"Merc 450SLC 3.0 3.0 NaN Merc 450SLC \n",
"Cadillac Fleetwood 3.0 4.0 NaN Cadillac Fleetwood \n",
"Lincoln Continental 3.0 4.0 NaN Lincoln Continental \n",
"Chrysler Imperial 3.0 4.0 NaN Chrysler Imperial \n",
"Fiat 128 4.0 1.0 NaN Fiat 128 \n",
"Honda Civic 4.0 2.0 NaN Honda Civic \n",
"Toyota Corolla 4.0 1.0 NaN Toyota Corolla \n",
"Toyota Corona 3.0 1.0 NaN Toyota Corona \n",
"Dodge Challenger 3.0 2.0 NaN Dodge Challenger \n",
"AMC Javelin 3.0 2.0 NaN AMC Javelin \n",
"Camaro Z28 3.0 4.0 NaN Camaro Z28 \n",
"Pontiac Firebird 3.0 2.0 NaN Pontiac Firebird \n",
"Fiat X1-9 4.0 1.0 NaN Fiat X1-9 \n",
"Porsche 914-2 5.0 2.0 NaN Porsche 914-2 \n",
"Lotus Europa 5.0 2.0 120000.0 Lotus Europa \n",
"Ford Pantera L 5.0 4.0 NaN Ford Pantera L \n",
"Ferrari Dino 5.0 6.0 NaN Ferrari Dino \n",
"Maserati Bora 5.0 8.0 NaN Maserati Bora \n",
"Volvo 142E NaN NaN NaN Volvo 142E "
]
},
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"n2str = {\n",
" 4.0: \"four\",\n",
" 6.0: \"six\",\n",
" 8.0: \"eight\"\n",
" }\n",
"df[\"cyl\"] = df[\"cyl\"].map(n2str)\n",
"df"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"```{warning}\n",
"Использовать числа с плавающей точкой в качестве ключей словаря считается плохой практикой из-за особенностей их представления в компьютере: даже минимальное отклонение от точного значения ключа в словаре может привести (и почти наверняка приведет) к другому значению хэш-функции и получить искомое значение по \"испорченному ключу не выйдет\".\n",
"```\n",
"\n",
"Метод [DataFrame.applymap](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.applymap.html#pandas.DataFrame.applymap) применяет функцию к кажому значению в таблице."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Группировка по значениям столбца\n",
"\n",
"Часто необходимо сгруппировать строки таблицы по какому-то принципу и с каждой из групп проделать какие-то операции. За группировку в `pandas` отвечает метод [DataFrame.groupby](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.groupby.html). В качестве аргумента её можно передать имя столбца, по значениям которого необходима группировка (можно также передать несколько столбцов в списке)."
]
},
{
"cell_type": "code",
"execution_count": 113,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" mpg | \n",
" cyl | \n",
" disp | \n",
" hp | \n",
" drat | \n",
" wt | \n",
" qsec | \n",
" vs | \n",
" am | \n",
" gear | \n",
" carb | \n",
" price | \n",
" brand | \n",
" model | \n",
"
\n",
" \n",
" car | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
"
\n",
" \n",
" \n",
" \n",
" Mazda RX4 | \n",
" 21.0 | \n",
" 6.0 | \n",
" 160.0 | \n",
" 110.0 | \n",
" 3.9 | \n",
" 2.620 | \n",
" 16.46 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 4.0 | \n",
" 4.0 | \n",
" 100000.0 | \n",
" Mazda | \n",
" RX4 | \n",
"
\n",
" \n",
" Mazda RX4 Wag | \n",
" 21.0 | \n",
" 6.0 | \n",
" 160.0 | \n",
" 110.0 | \n",
" 3.9 | \n",
" 2.875 | \n",
" 17.02 | \n",
" 0.0 | \n",
" 1.0 | \n",
" 4.0 | \n",
" 4.0 | \n",
" NaN | \n",
" Mazda | \n",
" RX4 Wag | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" mpg cyl disp hp drat wt qsec vs am gear \\\n",
"car \n",
"Mazda RX4 21.0 6.0 160.0 110.0 3.9 2.620 16.46 0.0 1.0 4.0 \n",
"Mazda RX4 Wag 21.0 6.0 160.0 110.0 3.9 2.875 17.02 0.0 1.0 4.0 \n",
"\n",
" carb price brand model \n",
"car \n",
"Mazda RX4 4.0 100000.0 Mazda RX4 \n",
"Mazda RX4 Wag 4.0 NaN Mazda RX4 Wag "
]
},
"execution_count": 113,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"grouped = df.groupby(\"brand\")\n",
"grouped.get_group(\"Mazda\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Каждую группу можно рассматривать как мини-таблицу. Часто таблицу группируют по значениям одного столбца, чтобы потом посчитать какие-нибудь агрегирующие функцию к значениям других столбцов. Сделать это можно методом [DataFrameGroupBy.aggregate](https://pandas.pydata.org/docs/reference/api/pandas.core.groupby.DataFrameGroupBy.aggregate.html). "
]
},
{
"cell_type": "code",
"execution_count": 114,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" min | \n",
" mean | \n",
" max | \n",
" count | \n",
"
\n",
" \n",
" brand | \n",
" | \n",
" | \n",
" | \n",
" | \n",
"
\n",
" \n",
" \n",
" \n",
" AMC | \n",
" 15.2 | \n",
" 15.200000 | \n",
" 15.2 | \n",
" 1 | \n",
"
\n",
" \n",
" Cadillac | \n",
" 10.4 | \n",
" 10.400000 | \n",
" 10.4 | \n",
" 1 | \n",
"
\n",
" \n",
" Camaro | \n",
" 13.3 | \n",
" 13.300000 | \n",
" 13.3 | \n",
" 1 | \n",
"
\n",
" \n",
" Chrysler | \n",
" 14.7 | \n",
" 14.700000 | \n",
" 14.7 | \n",
" 1 | \n",
"
\n",
" \n",
" Datsun | \n",
" 22.8 | \n",
" 22.800000 | \n",
" 22.8 | \n",
" 1 | \n",
"
\n",
" \n",
" Dodge | \n",
" 15.5 | \n",
" 15.500000 | \n",
" 15.5 | \n",
" 1 | \n",
"
\n",
" \n",
" Duster | \n",
" 14.3 | \n",
" 14.300000 | \n",
" 14.3 | \n",
" 1 | \n",
"
\n",
" \n",
" Ferrari | \n",
" 19.7 | \n",
" 19.700000 | \n",
" 19.7 | \n",
" 1 | \n",
"
\n",
" \n",
" Fiat | \n",
" 27.3 | \n",
" 29.850000 | \n",
" 32.4 | \n",
" 2 | \n",
"
\n",
" \n",
" Ford | \n",
" 15.8 | \n",
" 15.800000 | \n",
" 15.8 | \n",
" 1 | \n",
"
\n",
" \n",
" Honda | \n",
" 30.4 | \n",
" 30.400000 | \n",
" 30.4 | \n",
" 1 | \n",
"
\n",
" \n",
" Hornet | \n",
" 18.7 | \n",
" 20.050000 | \n",
" 21.4 | \n",
" 2 | \n",
"
\n",
" \n",
" Lincoln | \n",
" 10.4 | \n",
" 10.400000 | \n",
" 10.4 | \n",
" 1 | \n",
"
\n",
" \n",
" Lotus | \n",
" 30.4 | \n",
" 30.400000 | \n",
" 30.4 | \n",
" 1 | \n",
"
\n",
" \n",
" Maserati | \n",
" 15.0 | \n",
" 15.000000 | \n",
" 15.0 | \n",
" 1 | \n",
"
\n",
" \n",
" Mazda | \n",
" 21.0 | \n",
" 21.000000 | \n",
" 21.0 | \n",
" 2 | \n",
"
\n",
" \n",
" Merc | \n",
" 15.2 | \n",
" 19.014286 | \n",
" 24.4 | \n",
" 7 | \n",
"
\n",
" \n",
" Pontiac | \n",
" 19.2 | \n",
" 19.200000 | \n",
" 19.2 | \n",
" 1 | \n",
"
\n",
" \n",
" Porsche | \n",
" 26.0 | \n",
" 26.000000 | \n",
" 26.0 | \n",
" 1 | \n",
"
\n",
" \n",
" Toyota | \n",
" 21.5 | \n",
" 27.700000 | \n",
" 33.9 | \n",
" 2 | \n",
"
\n",
" \n",
" Valiant | \n",
" 18.1 | \n",
" 18.100000 | \n",
" 18.1 | \n",
" 1 | \n",
"
\n",
" \n",
" Volvo | \n",
" NaN | \n",
" NaN | \n",
" NaN | \n",
" 0 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" min mean max count\n",
"brand \n",
"AMC 15.2 15.200000 15.2 1\n",
"Cadillac 10.4 10.400000 10.4 1\n",
"Camaro 13.3 13.300000 13.3 1\n",
"Chrysler 14.7 14.700000 14.7 1\n",
"Datsun 22.8 22.800000 22.8 1\n",
"Dodge 15.5 15.500000 15.5 1\n",
"Duster 14.3 14.300000 14.3 1\n",
"Ferrari 19.7 19.700000 19.7 1\n",
"Fiat 27.3 29.850000 32.4 2\n",
"Ford 15.8 15.800000 15.8 1\n",
"Honda 30.4 30.400000 30.4 1\n",
"Hornet 18.7 20.050000 21.4 2\n",
"Lincoln 10.4 10.400000 10.4 1\n",
"Lotus 30.4 30.400000 30.4 1\n",
"Maserati 15.0 15.000000 15.0 1\n",
"Mazda 21.0 21.000000 21.0 2\n",
"Merc 15.2 19.014286 24.4 7\n",
"Pontiac 19.2 19.200000 19.2 1\n",
"Porsche 26.0 26.000000 26.0 1\n",
"Toyota 21.5 27.700000 33.9 2\n",
"Valiant 18.1 18.100000 18.1 1\n",
"Volvo NaN NaN NaN 0"
]
},
"execution_count": 114,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"grouped[\"mpg\"].aggregate([\"min\", \"mean\", \"max\", \"count\"])"
]
}
],
"metadata": {
"interpreter": {
"hash": "cd49b4596bae9c980ff74fdf93e8fe80e447435ae307c062fad6c4f9ef2eb47f"
},
"kernelspec": {
"display_name": "Python 3.8.10 64-bit ('venv': venv)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.10"
},
"orig_nbformat": 4
},
"nbformat": 4,
"nbformat_minor": 2
}