Pular para conteúdo

Home

Pandas X SQL

SELECT

  • selecionar colunas
SELECT total_bill, tip, smoker, time
FROM tips;
Selecionar colunas
tips[["total_bill", "tip", "smoker", "time"]]
Out[6]: 
     total_bill   tip smoker    time
0         16.99  1.01     No  Dinner
1         10.34  1.66     No  Dinner
2         21.01  3.50     No  Dinner
3         23.68  3.31     No  Dinner
4         24.59  3.61     No  Dinner

# sem indicar colunas lista todas, assim como o select * do SQL
tips[[]]

Pandas X SQL

import pandas and NumPy:

import pandas as pd
import numpy as np

Most of the examples will utilize the tips dataset found within pandas tests.

url = (
    "https://raw.githubusercontent.com/pandas-dev/pandas/main/pandas/tests/io/data/csv/tips.csv"
)

tips = pd.read_csv(url)

tips
Out[5]: 
     total_bill   tip     sex smoker   day    time  size
0         16.99  1.01  Female     No   Sun  Dinner     2
1         10.34  1.66    Male     No   Sun  Dinner     3
2         21.01  3.50    Male     No   Sun  Dinner     3
3         23.68  3.31    Male     No   Sun  Dinner     2
4         24.59  3.61  Female     No   Sun  Dinner     4

Copies vs. in place operations

Most pandas operations return copies of the Series/DataFrame. You’ll need to either assign to a new variable:

sorted_df = df.sort_values("col1")

or overwrite the original one:

df = df.sort_values("col1")

SELECT

SELECT total_bill, tip, smoker, time
FROM tips;

list of column names to your DataFrame:

tips[["total_bill", "tip", "smoker", "time"]]