Pular para conteúdo

Pandas X SQL

import pandas and NumPy:

import pandas as pd
import numpy as np

Most of the examples will utilize the tips dataset found within pandas tests.

url = (
    "https://raw.githubusercontent.com/pandas-dev/pandas/main/pandas/tests/io/data/csv/tips.csv"
)

tips = pd.read_csv(url)

tips
Out[5]: 
     total_bill   tip     sex smoker   day    time  size
0         16.99  1.01  Female     No   Sun  Dinner     2
1         10.34  1.66    Male     No   Sun  Dinner     3
2         21.01  3.50    Male     No   Sun  Dinner     3
3         23.68  3.31    Male     No   Sun  Dinner     2
4         24.59  3.61  Female     No   Sun  Dinner     4

Copies vs. in place operations

Most pandas operations return copies of the Series/DataFrame. You’ll need to either assign to a new variable:

sorted_df = df.sort_values("col1")

or overwrite the original one:

df = df.sort_values("col1")

SELECT

SELECT total_bill, tip, smoker, time
FROM tips;

list of column names to your DataFrame:

tips[["total_bill", "tip", "smoker", "time"]]