Sampling Data with Pandas

Python

 1|  # samples a random 1000 rows from df
 2|  sample = df.sample(n=1000, random_state=101)
 3|  
 4|  # samples a random 10% of rows from df
 5|  sample = df.sample(frac=0.1, random_state=101)
 6|  
 7|  # samples a random 10% of rows from df but allow rows to be included
 8|  # more than once
 9|  sample = df.sample(frac=0.1, replace=True, random_state=101)
10|  
11|  # samples a random 1000 rows from df but sets the chance of a row being 
12|  # included according to the value in the 'row_weight' columns of df
13|  sample = df.sample(n=1000, weights='row_weight', random_state=101)
Did you find this snippet useful?

Sign up for free to to add this to your code library