pyspark.pandas.groupby.GroupBy.nunique¶
- 
GroupBy.nunique(dropna: bool = True) → FrameLike[source]¶
- Return DataFrame with number of distinct observations per group for each column. - Parameters
- dropnaboolean, default True
- Don’t include NaN in the counts. 
 
- Returns
- nuniqueDataFrame or Series
 
 - Examples - >>> df = ps.DataFrame({'id': ['spam', 'egg', 'egg', 'spam', ... 'ham', 'ham'], ... 'value1': [1, 5, 5, 2, 5, 5], ... 'value2': list('abbaxy')}, columns=['id', 'value1', 'value2']) >>> df id value1 value2 0 spam 1 a 1 egg 5 b 2 egg 5 b 3 spam 2 a 4 ham 5 x 5 ham 5 y - >>> df.groupby('id').nunique().sort_index() value1 value2 id egg 1 1 ham 1 2 spam 2 1 - >>> df.groupby('id')['value1'].nunique().sort_index() id egg 1 ham 1 spam 2 Name: value1, dtype: int64