The efficient matrix and DataFrame functions in Python used in recommendation system data processing

np.setdiff1d, np.where and unstack etc.

Photo from Michal Matlon on Unsplash
  1. If assume_unique is equal to True, it assumes that the array is unique, no duplicated element, if it is False, then it will unique the result.
  2. It seems assume_unique=False also sorts the values ascending, but it is just from this case, and it isn’t mentioned in the official docu.
  1. np.unique(np.concatenate([list1, list2], axis=0)): can combine the two lists together and get the unique list.
  2. np.where(condition[, x, y])[0][0]: can get the index based on the condition
  3. can get the scalar product of two array, which can be used for similarity or giving weight to the parameters
  4. unstack(): can be used to check the interaction between item-item or item-user.

passionate about data analysis and data science

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store