One of the advantages of a decorator in Python is that it can make the usage of function be extended, but no need to modify the original functions.
Today let’s take an example to check it step by step.
Let’s say I have defined a lot of original functions as below:
The problem: After serval months, I want to print some words to all of the calculation functions (add, minus, and multiply) before and after the calculation to make the process more clear. What should I do?
Method1: Change the original functions: add the printed words to the original…
Today I would like to include the below parts:
Let’s get started.
FunkSVD: briefly intro
We know that there are cons with SVD to make a prediction in reality, for example, it can’t predict if there is even a NaN in the dataset, but as this is the starting point of collaborative filtering and I want to reproduce the procedure with SVD to see how it works, with some compromisation of the dataset.
This story will focus on code realization with SVD, have no offline testing (no splitting of train test dataset), and include some basic linear algebra-related terminologies.
Linear algebra basic
SVD is singular value decomposition. Detail explanation can be found in Wiki. Here…
Today I would like to discuss two examples for content-based recommendation systems and some efficient array functions I learn from them. The two examples are
1: Based on item content recommendation
2: Based on weighted content recommendation
I use a simple movie set as an example and would like to focus on the main process and ignore other processes and special cases. Let’s get started.
Use the below codes to generate two datasets: movie_df and review_df
The two tables as:
In my previous story, some NumPy functions have been used in Recommendation System data processing. Because the emphasis is on the content-based recommendation system, it is a pity that these functions haven’t be displayed the efficient usage in detail there. Now in this story, I would like to explain them in detail.
For your information, for function 3,4,5, you might want to check the dataframe I used. It is here:
This function finds the difference of two arrays and returns the unique values in a1 that are not in a2. It can be used to compare lists and…
Flask is the tool that can be used to create API server. It is a micro-framework, which means that its core functionality is kept simple, but there are numerous extensions to allow developers to add other functionality (such as authentication and database support).
Heroku is a cloud platform where developers host applications, databases, and other services in several languages. Developers can use Heroku to deploy, manage, and scale applications. Heroku is also free, with paid specialized memberships, and most services such as a database offer a free tier.
This story will focus on application deployment and database interactive without…
Nowadays, with big data becomes reality, people now focus on how to use the data to realize commercial values. One area which is much more mature is how to picture the potential customer or predict the behavior of the customer, to target the market or customer more precisely.
Bertelsman Arvato Financial Solution provided a real-world challenge in Udacity. Arvato provided four demographics datasets. They are:
As I am in a career transition phase, have worked in the industry related to coding, but without formal education in the area of computer science, I am wondering whether I can find some hits from the StackOverflow survey, which has the largest developer community.
After reviewing serval years of the survey, I find that the survey 2017 does give the questionary in the survey, but not include in the following years. But sometimes, the hints keep the same, so I choose the survey 2017. …
In order to simplify the problem, I take out two columns from my working file which is a Stackoverflow yearly Survey file, as below:
As a popular NoSQL database, Apache Cassandra is introduced in Udacity Data Engineering Nanodegree. In the second project, a workspace has been created, the connection between Cassandra and Jupyter Notebook has been set by Udacity. As a student, you don’t need to do anything for the connection if you work in the workspace.
However, you might want to run it on your local computer, try to set up the connections by yourself, and be more confident about what you have learned. But it might be frustrated by searching for a solution. …
passionate about data analysis and data science