We will create a complete project trying to predict customer spending using linear regression with Python.

Image for post
Image for post
Image by rupixen.com

In this exercise, we have some historical transaction data from 2010 and 2011. For each transaction, we have a customer identifier (CustomerID), the number of units purchased (Quantity), the date of purchase (InvoiceDate) and the unit cost (UnitPrice), as well as some other information about the purchased item.

We want to prepare this data for a regression of 2010 customer transaction data against 2011 expenses. Therefore, we will create features from the 2010 data and calculate the target (the amount of money spent) for 2011.

When we create this model, it should generalize to future years for which we do…


Open innovation is a term used to promote a different and open mindset towards innovation that goes against the secrecy and traditional mentality of corporate R&D labs

Image for post
Image for post
Image by Ameen Fahmy

Open innovation is a term used to promote a different and open mindset towards innovation that goes against the secrecy and traditional mentality of corporate R&D labs.

The use of the term “open innovation” refers to the growing acceptance of external cooperation in an increasingly complex world and environment.

It has been promoted in particular by Henry Chesbrough, associate professor and faculty director of the Center for Open Innovation at the Haas School of Business at the University of California, Berkeley.

The term was originally referred to as “a new paradigm that assumes that companies can and should use both…


The Python collections module has different specialized data types that function as containers and can be used to replace the general purpose Python containers

Image for post
Image for post
Image by Daniel von Appen

The Python collections module has different specialized data types that function as containers and can be used to replace the general purpose Python containers (dict, tuple, list and set). We will study the following parts of this module:

  • ChainMap
  • defaultdict
  • deque

There is a submodule of collections called abc or Abstract Base Classes. These will not be covered in this post. Let’s start with the ChainMap container!

ChainMap

A ChainMap is a class that provides the ability to link multiple mappings together so that they end up as a single unit. If you look at the documentation, you will notice that…


The data is sometimes called the “new oil,” a newly discovered source of wealth that is extracted from the depths of corporate and government archives.

Image for post
Image for post
Image by Leonardodavinci.net

The data is sometimes called the “new oil” a newly discovered source of wealth that is extracted from the depths of corporate and government archives. Some accountants are so excited about the potential value of the data that they count it in the same way as a physical asset.

While it is true that data can enhance an organization’s value, this resource has no intrinsic value. Like oil, data needs to be extracted and refined with the right quality. Data needs to be transported across information networks before it can be used to create new value. …


Conversion rates analysis and building a machine learning model with the result

Image for post
Image for post
Image by Author

Before we start, I would like to give you an overview of the database and its context. It is not a very complex database, but there some important data is stored such as the type of work the client does, marital status, education, something that could apply to any other business, since the important thing is the last column of the database: the Y column

What is the Y column and why is it so important?

The Y column can be any column in your database that contains the final result of a customer interaction, in this case the column says no for those who did not buy, and says yes


Startups, SMBs, company founders, managers, and decision makers often claim that they are “data rich but information poor”.

Image for post
Image for post
Figure 1.0 Image by Author (Data Science Workflow)

Startups, SMBs or company founders, managers or decision makers often claim that they are “data rich but information poor”. This statement is in many cases only partially correct because it hides a misconception about the data life cycle. The fact that it is data-rich but information-poor suggests that previously untapped data sources are waiting to be exploited and used.

It is very unlikely that any organization will collect data without a particular purpose. In most cases, data is collected to manage operational processes. Collecting data without a particular purpose is a waste of resources. …


Not all startups have the financial capacity to experiment with new technologies

Image for post
Image for post
Image by Matt Lee

Today, large companies have large I+D budgets that allow them to experiment and be at the cutting edge of new technologies; always adopting the newest, trying to adapt it to their own needs, trying to find the hidden value in each of them.

It is natural that not always a new technology is adapted to the needs of a particular company, however, with the process of I+D companies have an “innovation lab” where it is allowed to fail, and where amazing things also happen.

In recent years, the technology that has gained millions of followers (companies and engineers), is Artificial…


We will go deeper into building a product recommendation system that we can better target customers with

Image for post
Image for post
Image by Amazon

Hi there! I would like to share this tutorial with you, which I wrote in order to explain how recommendation algorithms work, in an easy to understand way for beginners.

I will go deeper into building a product recommendation system that we can better target customers with, using product recommendations that are tailored to individual customers. Studies have shown that customized product recommendations improve conversion rates and customer retention rates.

A product recommendation system is a system whose objective is to predict and compile a list of items that a customer is likely to buy. …


How to identify linear relationships?

Image for post
Image for post

Linear models assume that the independent variables, X, take on a linear relationship with the dependent variable, Y. This relationship can be dictated by the following equation (Equation of a Straight Line)


Analyze Financial Data using Pandas Datareader

Image for post
Image for post
Image by Micheile Henderson

Finance and economics are becoming more and more interesting for all kinds of people, regardless of their career or profession. This is because we are all affected by economic data, or at least we are increasingly interested in being up-to-date, and we have a lot of information at hand.

Every day billions of bytes of financial data are sent over the Internet. Whether it is the price of a share, an e-commerce transaction, or even information on a country’s GDP. All this data, when properly organized and managed can be used to build some amazing and insightful software applications.

We…

Daniel Morales

Data Scientist. ML Engineer. Co-founder at DataSource.ai, Linkedin https://www.linkedin.com/in/danielmorales1/, Twitter https://twitter.com/daniel_moralesp

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store