In the first part of this tutorial (Forecasting Business KPIs With Python Using Prophet-Part I), we built a simple predictive model using Facebook’s Prophet library to forecast two daily KPIs (Sales Number & Sales Value GBP) 426 days in the future. The final forecasts were assigned to the empty future dates in the original dataframe to obtain the following result (GitHub):
df.loc[(df['observation_date'] > cutoff_date )].head()
df.loc[(df[‘observation_date’] > cutoff_date )].tail()
Picture this: you are a data analyst or business intelligence analyst with experience building KPIs, reporting and extracting insights on these metrics, but with little to no experience working on predictive models. At the same time your company is not just willing to track performance retroactively, but also in need for a strategic or dynamic forecast, but it turns out that there are no data scientists around the corner with a similar background.
Your manager approaches you, claiming you will be perfect for the job as you have just the right background and skillset to create a simple model to…
Picture this: you are in the process of gathering data sources to build a new report and realize that some datasets are still updated manually by your stakeholders and stored in Google spreadsheets…sounds familiar?
In this case you have two options: either you run an crush course to teach your less technical colleagues to work with SQL and data warehouses or you automate the process yourself with Python.
In this tutorial you will learn how to pull datasets from a Google spreadsheet with Python by connecting to the Google Drive API and then store them into a database table using…
Csvkit is a command line tool built as a Python library, that is optimized to explore, transform and move comma-separated datasets across systems.
csvkit is often presented as quick alternative to other programming languages to perform data science tasks, it really unleashes its true potential when used by data engineers that are comfortable working with the command line on a daily basis.
“It’s not uncommon for data engineers to support other teams in the business, transforming manually populated CSV files into static tables located in a database”
For instance, it’s not uncommon for data engineers to support other teams…
Suppose you had to analyze the table below, showing the yearly salary for the employees of a small company divided in five groups (from lower to higher salary):
While interviewing for a role in data science, data engineering or software engineering with Top Tech Companies, it’s very likely that the technical round will include one or more live code sessions to test your knowledge of SQL and a programming language of your choice.
Despite practicing to solve algorithms in Python or JAVA is often demanding and might take a good chunk of your time (because of the almost infinite amount of problems out there!), you should try not to underestimate the difficulty of SQL problems in FAANG interviews.
Tech companies have large amount of data stored in their…
If you just started using Python to analyze historical stock prices with the aim of visualizing trends and build investment strategies, or if you are a more experienced coder tired to use loops, you should stick around and learn how to improve your scripts with the
For instance, let’s suppose you selected a number of stock tickers and your task was to download the historical adjusted close prices for each company and merge them in a unique clean dataset similar to the one below:
Almost no one really liked them while studying math back in school as they seemed pretty boring, but if you are preparing for a Python coding screen or a whiteboard interview and you wish to nail it, you really have to learn about sets operators and methods right now. Sets are a weapon you need to have in your arsenal as common uses include removing duplicates, computing math operations on sets (like union, intersection, difference and symmetric difference) and membership testing.
“If you are preparing for a Python coding screen or whiteboard interview and you wish to nail it, you…
If you are relatively new to Python and plan to start interviewing for top companies (among which FAANG) listen to this: you need to start practicing algorithms right now.
Don’t be naive like I was when I first started solving them. Despite I thought that cracking a couple of algorithms every now and then was fun, I never spent too much time to practice and even less time to implement a faster or more efficient solution. …
Apache Airflow is a workflow management platform that allows users to programmatically schedule jobs that run on a standalone basis and to monitor them through its user interface. Airflow’s main components are a Webserver (Airflow’s UI build as a Flask App), a Scheduler, an Executor and a Metadata Database.
As a BI Developer I worked close enough to data engineers to learn pretty early what Airflow was very good at, but I never really put the theory into practice until I started with side projects.