Python Archives - darrengidado.com

In Data Science 19 Mins Read

Starbucks Data Science Blog

September 11, 2021 No Comments

Introduction A customer receives a drink from Starbucks barista millions of times each week, but each interaction is unique. This is just a moment in time, but nevertheless a connection. How does this customer behave, and what prompts them to make a purchase? To find out, we can use a simulated dataset that mimics customer behaviour on the Starbucks rewards mobile app. Once every few days, Starbucks sends out an offer to users of the mobile app. An offer can be merely an advertisement for a drink or an actual offer such as a discount or BOGO (buy one get one free). Some users might not receive any offers during certain weeks. Not all users receive the same offer, and that is the challenge to solve with this data set. Our task is to combine transactions, demographics and offer data to determine which demographic groups respond best to which offer…

In Data Science 8 Mins Read

Airbnb Data Science Blog

May 20, 2021 No Comments

Introduction Airbnb is a popular way for homeowners to make money by renting out their properties or even spare rooms in their own home. More people are considering joining Airbnb to profit by investing in new properties to transform into Airbnbs. However, how will they know what to consider to make their property an attractive proposition for customers? How will they identify which variables can increase their listing price and profit? There is a problem though, you see. Hosts remove their listings for various reasons such as a lack of bookings or if the property is currently occupied. This means we must find a way to predict that data and recommend them a reasonable price so they can attract more guests. Before we can answer those questions, we need to find relevant variables to use. Since 2008, guests and hosts have used Airbnb to travel in a more unique, personalized…

In SQL 11 Mins Read

SQL with Big Query and Python

May 3, 2021 No Comments

Big Query requires a special package bigquery that allows us to connect to the database via an API using our credentials. We will be able to see things about any database we select and run queries against it using SQL to extract information. This is the most popular way of working with BigQuery datasets in Python, where we can subsequently able to do further analysis on it. This page shows how to get started with the Cloud Client Libraries for the BigQuery API. Read more about the client libraries for Cloud APIs, including the older Google APIs Client Libraries, in Client Libraries Explained. Step 1: Installing the BigQuery API Client Library First, we need to create a Google BigQuery account, then we need to install the library via Anaconda Prompt. pip install –upgrade google-cloud-bigquery Next, we must create a service account. Once we have done that and downloaded our API key we…

In Data Analysis 5 Mins Read

We Rate Dogs Twitter Analysis

April 9, 2021 No Comments

Introduction The dataset that we will be wrangling (and analyzing and visualizing) is the tweet archive of Twitter user @dog_rates, also known as WeRateDogs. WeRateDogs is a Twitter account that rates people’s dogs with a humorous comment about the dog. These ratings almost always have a denominator of 10. The numerators, though? Almost always greater than 10. 11/10, 12/10, 13/10, etc. Why? Because “they’re good dogs, Brent.” WeRateDogs has over 4 million followers and has received international media coverage. There are many insights we can get from this dataset on WeRateDogs but first, we have to do a final check and save our new dataframes to their master variables. Dictionary: pupper: Puppy, a small doggo and usually younger.puppo: A transitional phase between pupper and doggo. Easily understood as the dog equivalent of a teenager.doggo: Dog, usually older.floofer: Very fluffy dog or a dog with excess fur. Comical amounts of fur on a dog will certainly earn…

In Data Analysis 44 Mins Read

The Truth About Airline Statistics

April 8, 2021 No Comments

Table of Contents ? IntroductionStep 1: Importing LibrariesStep 2: Gathering DataStep 3: Univariate ExplorationStep 4: Bivariate ExplorationStep 5: Multivariate ExplorationStep 6: Random ExplorationStep 7: Conclusion Introduction ? What is the ASA? ASA stands for ‘American Statistical Association’, ASA is the main professional organisation for statisticians in the United States. The organization was formed in November 1839 and is the second oldest continuously operating professional society in the United States. Every other year, at the Joint Statistical Meetings, the Graphics Section and the Computing Section join in sponsoring a special Poster Session called The Data Exposition, but more commonly known as The Data Expo. All of the papers presented in this Poster Session are reports of analyses of a common data set provided for the occasion. In addition, all papers presented in the session are encouraged to report the use of graphical methods employed during the development of their analysis and…

In Data Extraction 22 Mins Read

Scraping the Premier League Website

April 5, 2021 No Comments

Data scraping data, also known as web scraping is the process of extracting data from a website programmatically. The destination of the extracted data can vary, in some cases to channel that data to another website but is commonly saved to a spreadsheet or local file saved on your computer. It’s one of the most efficient ways to get data from the web aside from directly querying a REST API. We are going to be extracting Premier League data such as: All-time Top Scorers2020-21 – League Table2020/21 Top Scorers Table of Contents ? Importing LibrariesMethod 1: HTML Table ScrapingMethod 2: Beautiful Soup ScrapingMethod 3: Using an API and JSONs Importing Libraries ? We are going to be using the official Premier League website to extract the data we need. The libraries we are using are as follows: Pandas: This will be used to generate our dataframesNumpy: Numpy will be used to calculate our numeric…

In Data Analysis Udacity 24 Mins Read

Project 4 – Communicate Data Findings

August 28, 2019 No Comments

The Program for International Student Assessment (PISA) is a system of international assessments that allows countries to compare outcomes of learning as students near the end of compulsory schooling. PISA core assessments measure the performance of 15-year-old students in mathematics, science, and reading literacy every 3 years.

In Data Analysis Udacity 57 Mins Read

Project 3 – Wrangling and Analyze Data

August 28, 2019 No Comments

This project is a data wrangling project, which mainly focus on fixing the data quality and tidiness issues using python. The dataset that I am wrangling is the tweet archive of Twitter user @dog_rates, also known as WeRateDogs.

In Data Analysis Udacity 14 Mins Read

Project 2 – Experiment Results

August 28, 2019 No Comments

A/B tests are very commonly performed by data analysts and data scientists. For this project, we will be working to understand the results of an A/B test run by an e-commerce website.

In Data Analysis Udacity 10 Mins Read

Project 1 – Investigate a Dataset

August 28, 2019 No Comments

Gapminder datasets are being used to investigate the relationship between GDP/GNI per capita, standard of living, carbon footprint, healthcare and the importance of export in each country’s economy.

Python

Starbucks Data Science Blog

Airbnb Data Science Blog

SQL with Big Query and Python

We Rate Dogs Twitter Analysis

The Truth About Airline Statistics

Scraping the Premier League Website

Project 4 – Communicate Data Findings

Project 3 – Wrangling and Analyze Data

Project 2 – Experiment Results

Project 1 – Investigate a Dataset

Algorithmic Forex Mastery: Building A Trading Bot and Analytical Tools

Safer Gambling Analytics Part 2 – Session Aggregation

Safer Gambling Analytics Part 1 – Declined Deposits

Kaggle Marketing Analysis – Power BI Dashboard