A client wants to run their ad campign online and they selected their main advertising channel youtube.
They wanted to understand some of the initial question they had.
- How to categorize videos based on their comments and statistics.
- What are the factors affect how popular a Youtube video will be
| Data Ingestion - Ingest data, one-offs and incrementally | ETL Design - Extract, transform and load data efficiently | Data Lake - Design and build a new Data Lake architecture | Sclability - The data architecture should scale efficiently | AWS Cloud - AWS as the could provider | Reporting - Biuld a Business intelligence tier, incl. Dashboards
- Gather channel video's data (title, tags, duration, like count, comment count) using api.
- Host database on aws rds.
aws s3 cp . s3://youtube-raw-us-east-1-avi/youtube/raw_statistics_reference_data/ --recursive --exclude "" --include ".json"
aws s3 cp CAvideos.csv s3://de-on-youtube-raw-useast1-dev/youtube/raw_statistics/region=ca/ aws s3 cp DEvideos.csv s3://de-on-youtube-raw-useast1-dev/youtube/raw_statistics/region=de/ aws s3 cp FRvideos.csv s3://de-on-youtube-raw-useast1-dev/youtube/raw_statistics/region=fr/ aws s3 cp GBvideos.csv s3://de-on-youtube-raw-useast1-dev/youtube/raw_statistics/region=gb/ aws s3 cp INvideos.csv s3://de-on-youtube-raw-useast1-dev/youtube/raw_statistics/region=in/ aws s3 cp JPvideos.csv s3://de-on-youtube-raw-useast1-dev/youtube/raw_statistics/region=jp/ aws s3 cp KRvideos.csv s3://de-on-youtube-raw-useast1-dev/youtube/raw_statistics/region=kr/ aws s3 cp MXvideos.csv s3://de-on-youtube-raw-useast1-dev/youtube/raw_statistics/region=mx/ aws s3 cp RUvideos.csv s3://de-on-youtube-raw-useast1-dev/youtube/raw_statistics/region=ru/ aws s3 cp USvideos.csv s3://de-on-youtube-raw-useast1-dev/youtube/raw_statistics/region=us/