A while ago I entered the challenging world of Big Data. As an engineer, at first, I was not so impressed with this field. As time went by, I realised more and more, The technological challenges in this area are too great to master by one person. Just take a look at the picture in this article, It only covers a small fraction of the technologies in the Big Data industry…
Consequently, I created a meetup detailing all the challenges of Big Data. Especially in the ecosystem of AWS cloud. I am using AWS infrastructure to answer the basic questions of anyone starting their way in the big data world.
How to transform data (TXT, CSV, TSV, JSON) into Parquet, ORC? which technologies? using SQL or code?
Which technology should we use to model the data ? EMR? Athena? Redshift? Spectrum? Glue? Spark? SparkSQL?
How to handle streaming?
How to manage costs?
Security tip and network tips?
Cloud best practices tips?
The first lecture video, spoken language — english: