Monday, September 21, 2020

0 comments
  Apache NiFi is an open source software to automate and manage the flow of data between different systems .  It provides a web-based UI for...

Wednesday, September 2, 2020

0 comments
   Resilient Distributed Data set( RDD) is the core of Apache Spark. It is the fundamental data structure on top of which all the spark comp...

Tuesday, September 1, 2020

0 comments
   Spark is a Scheduling Monitoring and Distribution engine, it can also acts as a resource manager for its jobs. When Spark runs job by its...

Saturday, August 29, 2020

0 comments
Loading a csv file and capturing all the bad records is a very common requirement in ETL projects. M ost of the relational database loaders ...