• Charles Robert McCall
  • Cristian Viorel Buda
4. semester, Computer Science (IT - International Track) (Kandidatuddannelse)
Building an environment suitable to handle big data workloads involves using multiple software together to form the end result. We define a software framework as containing a suite of software packages to be used together to form a reproducible environment to run big data tasks on. Each choice of software is justified and its corresponding code is explained, as well as the resulting environment is demonstrated by running experimental big data tasks. The infrastructure is built by leveraging the Google Cloud Platform cloud computing provider to build the hardware. Terraform, an infrastructure manager, is used to communicate with the Google Cloud Platform API in order to programatically build the hardware infrastructure, while the Nix package manager is used to download, setup and configure the software packages. This framework can be used to build similar environments or adapt and further expand the code presented in this paper.
SprogEngelsk
Udgivelsesdato14 sep. 2019
Antal sider98
ID: 310919902