The Big Data ecosystem

For a beginner, the landscape can be utterly confusing. There is vast arena of technologies and equally varied use cases. There is no single go-to solution; every use case has a custom solution and this widespread technology stack and lack of standardization is making Big Data a difficult path to tread for developers. There are a multitude of technologies that exist which can draw meaningful insight out of this magnitude of data.

Let's begin with the basics: the environment for any data analytics application creation should provide for the following:

  • Storing data
  • Enriching or processing data
  • Data analysis and visualization

If we get to specialization, there are specific Big Data tools and technologies available; for instance, ETL tools such as Talend and Pentaho; Pig batch processing, Hive, and MapReduce; real-time processing from Storm, Spark, and so on; and the list goes on. Here's the pictorial representation of the vast Big Data technology landscape, as per Forbes:

It clearly depicts the various segments and verticals within the Big Data technology canvas:

  • Platforms such as Hadoop and NoSQL
  • Analytics such as HDP, CDH, EMC, Greenplum, DataStax, and more
  • Infrastructure such as Teradata, VoltDB, MarkLogic, and more
  • Infrastructure as a Service (IaaS) such as AWS, Azure, and more
  • Structured databases such as Oracle, SQL server, DB2, and more
  • Data as a Service (DaaS) such as INRIX, LexisNexis, Factual, and more

And, beyond that, we have a score of segments related to specific problem area such as Business Intelligence (BI), analytics and visualization, advertisement and media, log data and vertical apps, and so on.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset