Chapter 1

Introduction to Data Analysis and Decision Making

Abstract

This book, in which the main multivariate modeling and operational research statistical techniques are discussed, is the result of several years of study and research, and emphasizes the importance of data science in academic and business environments. It may be considered the main fruit of several discussions and lucubrations of the importance of applied modeling in decision making. This chapter aims to introduce the book, encouraging researchers and managers to use it whenever possible, as well as to propose a brief reflection about the hierarchy between data, information, and knowledge. Our humble expectation is that their decision-making processes will always improve with the continuous use of this book.

Keywords

Data; Information; Knowledge; Decision making

Everything in us is mortal, except the gifts of the spirit and of intelligence.

Ovid

1.1 Introduction: Hierarchy Between Data, Information, and Knowledge

In academic and business environments, improving the use of research techniques and modern software packages, together with the understanding, by researchers and managers in the most varied fields of knowledge, of the importance of statistics and data modeling in defining objectives and substantiating research hypotheses based on underlying theories, has been producing more consistent and rigorous papers from a methodological and scientific standpoint.

Nevertheless, as the well-known Austrian philosopher, later on naturalized as a British citizen, Ludwig Joseph Johann Wittgenstein used to say, only methodological rigor and the existence of authors who research more of the same topic can generate a deep lack of oxygen in the academic world. Besides availability of data, adequate software packages, and an adequate underlying theory, it is essential for researchers to also use their intuition and experience when defining their objectives and constructing their hypotheses, even in terms of deciding to study the behavior of new and, sometimes, unimaginable variables in their models. This, believe it or not, may also generate interesting and innovative information for the decision-making process!

The basic principle of this book is to explain the hierarchy between data, information, and knowledge, at every turn, in this new scenario we live in. Whenever treated and analyzed, data are transformed into information. On the other hand, knowledge is generated at the moment in which such information is recognized and applied to the decision-making process. Analogously, reverse hierarchy can also be applied, since knowledge, whenever disseminated or explained, becomes information that, when broken up, has the capacity to generate a dataset. Fig. 1.1 shows this logic.

Fig. 1.1
Fig. 1.1 Hierarchy between data, information, and knowledge.

1.2 Overview of the Book

The book is divided into 23 chapters, which are structured into eight major parts, as follows:

Part I: Foundations of Business Data Analysis

  •  Chapter 1: Introduction to Data Analysis and Decision Making.
  •  Chapter 2: Types of Variables and Mensuration and Accuracy Scales.

Part II: Descriptive Statistics

  •  Chapter 3: Univariate Descriptive Statistics.
  •  Chapter 4: Bivariate Descriptive Statistics.

Part III: Probabilistic Statistics

  •  Chapter 5: Introduction to Probability.
  •  Chapter 6: Random Variables and Probability Distributions.

Part IV: Statistical Inference

  •  Chapter 7: Sampling.
  •  Chapter 8: Estimation.
  •  Chapter 9: Hypotheses Tests.
  •  Chapter 10: Nonparametric Tests.

Part V: Multivariate Exploratory Data Analysis

  •  Chapter 11: Cluster Analysis.
  •  Chapter 12: Principal Component Factor Analysis.

Part VI: Generalized Linear Models

  •  Chapter 13: Simple and Multiple Regression Models.
  •  Chapter 14: Binary and Multinomial Logistic Regression Models.
  •  Chapter 15: Regression Models for Count Data: Poisson and Negative Binomial.

Part VII: Optimization Models and Simulation

  •  Chapter 16: Introduction to Optimization Models: General Formulations and Business Modeling.
  •  Chapter 17: Solution of Linear Programming Problems.
  •  Chapter 18: Network Programming.
  •  Chapter 19: Integer Programming.
  •  Chapter 20: Simulation and Risk Analysis.

Part VIII: Other Topics

  •  Chapter 21: Design and Analysis of Experiments.
  •  Chapter 22: Statistical Process Control.
  •  Chapter 23: Data Mining and Multilevel Modeling.

Each chapter is structured in the same presentation didactical logic, which we believe favors learning. First, the concepts regarding each topic are introduced and always followed by the algebraic solution, many times in Excel, of practical exercises from datasets primarily developed with a more educational focus. Next, sometimes, the same exercises are solved in Stata Statistical Software® and IBM SPSS Statistics Software®.

We believe that this logic facilitates the study and understanding of the correct use of each technique and of the analysis of the results. Moreover, the practical application of the models in Excel, Stata, and SPSS also brings benefits to researchers, as the results can be compared, at every turn, to the ones already estimated or calculated algebraically in the previous sections of each chapter. In addition to providing an opportunity to use these important software packages.

At the end of each chapter, additional exercises are proposed, whose answers, presented through the outputs generated, are available at the end of the book. The datasets used are available at www.elsevier.com.

1.3 Final Remarks

All the benefits and potential of the techniques discussed here will be felt by researchers and managers as the procedures are practiced repeatedly. As there are several methods, we must be very careful when defining the technique, since choosing the best alternatives for treating the data fundamentally depends on this moment of practice and exercises.

The adequate use of the techniques presented in this book by professors, students, and business managers may more powerfully underpin the research’s initial perception, which can support the decision-making process. Generating knowledge from a phenomenon depends on a well-structured research plan, with the definition of the variables to be collected, the dimensions of the sample, the development of the dataset, and choosing the technique that will be used, which is extremely important.

Thus, we believe that this book is meant for researchers who, for different reasons, are specifically interested in data science and decision making, as well as for those who want to deepen their knowledge by using Excel, SPSS, and Stata software packages.

This book is recommended to undergraduate and graduate students in the fields of Business Administration, Engineering, Economics, Accounting, Actuarial Science, Statistics, Psychology, Medicine and Health, and to students in other fields related to Human, Exact and Biomedical Sciences. It is also meant for students taking extension, lato sensu postgraduation and MBA courses, as well as for company employees, consultants, and other researchers that have as their main objectives to treat and analyze data, aiming at preparing data models, generating information, and improving knowledge through decision-making processes.

To all the researchers and managers that use this book, we hope that adequate and ever more interesting research questions may arise, that analyses may be developed, and that reliable, robust, and useful models for decision-making processes may be constructed. We also hope that the interpretation of outputs may become friendlier and that the use of Excel, SPSS, and Stata may result in important and valuable fruits for new researches and projects.

We would like to thank everyone who contributed and made this book become a reality. We would also like to sincerely thank the professionals at Montvero Consulting and Training Ltd., at the International Business Machines Corporation (Armonk, New York), at StataCorp LP (College Station, Texas), at Elsevier Publishing House, especially Andre Gerhard Wolff, J. Scott Bentley, and Susan E. Ikeda. Lastly, but not less important, we would like to thank the professors, students, and employees of the Economics, Business Administration and Accounting College of the University of Sao Paulo (FEA/USP) and of the Federal University of the ABC (UFABC).

Now it is time for you to get started! We would like to emphasize that any contributions, criticisms, and suggestions will always be welcome. So that, later on, they may be incorporated into this book and make it better.

Luiz Paulo Fávero

Patrícia Belfiore

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset