Chapter 3: Index of Graphs

3.1 Basic Charts

3.2 Thematic Maps

3.3 Control Charts, Pareto, Variability, and Overlay Plots

3.4 Graphs of One Column

3.5 Graphs Comparing Two Columns

3.6 Graphs Displaying Multiple Columns

3.7 Word Cloud

3.8 Summary

This chapter is a quick reference to commonly used graphs in JMP. The format of this chapter differs from other chapters for good reason. In this chapter, you can peruse the graphs much like you would look through a cookbook – find what you want and follow the recipe. A picture of the graph, brief description, required data conditions, usage description, and the steps required to generate the graph immediately follow. This chapter is not intended to be a complete index of graphs available in JMP, but we have tried to choose those that we see used most often. This chapter is for users who know what graph they want and can select it by how it looks or what it’s called.

You will find that many graph windows have additional options that enable you to further enhance your graphical result. However, we have focused on the steps to generate the base case of each graph, which is illustrated in the figure that accompanies each graph.

Some of the graphs illustrated in this chapter are accessed from the Graphs menu, while others are accessed from the Analyze menu.

Statistical output is provided with graphs generated from the Analyze menu. For instructions on sharing or printing graphs and on surfacing graphs into other applications such as Microsoft Word or PowerPoint, interactive HTML, JMP Public, or JMP Live, see Chapter 7.

There is more than one way to generate many of the graphs in this chapter. A selection of graphing methods will be presented including methods available in the more intuitive Graph Builder and Control Chart Builder platforms. You can decide which method works best for you. The preferred method is presented first.

You can customize the appearance of any graph (including colors, markers, axes, legends, and fonts) by interacting with simple palettes and controls or by simply right-clicking on the area or item that you want to change. See Chapter 2, sections 5 and 6, for more details about customizing the appearance of your graphs.

3.1 Basic Charts

The Graph Builder platform produces dozens of graphs for general purposes. The platform responds differently depending on the modeling types of the data and the roles of the columns that you select. In the context of those modeling types, it visually alerts you to charting options that come with your selections. This section is designed to help you understand some of the more commonly used charts that can be produced with this multi-purpose platform and how to produce them.

Because the type of plot generated depends on the columns assigned to the drop zones, a clear understanding of those zones is helpful.

Figure 3.1 Graph Builder Drop Zones

Figure 1.1 Some JMP Help Options

Graph Builder has 12 distinct drop zones (see Figure 3.1):

Y: A column placed in this drop zone will be graphed on the Y axis.

Group Y: A column placed in the drop zone will be used to create separate graphs along the vertical axis. There will be one graph for each level of a nominal or ordinal column. A continuous column is grouped when used in this zone.

X: A column placed in this drop zone will be graphed on the X axis.

Group X: A column placed in the drop zone will be used to create separate graphs along the horizontal axis. There will be one graph for each level of a nominal or ordinal column. A continuous column is grouped when used in this zone.

Wrap: Has the same effect as a Group X or Group Y column except the resulting graphs are presented in a grid or trellis rather than strictly by the horizontal or vertical direction.

Freq: Enables you to assign a frequency column. This is used in instances where the data is summarized. That is, each row in the data table represents more than one observation.

Map Shape: Used to create a map, such as states or counties. This is discussed further in the section on maps.

Overlay: Similar to the Group X or Group Y, a column in this role creates separate graphs for each level of a nominal or ordinal column or for groups of levels for a continuous column. However, the graphs are all placed on the same set of axes with colors used to identify the different values of the overlay column.

Color: The color role is used to color any points on the graph according to the level of a nominal or ordinal variable assigned. If a continuous variable is assigned, a gradient of colors is used.

Size: Used to vary the size of graph elements.

Interval: Columns to be used to determine the numeric value to be used when error bars are drawn on the graph.

Page: Used to specify a nominal or ordinal column whose levels are used to create separate graphs with the same features.

The Graph Builder platform is accessed from the Graph menu. (See Figure 3.2.)

Figure 3.2 Graph Builder Menu

Figure 1.1 Some JMP Help Options

Graphs appear in the center area framed by X and Y drop zones. When columns are selected, dragged, and dropped into zones, graphs instantly appear depending on the data and modeling types of the columns. Additional options become enabled. If column modeling types like continuous and nominal and ordinal are unfamiliar to you, see Section 2.3.

By default, a point chart is generated for any column type when dragged to the X or Y drop zones. Charting choices are revealed in the element icon palette at the top of the Graph Builder window. (See Figure 3.3.)

Figure 3.3 Element Icon Palette

Figure 1.1 Some JMP Help Options

As columns are dragged to drop zones, graphing options become enabled and their icons are highlighted in the palette. Graphing options respond to the data types and modeling types of the columns and the drop zones selected. Unavailable graphing options for the column combinations selected are automatically disabled and appear to be grayed out. You can use the Shift key to apply multiple elements at once. If you are unfamiliar with the graphs depicted in the element type icon palette, experiment by clicking on them when they are enabled. You can also click Undo if you don’t like the result.

Pie Chart

A pie chart is a circular chart divided into areas proportional to the percentages of the whole or total.

Figure 3.4 Pie Chart

Figure 1.1 Some JMP Help Options

Data Table Access

To access the data table used for the above pie chart example, follow Help Sample Data Library Quality Control Failure.jmp.

Usage

Used when representing proportions, percentages, or fractions of any measured quantity. Some examples are market share, customer preferences, and percent of any type of category or group. As shown, pie charts can display percentages of the different failure types. (See Figure 3.4.)

Required

One ordinal or nominal column for labels and a continuous column to define the size of the pie chart sections.

Other column combinations are supported. See Help JMP Help. Type Pie Chart in the search field.

Select Graph Graph Builder. Drag a continuous column with count data to the Y drop zone and drag a nominal variable with the group identifier to the X drop zone. From the element palette, choose the pie element Figure 1.1 Some JMP Help Options. To add percentages to the pie slices as depicted in Figure 3.4, choose Label by Percent of Total Values from the Control Panel on the left.

Dot Plot

While the Pie Chart might be considered the simplest graph to depict the values of categorical data, a Dot Plot is often used to depict the values of continuous data when the data set is fairly small. As with the pie chart, a dot plot is used to visually see the distribution of the data.

Figure 3.5 Dot Plot

Figure 1.1 Some JMP Help Options

In a dot plot, data values are displayed along the axis. Each dot represents a single data point at the point’s value along the axis. From the graph shown in Figure 3.5, you can see that the youngest people in this data table are two individuals who are 38 years old. The oldest are 57 years old. No more than 3 people are the same age as one another.

Data Table Access

To access the data table used for the above dot plot example, follow Help Sample Data Library Fitness.jmp.

Usage

To view the properties of a continuous distribution such as shape and range.

Required

A numeric column for either the X or Y role. In the graph in Figure 3.5, the column is placed in the X role yielding a dot plot that goes from low to high as you move from right to left. If the column were placed in the Y role, the plot would go from low to high as you move from bottom to top.

To generate the graph shown, select Graph Graph Builder. Drag Age to the X drop zone at the bottom of the graphing area. In the Control Panel, change the Jitter setting to Positive Grid. (See Figure 3.6.) In the plot shown, the Control Panel is hidden by clicking on the Done button, and the graph was resized by clicking on and dragging the bottom right corner of the plot.

Figure 3.6 Jitter Setting for Dot Plot

Figure 1.1 Some JMP Help Options

Bar Chart, Line Chart, and Point (or Scatter) Chart

These charts use bars, lines, and points to show lengths or positions proportional to quantities.

Figure 1.1 Some JMP Help Options

Data Table Access

To access the data table used for the above chart examples, follow Help Sample Data Business and Demographic Financial.jmp.

Usage

Similar to pie charts representing individual values, proportions, percentages, or depictions of any measured quantity. Some examples are market share proportions and customer preferences expressed as percentages of any type of category or group. As shown, these charts display profits for six company types in three different chart styles. (See Figures 3.7a, 3.7b and 3.7c.)

Required

One continuous column for the Y drop zone and one nominal or ordinal column for the X drop zone.

Select Graph Graph Builder. Then drag a continuous column to the Y drop zone and drag a nominal or ordinal column to X drop zone. For a bar chart, choose the bar element Figure 1.1 Some JMP Help Options. To add value totals to the bars as depicted, choose Label Value from the Bar Element Properties Panel on the left. Additional elements like Summary Statistic, Error Bars, and labeling are available from the Bar Element Properties Panel on the left. To generate a line chart, choose the line element Figure 1.1 Some JMP Help Options. To generate a scatter chart with a smoother, choose both elements (scatter chart and smoother) Figure 1.1 Some JMP Help Options. Many other charts are supported. Experiment!

Note

The bar chart and line chart summarize your data based on the groups defined by the nominal column placed in the X drop zone. The point chart does not summarize the data but shows each row in the data table as a separate point.

3.2 Thematic Maps

The Graph Builder platform can be used to create interactive maps with boundaries such as states, provinces, or county boundaries and street-level mapping. These mapping tools are included in JMP and stored as shape files, background maps, or as a link to a street map server. Other sources of shape files (for example, ESRI) or map shapes that you create yourself can be used in JMP. See the JMP.com website for add-in utilities to make your own map shapes.

To create thematic maps such as those in Figure 3.8a or 3.8b, your data needs to contain boundary names or abbreviations in a column that match those that appear in the shape file, for example, “California”, “CA” or “Calif”. That column is dragged to the Map Shape drop zone. (See Figure 3.8d.) If you have latitude and longitude columns, you can plot points on a background map as those in Figure 3.8c. (See Section 2.7 for more information about these file types.)

Figure 1.1 Some JMP Help Options

Data Table Access

To access the data table that uses a map shape file, select Help Sample Data Library SAT.jmp.

To access the data table that uses latitude and longitude, select Help Sample Data Library San Francisco Crime.jmp.

Usage

A method to visualize data in a spatial system. Some examples are socio-economic indicators overlaid on political boundaries, crime incidents overlaid on street maps, or where boundaries can be defined in two dimensions.

Required for thematic maps (with a shape file)

One column recognized as a map shape to provide map boundaries and one continuous column to provide a color range.

Optional

A continuous, nominal, or ordinal column for the additional visualizations. See Appendix B for descriptions of these terms. As shown in Figures 3.8a and 3.8b, the maps display US State boundaries for 2004 SAT Scores for Math and percentage of students taking the SAT in that year.

Select Graph Graph Builder. Then, drag a map shape column from the Variables list (for example, State) to the map shape drop zone (Figure 3.8d). The states can be whole names or two-letter abbreviations. Drag a continuous column (for example, 2004 Math) to the color drop zone.

Required for plotting points on a background map (with latitude and longitude)

One column with latitude coordinates in the Y drop zone and one column with longitude coordinates in the X drop zone. Optional: A continuous, nominal, or ordinal column for the additional visualizations. As shown in Figure 3.8c, San Francisco crime incidents are displayed by type of crime overlaid upon a street level map.

Select Graph Graph Builder. Then, drag the longitude column from the Variables list to the X drop zone. Then, drag the latitude column from the Variables list to the Y drop zone. Assuming you have an internet connection, right-click in the map and select Graph Background Map Street Map Service OK. Finally, remove the smoother from the graph.

3.3 Control Charts, Pareto, Variability, and Overlay Plots

This section groups typical graphs used to measure product or service quality. The graphs are common in quality improvement scenarios.

Control charts are graphical and analytic tools for deciding whether a process is in a state of statistical control and for monitoring an in-control process. Control charts help determine whether variations in measurement of a product are caused by small, normal variations that cannot be controlled or by some larger, special cause that can be controlled. The type of chart to use is based on the nature of the data.

Control charts are broadly classified into control charts for variables and control charts for attributes. Control charts for variables and control charts for attributes come in several varieties with names or letters attached to them. Control charts can be generated from the Control Chart Builder under Analyze Quality and Process Control Chart Builder or Control Chart. (See Figure 3.9.) This chapter focuses on use of the Control Chart menu item, which enables you to specify in advance the type of control chart you want to create. In Chapter 4, we will examine the Control Chart Builder menu choice, which provides a more interactive way to explore your data.

Figure 3.9 Quality and Process Control Chart Builder and Control Chart

Figure 1.1 Some JMP Help Options

Note

In a few of the graph instructions that follow, we use the term “numeric,” which refers to the nature of the data. This term is used in place of modeling type (continuous, nominal, ordinal).

Run Chart

A run chart displays a column of data as a connected series of points. (See Figure 3.10.)

Figure 3.10 Control Chart Builder Run Chart

Figure 1.1 Some JMP Help Options

Data Table Access

To access the data table used for this example, follow Help Sample Data Library Quality Control Pickles.jmp.

Usage

Displays data from a column. Frequently used as a first visualization of quality data and to assess ranges of variability. Examples include delay times, process yield, or any other continuous measurement, generally over time. As shown, the run chart displays Acid measurements over 24 Pickle Vats.

Required

One or more continuous columns for the Y role.

Optional

Nominal, ordinal, or continuous columns for the By role.

If data in a column is sorted by ascending values of time, then the X axis will be displayed in the time-sorted order.

Select Analyze Quality and Process Control Chart Run Chart. Assign a continuous column to the Y role.

Individual & Moving Range Chart

A process behavior chart that displays individual measurements, an individual measurements chart displays a single measurement with each point. This type of chart is appropriate when there is no logical subgrouping of the data values collected. That is, only one measurement is available for each sample.

The accompanying moving range chart displays moving ranges of two or more successive measurements. Moving ranges are computed using the number of consecutive measurements entered in the Range Span box. The default range span is 2.

Figure 3.11 Control Chart Builder IR Chart with Phases

Figure 1.1 Some JMP Help Options

Data Table Access

To access the data table used for the above chart examples, follow Help Sample Data Library Quality Control Diameter.jmp.

Usage

Used when only one measurement is available for each subgroup. An example would include measurements of the diameter of one part from each of various lots. Individual & moving range charts are efficient at detecting relatively large shifts in the process average. As shown, the charts display the contrasting control limits and process performance before and after a quality improvement as phases 1 and 2. (See Figure 3.11.)

Required

One or more numeric columns for the Process or Y role.

Optional

Continuous, nominal, or ordinal columns for the Sample Label, Phase, and By roles.

Select Analyze Quality and Process Control Chart I/MR Control Chart. Assign a continuous column to the Y role. To add phases as shown, assign a nominal or ordinal column to the Phase role (see Figure 3.11)

XBar & R Chart

An XBar and R chart displays quality characteristics measured on a continuous scale. A typical analysis shows both the process mean and its variability, aligned above a corresponding range or standard deviation chart, respectively.

Figure 3.12 Control Chart Builder XBar and R Chart

Figure 1.1 Some JMP Help Options

Data Table Access

To access the data table used for these chart examples, follow Help Sample Data Library Quality Control Diameter.jmp.

Usage

Normally used for numeric data that is recorded in subgroups in some logical manner (for example, three production parts measured every hour). A special cause, such as a broken tool, will then appear as an abnormal pattern of points on the chart. As shown, the chart displays several sample mean diameters outside of the control limits. (See Figure 3.12.)

Required

One or more numeric columns for the Process (or Y) role.

Optional

Nominal, ordinal, or continuous columns for the Sample Label and By roles.

Select Analyze Quality and Process Control Chart XBar Control Chart. Assign a continuous column to the Y role. Assign a column to the Subgroup (or X) role. (See Figure 3.12.)

P Chart

A P chart is an attribute chart that displays the proportion of nonconforming (defective) items in subgroup samples, which can vary in size. Because each subgroup for a P chart consists of N items, and an item is judged as either conforming or nonconforming, the maximum number of nonconforming items in a subgroup is N. The binomial distribution is used to calculate P chart control limits.

Figure 3.13 Control Chart Builder P Chart of Defective

Figure 1.1 Some JMP Help Options

Data Table Access

To access the data table used for these chart examples, follow Help Sample Data Control Charts Washers.jmp.

Usage

Used when each sample is assessed as good or bad, that is, it passes or fails. The proportion is then the number of good items divided by the total number of items in the sample, where the sample sizes can vary. As shown, the chart displays the proportion of defective washers across many subgroups of washers within and outside the control limits. (See Figure 3.13.)

Required

One or more numeric columns for the Process (or Y) role. A constant or variable sample size can be specified and must be numeric.

Optional

Continuous, nominal, or ordinal columns for the Subgroup, Phase, and By roles.

Select Analyze Quality and Process Control Chart P Control Chart. Assign a continuous, numeric column to the Y role and, optionally, a numeric column to the n Trials role.

NP Chart

An NP chart is an attribute chart that displays the number of nonconforming (defective) items in fixed-sized subgroup samples. Because each subgroup for an NP chart consists of Ni items, and an item is judged as either conforming or nonconforming, the maximum number of nonconforming items in subgroup i is Ni. The binomial distribution is used to calculate the control limits of an NP Chart.

Figure 3.14 Control Chart Builder NP Chart of Defective

Figure 1.1 Some JMP Help Options

Data Table Access

To access the data table used for the above chart examples, follow Help Sample Data Library Quality Control Washers.jmp.

Usage

A fixed sample is taken from an established number of transactions or manufactured items each month. From this sample, the number of transactions or items that had one or more errors is counted. The control chart then tracks the number of items with errors per group or lot. As shown, the chart displays the number of defective washers across many lots of washers within and outside the control limits. (See Figure 3.14.)

Required

One or more numeric columns for the Y (or Process) role. A numeric Sample Size must be specified for each subgroup in the n Trials role.

Optional

Continuous, nominal, or ordinal columns for the Subgroup, Phase, and By roles.

Select Analyze Quality and Process Control Chart NP Control Chart. Assign a continuous column to the Y role and a column representing lot size to the n Trials role. Optionally, assign a column representing subgroups to the Subgroup role.

C Chart

A C chart is an attribute chart that displays the number of nonconformities (defects) in a subgroup. The Poisson distribution is used to calculate the control limits in a C chart.

Figure 3.15 Control Chart Builder C Chart of Flaws

Figure 1.1 Some JMP Help Options

Data Table Access

To access the data table used for the above chart examples, follow: Help Sample Data Library Quality Control Fabric.jmp.

Usage

Used when one or more errors might propagate within the same sample, such as flaws on a DVD. As shown, the chart displays the number of flaws in each bolt of fabric within and outside the control limits. (See Figure 3.15.)

Required

One or more numeric columns for the Y (or Process) role.

Optional

Continuous, nominal, or ordinal columns for the Subgroup, Phase, and By roles. A constant or variable sample size can be specified in the n Trials role and must be numeric.

Select Analyze Quality and Process Control Chart C Control Chart. Assign a continuous column representing defects to the Y role. Assign a column representing the subgroups to the Subgroup role.

U Chart

A U chart is an attribute chart that displays the number of nonconformities (defects) per unit in subgroup samples that can have a varying number of inspection units. The Poisson distribution is used to calculate the control limits in a U chart.

Figure 3.16 Control Chart Builder U Chart of Defects

Figure 1.1 Some JMP Help Options

Data Table Access

To access the data table used for the above chart examples, follow Help Sample Data Library Quality Control Braces.jmp.

Usage

To count the number of defects in subgroups of varying numbers. As shown, the chart counts the number of defects on braces in groups of braces of varying size on specific dates and indicates whether the count is within or outside of the control limits. (See Figure 3.16.)

Required

One or more numeric columns for the Y (or Process) role. The column assigned to the n Trials role must be numeric.

Optional

Continuous, nominal, or ordinal columns for the Subgroup, Phase, and By roles.

Select Analyze Quality and Process Control Chart U Control Chart. Assign a continuous column representing defects to the Y role and a continuous column representing the number of units to the n Trials role.

Variability Chart

A variability chart illustrates how numeric values vary across a categories or subgroups. Along with the data, you can view the mean, range, and standard deviation of the data in each category. The analysis options assume that the primary interest is in how the mean, range, and variance change across the categories. One use of this graph is in assessing the variability caused by the measurement system itself. That is, how much of the variability is in the product and how much is caused by the way we measure it (for example, the instruments or operators).

Figure 3.17a Graph Builder Variability Chart of Diameter

Figure 1.1 Some JMP Help Options

Data Table Access

To access the data table used for these chart examples, follow Help Sample Data Library Quality Control Diameter.jmp.

Usage

For viewing the ranges, standard deviation, and means of a measured column across groups and subgroups. As shown, part diameter variability is displayed across different operators. (See Figure 3.17a).

Required

At least one numeric column for the Y, Response role and at least one nominal, ordinal, or continuous column for the X, Grouping role.

Using the Graph Builder method, select Graph Graph Builder. Drag a continuous column to the Y drop zone. Drag one or more nominal or ordinal columns to the X drop zone for groups or subgroups. Use selections on the element palette and line and point controls to modify the graph.

To produce a variability chart for a variability study, select Analyze Quality and Process Variability / Attribute Gauge Chart. (See Figure 3.17b.)

Figure 3.17b Classic Variability Chart

Figure 1.1 Some JMP Help Options

Select one column for the Y, Response role and one or more columns for the X, Grouping role. Multiple nominal columns for the X, Grouping role produce horizontally nested results by the subgroup, overlaid. Click the Help button for many additional options.

Overlay Plots

Like the variability chart presented earlier, the overlay plot enables you to visualize values over any specified times or groups. A key difference, however, is that this plot enables you to specify multiple Y columns and group values in a meaningful way. An example of an overlay plot with more than one Y value and more than one X value is shown for reference here. (See Figure 3.18a.)

Figure 3.18a Overlay Plot Example

Figure 1.1 Some JMP Help Options

Data Table Access

To access the data table used for Figure 3.18a, follow Help Sample Data Library Lipid Data.jmp.

The overlay plot pictured is a scatter plot of weight by age, with separate plots for each Gender and Alcohol Use combination. To generate an overlay plot like the one pictured:

1. Select Graph Graph Builder.

2. Drag a continuous column of interest, in this case Weight, to the Y drop zone on the left side of the graph area.

3. Drag a second continuous column of interest, in this case Age, to the X drop zone at the bottom of the graph area. You now have a scatter plot of Weight by Age as shown in Figure 3.18b.

Figure 3.18b Scatter Plot of Weight by Age

Figure 1.1 Some JMP Help Options

4. To get separate, side-by-side scatter plots for each level of a categorical variable, in this case Gender, drag that column to the Group X drop zone at the top of the graphing area.

5. Finally, to get separate plots vertically for each level of a different categorical variable, in this case Alcohol Use, drag that column to the Group Y drop zone at the right side of the graphing area.

Alternate Overlay Plots

Another type of overlay plot produces overlays of columns or groups on a single bivariate plot.

Figure 3.19a Alternate Overlay Plot

Figure 1.1 Some JMP Help Options

Usage

You want to overlay multiple groups of data into one graph when there is a single X and a single Y axis. This graph shows a scatter plot of weight by age with contrasting colors for each gender (see Figure 3.19a).

Required

Two numeric columns, one for the Y role and one for the X role. A nominal or ordinal column for the Overlay role.

1. Select Graph Graph Builder.

2. Drag a continuous column of interest, in this case Weight, to the Y drop zone on the left side of the graph area.

3. Drag a second continuous column of interest, in this case Age, to the X drop zone at the bottom of the graph area. As before, you now have a scatter plot of Weight by Age.

4. Drag a nominal or ordinal column, in this case Gender to the Overlay drop zone at the top right corner

A final type of overlay plot enables you to use multiple continuous columns in the Y role with a single X column.

Figure 3.19b Overlay Plot with Multiple Y Columns

Figure 1.1 Some JMP Help Options

Data Table Access

To access the data table used for this graph, follow Help Sample Data Library XYZ Stock Averages (Plots).jmp.

Usage

You want to overlay multiple groups of data into one graph when there is a single X axis, but multiple Y columns. The graph shows the Dow Jones High, Low, and Close index for a period of time with contrasting colors for High, Low, and Close (see Figure 3.19b).

Required

At least two numeric columns for the Y role and one numeric column for the X role. Optional: Continuous, nominal, or ordinal columns for the Grouping role.

1. Select Graph Graph Builder.

2. Select at least two continuous columns simultaneously (in this case DJI High, DJI Low, and DJI Close) and drag them to the Y drop zone on the left side of the graph area.

3. Drag another continuous column of interest, in this case Date, to the X drop zone at the bottom of the graph area

Note

When using multiple continuous Y columns in this manner, they should be columns that have magnitudes close to one another. Columns with magnitudes that differ widely will create a graph that is difficult to read.

Note

When the overlay chart appears, additional customizations are available on the element icon palette and with a right mouse click inside the graph frame.

Pareto Plot

A chart often included as a quality metric for processes and products, the Pareto plot produces charts to display the relative frequency of problems in a quality-related process or operation. A Pareto plot is a bar chart that displays the classification of problem occurrences, arranged in decreasing order. The column with values that are the cause of a problem is displayed as X in the plot. An optional column with values assigning the frequencies is assigned as Freq. An optional column whose value holds a weighting value is assigned as Weight.

Figure 3.20 Pareto Plot of Failure

Figure 1.1 Some JMP Help Options

Data Table Access

To access the data table used for the above chart example, follow Help Sample Data Library Quality Control Failure.jmp.

Usage

For counts of defects by occurrence of defect causes. The plot can be used to target improvement efforts toward those failures that are most serious or common. As shown, the chart displays defect counts and cumulative percents of seven types of semiconductor defects. (See Figure 3.20.)

Required

At least one continuous or nominal column for the Cause role. Additional options for Grouping, Frequency and Weight roles.

To generate the plot, select Analyze Quality and Process Pareto Plot.

3.4 Graphs of One Column

Unlike the charts introduced so far, graphs in this section are accompanied by statistical results. These graphs depict the distribution of values for one column of data (also known as univariate) and provide appropriate tools to assess their properties.

These graphs help you understand the nature of a column, such as how widely the values vary or whether there are any curious qualities to the data.

Most of these graphs are found within the Distribution platform from the Analyze menu. We also briefly cover time series graphs in this section.

Many JMP graphs can be saved as interactive HTML and retain their interactivity when opened in a web browser.

Note

You can choose more than one column with these graphs, but each column will be graphed and independently analyzed side-by-side. When you are looking at more than one column, the graphs are linked, which enables you to click on any part of the graph to see and explore those values represented in the graphs of other selected columns. See Chapter 5 for more information about how graphs and data are linked.

Histograms and Bar Charts

The Distribution platform examines properties of a continuous, nominal, or ordinal column individually, or in a univariate fashion.

Figure 1.1 Some JMP Help Options

Data Table Access

To access the data table used for the above distribution chart examples, follow Help Sample Data Library Financial.jmp.

Continuous Usage

To view the properties of a continuous distribution such as shape, range, and data density. As shown in Figure 3.21a, the chart displays the profits (or losses) of a selection of technology companies from the late 1990s.

Continuous Distribution Requires

One or more continuous columns for the Y, Columns role.

Nominal, Ordinal Usage

The default plot is similar to a bar chart and enables you to view the properties of a frequency distribution such as the relative counts or percentages of fixed groups. As shown in Figure 3.21b, the chart displays the frequency of company types as bars for a selection of technology companies from the late 1990s. The nominal and ordinal distribution plots are related to the mosaic plot in the next section.

Frequency Distribution Requires

One or more nominal or ordinal columns for the Y, Columns role.

Select Analyze Distribution. Select a column and place it in the Y, Columns role, and click OK.

Optional

To generate a distribution plot like the ones pictured using Graph Builder select Graph Graph Builder. Drag a continuous or nominal column to the Y drop zone. Select the histogram element Figure 1.1 Some JMP Help Optionsfrom the elements palette.

Outlier Box Plot

An outlier box plot is a chart for detecting extreme values and properties of a distribution, sometimes called a Tukey box plot. See Appendix B for a description of this term.

Figure 3.22 Tukey Box Plot of Profit

Figure 1.1 Some JMP Help Options

Data Table Access

To access the data table used for the above chart example, follow Help Sample Data Library Financial.jmp.

Usage

To view the properties of a continuous distribution such as quartiles, moments, and outliers. See Appendix B for a description of these terms.

As shown, the plot displays a few very profitable companies as points that are well beyond the main body of companies, the middle half of which are contained in the box. (See Figure 3.22.)

Required

One or more continuous columns for the Y, Columns role.

Select Analyze Distribution. Select a continuous column and place it in the Y, Columns role, and click OK. By default, the outlier box plot is displayed with the histogram. If it is not, click the red triangle and select Outlier Box Plot.

Optional

To generate an outlier box plot like the one pictured using Graph Builder select Graph Graph Builder. Drag a continuous column to the Y drop zone. Select the Box Plot element Figure 1.1 Some JMP Help Optionsfrom the Elements palette.

Normal Quantile Plot

A normal quantile plot is a chart for visualizing the extent to which a column is consistent with a normal distribution. In a normal distribution, the points would fall about the solid red line in the display and not beyond the confidence curves.

Figure 3.23 Normal Quantile Plot of Profits with Box Plot

Figure 1.1 Some JMP Help Options

Data Table Access

To access the data table used for the above chart example, follow Help Sample Data Library Financial.jmp.

Usage

To view the properties and visually assess the extent to which the data is normally distributed. (See Figure 3.23.)

In this example, the plot displays the profits from a sample of companies. The data do not follow the solid red line; some fall beyond the dotted red confidence bands. The data are not consistent with a normal distribution.

Required

One or more continuous columns for the Y, Columns role.

Select Analyze Distribution. Drag a continuous column into the Y, Columns role, and click OK. Click the red triangle and select Normal Quantile Plot.

Mosaic Plot

A mosaic plot is a stacked bar chart where each segment is proportional to its group’s frequency count.

Figure 3.24 Mosaic Plot of Type

Figure 1.1 Some JMP Help Options

Data Table Access

To access the data table used for the above chart examples, follow Help Sample Data Library Financial.jmp.

Usage

To view the properties of a nominal or ordinal distribution, or to visually assess the proportions of data that fall within each group. As shown, the chart displays the proportions or counts of each type of company from a stock portfolio. (See Figure 3.24.)

Required

One or more nominal or ordinal columns for the Y, Columns role.

Select Analyze Distribution. Drag a nominal or ordinal column into the Y, Columns role, and click OK. Click the red triangle and select Mosaic Plot.

Optional

To generate a mosaic plot like the one pictured using Graph Builder, select Graph Graph Builder. Drag a nominal column to the Y drop zone. Select the Mosaic element Figure 1.1 Some JMP Help Optionsfrom the Elements palette.

Time Series

Time Series is a separate platform that generates a graph of a numeric value over time. It also serves as a platform to employ forecasting techniques and produces statistical results. For more information about these techniques, see the JMP Statistics and Graphics Guide (Help > JMP Documentation Library > Predictive and Specialized Modeling). The Time Series platform is available from the Analyze menu and the Modeling submenu. (See Figure 3.25a.)

Figure 3.25a Analyze Modeling Time Series

Figure 1.1 Some JMP Help Options

Data Table Access

To access the data table used for these chart examples, follow Help Sample Data Library Time Series Raleigh Temps.jmp.

Time Series Plot

A graph of numeric values, Y, over a time order, X.

Usage

To view and fit the variability and potential seasonality of a measured value over time. For example, the chart displays the average monthly temperatures with a clear seasonal trend in Raleigh, North Carolina, over a 130-month period. (See Figure 3.25b.)

Figure 3.25b Time Series

Figure 1.1 Some JMP Help Options

Required

One numeric column for the Y, Time Series role. Options include a numeric Time column (X, Time ID) with corresponding values and an input column. If an X, Time ID column is not specified, JMP orders the data over the rows sequentially. It is assumed that the time interval is constant between every pair of time points.

Select Analyze Modeling Time Series, drag a continuous column to the Y, Time Series role, and click OK. Select a numeric column for the Time, ID role.

3.5 Graphs Comparing Two Columns

The Fit Y by X command studies the relationship between two columns of data. This command is available from the Analyze menu and shows graphs with statistical results for each pair of x and y columns. The type of graph generated by JMP is determined by the modeling types (continuous, nominal, or ordinal) of the columns that are cast into the X and Y roles. When choosing a platform from the Analyze menu, JMP always creates the right graphs based on the modeling type. In important ways, Fit Y by X is four sets of graphs and analyses in one!

The matrix circled in the Fit Y by X window (see Figure 3.26) provides a visual preview of the graphs that will be generated depending on the modeling type of the Y (the vertical axis) and the X (the horizontal axis), which can be altered to obtain the desired analysis or plot.

Note

If the column modeling types nominal, ordinal, and continuous are unfamiliar to you, see Section 2.3.

Figure 3.26 Fit Y by X Contextual

Figure 1.1 Some JMP Help Options

Scatterplot

A scatterplot is a graph of the continuous-by-continuous personality within the Fit Y by X command. The analysis begins as a scatterplot of points to which you can interactively add a linear fit and confidence curves.

Figure 1.1 Some JMP Help Options

Data Table Access

To access the data table used for the above chart examples, follow Help Sample Data Library Financial.jmp.

Usage

To view the relationship of a continuous column to another continuous column. An example might be graphing the relationship of profits to assets for a selection of Fortune 500 companies and then fitting a regression line with 95% confidence curves, as shown. (See Figures 3.27a, 3.27b, 3.27c, and 3.27d.)

Required

One continuous column for the Y, Response role and one continuous column for the X, Factor role.

Select Analyze Fit Y by X, select a continuous Y, Response column and a continuous X, Factor column, and click OK. (See Figure 3.27a.)

To add the simple linear least squares fit: From the red triangle next to Bivariate Fit, select Fit Line. (See Figure 3.27b.)

To add the confidence shaded curves to the fit: From the red triangle on the Linear Fit item (see Figure 3.27c), select Confid Shaded Fit. (See Figure 3.27d.)

Optional

Using the Graph Builder method, select Graph Graph Builder and drag a continuous column to the Y drop zone and a continuous column to the X drop zone. Select the Line Of Fit Figure 1.1 Some JMP Help Options element from the elements palette.

Note

Scatterplots and bivariate fits are also supported in the Graph Builder Platform. These graphs can show colored markers. For more information about how to color or mark rows, see Section 2.6.

Scatterplot (with Polynomial Fit)

A scatterplot with polynomial fit is a graph that fits a polynomial curve to the degree you select from the Fit Polynomial submenu. After you select the polynomial degree, the curve is fit to the data points using least squares regression.

Figure 3.28 Bivariate 3rd Order Fit

Figure 1.1 Some JMP Help Options

Data Access Table

To access the data table used for the above chart example, follow Help Sample Data Library Financial.jmp.

Usage

To view the relationship of a continuous column to another continuous column using a linear polynomial fit where curves produce the best fit of the data. The chart displays the third-order polynomial fit showing the relationship of profits to number of employees for a selection of companies. (See Figure 3.28.)

Required

One continuous column for the Y, Response column and one continuous column for the X, Factor role.

Select Analyze Fit Y by X, select a continuous Y, Response column and a continuous X, Factor column, and click OK. From the red triangle, select Fit Polynomial and select a degree number from the submenu.

Optional

Using the Graph Builder method, select Graph Graph Builder and drag a continuous column to the Y drop zone and a continuous column to the X drop zone. Select the Line Of Fit Figure 1.1 Some JMP Help Options element from the elements palette, then adjust the degree of fit in the Line Of Fit panel.

Scatterplot (with Spline Fit)

A scatterplot with spline fit is a chart that fits a smoothing spline that varies in smoothness (or flexibility) according to a tuning parameter in the spline formula. You can use a spline of varying smoothness to highlight the overall trends in the data without using a linear function to describe the relationship. Larger values of the tuning parameter (lambda) smooth the relationship more, while smaller values fit the data more closely.

Figure 3.29 Bivariate Spline Fit

Figure 1.1 Some JMP Help Options

Data Table Access

To access the data table used for the above chart example, follow Help Sample Data Library Financial.jmp.

Usage

To view the relationship of a continuous column to another continuous column. For example, the chart illustrates that limited profit variation is present in companies with lower numbers of employees. The plot also shows that fewer companies have higher numbers of employees and higher profit variation. (See Figure 3.29.)

Required

One continuous column for the Y, Response role and one continuous column for the X, Factor role.

Select Analyze Fit Y by X. Select a continuous column and place it in the Y, Response role. Select a continuous column, place it in the X, Factor role, and click OK. From the red triangle, select Flexible Fit Spline, and from the submenu, select the degree of flexibility that you want in the spline fit by changing the lambda value.

Optional

Using the Graph Builder method, select Graph Graph Builder and drag a continuous column to the Y drop zone and drag a continuous column to the X drop zone. Adjust lambda to the value desired.

Oneway Plots

The Oneway platform analyzes how the distribution of a continuous Y column differs across groups defined by a categorical X column. Group means, as well as other statistics and tests, can be calculated and tested. The Oneway platform is the continuous (placed as Y) by nominal/ordinal (placed as X) personality of the Fit Y by X command.

Figure 1.1 Some JMP Help Options

Data Table Access

To access the data table used for these chart examples, follow Help Sample Data Library Financial.jmp.

Usage

To compare the relationship of a continuous column across groups in a nominal or ordinal column. For example, the chart displays the difference in means and variation in profits per employees across six company types. (See Figures 3.30a, 3.30b, and 3.30c.)

Required

One continuous column for the Y, Response column and one nominal or ordinal column for the X, Factor role.

Select Analyze Fit Y by X. Select a continuous column for the Y, Response role and a nominal or ordinal column for the X, Factor role, then click OK. (See Figure 3.30a.)

From the red triangle, select Means/Anova. (See Figure 3.30b.) From the red triangle, select Quantiles. (See Figure 3.30c.)

Logistic Fit

The Logistic platform analyzes the probably of occurrence of one level of a categorical Y column across a continuous X column. Odds ratios and ROC curves are often reported from this type of analysis. The Logistic platform is the nominal/ordinal (placed as Y) by the continuous (placed as X) personality of the Fit Y by X command.

The Logistic platform displays a chart that estimates the probability of choosing one of the Y response levels as a smooth function of the X factor. The fitted probabilities will be between 0 and 1 and will sum to 1 across the response levels for a given factor value.

In a logistic probability plot, the Y axis represents probability.

Figure 3.31 Fit Y by X Logistic

Figure 1.1 Some JMP Help Options

Data Table Access

To access the data table used for Figure 3.31, follow Help Sample Data Library Companies.jmp.

Usage

To predict a group or groups by some continuous column. The chart displays a prediction (as a probability/percentage) of the type of company (Computer or Pharmaceutical) by its profits per employee. (See Figure 3.31.)

Required

One nominal or ordinal column for the Y, Response column and one continuous column for the X, Factor role.

Select Analyze Fit Y by X. Select a nominal or ordinal column for the Y, Response role and a continuous column for the X, Factor role, then click OK.

Mosaic Plot

A mosaic plot is a chart that is divided into small rectangles such that the area of each rectangle is proportional to a frequency count of interest.

The mosaic plot appears in the Contingency Platform and is the personality of the Fit Y by X command when both the Y and X columns are nominal or ordinal. Mosaic examines the distribution of a categorical Y column by the values of a categorical X column.

Figure 3.32 Fit Y by X Contingency Mosaic Plot

Figure 1.1 Some JMP Help Options

Data Access Table

To access the data table used for Figure 3.32, follow Help Sample Data Library Consumer Preferences.jmp.

Usage

Group-by-group counts are shown as proportional colored rectangles in a two-by-two arrangement. As shown, the graph displays a simple color chart of the proportion of seven age groups displayed by four employee tenure groups in a large company. Older employees are represented in red shades while younger employees are represented in blue shades. (See Figure 3.32.) From this graph, you can see half of the tenure group less than 5 years are between 25 and 29 years old. Half of the tenure group more than 20 years are greater than 54 years old.

Required

One nominal or ordinal column for the Y, Response column and at least one nominal or ordinal column for the X, Factor role.

Select Analyze Fit Y by X. Select a nominal or ordinal column for the Y, Response role and a nominal or ordinal column for the X, Factor role, and click OK.

Optional

Using the Graph Builder method, select Graph Graph Builder and drag a nominal or ordinal column to the Y drop zone and a nominal or ordinal column to the X drop zone. Select the mosaic icon Figure 1.1 Some JMP Help Optionsfrom the Elements palette.

3.6 Graphs Displaying Multiple Columns

Sometimes it is valuable to see a problem in more than two dimensions. This section uses JMP to visualize three or more columns at once. These graphs appear under the Graph menu and contain only a few built-in analytic procedures. (See Figure 3.33.) The Profiler, discussed at the end of this section, also appears in the results of the Fit Model platform in the Analyze menu.

Figure 3.33 Graph Menu

Figure 1.1 Some JMP Help Options

Like most JMP graphs, these multi-dimensional (or multi- column) graphs are interactive and enable you to select, rotate, and animate them. You can copy and paste these into other documents. At this book’s release, many graphs can be saved as interactive HTML and retain their interactivity when opened in a web browser. Sharing these graphs is discussed in detail in Chapter 7.

Scatterplot 3D

Accessible from the Graph menu, this chart displays a three-dimensional scatterplot that can be rotated with your mouse. The Scatterplot 3D platform displays three columns at a time from the columns that you select.

Figure 3.34 Scatterplot 3D

Figure 1.1 Some JMP Help Options

Data Access Table

To access the data table used for Figure 3.34, follow Help Sample Data Library Financial.jmp.

Usage

To view patterns among any two or three columns of data. This plot is very useful for exploring data in three dimensions. The chart displays sales (Y axis) by number of employees (X axis) by profits (Z axis) with colored 3D density contours by company type. (See Figure 3.34.) For example, note the two points with a very high employee count also have very high profits. The eye can detect possible differences among company types across the three columns. This is an interactive plot and can be rotated on any axis. To rotate the graph, click and hold the graph and move the mouse.

Required

Two or more columns of any modeling type (can be continuous, nominal, or ordinal). Three columns are required for a three-dimensional plot.

Select Graph Scatterplot 3D. Select at least two columns (three are recommended), place them in the Y, Columns role, and click OK. To include surfaces as displayed, from the red triangle, select Nonpar Density Contour, grouped by Type.

Treemap

A treemap is a graphical technique of observing patterns among groups that have many levels. Treemaps are especially useful in cases where histograms are ineffective because there are so many bars.

Figure 3.35 Treemap Airline Delays

Figure 1.1 Some JMP Help Options

Data Table Access

To access the data table used for Figure 3.35, follow Help Sample Data Library Airline Delays.jmp.

Usage

For example, the chart displays airline arrival delays by month, where hot colors represent longer delays and cold colors represent shorter delays by airline. Larger squares represent longer delays as well. (See Figure 3.35.) These maps produce convenient visual rankings or groups within groups.

Required

At least one nominal, or ordinal column for the X role.

Select Graph Graph Builder and drag Airline and Month to the X drop zone. Select the treemap icon Figure 1.1 Some JMP Help Optionsfrom the Elements palette. Drag Arrival Delay to the Color drop zone. Click Done. Experiment!

Packed Bar Chart

A packed bar chart is the synthesis of an ordered bar chart and a treemap. Like treemaps, packed bars are especially useful in cases where there are dozens or even hundreds of levels of a categorical variable. Packed bar charts display these levels in ranked order, highlighting the most frequently occurring values.

Figure 3.36 Packed Bars from Billion Dollar Events

Figure 1.1 Some JMP Help Options

Data Table Access

To access the data table used for the example, follow Help Sample Data Library Billion Dollar Events.jmp.

Usage

For example, in Figure 3.36, the chart displays the top categories as a bar chart with blue bars. Secondary categories are labeled and in gray. In this example, you can clearly see that Katrina in 2005 had the largest impact, followed by Harvey and Maria in 2017.

Required

One continuous column for the X drop zone and one nominal for the Y drop zone.

Select Graph Graph Builder. Select Unique Event and drag it to the Y zone. Select

Cost and drag it to the X zone. Select the Bar Figure 1.1 Some JMP Help Options element. In the Bar options panel:

• For the Bar Style, select Packed.

• Change Packed Primaries to 10.

• Move the Packed Labeling slider down until it’s about halfway.

Parallel Plot

A parallel plot is used when visualizing more than three dimensions in one graph. It is useful for seeing how the values of the many variables separate or stay together as you scan the plot from left to right. The parallel plot is scalable to any number of variables.

Figure 3.37 Parallel Plot of Passenger Information on the Titanic

Figure 1.1 Some JMP Help Options

Data Table Access

To access the data table used for the example, follow Help Sample Data Library Titanic.jmp.

Usage

For example, in Figure 3.37, the chart displays the three top predictors of survival among the Titanic passengers. The thickness of the curved lines is reflective of the number of passengers in each of the groups. In this example, when clicking on the class you can see the relative passenger frequencies across the other category variables and whether this group survived or not. Try it!

Required

Continuous or nominal columns for the X drop zone.

Select Graph Graph Builder. Select Class, Age, Sex and Survived and drag the columns to the X zone. Select the Parallel Figure 1.1 Some JMP Help Options element. Select Class and drag it to the Color drop zone.

Bubble Plot

A bubble plot is an interactive scatterplot that represents its points as circles (bubbles). Optionally, the bubbles can be sized according to another column, colored by yet another column, and dynamically indexed by a time column. With the opportunity to see up to five dimensions at once (x position, y position, size, color, and time), bubble plots can produce dramatic animated visualizations and are effective at communicating complex relationships.

Figure 3.38 Bubble Plot Window with Assigned Variables

Figure 1.1 Some JMP Help Options

Data Table Access

To access the data table used for these chart examples, see the author page for this book. The original data comes from the World Bank: https://datacatalog.worldbank.org/dataset/world-development-indicators

Usage

Summarizing multi-column data in an interactive two-dimensional display. Frequently used where time is one of the columns. The bubble plot in this example displays the relationship between life expectancy at birth and health expenditure in 2010 U.S. dollars. (See Figure 3.39.)

Figure 3.39 Bubble Plot Life Expectancy Versus Health Expenditure

Figure 1.1 Some JMP Help Options

Required

One X column and one Y column of any type (continuous, nominal, or ordinal). For time animation, select a column for the Time role.

Select Graph Bubble Plot. Select Life expectancy at birth, total (years) and place it in the Y role. Select Health expenditure (2010 int.-$) and place it in the X role. For the animated time-based plot, select the Time column as Year. The ID column produces a label for each bubble. For this plot, select Entity as the ID role. The Sizes and Coloring columns can be specified to increase information density. For this plot, select Total population (Gapminder) as the Sizes role and select Entity for the Coloring role. (See Figure 3.38.)

To generate trail lines (as shown), select the Trail Lines and Trail Bubbles options from the red triangle. To save an interactive view of this plot for viewing in a web browser, select File Save As and choose Interactive HTML with Data.

Note

Bubble plots are particularly compelling when shared in a JMP Live environment. For more information about sharing results with JMP Live, see Section 7.6 “Publishing Reports in JMP Public (and JMP Live).”

To turn off the legend, select the red triangle, then select Legend.

Select the play Figure 1.1 Some JMP Help Options button to start the animation. You can choose to filter the graph to include only data since 1970 as is done in the data filter shown. Use the Year, Speed and Bubble Size sliders to modify the data display and animation. Experiment!

Scatterplot Matrix

A scatterplot matrix is a chart that provides quick and orderly production of many bivariate graphs. These are assembled so that comparisons among many columns can be conducted visually and that correlation and data pattern can be easily detected. In addition, the plots can be customized and use advanced features (such as density ellipses) to provide for further analysis.

Figure 3.40a Scatterplot Matrix with Density Ellipses

Figure 1.1 Some JMP Help Options

Figure 3.40b Scatterplot Matrix with Density Ellipses and Groups

Figure 1.1 Some JMP Help Options

Data Access Table

To access the data table used for the above examples, follow Help Sample Data Library US Demographics.jmp.

Usage

In this example, the scatterplot matrix provides every bivariate combination of nine columns of US state demographic data. (See Figures 3.40a and 3.40b.) This chart quickly produces many correlation plots of all variables specified for easy identification of interesting groups and patterns.

Required

Two or more columns of any type (continuous, nominal, or ordinal) for the Y, Columns role. More than two columns are recommended.

Select Graph Scatterplot Matrix.

Select at least one column for the Y, Columns role. Select multiple Y and X columns for a matrix of graphs. Optionally, select a column and place it in the X role. Select a nominal or an ordinal column and place it in the Group role.

To include grouped ellipses as shown, include a column for the Group role in the window. Then for the plot, select Density Ellipses from the red triangle.

Profiler

The profiler is an interactive graph that provides a simple way to view complex relationships within a model. It lets you visualize what-if scenarios quickly and easily by enabling you to see the effect that changes in one column have on the remaining columns. This tool is especially useful when describing multiple variable models by demonstrating the sensitivity of changes in one or more X columns on the predicted Y.

The profiler displays profile traces for each X column. A profile trace is the predicted response as one column is changed (by dragging the vertical red dotted line in the graphs) while the others are held constant at the current values. The profiler recalculates the predicted responses (in real time) as you vary the value of an X column.

Figure 3.41 Prediction Profiler

Figure 1.1 Some JMP Help Options

Data Table Access

To access the data table used for Figure 3.41, follow Help Sample Data Library Financial.jmp.

Usage

In this example, the chart displays the relationship between a continuous column Profits($M) and four predictors: sales [Sales($M)], number of employees (#emp), assets [Assets($Mil.)], and stockholder’s equity [Stockholder’s Eq($Mil.)]. (See Figure 3.41.)

The profiler shows that as sales increase, predicted profits increase, and as assets increase, profits decrease. To generate interactive, real-time predictions for profits, drag the vertical red trace lines in the profiler in JMP. To save the profiler as an interactive web page Choose File Save As, then choose Interactive HTML with Data as the file type.

Required

A formula column that can be created in one of two ways: from Fit Model (which generates a formula), or from a formula column either saved from Fit Model or entered in a data table by hand.

To create a formula in a column manually, see Section 2.4.

Create a Profiler from Fit Model

Select Analyze Fit Model, and select one or more columns for the Y role and one or more columns for the Construct Model Effects. Select the Emphasis pop-down menu, select Effect Screening, and click Run Model. The profiler appears at the bottom of the report.

Create a Profiler for Use Outside of JMP

From the Fit Model report window from the previous step, select the red triangle and select Save Columns Prediction Formula. This action saves a prediction formula to a JMP data table column as the last column in the table.

Once you have a prediction formula in a column, select Graph Profiler from the top menu. Select the prediction formula column that has appeared in the data table, place it in the Y, Prediction Formula role, and click OK. This implementation enables you to save the profiler as an Interactive HTML file with Data. Choose File Save As and choose Interactive HTML with Data.

3.7 Word Cloud

There are many times when data collected is free form text. For example, you might have comments on a product, reports about accidents or incidents, or open-ended questions on a questionnaire. The Text Explorer in JMP enables you to analyze this unstructured text. The process can include excluding specified words, combining similar terms by stemming or recoding, and analysis based on the terms counts. A word cloud is often an excellent way to visualize this analysis of text data. (See Figure 3.42.)

Figure 3.42 Word Cloud for Narrative Cause of Aircraft Incidents

Figure 1.1 Some JMP Help Options

Data Table Access

To access the data table used for the above example, follow Help Sample Data Library Aircraft Incidents.jmp.

Usage

To illustrate, we will use the Aircraft Incidents sample data table. This has information about over 1900 aircraft incidents and accidents that occurred in 2001. Included in the data table is a narrative cause column containing comments on the cause of the incident. Examples of these entries include:

● Improper weather evaluation by both the pilot and pilot/passenger, and the pilot’s inadvertent VFR flight into IMC resulting in his spatial disorientation. Factors were the pilot rated passenger’s spatial disorientation, fog, and night conditions.

● The flight instructor’s failure to ensure (supervision) the student had an adequate supply of fuel available, and the student’s failure to refuel the aircraft sufficiently resulting in a fuel starvation/exhaustion condition and total power loss. Contributing factors were the trees and unsuitable terrain at the forced landing site.

● A loss of right engine power due to chaffed and frayed wiring on the right and left magnetos on the right engine resulting in a forced landing, an inadvertent stall, and subsequent impact with the ground.

Begin by opening the sample data table Aircraft Incidents.jmp.

1. Select Analyze Text Explorer.

2. Assign the column Narrative Cause to the Text Columns role (Figure 3.43).

3. Select OK.

Figure 3.43 Text Explorer for Narrative Cause of Aircraft Incidents

Figure 1.1 Some JMP Help Options

The results show there are 2574 terms found in the 1906 rows in the data table. The terms and phrases that appear most frequently are found at the top of their respective lists.

4. To visualize the terms and the frequency of their appearance, click on the red triangle and select Display Options Show Word Cloud (Figure 3.44).

Figure 3.44 Word Cloud for Narrative Cause of Aircraft Incidents

Figure 1.1 Some JMP Help Options

The words that appear largest in the word cloud are those that appear most often in the text being analyzed. A few things that you might note in these results:

● Pilot’s and pilot are different words. Resulting and resulted also appear separately in the Word Cloud. Words like these could be combined by stemming the words. This is the process of reducing the terms to their roots so that all are treated as the same term in the analysis.

● Airplane and aircraft are both in the word cloud. These words might be substitutes for one another, so you might want to recode aircraft to read airplane (or the other way around if you prefer).

5. To recode aircraft to airplane, click on the red triangle at the top of the results and select Term Options Manage Recodes…

6. In the Manage Recodes window, under the User column, type in aircraft as the old value and airplane as the new value. (See Figure 3.45.)

Figure 3.45 Managing Recodes

Figure 1.1 Some JMP Help Options

7. Click on the Figure 1.1 Some JMP Help Options button to add this recode to the user list, and then select OK.

8. To stem the words, click on the red triangle at the top of the results and select Term Options Stemming Stem for Combining (Figure 3.46).

9. Examine the differences between the two word clouds.

Figure 3.46 Word Cloud After Recoding and Stemming

Figure 1.1 Some JMP Help Options

You will note that some of the words have moved (and changed size) in the new word cloud. You will also see dots at the end of some of the words that have been stemmed and then the stems combined.

The default organization of the word cloud is Ordered, with the most frequently occurring terms at the top left and diminishing in size as you move from left to right and top to bottom. You might prefer that the words be more randomly arranged in an oval (or cloud), with their size still representing the frequency of occurrence.

10. To change the layout of the Word Cloud, click the Word Cloud red triangle and select Layout Centered (Figure 3.47).

Figure 3.47 Word Cloud with Centered Layout

Figure 1.1 Some JMP Help Options

When exploring this unstructured text data with word clouds, it can be interesting to see whether there is a relationship between any other columns in the data table. For example, you could consider doing this analysis by the levels of a categorical column such as Fatal. To do so, redo this analysis with Fatal assigned as a By variable and consider the differences between the word clouds.

3.8 Summary

This chapter presented a series of the most frequently used graphs and their step-by-step recipes. They are presented in a cookbook style so that each graph can be recognized by its picture, name, or definition and easily replicated with simple steps. These charts represent the first tier of commonly used graphs. You can find additional graphs and instructions on how to create them in the JMP documentation, including Essential Graphing. Go to Help JMP Documentation Library Essential Graphing for more information.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset