QMB 6305 - Quantitative Methods for Business - Fall 1997
MINITAB handout
MINITAB Release 7.1 - Standard Version
These notes provide a simple introduction to the use of Minitab (the
Standard version). A list of the commands not included in the
student version are listed at the end of the handout. The topics
include:
0. INTRODUCTION
1. HOW TO GET HELP
2. HOW TO INPUT AND TRANSFORM DATA
3. SAVING AND RETRIEVING DATA
4. REGRESSION (REGRESS)
5. RESIDUALS (REGRESS RESIDS)
6. PREDICTION (REGRESS PREDICT)
7. PLOTS (TSPLOT, PLOT)
8. STORE COMMANDS (STORE)
9. SENDING OUTPUT TO A FILE OR TO A PRINTER.
0. INTRODUCTION
To get into Minitab just type at the Dos prompt
Minitab
and hit ENTER. Once you are in Minitab the prompt
MTB >
indicates that Minitab is waiting for a command.
1. HOW TO GET HELP
HELP. To obtain help at the Minitab prompt type the following
commands one at a time, and follow the directions to get additional
information.
MTB > help
MTB > help commands
MTB > help overview
2. HOW TO INPUT AND TRANSFORM DATA
Data can be input from the keyboard or read in from a file.
2.1. DATA FROM THE KEYBOARD
Here is a very simple example of a Minitab program:
READ C1 C2
40 18
36 32
14 10
END
LET C3 = C1 + C2
PRINT C1-C3
The first command, READ, says to put the data that follow into C1
and C2. The command END tells Minitab you are finished typing
data. At this point, the worksheet looks like:
C1 C2
------------------------------------------
40 18
36 32
14 10
------------------------------------------
The command LET C3 = C1 + C2 tells Minitab to add C1 and C2. After
this command, the worksheet looks like:
C1 C2 C3
-------------------------------------------
40 18 58
36 32 68
14 10 24
-------------------------------------------
The last command, PRINT C1-C3, says to print out the numbers in C1
through C3. Minitab responds with the following output:
ROW C1 C2 C3
1 40 18 58
2 36 32 68
3 14 10 24
2.2 DATA FROM A FILE (AN ASCII FILE)
The commands READ, SET, INSERT, WRITE, SAVE, RETRIEVE, STORE,
EXECUTE, PAPER, JOURNAL and OUTFILE may refer to files. For
example:
READ 'A:FILENAME' C1 C2
reads in the contents of a file in drive A: called FILENAME, and
allocates both columns of the file into columns C1 and C2 in
Minitab. The specific file name must be enclosed in single quotes.
For example:
READ 'A:TREES' C1-C3
The file name must have an extension. By default Minitab appends
the .DAT extension. So from the previews example, you should have
in drive A: a file called TREES.DAT. If the extension is
different than DAT, for instance PRN, then in the READ command you
should type the whole name. For example:
READ 'A:TREES.PRN' C1-C3
means that you have a file in drive A: called TREES.PRN with three
columns and these are being read into columns C1, C2 and C3.
2.3. DATA TRANSFORMATION
Some example of data manipulation follow:
MTB > let c6 = loge(c1)
MTB > let c7 = expo(c6)
MTB > lag c1 c10
The command PRINT allows you to see on the screen the different
variables defined.
MTB > print c1 c10
ROW EXAM 1 C10
1 89 *
2 56 89
3 78 56
4 88 78
5 94 88
6 87 94
7 96 87
8 72 96
Simple statistics on the variables of interest are obtained using
the command DESCRIBE.
MTB > describe c1 c2
N MEAN MEDIAN TRMEAN STDEV SEMEAN
EXAM 1 8 82.50 87.50 82.50 13.31 4.71
EXAM 2 8 84.62 84.00 84.62 7.82 2.76
MIN MAX Q1 Q3
EXAM 1 56.00 96.00 73.50 92.75
EXAM 2 75.00 99.00 77.75 90.00
2.4. HOW TO CREATE SPECIAL VARIABLES
2.4.1. DUMMY(BINARY) VARIABLES IN THE STANDARD VERSION
A dummy variable is a binary variable that takes 0-1 values.
The command to create these variables is INDICATOR (NOT available
in the student version). Example:
If a variable in C1, for example, has 4 different values (e.g., a
variable "quarter" with values 1, 2, 3, and 4) the command
MTB > INDIC C1 C2-C5
creates four binary variables the following way:
C1 C2 C3 C4 C5
1 1 0 0 0
2 0 1 0 0
3 0 0 1 0
4 0 0 0 1
1 1 0 0 0
2 0 1 0 0
3 0 0 1 0
4 0 0 0 1
. . .
2.4.2. DUMMY(BINARY) VARIABLES IN THE STUDENT VERSION
Both the command CODE and the command SET allow to create dummy
variables. The same set of variables as in the previous example can
be created with CODE:
MTB >code (1)1 (2)0 (3)0 (4)0 c1 c2
MTB >code (1)0 (2)1 (3)0 (4)0 c1 c3
MTB >code (1)0 (2)0 (3)1 (4)0 c1 c4
MTB >code (1)0 (2)0 (3)0 (4)1 c1 c5
In the student version the command SET could also be used. The same
set of variables can be created following these steps:
MTB >set c2
DATA>1 3(0) 1 3(0) 1 3(0) (and so on for each variable)
DATA>end
The command SET is very general and allows to create other special
variables like a "time" or "observation" variable. The commands
MTB >set c1
DATA>1:15
DATA>end
create a variable in C1 that contains the values 1,2,3,...,15.
3. SAVING AND RETRIEVING DATA
3.1. SAVING THE DATA (SAVE)
SAVE [in 'FILENAME'] a copy of the worksheet.
Subcommands: PORTABLE LOTUS
After using SAVE, FILENAME will contain all data in the worksheet,
all stored constants, column names, matrices and missing data
codes. SAVE handles alpha columns automatically, just as it does
for numeric columns. If FILENAME is not specified, Minitab will
write the saved worksheet in the file called MINITAB.MTW. Notice,
the FILENAME must be enclosed in single quotes.
3.2. RETRIEVING DATA
A file created by SAVE can only be used by Minitab's RETRIEVE
command. You cannot use your computer's editor to modify or list
it. After the file is RETRIEVED, the worksheet, stored constants,
column names and matrices will be exactly as when they were SAVED.
SAVE handles alpha columns automatically, just as it does for
numeric columns.
4. REGRESSION (REGRESS)
REGRESS C on K predictors C,...,C
REGRESS C on K pred. C,...,C [st. res. in C [fits in C]]
Subcommands: NOCONSTANT XPXINV COOKD VIF
WEIGHTSR MATRIX DFITS DW
MSE RESIDS HI PURE
COEF TRESIDS PREDICT XLOF
TOLERANCE
Subcommands available in the Student Version include:
NOCONSTANT RESIDS PREDICT DW
Fits a regression equation to data. To fit an equation without a
constant (intercept) use the subcommand NOCONSTANT. To do a
weighted fit, use the subcommand WEIGHTS. If you give an
additional column, the standardized residuals will be stored. If
you give a second column the fits will be stored in it. See HELP
REGRESS RESIDS for a definition of standardized residual. To
control the amount of printed output, use the (main) command BRIEF.
Example of a simple linear regression:
MTB > retrieve 'd:\mtabsv\data\exam'
WORKSHEET SAVED 11/18/1987
Worksheet retrieved from file: d:\mtabsv\data\exam.MTW
MTB > regress c2 1 c1 residuals c3 predict c4 coef c5
The regression equation is
EXAM 2 = 56.2 + 0.344 EXAM 1
Predictor Coef Stdev t-ratio p
Constant 56.25 16.22 3.47 0.013
EXAM 1 0.3440 0.1944 1.77 0.127
s = 6.846 R-sq = 34.3% R-sq(adj) = 23.3%
Analysis of Variance
SOURCE DF SS MS F p
Regression 1 146.70 146.70 3.13 0.127
Error 6 281.18 46.86
Total 7 427.87
MTB > print c1-c5
ROW EXAM 1 EXAM 2 C3 C4 C5
1 89 83 -0.61499 86.8607 56.2490
2 56 77 0.39169 75.5103 0.3440
3 78 91 1.24896 83.0772
4 88 87 0.07654 86.5167
5 94 99 1.73642 88.5804
6 87 80 -0.97309 86.1728
7 96 85 -0.73075 89.2683
8 72 75 -0.99078 81.0135
Note. The REGR command with an additional column C5 (for
coefficients) only works in the student version.
5. RESIDUALS (REGRESS RESIDS)
The following subcommands all store diagnostics that help you find
unusual observations.
HI put into C (leverage or diagonal of the hat matrix)
RESIDS put into C (residual = observation - fit)
TRESIDS put into C (Studentized or Studentized deleted residual)
COOKD put into C (Cook's distance)
DFITS put into C (also spelled DFFITS)
In addition, you may store the standardized residuals on the
REGRESS line. Let n = number of observations and p = number of
coefficients. Let Yi = the i-th response and Yhati = the i-th
fitted value. HI stores the leverages. The leverage of the i-th
observation is the i-th diagonal element, hi, of the projection or
hat matrix X( INV(X'X) )X', where X is the n by p design matrix.
Notice, HI depends only on the predictors; it does not involve the
response Y. If hi is larger than 2p/n, then the i-th observation
has unusual predictor values and Minitab prints an X next to the
observation in the REGRESS output. This observation has a large
influence in determining the regression coefficient. Note,
Var(Yhati) = MSE*hi.
RESIDS stores the residuals. The residual for the i-th observation
is ei = (Yi - Yhati). If ei is large and positive or large and
negative, then Yi is unusual.
The standardized residuals are ei/stdev(ei), where ei is the
residual and stdev(ei) = SQRT (MSE - Var(Yhati)). standardized
residuals, except the calculations are done with the i-th
observation omitted from the data set, so the i-th observation
cannot influence the calculations. We use this reduced data set to
fit the regression equation, find Yhat(i), calculate MSE, and
estimate the variance of Yhat(i). Let e(i) = Yi - Yhat(i). Then
var( e(i) ) = MSEi + Var(Yhati) and the i-th Studentized residual
is e(i)/stdev(e(i)). Since these each have a t-distribution with
(n-p-1) degrees of freedom, we call them TRESIDUALS.
COOKD: Recall that leverage, hi, tells us if an observation has
unusual predictors, and a residual tells us if an observation has
an unusual Yi value. Both Cook's distance and DFITS combine these
into one overall measure. COOKD can be viewed as the distance
between the coefficients calculated with and without the i-th
observation. COOKD for the i-th observation is:
(1/p)*( hi/(1-hi))*(standardized residual)**2
Studentized residual is e(i)/stdev(e(i)). Since these each have a
t-distribution with (n-p-1) degrees of freedom, we call them
TRESIDUALS. This is algebraically equivalent to:
((ei)**2/pMSE)*(hi/(1-hi)**2)
This can be compared with an F-distribution on p and n-p degrees of
freedom to determine whether or not COOKD is large. DFITS for the
i-th observation is SQRT( hi/(1-hi) )* (Studentized residual).
Values greater than 2*SQRT(p/n) deserve special attention.
6. PREDICTION (REGRESS PREDICT)
PREDICT for E,...,E
This subcommand computes estimates of fitted Y's for given values
of the predictor variables. It prints out a table that contains
the fitted Y values, standard deviation of the fitted Y values, a
95% confidence interval, and a 95% prediction inter- val. "E" may
be specified as a constant such as 68 or K3, or it may be a column
containing a list of predictor values. You must specify one value
for each predictor in the regression equation. Up to 10 PREDICT
subcommands can be used with one REGRESS command.
Example:
MTB > REGRESS C10 ON 4 C1-C4;
SUBC> PREDICT 61 195 36 78.
The prediction interval output of PREDICT assumes a weight of 1.
An adjustment must be made if the WEIGHT subcommand is used with
values other than 1.
7. PLOTS (TSPLOT, PLOT)
TSPLOT [period = K] time series data in C
Subcommands: INCREMENT START ORIGIN TSTART
TSPLOT (the TS is for time series) plots the column of data
(vertical axis) vs the integers 1, 2, 3, ... (horizontal axis).
Often the data are a series of observations made at equally spaced
intervals of time, e.g., monthly sales figures, or yearly growth
increments.
The period K may be any integer from 1 to 36. If you specify a
period, the first observation is plotted with 1, the second with 2,
until the end of the first period. Then the next observation is
plotted with 1, then 2, etc. If the period is longer than 9, then
0 is used for 10, A for 11,..., Z for 36. A count of the number of
missing observations (if any) is printed below the horizontal axis.
If the time series is too wide to fit across the screen, the plot
is printed in several pieces. The width of TSPLOT is controlled by
the command OW, not WIDTH, as in other plotting commands. Height
is controlled by the HEIGHT command.
PLOT plots scatter diagrams.
Examples:
MTB > plot c2 c1
- *
-
96.0+
-
EXAM 2 -
- *
-
88.0+
- *
- *
- *
-
80.0+ *
-
- *
- *
-
----+---------+---------+---------+---------+---------+--EXAM 1
56.0 64.0 72.0 80.0 88.0 96.0
MTB > tsplot c2
- 5
97.5+
-
EXAM 2 -
-
- 3
90.0+
-
- 4
- 7
-
82.5+ 1
-
- 6
-
- 2
75.0+ 8
+-------+-------+-------+-------+
0 2 4 6 8
8. STORE COMMANDS (STORE)
The commands STORE and EXECUTE provide both a simple macro (or
stored command file) capability and a simple looping capability.
The commands ECHO and NOECHO control printed output. A special
feature, called the CK capability, makes macros and loops very
flexible.
STORE [in 'FILENAME'] the following Minitab commands
(type in Minitab commands here)
END
EXECUTE commands [in 'FILENAME'] [K times]
Executes commands stored in the specified file, or a file named
MINITAB.MTB if no file is named. Minitab command files (macros) may
be created from within Minitab using the STORE command, or with an
editor on your computer. If you create a macro with an editor, be
sure you save it in standard (ASCII) format, and name it with the
three-letter file name extension MTB. (Naming conventions may vary
with different computers.) If you create a macro from within
Minitab, the file will automatically be given the extension, MTB
unless you specify otherwise.
The CK Capability
The integer part of a column number may be replaced by a stored
constant. Here is a simple example:
LET K1 = 5
PRINT C1-CK1
Since K1 = 5, this PRINTS C1 through C5. MK also works; for
example, MK1 would be equivalent to matrix M5.
PRINT C1-CK1
Since K1 = 5, this PRINTS C1 through C5. Currently there is one
restriction with STORE. If you use READ, SET, or INSERT in a STORE
file and type the data following the command then you must not use
END at the end of the data. (If you do, Minitab will take this END
of data as the end of the stored instructions.)
9. SENDING OUTPUT TO A FILE OR TO A PRINTER.
9.1. SENDING OUTPUT TO A FILE.
The command:
MTB > outfile 'a:mywork'
will send a copy of the session that follows (everything that you
write and the results) to a file called "mywork.lis" in drive A
(you must have a diskette in drive A). Minitab appends the
extension ".lis" to the file. If you want your data sent to the
file you must PRINT it on the screen first.
9.2. SENDING OUTPUT TO THE PRINTER.
When you type:
MTB > paper
all the commands and output on your screen are also sent to the
printer until you type NOPAPER or end your Minitab session.
Note. The following commands are NOT in the Student version:
ALPHA CPLOT GHISTOGRAM INVERT RLINE
ANCOVA CTABLE GLPLOT JOURNAL ROOTOGRAM
ANOVA DEFINE GMPLOT LC RSMOOTH
BATCH DIAGONAL GOPTIONS LVALUES TRANSPOSE
BREG DISCRIMINANT GPLOT MPOLISH TSHARE
CENTER EIGEN GRID MTSPLOT UC
CONTOUR GBOXPLOT GTPLOT NOJOURNAL
COVARIANCE GDEFINE INDICATOR PCA
Back to the QMB 6305 main page.