QMB 6305 - Quantitative Methods for Business - Fall 1997

MINITAB handout

MINITAB Release 7.1 - Standard Version

These notes provide a simple introduction to the use of Minitab (the Standard version). A list of the commands not included in the student version are listed at the end of the handout. The topics include:
0. INTRODUCTION
1. HOW TO GET HELP
2. HOW TO INPUT AND TRANSFORM DATA
3. SAVING AND RETRIEVING DATA
4. REGRESSION (REGRESS)
5. RESIDUALS (REGRESS RESIDS)
6. PREDICTION (REGRESS PREDICT)
7. PLOTS (TSPLOT, PLOT)
8. STORE COMMANDS (STORE)
9. SENDING OUTPUT TO A FILE OR TO A PRINTER.

0. INTRODUCTION
To get into Minitab just type at the Dos prompt Minitab and hit ENTER. Once you are in Minitab the prompt MTB > indicates that Minitab is waiting for a command.
1. HOW TO GET HELP
HELP. To obtain help at the Minitab prompt type the following commands one at a time, and follow the directions to get additional information. MTB > help MTB > help commands MTB > help overview
2. HOW TO INPUT AND TRANSFORM DATA
Data can be input from the keyboard or read in from a file. 2.1. DATA FROM THE KEYBOARD Here is a very simple example of a Minitab program: READ C1 C2 40 18 36 32 14 10 END LET C3 = C1 + C2 PRINT C1-C3 The first command, READ, says to put the data that follow into C1 and C2. The command END tells Minitab you are finished typing data. At this point, the worksheet looks like: C1 C2 ------------------------------------------ 40 18 36 32 14 10 ------------------------------------------ The command LET C3 = C1 + C2 tells Minitab to add C1 and C2. After this command, the worksheet looks like: C1 C2 C3 ------------------------------------------- 40 18 58 36 32 68 14 10 24 ------------------------------------------- The last command, PRINT C1-C3, says to print out the numbers in C1 through C3. Minitab responds with the following output: ROW C1 C2 C3 1 40 18 58 2 36 32 68 3 14 10 24 2.2 DATA FROM A FILE (AN ASCII FILE) The commands READ, SET, INSERT, WRITE, SAVE, RETRIEVE, STORE, EXECUTE, PAPER, JOURNAL and OUTFILE may refer to files. For example: READ 'A:FILENAME' C1 C2 reads in the contents of a file in drive A: called FILENAME, and allocates both columns of the file into columns C1 and C2 in Minitab. The specific file name must be enclosed in single quotes. For example: READ 'A:TREES' C1-C3 The file name must have an extension. By default Minitab appends the .DAT extension. So from the previews example, you should have in drive A: a file called TREES.DAT. If the extension is different than DAT, for instance PRN, then in the READ command you should type the whole name. For example: READ 'A:TREES.PRN' C1-C3 means that you have a file in drive A: called TREES.PRN with three columns and these are being read into columns C1, C2 and C3. 2.3. DATA TRANSFORMATION Some example of data manipulation follow: MTB > let c6 = loge(c1) MTB > let c7 = expo(c6) MTB > lag c1 c10 The command PRINT allows you to see on the screen the different variables defined. MTB > print c1 c10 ROW EXAM 1 C10 1 89 * 2 56 89 3 78 56 4 88 78 5 94 88 6 87 94 7 96 87 8 72 96 Simple statistics on the variables of interest are obtained using the command DESCRIBE. MTB > describe c1 c2 N MEAN MEDIAN TRMEAN STDEV SEMEAN EXAM 1 8 82.50 87.50 82.50 13.31 4.71 EXAM 2 8 84.62 84.00 84.62 7.82 2.76 MIN MAX Q1 Q3 EXAM 1 56.00 96.00 73.50 92.75 EXAM 2 75.00 99.00 77.75 90.00 2.4. HOW TO CREATE SPECIAL VARIABLES 2.4.1. DUMMY(BINARY) VARIABLES IN THE STANDARD VERSION A dummy variable is a binary variable that takes 0-1 values. The command to create these variables is INDICATOR (NOT available in the student version). Example: If a variable in C1, for example, has 4 different values (e.g., a variable "quarter" with values 1, 2, 3, and 4) the command MTB > INDIC C1 C2-C5 creates four binary variables the following way: C1 C2 C3 C4 C5 1 1 0 0 0 2 0 1 0 0 3 0 0 1 0 4 0 0 0 1 1 1 0 0 0 2 0 1 0 0 3 0 0 1 0 4 0 0 0 1 . . . 2.4.2. DUMMY(BINARY) VARIABLES IN THE STUDENT VERSION Both the command CODE and the command SET allow to create dummy variables. The same set of variables as in the previous example can be created with CODE: MTB >code (1)1 (2)0 (3)0 (4)0 c1 c2 MTB >code (1)0 (2)1 (3)0 (4)0 c1 c3 MTB >code (1)0 (2)0 (3)1 (4)0 c1 c4 MTB >code (1)0 (2)0 (3)0 (4)1 c1 c5 In the student version the command SET could also be used. The same set of variables can be created following these steps: MTB >set c2 DATA>1 3(0) 1 3(0) 1 3(0) (and so on for each variable) DATA>end The command SET is very general and allows to create other special variables like a "time" or "observation" variable. The commands MTB >set c1 DATA>1:15 DATA>end create a variable in C1 that contains the values 1,2,3,...,15.
3. SAVING AND RETRIEVING DATA
3.1. SAVING THE DATA (SAVE) SAVE [in 'FILENAME'] a copy of the worksheet. Subcommands: PORTABLE LOTUS After using SAVE, FILENAME will contain all data in the worksheet, all stored constants, column names, matrices and missing data codes. SAVE handles alpha columns automatically, just as it does for numeric columns. If FILENAME is not specified, Minitab will write the saved worksheet in the file called MINITAB.MTW. Notice, the FILENAME must be enclosed in single quotes. 3.2. RETRIEVING DATA A file created by SAVE can only be used by Minitab's RETRIEVE command. You cannot use your computer's editor to modify or list it. After the file is RETRIEVED, the worksheet, stored constants, column names and matrices will be exactly as when they were SAVED. SAVE handles alpha columns automatically, just as it does for numeric columns.
4. REGRESSION (REGRESS)
REGRESS C on K predictors C,...,C REGRESS C on K pred. C,...,C [st. res. in C [fits in C]] Subcommands: NOCONSTANT XPXINV COOKD VIF WEIGHTSR MATRIX DFITS DW MSE RESIDS HI PURE COEF TRESIDS PREDICT XLOF TOLERANCE Subcommands available in the Student Version include: NOCONSTANT RESIDS PREDICT DW Fits a regression equation to data. To fit an equation without a constant (intercept) use the subcommand NOCONSTANT. To do a weighted fit, use the subcommand WEIGHTS. If you give an additional column, the standardized residuals will be stored. If you give a second column the fits will be stored in it. See HELP REGRESS RESIDS for a definition of standardized residual. To control the amount of printed output, use the (main) command BRIEF. Example of a simple linear regression: MTB > retrieve 'd:\mtabsv\data\exam' WORKSHEET SAVED 11/18/1987 Worksheet retrieved from file: d:\mtabsv\data\exam.MTW MTB > regress c2 1 c1 residuals c3 predict c4 coef c5 The regression equation is EXAM 2 = 56.2 + 0.344 EXAM 1 Predictor Coef Stdev t-ratio p Constant 56.25 16.22 3.47 0.013 EXAM 1 0.3440 0.1944 1.77 0.127 s = 6.846 R-sq = 34.3% R-sq(adj) = 23.3% Analysis of Variance SOURCE DF SS MS F p Regression 1 146.70 146.70 3.13 0.127 Error 6 281.18 46.86 Total 7 427.87 MTB > print c1-c5 ROW EXAM 1 EXAM 2 C3 C4 C5 1 89 83 -0.61499 86.8607 56.2490 2 56 77 0.39169 75.5103 0.3440 3 78 91 1.24896 83.0772 4 88 87 0.07654 86.5167 5 94 99 1.73642 88.5804 6 87 80 -0.97309 86.1728 7 96 85 -0.73075 89.2683 8 72 75 -0.99078 81.0135 Note. The REGR command with an additional column C5 (for coefficients) only works in the student version.
5. RESIDUALS (REGRESS RESIDS)
The following subcommands all store diagnostics that help you find unusual observations. HI put into C (leverage or diagonal of the hat matrix) RESIDS put into C (residual = observation - fit) TRESIDS put into C (Studentized or Studentized deleted residual) COOKD put into C (Cook's distance) DFITS put into C (also spelled DFFITS) In addition, you may store the standardized residuals on the REGRESS line. Let n = number of observations and p = number of coefficients. Let Yi = the i-th response and Yhati = the i-th fitted value. HI stores the leverages. The leverage of the i-th observation is the i-th diagonal element, hi, of the projection or hat matrix X( INV(X'X) )X', where X is the n by p design matrix. Notice, HI depends only on the predictors; it does not involve the response Y. If hi is larger than 2p/n, then the i-th observation has unusual predictor values and Minitab prints an X next to the observation in the REGRESS output. This observation has a large influence in determining the regression coefficient. Note, Var(Yhati) = MSE*hi. RESIDS stores the residuals. The residual for the i-th observation is ei = (Yi - Yhati). If ei is large and positive or large and negative, then Yi is unusual. The standardized residuals are ei/stdev(ei), where ei is the residual and stdev(ei) = SQRT (MSE - Var(Yhati)). standardized residuals, except the calculations are done with the i-th observation omitted from the data set, so the i-th observation cannot influence the calculations. We use this reduced data set to fit the regression equation, find Yhat(i), calculate MSE, and estimate the variance of Yhat(i). Let e(i) = Yi - Yhat(i). Then var( e(i) ) = MSEi + Var(Yhati) and the i-th Studentized residual is e(i)/stdev(e(i)). Since these each have a t-distribution with (n-p-1) degrees of freedom, we call them TRESIDUALS. COOKD: Recall that leverage, hi, tells us if an observation has unusual predictors, and a residual tells us if an observation has an unusual Yi value. Both Cook's distance and DFITS combine these into one overall measure. COOKD can be viewed as the distance between the coefficients calculated with and without the i-th observation. COOKD for the i-th observation is: (1/p)*( hi/(1-hi))*(standardized residual)**2 Studentized residual is e(i)/stdev(e(i)). Since these each have a t-distribution with (n-p-1) degrees of freedom, we call them TRESIDUALS. This is algebraically equivalent to: ((ei)**2/pMSE)*(hi/(1-hi)**2) This can be compared with an F-distribution on p and n-p degrees of freedom to determine whether or not COOKD is large. DFITS for the i-th observation is SQRT( hi/(1-hi) )* (Studentized residual). Values greater than 2*SQRT(p/n) deserve special attention.
6. PREDICTION (REGRESS PREDICT)
PREDICT for E,...,E This subcommand computes estimates of fitted Y's for given values of the predictor variables. It prints out a table that contains the fitted Y values, standard deviation of the fitted Y values, a 95% confidence interval, and a 95% prediction inter- val. "E" may be specified as a constant such as 68 or K3, or it may be a column containing a list of predictor values. You must specify one value for each predictor in the regression equation. Up to 10 PREDICT subcommands can be used with one REGRESS command. Example: MTB > REGRESS C10 ON 4 C1-C4; SUBC> PREDICT 61 195 36 78. The prediction interval output of PREDICT assumes a weight of 1. An adjustment must be made if the WEIGHT subcommand is used with values other than 1.
7. PLOTS (TSPLOT, PLOT)
TSPLOT [period = K] time series data in C Subcommands: INCREMENT START ORIGIN TSTART TSPLOT (the TS is for time series) plots the column of data (vertical axis) vs the integers 1, 2, 3, ... (horizontal axis). Often the data are a series of observations made at equally spaced intervals of time, e.g., monthly sales figures, or yearly growth increments. The period K may be any integer from 1 to 36. If you specify a period, the first observation is plotted with 1, the second with 2, until the end of the first period. Then the next observation is plotted with 1, then 2, etc. If the period is longer than 9, then 0 is used for 10, A for 11,..., Z for 36. A count of the number of missing observations (if any) is printed below the horizontal axis. If the time series is too wide to fit across the screen, the plot is printed in several pieces. The width of TSPLOT is controlled by the command OW, not WIDTH, as in other plotting commands. Height is controlled by the HEIGHT command. PLOT plots scatter diagrams. Examples: MTB > plot c2 c1 - * - 96.0+ - EXAM 2 - - * - 88.0+ - * - * - * - 80.0+ * - - * - * - ----+---------+---------+---------+---------+---------+--EXAM 1 56.0 64.0 72.0 80.0 88.0 96.0 MTB > tsplot c2 - 5 97.5+ - EXAM 2 - - - 3 90.0+ - - 4 - 7 - 82.5+ 1 - - 6 - - 2 75.0+ 8 +-------+-------+-------+-------+ 0 2 4 6 8
8. STORE COMMANDS (STORE)
The commands STORE and EXECUTE provide both a simple macro (or stored command file) capability and a simple looping capability. The commands ECHO and NOECHO control printed output. A special feature, called the CK capability, makes macros and loops very flexible. STORE [in 'FILENAME'] the following Minitab commands (type in Minitab commands here) END EXECUTE commands [in 'FILENAME'] [K times] Executes commands stored in the specified file, or a file named MINITAB.MTB if no file is named. Minitab command files (macros) may be created from within Minitab using the STORE command, or with an editor on your computer. If you create a macro with an editor, be sure you save it in standard (ASCII) format, and name it with the three-letter file name extension MTB. (Naming conventions may vary with different computers.) If you create a macro from within Minitab, the file will automatically be given the extension, MTB unless you specify otherwise. The CK Capability The integer part of a column number may be replaced by a stored constant. Here is a simple example: LET K1 = 5 PRINT C1-CK1 Since K1 = 5, this PRINTS C1 through C5. MK also works; for example, MK1 would be equivalent to matrix M5. PRINT C1-CK1 Since K1 = 5, this PRINTS C1 through C5. Currently there is one restriction with STORE. If you use READ, SET, or INSERT in a STORE file and type the data following the command then you must not use END at the end of the data. (If you do, Minitab will take this END of data as the end of the stored instructions.)
9. SENDING OUTPUT TO A FILE OR TO A PRINTER.
9.1. SENDING OUTPUT TO A FILE. The command: MTB > outfile 'a:mywork' will send a copy of the session that follows (everything that you write and the results) to a file called "mywork.lis" in drive A (you must have a diskette in drive A). Minitab appends the extension ".lis" to the file. If you want your data sent to the file you must PRINT it on the screen first. 9.2. SENDING OUTPUT TO THE PRINTER. When you type: MTB > paper all the commands and output on your screen are also sent to the printer until you type NOPAPER or end your Minitab session. Note. The following commands are NOT in the Student version: ALPHA CPLOT GHISTOGRAM INVERT RLINE ANCOVA CTABLE GLPLOT JOURNAL ROOTOGRAM ANOVA DEFINE GMPLOT LC RSMOOTH BATCH DIAGONAL GOPTIONS LVALUES TRANSPOSE BREG DISCRIMINANT GPLOT MPOLISH TSHARE CENTER EIGEN GRID MTSPLOT UC CONTOUR GBOXPLOT GTPLOT NOJOURNAL COVARIANCE GDEFINE INDICATOR PCA

Back to the QMB 6305 main page.