SYSTAT for the Macintosh: Crunch Numbers Without a Cray

© 1993 Kathleen G. Charters

Washington Apple Pi Journal, Vol. 15, no. 12, December 1993, pp. 28-32.

SYSTAT for Macintosh is a comprehensive graphics oriented statistics package. SYSTAT applications include (but are not limited to) biomedical, environmental and social science research. This program requires Macintosh System 6.0.2 or higher, 2 Mb of RAM, and a hard disk drive. It comes with two program disks (either the 68020+/coprocessor version or the 68000 non-coprocessor version); a data disk (with sample data, help files, and import drivers); four manuals (Getting Started, Data, Statistics, and Graphics); and a Quick Reference Card (overview of menus and windows).

SYSTAT stats menu.
SYSTAT stats menu.

Though SYSTAT is one of the most powerful tools for analysis and visualization of data ever created, it is surprisingly easy to use. You may work from menus using icons, or you may write commands in the command window, or mix and match. The icons are readily understood, but if you should need to use the reference manuals (several pounds of them) you will find clearly written explanations, complete with examples and references. The clarity and power is not limited to just the interface, either. SYSTAT allows you to use QuickTime to create animated visualizations of the graphs you generate (each graph becomes a frame of the movie), a huge advance from the static character-based plots of the recent past.

SYSTAT is written by statisticians, and their professionalism is evident in the effort that has gone into this product. For example, in controversial areas, the authors’ approach is presented along with alternative schools of thought, and complete bibliographic references are provided.

SYSTAT graphing menu.
SYSTAT graphing menu.

There is an excellent support system for SYSTAT users. SYSTAT, Inc. and Apple co-sponsor data analysis seminars. If you purchase SYSTAT when you register for a course, the registration fee is reduced. SYSTAT, Inc. produces SYSNET, a quarterly SYSTAT Network newsletter, and provides technical support by telephone (Monday through Friday, 9 a.m. to 6 p. m. Central Time) or through the SYSTAT Electronic Bulletin Board Service (available 24 hours a day, seven days a week). Several companion and supplemental products allow you to customize data analysis to meet your particular needs, no matter how exotic.

Getting Started

To get started I used the aptly named SYSTAT: Getting Started, Version 5.2 Edition manual. This gives a quick overview of the menus, windows and manuals, then provides tutorials for a test drive of the way the program works.

Error identification in data. Case 154 is a duplication of case 153. The four windows are: Data Editor worksheet, Command, View, and Analysis (the foremost, active screen).
Error identification in data. Case 154 is a duplication of case 153. The four windows are: Data Editor worksheet, Command, View, and Analysis (the foremost, active screen).

I suspect this is more of a learning experience than the authors intended, as discrepancies occur between what the manual states will happen and what actually occurs on the screen. Instead of “Opening a file as described in the manual, you are given the option of either “Edit” or “Use.” The command selected makes a big difference in which features are available. For example, scatterplot brushing tools are missing when the “Use” command opens the file, but these tools appear when the “Edit” command opens the file. Using the tutorial to work with a SYSTAT data file does encourage experimentation. For example, importing a file can be done, just not the way the manual describes. Eventually I figured out how to produce the desired results, though not always in the manner suggested by the tutorial.

Data

Data is entered into the Data Editor Worksheet, which has the appearance of a spreadsheet, but it is not a spreadsheet. It does not store formulas in individual cells (it will store the results of mathematical transformations), nor does it insert rows and columns (it will add new columns and rows after the last entry). Data is kept in files, not memory, so capacity is limited only by the availability of disk space. The worksheet format is easy to use, review and edit. There are five ways to create a SYSTAT file, including typing the data in, transferring via the clipboard, or importing data (as ASCII text files or from Excel, map [cartographic], and DOS SYSTAT files). The DATA Command Procedure provides even more flexibility in creating files. Once the SYSTAT file exists, it is easy to transform the data. The dialog box shows existing variables and math functions, so the options for transforming variables or deriving new variables are readily apparent. This information can also be used to recode data by specifying conditional (!F … THEN) transformations. A complete programming language, SYSTAT BASIC is available should you desire to program even more complex transformations. The results of transformations are immediately seen on the Data Editor Worksheet, providing visual confirmation. Data may be sorted, ranked, or standardized from the Data menu, or the user may opt to write a DATA program for variations not available with the Data Editor. Temporary subgroups may be created. Four grouping variables are provided in DATA, and others may be created using SYSTAT BASIC. Every statistical module is capable of creating temporary subgroups. Files may be rearranged or combined. This includes dropping or extracting variables or deleting cases (by creating a new file), or even transposing a file (numeric data only, maximum of99 cases). Files may be merged’ horizontally or appended vertically. Exporting data is done using the clipboard, or saving a file as comma-delimited or tab-delimited; or using the PUT command to select subsets of cases. Data may be output to the screen or the printer, and the user has complete control over what subset of the data (e.g. variable names) is selected and how the data appears (e.g. a number of variables per line). In short, SYSTAT provides extensive data management capabilities. Data entry, manipulation, and output are click-and-point easy, augmented by sophisticated flexibility to create advanced applications.

Statistics

The documentation in SYSTAT: Statistics, Version 5.2 Edition is exceptionally well done. The authors take great care to cover not only the methodology of running an analysis, but also the appropriateness of the test. In the introduction the authors explain statistics “formally summarize our observations of the world. As we all know, summaries can mislead or elucidate.”The focus of the text is to inform the user ”how to use numbers to elucidate rather than to mislead.” This approach yields a wealth of information in readily understandable language supported by numerous examples.

Descriptive and inferential statistics are covered. A variety of cluster analysis methods, measures of correlation and similarity, principal components or factor analysis, and nonmetric multidimensional scaling using different algorithms are provided. The multivariate general linear hypothesis can estimate and test any univariate or multivariate general linear model, and covers methods of regression, analysis of variance (ANO VA) and multivariate models. Nonlinear modeling and nonparametric statistics are also supported. A wide variety of time series models, including smoothing, autoregressive integrated moving average (ARIMA), seasonal decomposition and adjustment, exponential smoothing, and Fourier analysis can be used for modeling and forecasting. t-tests include independent, paired, matched pairs, and one-sample tests. Frequency tables provide useful summaries. Significance tests, or measures of association for two way tables, or log-linear models may be applied to the tables. (See Fig. 1 for the Stats menu icons.)

The SYSTAT: Statistics, Version 5.2 Edition manual covers each category in detail. The chapters provide an introduction, usage, computation, and examples. These chapters are more intelligible than most statistics text books. SYSTATs point-and-click menu interface can be used in conjunction with the Command window, revealing command line equivalents to pointing and clicking. Appendix I: Command Reference explains command syntax and defines all the options and arguments, giving reasons for using the command interface and instructions for editing and submitting batch files. Examples in the manual use mouse instructions. Equivalent keyboard commands for those examples are provided in Appendix II. A wonderful 15 page section contains a comprehensive bibliography of classic and contemporary statistics works. The extensive index is carefully cross-referenced so even a novice can find what they are looking for.

Graphics

The most notable advantage of personal computer statistical computation over mainframe statistical computation lies in the ability to easily produce quantitative graphics. SYSTAT contains SYGRAPH, an extensive graphics program (which can be purchased separately). SYGRAPH was designed by an expert in graphics perception, so features shown in published re-

Density graph and cluster analysis of age.
Density graph and cluster analysis of age.