SSS 2.0


Downloads, Installation and Running SSS: Serial version


The program can be run directly (Windows or mac or unix) or from within Matlab, taking text file inputs and producing summary text file outputs. The serial code is self-contained and can be easily installed; it requires no libraries and has been optimized for speed. A script file interface is available so SSS can be run from a command window as well as from within Matlab. A Java graphical user interface allows SSS to be run as a standalone program and smoothly via a GUI as an option in Matlab (version 7 and higher).


Download the SSS zip archive
This includes all the files, as follows:

-----------------
Executables: modelsearch.exe - main SSS program
modelsummary.exe - helper application
killprocess.exe - helper used with Matlab (Windows only)
modelsearch - MAC OS X 10.5.2 (intel) executable version of modelsearch
modelsearch32 - 32bit linux executable version of modelsearch
modelsummary32 - 32bit linux executable version of modelsummary
modelsummary - MAC OS X 10.5.2 (intel) executable version of modelsummary
modelsearch64 - 64bit linux executable version of modelsearch
modelsummary64 - 64bit linux executable version of modelsummary
-----------------
Java files: sss.jar - java GUI program
AbsoluteLayout.jar - helper Java file
dataframe.jar - helper Java file (non essential)
-----------------
Inputs: xdata.txt - predictor data for examples
ybinarydata.txt - response data for binary regression example
ylineardata.txt - response data for linear regression example
ysurvtimedata.txt - response data for survival regression example
relapsedata.txt - observed (1) versus censored (0) data for survival example
wdata.txt - indicator data for observations to be used in analysis
binary.setup.txt - setup/input file for binary regression example
linear.setup.txt - setup/input file for linear regression example
survival.setup.txt - setup/input file for survival regression example
-----------------
Matlab: examples.m - commands to load and run the three examples
binarysummary.m - commands to summarise, plot aspects of binary example
linearsummary.m - commands to summarise, plot aspects of linear example
survivalsummary.m - commands to summarise, plot aspects of survival example
show.m, showtv.m - matlab utilities for graphs
scattertv.m, km.m - matlab utilities for graphs
std_rows.m, ranktrf.m - matlab utiliies
-----------------
R: examples.r - commands to load and run the three examples
binarysummary.r - commands to summarise, plot aspects of binary example
linearsummary.r - commands to summarise, plot aspects of linear example
show.r, showtv.r - utilities for graphs
scattertv.r - utilities for graphs
-----------------



Installation and Running SSS

Installation

  1. create a folder/subdirectory for SSS, such as c:\SSS on Windows and ~/SSS on Linux/Mac.
  2. copy all the files mentioned above to the directory and change the privileges if necessary.
  3. add the full path of the directory to system path.
Running: SSS can be run from the command line or in the GUI window. From command line, the call is simply:

  • on Windows, modelsearch.exe setup.txt
  • on unix, ./modelsearch64 setup.txt followed by ./modelsummary64 setup.txt (or 32bit versions)
where setup.txt is the required setup file for the particular analysis (described below).

The GUI is an easy method of setting up/changing the parameter input file and then running SSS. Here we describe running from within Matlab, and this is illustrated in the Matlab examples file examples.m -- we recommend that users start with those examples.

Startup and running from Matlab

  • start Matlab
  • load the Java GUI library and helpers. If the jar files are in the current directory then just type commands
    javaaddpath 'sss.jar'
    javaaddpath 'AbsoluteLayout.jar'
    javaaddpath 'dataframe.jar'
    import dataframe.* If the jar files are in a different folder/directory, then use the full path, e.g.
    javaaddpath 'c:\SSS\sss.jar'
    and so forth.

    Running SSS from command line in Matlab

    Many users will setup an input text file -- say, setup.txt -- either directly or with the GUI and then run the SSS model search from within Matlab via

    !modelsearch.exe setup.txt

    Using the GUI

    1. start the GUI with commands
      mySSS=sss.Model; mySSS.start
    2. use the GUI to customise input/ouput parameter values in setup file.
    3. Save and Load to create or input setup text files.
    4. hit the Run button in the GUI to start the SSS model search.
    5. hit the Stop button in the GUI if you want to abandon a running SSS model search.



    Setup file of input parameters

    The parameter setup/input file (named setup.txt above) is a flat text file with each line representing a parameter (name, value) pair for a predefined set of parameters. The order of parameters in the file is not important. User can also comment out a line by adding # at the beginning. Each line is in the format

    ParameterName = Value

    where ParameterName is one of the names described bellow and Value takes a string or numberic values depending on the nature of given parameter. When a path is used as a parameter value, the spaces in the path will be ignored. SSS will NOT work with a path that has spaces in it.

    -----------------
    Inputs: NOBSERVATIONS n = total sample size
    NVARIABLES p = total number of predictor variables
    DATAFILE tab delimited predictor data (n rows, p columns)
    RESPONSEFILE response variable (n values, tab delimited row or single column)
    WEIGHTSFILE weight vector (n values of 0/1; 1 indicates samples to be used in model fit)
    CENSORFILE 0/1 indicators of right-censoring (0) versus observed (1) in case of survival data
    -----------------
    Output: OUTFILE list of selected best models
    ITEROUT details of models visited at each SSS iteration
    SUMMARYFILE summary of models and parameter estimation
    DEBUGOUT output debug information to standard output, or not
    -----------------
    Model/Search: MODTYPE Model Type: 1=linear, 2=binary/logistic, 3=Weibull survival
    PSTART Initial Model Size: number of predictors for model to start SSS
    PMAX Maximum Model Size: maximum number of predictors in any model
    PRIORMEANP Prior mean of number of included variables: this is the key sparsity control parameter (PRIORMEANP=v means that each variable is "in the model" with prior probability v/p)
    NBEST Number of Best Models to be saved/recorded
    ITERS Total Number of SSS iterations
    ONEVAR indicator to include all 1-variable models in search
    -----------------
    Annealing*: (replace, innerAnneal1) annealing parameter for variable replacement
    (delete, innerAnneal2) annealing parameter for variable deletion
    (add, innerAnneal3) annealing parameter for variable addition
    (outer, outerAnneal) annealing parameter for second level model selection
    -----------------
    *Annealing parameters can generally be left at the suggested defaults in the example files here. See the paper for additional discussion.



    Output files

    The key output file is the model summary file -- a flat txt file summarising the posterior distributions within and across the models summarised. See the examples in the Matlab examples file examples.m and the three summary support .m script files for these examples. The SSS search explores linear models with standardised y and x data and so the output summaries relate to the standardised models with no intercept. In contrast the binary and survival models include intercepts.

    The model summary text file output has the following information. Each row is one model of the "top models" ordered in decreasing order of posterior probability. The number of columns is defined by the largest model, and entries are NA/NaN for models smaller than the largest In each model/row the entries are as follows:

    Linear regression models:

    • element 1 - dimension of the model = number of predictors p for this model
    • element 2 - log posterior probability of this model (the "score")
    • elements 3:(2+p) - the indices of the p variables in this model
    • elements (3+p):(2+2p) - posterior mean of the regression parameter vector beta (no intercept)
    • elements (3+2p):(2+2p+p*p) - posterior variance matrix of beta in vectorised form (no intercept)
    • final two elements: - (s,d), the residual SD estimate, and the posterior degees of freedom

    Binary logistic regression models:

    • element 1 - dimension of the model = number of predictors p for this model
    • element 2 - log posterior probability of this model (the "score")
    • elements 3:(2+p) - the indices of the p variables in this model
    • elements (3+p):(3+2p) - posterior mode of the regression parameter vector beta (includes intercept)
    • elements (4+2p):(4+4p+p*p) - estimated posterior variance matrix of beta in vectorised form (includes intercept)

    Survival (Weibull) regression models:

    • element 1 - dimension of the model = number of predictors p for this model
    • element 2 - log posterior probability of this model (the "score")
    • elements 3:(2+p) - the indices of the p variables in this model
    • element 3+p: - the posterior mean of the Weibull index parameter alpha in this model
    • elements (4+p):(4+2p) - posterior mode of the regression parameter vector beta (includes intercept)
    • elements (5+2p):(8+6p+p*p) - estimated posterior variance matrix of (alpha,beta) (beta includes intercept) in vectorised form