Loglet Lab for Windows (version 1.1) Tutorial


Jason Yung, Perrin S. Meyer, and Jesse H. Ausubel
Program for the Human Environment
The Rockefeller University
New York, NY
July 1998


Trends commonly accelerate, reach a maximum speed, and then slow as they approach some limit. After a sunflower seed germinates, the plant grows faster and faster because each cell produces two, then the two produce two and so on. But multiplication does not go on ad infinitum because the seed produces a sunflower, not Jack's beanstalk; a sunflower stops before it reaches 20 feet. The simplest description is the so-called logistic equation of three parameters: the time of germination, the relative rate of growth and the limit. Loglet Lab fits a single logistic equation to a series of observations through time. It also displays the deviations of the observations from the logistic equation, the result in linear form, and the rates of change.

Growth may slow and accelerate again. Imagine a drought or shortage of fertilizer halts the sunflower's growth until rain or fertilizer starts a new wave of growth toward the natural limit of sunflowers. Or imagine nuclear testing multiplies as equipment and war scares increase; then scares recede and testing slows, only to start a wave of testing toward another ceiling. As "Loglet" contracts "logistic" and "wavelet," Loglet Lab fits multiple logistic equations to such waves.

Competition causes a third variation of acceleration, slowing, and reaching a limit as the competing substitute increases its market share, and then declines as a new competitor becomes ascendant. Consider the rise of long playing (LP) records of music, their displacement by cassettes, and then the replacement of cassettes by compact discs (CDs). Loglet Lab computes the market shares of each competitor and fits their rise, leveling and fall with logistic equations.

With this brief introduction you are now ready to enjoy a tutorial that carries you through fitting a single logistic, multiple logistics and the substitution of one competitor by another. You can read about the mathematics in the online Help as well as in the accompanying "Logistics Primer."

We also append a list of known bugs to this tutorial.

INSTALLATION AND OTHER PRELIMINARIES

  1. Run A:\Setup.exe. This will run the InstallShield program.
  2. By default, Setup places files in C:\program files\ru-phe\loglet lab\. In particular, the installation will install the following files which are required to run Loglet Lab:

  3. Setup will also install or update the following files in the Windows system directory if necessary: If older copies of any of the above files already exist on your computer, registration of Vcfi5.ocx and Vcf15.ocx may fail; this is because the computer has to restart before it uses the newer version. If this happens, you can always enter the following commands using Run... in the Windows Start Menu to register the controls manually:

    Regsvr32 "C:\Program Files\RU-PHE\Loglet Lab\Vcf15.ocx"

  4. The installation should also have added Loglet Lab to the Start menu. You can use this to start Loglet Lab.
  5. The first thing that you will see is the splash page, featuring a nifty logo and a copyright notice. After 5 seconds, the splash page should go away, leaving you with a blank document:
  6. As you can see, there are two panes: the Data View, where the numerical data can be entered, viewed, and edited, and the Graph View, where the data are graphically represented. Near the top of the window is a toolbar, with buttons to execute the various commands in Loglet Lab. All of the toolbar buttons have equivalent commands in the menus at the top of the window.

    YOUR FIRST LOGLET, a single logistic equation fitted to three times.

  7. Click the mouse in the first (upper left) cell of the Data View grid. This activates the grid; you should see a heavy black border around the first cell, and a scroll bar should appear on the right side of the frame.
  8. The first column holds the time, and the second column holds the corresponding values of y such as sales, height, or publications. The third column holds the mask, which is used to withhold points from the fitting process. Mathematically speaking, this corresponds to the "weight" of the data point in the fitting process; for the most part, we will use all of points, so the default is 1. Enter the following times and y-values in the first two columns:

    1920

    10

    1930

    50

    1940

    90

  9. Notice that as soon as you enter a new data point, it is plotted in the Graph View. Only rows with both an x- and a y-value will be plotted.
  10. To fit a logistic to these data, click on the Fit Logistic button . (Again, all of these commands are available on the Data menu.) This will bring up the "Logistic Fitting Wizard" dialog box where you will specify the number of logistic curves and the displacement of the logistic curve from zero.

  11. You want to fit a single logistic to these data, so use the default "Number of Logistics" of 1. You also should use the default "Initial Displacement" of 0. An example where it should be not be zero will be discussed later about step 45. Click the "Next" button in the Wizard to proceed to the "Specify Logistic Parameters" dialog box.

  12. Unlike linear regression, fitting a nonlinear equation requires initial values for its coefficients. Using an iterative process, the parameters will converge from these values to final values. For the first attempt to fit a curve to these data, Loglet Lab will estimate values for each parameter; for subsequent fits, it will provide the most recent set of parameters. You can proceed with these values, or replace them with your own estimates. It is recommended that you try Loglet Lab's estimates and then use your own estimates based on the results of the first fit. For the three data which you currently have, Loglet Lab's estimates will suffice.
  13. You can tell Loglet Lab not to change a particular parameter during the process by clicking its corresponding "Hold" box; an "X" will appear in the box. This is useful if external evidence supports a particular limit to the saturation level or growth time; for instance, the growth of a bacteria culture is limited by the size of the petri dish. However, for the purpose of fitting a curve to these three data, you will not need to hold any of the parameters.
  14. Because you are fitting a single logistic to these data, you do not have to enter anything the other fields. (In fact, they should be deactivated by Loglet Lab.) When you are satisfied with Loglet Lab's estimates or have replaced them with ones you prefer, click OK.
  15. Voila! Loglet Lab graces your screen with an elegant logistic, and places the parameters of the fitted logistic in the upper-left corner. (If you hold any of the parameters constant in the fit, they will be annotated with an "(H)".)


    YOUR FIRST "REAL" LOGLET

  16. Once the glow of having fit your first logistic has subsided, you will want to fit real data. Select Open from the File menu. We will be delving into the "gallery" of examples that comes with Loglet Lab. Later you can browse the gallery for sets that illustrate the range of logistic processes.
  17. In the Open dialog box, double-click on sunflow.lgt to open the sunflower data set. On some computers, the ".lgt" will not appear.
  18. Because many phenomena rise and level off, it is not obvious that you are looking at the growth of a sunflower. To convey this, add a title and label the axes on the graph. Click on the Edit Labels button to bring up the "Edit Labels" dialog.

  19. In the Title field, enter "Growth of a Sunflower". For the X-axis, enter "time (days)" and for the Y-axis, enter "height (cm)".
  20. Fit a logistic to the growth of a real sunflower as you did for the fictitious three points in the preceding exercise. If all is well, a logistic with Growth time=50, Saturation=261, and Midpoint=34 will appear on your screen.

  21. A linear representation of the logistic and the data can provide a different interpretation of the model. This can be obtained by applying the Fisher-Pry transformation and plotting the transformed curve and data on a semi-logarithmic scale. The transform is y' = F/(1-F) where F is the ratio of y to the saturation level. This turns a logistic curve into a straight line which shows growth as the percentage of the saturation level. To apply the Fisher-Pry transformation, click on the Fisher-Pry button . Note that the y-axis has automatically changed to a logarithmic scale. To toggle back to the graph of absolute y versus absolute time, click on the Fisher-Pry button again.
  22. You will want to examine how fast as well as far the sunflower has grown. The rate of change of a logistic function is bell-shaped; click the Bell Curves button to see the rise and fall of the rate of change. To toggle back to the graph of absolute y versus absolute time, click on the Fisher-Pry button again.

  23. You will also want to see how well this single logistic curve fits the data. Clicking on the Hide/Show Residuals button displays the differences between the actual values and those predicted by the logistic equation you have fitted. If the residuals are large or systematically distributed, better fits are likely to be attainable, or perhaps the growth is not logistic. (Section 5 of the Logistic Primer describes residuals in detail.)
  24. The first time you click on the Hide/Show Residuals button, Loglet Lab presents the error as a percentage of the actual value. The graph of percentage errors is scaled to fit the maximum error. A second click shows the absolute errors, the simple difference between actual and predicted values, which like the graph of percentage errors is scaled to fit the maximum error. A third click cycles back to the graph of absolute y versus absolute time.

    A BI-LOGISTIC

  25. Now let's try a bi-logistic. Growth may slow and accelerate, as when a drought slows the sunflower's growth until rain feeds new wave of growth toward the natural limit of sunflowers. Or imagine nuclear testing multiplies as equipment and war scares increase; then scares recede and testing slows, only to start a wave of testing toward another ceiling. To see this, you will now fit two logistic curves to the two waves of nuclear testing. Select Open from the File menu, double-click on Nukes.lgt, and the record of U.S. nuclear tests will appear in the data view.
  26. Click on the Fit Logistic button. This time you will want to fit two logistics, so replace the '1' with a '2'. Because no nuclear tests preceded 1945, leave 0 as the displacement. Click on the Next button to specify the parameters for each logistic wave.
  27. Because you are fitting two logistics for these data, you will have to specify six parameters, three for each logistic. Note that Loglet Lab has activated the fields in the "Logistic #2" box in the "Specify Logistic Parameters" dialog. Moreover, Loglet Lab has adjusted its estimates to accommodate two logistics.
  28. Click OK to accept the parameters.
  29. Boom! A nice bi-logistic has just been fitted to the two waves of nuclear testing.

  30. A multi-logistic can be decomposed into logistic pulses to show the discrete growth periods of each pulse. Moreover, there are several methods for decomposition. To look at the decomposition in absolute numbers, click on the Decompose into Components button . The total saturation is 975 nuclear tests, of which 698 can be attributed to the first wave of growth, and 277 to the second. By decomposing the logistic, we can see each component rise to its respective limit, and the time span of its effects.

  31. Clicking on the Fisher-Pry button displays the two linear transformations of the components.

  32. Clicking on the Bell Curves button displays the rate of change of each component. For your convenience, Loglet Lab uses a different marker to denote the individual components. In this example, the first component is represented by circles, and the second component is represented by diamonds. The y-axis label now reads "change in cumulative number of nuclear tests".
  33. Admittedly, the discrete rate of change tends to look noisy compared to the idealized model. Thus it may make sense to hide the data points and look at just the fitted curve. To do this, click the Hide/Show Data Points button .


    PARAMETRIC BOOTSTRAPING

  34. How precise is this fit? That is, how sensitive is this fit to sampling error? Using a technique called the Bootstrap method, we can compute a confidence interval for each of our parameters. For a detailed description of the bootstrap, consult Section 4.3 of the Primer, which is motivated by An Introduction to the Bootstrap by Bradley Efron and Robert J. Tibshirani.
  35. Put simply, the bootstrap works as follows. First, a curve is fit to the data as per the previous sections. Then a new set of data is synthesized based on the residuals from the initial fit, and a curve is fitted to each of this set. The process of synthesizing and refitting is repeated 200 times. This gives us 200 values for each parameter, from which the mean and standard deviation can be computed. To demonstrate the bootstrap method, reopen the sunflower data and click on the Bootstrap button .
  36. To start the bootstrap, you have to fit a curve to the actual data. Thus your first steps will be fit a curve as above. Then Loglet Lab will execute the bootstrap. While the bootstrap is being executed, you will see a progress bar which shows how many iterations have been executed. (Even on our 200MHz Pentium Pro with 64MB RAM, this takes about 5-10 seconds, so be patient.)
  37. When Loglet Lab is finished running the bootstrap, it will print the 90% confidence interval (CI) for each parameter next to its respective value. A gray region will be appear on the graph, showing the range of curves with Saturation values within its 90% CI. Clicking on the Bootstrap button will cycle through the different parameters to show how varying a particular parameter affects the confidence region of a fit. (The parameter being varied over its confidence interval is indicated in the lower right corner.)

  38. You can set the number of iterations and the level of confidence by selecting Bootstrap Options from the Data menu. You can also the seed for the random number generator for comparing with other fits.
  39. Be careful when using the bootstrap method for multi-logistic curves. This is apparent when looking at the confidence region for Saturation, where error increases as time increases, even after the curve has reached 99% saturation. For such analyses, we recommend that you do an initial fit without the bootstrap, and then run the bootstrap with the parameters held for one logistic, i.e., vary the parameters for one logistic at a time.

    EXCLUDING DATA POINTS

  40. Including the years around 1959 when there was an extraordinary 4-year leap and then hiatus in testing in our analysis may obscure the fit. Often it is helpful to search first for fits using the "quiet" years of the data, when a process appears to be unfolding without much disturbance such as war or depression. Or, you might wish to exclude the years before or after 1971 to examine one of the waves more closely. Loglet Lab lets you exclude data points from the fitting process. First, click in the Data View to make sure it is active. This is signified by the scroll bars and a heavy border surrounding a cell.
  41. Scroll down until you can see 1969 through 1991 in the pane.
  42. Click on the "1971" cell. With the left button held down, drag the mouse down to the "936" cell, selecting the data to be excluded.

  43. Click on the Exclude Data button . (Note that double-clicking on the region has the same effect.) The following will tell you that the points you selected have been excluded:
  44. Try fitting a single logistic to the pre-1971 data by clicking on the Fit Logistic button.
  45. You must set the number of logistics to 1. Leave the displacement at 0 and click on the Next button.
  46. Because we have excluded the last 20 years from the fit, Loglet Lab filters them out and estimates appropriate parameters based on the data from the first 25 years. Click OK to fit the single logistic to the data.
  47. This should yield a single logistic with parameters: Growth time=18, Saturation=734, and Midpoint=1964.
  48. Now look at the Fisher-Pry transform for this fit. You should hit the Plot Data button each time you change views, because otherwise the excluded points may not be correctly marked. Notice that several points that were excluded are on the regression line; perhaps they should not have been excluded!

    IMPORTING DATA

  49. Because data are often stored on spreadsheets or other external files, Loglet Lab allows you to import files into the Data View. The next exercise imports data from Excel. In this exercise, you will also see a case where you will need to specify a nonzero displacement, as well as a loglet with three waves. The file elements.xls in the gallery documents the discovery of the elements. The data are years and the cumulative number of elements known in that year. The discoveries appear to have come in three waves, probably corresponding to new physical and chemical techniques and instruments.
  50. Select Import from the File menu to open the Import Data "dialog box".

  51. At the bottom of this box is the Files of Type list box. Click on the down arrow on the right end of the box, and select "Excel 5 or 7 (*.xls)" from the menu that drops down from the box. This will reveal all Excel files in the directory.
  52. Select Elements.xls and click on the Open button. (You can also double-click on the filename to achieve the same effect.) Loglet Lab reads the Excel file and translates it into Data View.
  53. To fit three logistics to this data, enter 3 in "Number of Logistics".
  54. Fourteen elements--for instance, gold--were known before 1735, the first year of this series. Thus it makes sense to assume that growth of the number of elements started at 14. Enter 14 as your "Initial Displacement," and click Next.
  55. Unfortunately, for this data set, the estimates provided by Loglet Lab won't work very well. They are based on an assumption that all three wavelets are symmetric, which is inappropriate because the first wave of discoveries was much longer in terms of the length of time and the amount of elements discovered. With the rapid advancement of science near the turn of the century, discoveries were more frequent in the subsequent waves. That said, using the following parameters for each logistic should give you a suitable fit:


    Logistic #1

    Logistic #2

    Logistic #3

    Growth time

    40

    20

    20

    Saturation

    40

    30

    20

    Midpoint

    1800

    1890

    1950

  56. Be sure to look at the decompositions for this logistic! Now you have seen Loglet Lab's ability to fit logistics, from a simple rise and leveling to a composite of several waves of growth.

    REMOVING DATA

  57. When importing from spreadsheets, Loglet Lab will fill in black spaces with zeros. This will catastrophically affect any attempt to fit a curve to the data. There are two ways to get around this. One is to interpolate values to replace the zeros, which must be done by hand; another is to use the Remove Data Points command to remove these rows. Should more data become available, you can always use the Insert Data Points command to insert more rows in your data.
  58. Using your mouse, select the cell(s) you wish to remove. (In our example, we only select the y-values because Loglet Lab can figure out which pair of values to remove; thus you can select cells in either or both columns and achieve the same results.)

  59. Click on the Remove Data Points button . The selected rows you selected will be deleted, and the cell(s) below it will be shifted up.

  60. The Insert Data Points command inserts rows in the column (or rather, in a pair of columns). Using the mouse, select the range of rows where you wish to insert data. If you want to insert 4 rows starting at row 17, select rows 17 through 20.
  61. Click on the Insert Data Points button . The rows you just selected and all the rows below them will be shifted down, leaving blank rows in the selection area.

    LOGISTIC SUBSTITUTION

  62. You are now ready to see Loglet Lab's third capability: Analyzing the rise and fall of competitors as one substitutes for another. We might analyze the 200-year displacement of packhorses with waves of wagons, canals, rails, trucks, and recently airplanes. Instead of this long history that no one has witnessed in a lifetime, you will analyze a history you may have witnessed: the rise of long playing (LP) records, their displacement by cassettes, and then the replacement of cassettes by compact discs (CDs). Loglet Lab computes the market shares of each competitor and fits their rise, leveling, and fall with logistic equations (You can read about the mathematics of logistic substitution by clicking on Help and then Logistic Substitution in the Help Index. It is also discussed in detail in Section 8 of the Logistics Primer.)
  63. In addition to demonstrating logistic substitution, this exercise will show how to paste data into Loglet Lab's Data View. Run Excel, and open RecMedia.xls. This should be a 6x21 grid, the first pair of columns being the year and annual sales of vinyl LPs, the middle pair the year and sales of CDs, and the last pair for cassettes. Although the series of years in all three pairs are identical in this file, they need not be; giving each competitor its own series of years allows you to enter only the important years for that competitor.
  64. In Excel, select the six columns containing the three pairs of years and sales by clicking the first or upper left cell and dragging the mouse to the lower right corner of the data.
  65. While you are still in Excel, click on the Copy button or select Copy in the Edit menu.
  66. Go back to Loglet Lab, and open a new (empty) document.
  67. Click on the Number of Data Sets button . This will bring up the following dialog box:

  68. For "Number of Data Sets," enter 3.
  69. Click on the first cell in the Data View pane, and make sure a heavy border surrounds that cell.
  70. Click on the Paste Data button , or select Paste Data on the Edit menu.
  71. Click the Plot Data button to plot the sales in the Graph View for each of the three competitors.
  72. To see the substitution of one competitor for another you need to convert the sales of each into a percentage of the sales of all three. That is, you need to replot the millions of sales into market shares. Click on the Logistic Substitution button once to see the market shares, which have been plotted using the Fisher-Pry transform.

    Because we are using the Fisher-Pry transformation, the linear portions of each data series show the window in time or portion of its history that a logistic equation logically represents.
  73. Next you need to fit a logistic to each competitor's record. Because LPs only declined during this history, Loglet Lab fits a logistic to its decline during 1975 to 1985, which the linear fall of the Fisher-Pry transform shows was logistic. For cassettes, Loglet Lab will fit an equation to the logistic portion of its rise during 1977 to 1985. Finally, Loglet Lab will fit an equation to the logistic rise of CDs during 1988 to 1995. Clicking on the Logistic Substitution button again opens the following dialog.

  74. Moving the dialog allows you to see the graph and the periods or windows that you will specify. You will learn that Loglet Lab allows you to specify a different order of substitution than the order of the columns in Data View. Note the following regarding this data set:
  75. The first series, LPs, were already declining when this record began, and their Fisher-Pry transforms from 1975 to 1985 line up well, displaying their logistic behavior.
  76. The Fisher-Pry transforms of the third series, cassettes, rose linearly from 1977 to 1985, so the market shares must have grown logistically during that period.
  77. The Fisher-Pry transforms of the second series, CDs, rose logistically from 1988 to 1995.
  78. Tell Loglet Lab the order of substitution by entering 1, 3 and 2 in "For Item #" boxes on the left side of the dialog box.
  79. To fit the logistic curves, Loglet Lab must know the windows or time intervals when a logistic equation is logical. We have already discussed these above and they are visible in the graph of Fisher-Pry transforms. For Item #1, LPs, enter "1975" and "1985" in the boxes to the right of "Fit a line between ____ and ____". For Item #3, cassettes, enter "1977" and "1985" after "Fit a line between ____ and ____". For Item #2, CDs, enter "1988" and "1995" after "Fit a line between ____ and ____".
  80. Make sure the order and intervals are correct. Then, click on the Go button.

  81. The Graph View will show the linear representations of the three logistic models. Three logistic equations now represent twenty years of rising and falling sales of three competitors. Behavior is often irregular when a market share is less than 5%; thus the model's line for the early years of the CD and that for the later years of the LP are not as accurate.

    ANTICIPATING THE IMPACT OF NEW TECHNOLOGIES

  82. New technologies like the digital versatile disc (DVD) are poised to usurp the CD's domination of recording media. Loglet Lab can visualize the impact of a new technology on its competitors.
  83. First, you must tell Loglet Lab to accommodate the new, fourth technology. Click on the Number of Data Sets button and enter 4.
  84. As you did for previously, click Logistic Substitution once to see the market shares, and click it again to get the dialog box. The parameters from the last fit should still be there, along with a new, fourth row. Since there is no data for the DVD, you will have to tell Loglet Lab to synthesize a fourth saturation curve for the new technology. For the new Item #4, click on the radio button "Use the parameters dt =". This allows you to tell Loglet Lab how well and quickly the new technology will compete. You specify its competitiveness by dt, the time for it to rise from 10 to 90% of market share; you specify how soon it will compete by tm, the midpoint of its rise. We expect DVD's to grow at about the same rate as CD's, so try dt = 15 and tm = 2002.
  85. Click Go. The rise of the new competitor and consequent decline of the third, CDs, will appear in Graph View.
  86. Naturally, we would like to look beyond 1995. To extrapolate the history, you can extend the x-axis. Click on the Extend Axes button . This opens the following dialog:

  87. For the minimum, enter 1970, and for the maximum, enter 2015. Hit OK to see the past and hypothetical future of recording sales in the United States:

  88. If you want, you can go back and try different values for dt and tm. You may also want to go back and try expanding the x-axis on other fitted data. In particular, you can expand the right-side to see how the curves level off as they approach limits.

    MORE ON CUT-AND-PASTE

  89. You can copy the data from the Data View to the Clipboard and paste them into another spread sheet (e.g., Excel), word processor/text editor (e.g., Word, emacs), or some other application (e.g., SigmaPlot). For your convenience, you have access to the transformed data, decomposed data, fitted data points (i.e., the points Loglet Lab uses to plot the fitted curves), and the residuals.
  90. To get these points, select the view for which you which to copy data. Then scroll the spreadsheet to find the columns which contain the data which you wish to copy, and select them with your mouse.
  91. Click on the Copy Data button or select Copy Data on the Edit menu. This will copy the data to the Clipboard. (For now, you cannot directly copy graphs onto the Clipboard, at least not using Loglet Lab.)

    PRINTING

  92. Finally, you can print any chart, by clicking on the Print button or by selecting Print in the File menu. You may want to use Print Preview to make sure the hard copy will suit your eyes.

KNOWN BUGS and other problems

  1. Applying exclusion to the generation of bell curves doesn't work quite properly.
  2. For now, Loglet Lab can plot and fit logistic curves for only one series at a time. That is, you are limited to one data series per plot. Thus the Fit Logistic command is disabled when there is more than one data series in the Data View. (Of course, if you have more than one data series in a document, you can try to apply the logistic substitution model.)
  3. Some commands are handled slightly differently depending on whether the Data View or the Graph View is active. A command may fail or misbehave because the wrong pane is active. Be sure to let us know if this happens. Bugs aside, all the commands are certain to work when the Data View is the active pane.
  4. Do not be alarmed if, in the course of using Loglet Lab, the final parameters turn out to be identical to the initial values. Loglet Lab is a 32-bit program, thus the final results are precise up to 16 digits, but they are rounded off to the nearest integer. On the other hand, because the fitting algorithm is iterative, it may be necessary to fit more than once before the parameters converge sufficiently.