Conference Schedule

8:15 – 9:00
Registration and Continental Breakfast

9:00 – 9:10
Introductory Remarks
Boris Iglewicz, Temple University

9:10 – 9:55
Considerations in the Design and Analysis of Studies with Clustered Data
Melissa Begg, Columbia University

9:55 – 10:40
Identification of Differentially Expressed Genes in a DNA Microarray Experiment with Little Replication
Javier Cabrera, Rutgers University

10:40 – 11:05

11:05 – 11:50
Three Statistical Problems Considered by the FDA
Joseph Gastwirth, George Washington University

11:50 – 12:35
Statistical challenges in analyzing mass spectrometry proteomic data
Xihong Lin, Harvard University

12:35 – 1:40

1:40 – 2:25
Use of Hadamard Matrices in the Analysis and Planning of Microarray Experiments
Damaraju Raghavarao, Temple University

2:25 – 3:10
The Impact of a Proper Weighting of Variance Function in Bioassay Applications
Charles Tan, Merck Research Laboratories

3:10 – 3:25
General Discussion


Melissa Begg, Columbia University

The past two decades have witnessed tremendous growth in the number of available methods for analyzing cluster-correlated data with continuous and discrete outcomes. These methods have been applied to a wide range of studies using clustered designs, due to widespread appreciation of the fact that clustered data cannot be analyzed using “ordinary” methods for independent data. But there remains a question as to whether common analytic practices make the best use of the richness of correlated data. For example, public health researchers rely on family (or sibling or twin) studies to evaluate the impact of a host of factors on health in infancy, childhood, and even adulthood. These study designs permit us to discriminate between factors operating at the family level and those operating at the level of the individual within a family. This “sifting” of effects might allow for better control of confounding factors, separation of environmental and genetic influences, and causal interpretation of effects. These advantages, however, may not always be realized. We will review strategies for model specification and their implications for inference in the analysis of family studies using regression models for clustered data.

Javier Cabrera, Rutgers University

A fundamental experiment in functional genomics research is the comparison of two groups of microarray data to determine which genes are expressed differentially between the two (e.g., diseased versus normal tissue). While these data could be simply analyzed gene by gene with a series of individual t tests, it should be possible to increase the power of the procedure substantially by borrowing strength across genes. We show how this can be realized via a model, which posits minimal distributional assumptions, and a conditional t suite of tests. We also provide a comparison with competitor procedures (SAM) both conceptually and via simulations.

Joseph Gastwirth, George Washington University

This talk will discuss three statistical topics that arose from important health issues faced by the FDA. Today almost everyone accepts the validity of random sampling, however, that was not true in the 1930’s and 40’s. Thus, it is of interest to review the statistical data from an early case where the court accepted the scientific validity of sampling and examined the propriety of “pooling” several samples into one. In the 1980’s the speaker served as a consultant to the Office of Management and Budget when the FDA proposed to regulate Reye syndrome. The statistical issues will be summarized and a proposal, based on more recent statistical literature, for using Bayesian methods to update the estimated relative risk will be presented. Finally, the suggested methodology for determining expiration dates of drugs will be examined from the perspective of positive predictive value and related methodology used in evaluating screening and diagnostic tests. This enables one to assess the various trade-offs involved.

Xihong Lin, Department of Biostatistics, School of  Public Health, Harvard University

In high-throughput mass spectrometry (MS) proteomic experiments, we can  simultaneously detect  and quantify a large number of peptides/proteins. Such techniques have good  potentials for new biomarker discovery for diseases. Resulting data  (spectra) from such experiments are large and can be treated as finely  sampled functions. Most of the existing  MS analysis involves multiple ad  hoc sequential methods for preprocessing the MS data, such as baseline subtraction, truncation, normalization, peak detection and peak alignmen.  We will discuss challenges in analyzing MS preteomic data and propose a  unified statistical  framework for pre-processing and post-processing  mass spectra using advanced nonparametric regression and  functional data  analysis technqiues  in conjunction with statistical  learning methods.  We stress that pre-processing is critical in analysis of  mass spectrometry proteomic data. We apply the methodology to a motivating data set obtained from a study of lung cancer patients whose serum samples were collected and processed using a  surface-enhanced laser desorption/ionization time-of-flight (SELDI-TOF)  mass spectrometry (MS) instrument.

Charles Tan, Merck Research Laboratories

In bioassay applications, a standard curve, typically a four-parameter logistic curve, is the anchor for each run. However, not much attention is usually paid to the weighting (or inverse variance) function since it doesn’t seem to matter much. If the weight function is off, we may loose some efficiency, but still have consistent estimates, and the parameter estimates differ very little in numerical terms. This presentation will show weighting function matters a lot in the determination of Limit of Quantitation based on standard curve. In choosing a proper variance function, one faces the problem of too many choices but no systematic method. This presentation will show that piece-wise power law weighting can accommodate a very wide range of bioassays and provide a systematic platform. There are even hints that the “knot” or “change point” of the piece-wise power law variance function tells us something about the underlying biochemistry that the standard curve (the mean function) itself does not. This presentation will also show that the proper variance function, once established, can be used to simplify the “extra-variability” criteria for duplicated wells or readings on the same sample.

Damaraju Raghavarao, Temple University

When replicated data are not available, we cannot determine expressed genes by standard statistical methods. However, using Hadamard matrices we can create artificial replications and identify expressed genes assuming some genes are unexpressed. This method is also applicable when a limited number of replicates are available. We demonstrate our method on ApoAI data. These matrices can also be used in a very complicated way to plan microarray experiments to study the significance of demographic factors of the subjects on the gene expressions.


Melissa Begg is a Professor of Clinical Biostatistics and her department’s Director of Academic Programs at the Mailman School of Public Health of Columbia University. She received her ScD degree in biostatistics from the Harvard School of Public Health in 1989. Her interests include clinical research education, analysis of clustered data, categorical data analysis, and statistical methods in oral health research and studies of mental health. She serves as director of the Clinical Research Methods and Patient Oriented Research tracks in the Master’s program at Columbia. In 2006 she received Columbia University’s Presidential Award for Outstanding Teaching, as well as the Award for Teaching Excellence from the graduating class of the Mailman School of Public Health.

Javier Cabrera is the director of Institute of Biostatistics at Rutgers University. He got his Ph.D. from Princeton University and has lectured in Statistics at Rutgers University, National University of Singapore and Hong Kong University of Sci & Tech . He is author and co-author of many publication in the areas of data mining and functional Genomics, including a book in Exploration and Analysis of DNA microarray and Protein array data.

Joseph Gastwirth is Professor of Economics and Statistics at the George Washington University. He is a Fellow of the American Statistical Association, the Institute of Mathematical Statistics and the American Association for the Advancement of Science and a member of the International Statistical Institute. In 1985 he received a Guggenheim Fellowship to continue his research on statistical problems arising in law and public policy. He is the author of “Statistical Reasoning in Law and Public Policy”, which discusses a wide variety of legal applications including the use of epidemiologic data in product liability cases and clinical trials in drug approval and misleading advertising cases. In the early 1980’s he served as a consultant to the Office of Information and Regulatory Analysis in the Office of Management and Budget. Professor Gastwirth has written over one hundred peer-reviewed articles concerning both the theory and applications of statistical inference.

Xihong Lin is a  Professor of the Department of Biostatistics at Harvard School
of Public  Health. Her statistical research includes analysis of correlated and
high dimensional data, nonparametric/semiparametric regression,  longitudinal data,  statistical learning methods, and a range of topics  in observational  studies and  disease prevention studies. Her  methodological research has been  continuously  founded by NIH in the  last 10 years. Professor Lin is a Fellow of the  American  Statistical  Association, and is the recipient of the 2002 Mortimer  Spiegelman  Award  from the American Public Health Association and the 2002 Noether Young  Researcher Award from the American Statistical  Association. She was  the  co-ordinating editor of Biometrics between 2003  and  2005.

Damaraju Raghavarao, Damaraju Raghavarao got his Ph.D. degree from Bombay University, India in 1961. After working as Professor & Head of the Department of Mathematics and Statistics in a Land Grant University in India, he came to U.S.A. in 1972 and joined Temple University as Professor of Statistics in 1974. His primary research area of interest is experimental designs with applications to agriculture, pharmaceuticals and business. He authored or coauthored 7 books and more than 100 refereed papers. He is a Fellow of the Institute of Mathematical Statistics, American Statistical Association and an elected member of the International Statistical Institute.

Dr. Charles Tan,  Associate Director, Scientific Staff, works in the non-clinical statistics area at Merck.  He obtained his PhD from Temple University in 1998 and a paper he co-authored with Prof. Boris Iglewics based on the dissertation work won the ASA W.J. Youden Award in Interlaboratory Testing in 2001.  Dr. Tan has been active in serving the statistical community.  He was the President of the Philadelphia Chapter of ASA and is the current Chair of ASA Advisory Committee on Continuing Education.  Dr. Tan has been working with bioassays for more than seven years and he’s active in connecting with other scientific disciplines.  He is a member of United State Pharmacopeia Statistical Expert Committee and a member of Advisory Panel for USP Chapter 1125, Nucleic Acid-Based Techniques.


General – $80
Merck – $40
Bristol-Myers Squibb – $60
Wyeth – $60
Full time graduate students – $25

Registration includes: Continental Breakfast, Lunch, Break. Parking is free.

8:15AM – 9:00AM
Meeting: 9:00AM – 3:30PM

Seating is limited. Please make checks payable to Temple University (Biostatistics) and send to:

Boris Iglewicz, Director,
Biostatistics Research Center,
Department of Statistics, Temple University,
1810 N. 13th Street,
Philadelphia, PA 19122-6083

Please include your name, the name of your company, and either your email address, fax #, or address. We must receive checks by Wednesday, October 18, 2005. We cannot accept cash or credit card payments.

For additional information, contact Boris Iglewicz, Director, email: or telephone (215) 204-8637.


640 W. Germantown Pike, Plymouth Meeting PA 19462
(610) 834-8300

From Airport: Take 95 South to 476 North to the last exit #20(Germantown Pike-West). Merge with Germantown Pike and follow for 3 lights. Make a right onto Hickory Rd. at the 3rd light. The hotel is the 3rd building on the left.

New York/ New Jersey Turnpike: Take the New Jersey Turnpike to exit #6, which is PA turnpike. Go west to exit #333- Norristown. Follow signs to Plymouth Rd. Go to the 1st light and make a left. Go to the next light and make a right onto Germantown Pike. Go to the second light and make a right on Hickory Rd. The hotel is the second driveway on the left.

Washington D.C., Wilmington, and Delaware: Take I-95 North to Route 476 North. Take Route 476 to the Germantown Pike West exit #20. Go to the third light, Hickory Rd., and make a right. The hotel is the 2nd driveway on the left.

Route 476: Take 476 to the Germantown Pike West exit #20. Go to the third light, Hickory Rd., and make a right. The hotel is the 2nd driveway on the left.

From downtown Philadelphia: I-76 west Plymouth Meeting exit #331B (Route 476). Take Route 476 north to Germantown Pike exit. Go to the third light, Hickory Rd., and make a right. The hotel is the 2nd driveway on the left.