These measures of variability can be calculated on; • the differences between members for all possible pairs of measurements in the sample or • the difference between each observation in the sample and the sample mean, 2 that is (( i : the squared deviation from the mean. Note that the variance is very close to the average of the squared deviations from the mean. The standard error of the mean reflects how unlike each other all possible means of samples of size n might be if each sample were randomly selected from the same population as the initial sample. Basic concepts of statistical inference the process of using a sample to make inferences about a population is perhaps the most vital aspect of epidemiologic research. The conceptual underpinnings for sta- tistical inference are based on the process of taking a single random sample of a specific size from a population and using this sample to make judgments about the population as a whole. Typically, these judgments are made in terms of means, vari- ances or other summarizing numbers. Summarizing numbers for the population are called parameters and are represented by Greek letters such as: • μ = mean, • ı = standard deviation and • ß = regression coefficient. Estimates of these parameters obtained from a sample are represented by x, s and b, respectively. Using samples to understand populations Random samples the process of selecting a sample from a population is essential to statistical inference. The first step is to select a random sample, whereby each member of the population has an equal chance of being selected for the sample (see Chapter 3). Example: calculating a sample mean 10 people are randomly selected from a population and their weights are mea- sured in kilograms as 82. Of course, another random sample selected from this same population and the weights measured for this new group could lead to a different sample mean; say, x = 68. One of these sample means is not better than the other, but this does raise the question as to the value of an individual sample mean as an estimate of the population mean when it is so easy to take another sample and obtain a different value for x. To put this into context, the value is derived from the process used to obtain the estimate. If this process were repeated a very large number of times, a very long list of sample means could be calculated (Box 4. How well a sample mean estimates the population mean can be assessed by examining the characteristics of this long list of 70 Chapter 4 sample means. If the mean of all these sample means, that is the mean of the means, is the same as the population mean, then the sample mean is an unbiased estimate of the population mean. Confidence intervals Confidence intervals are one of the most useful tools in Box 4. In general, a confidence interval uses these Clearly, it is also preferable that these sample means be concepts to create reasonable bounds for the population very similar to each other, so that any one of them is mean, based on information from a sample. They are easy likely to be close to the true value of the population to prepare and relatively easy to understand. The standard deviation for this long list of sample means, a measure of how similar these sample means Calculating a confidence interval are to each other, is called the eerrrroorr ofof ththee To construct a confidence interval, a lower bound and an memeaann. Note that the long list of sample means is not actually needed in order to estimate this standard error, upper bound are calculated. For the sample of weights, as it can be calculated from a single sample standard with n = 10 and x = 67. Note that the shorter the interval the better, and that a larger sample will likely produce a shorter interval. In fact, in this case, it is exactly in the middle of the interval; whereas the population mean, while likely to be included, is certainly not guaranteed to be within this interval. Interpreting a confidence interval from the t distribution with n-1 = 9 degrees of freedom. It is conceivable that there could be a long list of random However, if the sample size is above n = 30, then the samples taken from a population and that a confidence number 2. For very large samples, the interval could be calculated based on the information number will be 1. The result would be a long list of able in most standard statistics texts and online statistics confidence intervals and the expectation is that, if this resources. Unfortu- nately, for a specific sample one does not know if the including those from regression analyses and for odds ra- confidence interval obtained from the study sample is tios, among many others. Basic biostatistics: concepts and tools 71 Interpreting measures outside the confidence interval When interpreting confidence intervals, one needs to know what to do with measures that fall outside of the interval. One would expect that 95% of such confidence intervals would in fact contain the population mean. We can use a confidence interval to test a hypothesis, namely, the hypothesis that μ = 80. In this case, the hypothesis was tested and rejected based on the lower and upper limits of the confidence interval. In general, confidence intervals can be used in this way to test hypotheses; however, there is a more formal approach described in Box 4. Hypothesis tests, p-values, statistical power Hypothesis-testing is relatively straightforward. We need to make a careful statement of the statistical hypothesis to be tested, the p-value associated with this test and the statistical power the test has for “detecting” a difference of a specified magnitude. The p-value In the above situation, the null hypothesis was rejected because the observed out- come was deemed to be too unlikely or rare under the assumption that the null hypothesis was true. The cut point for rarity in this circumstance was set when the ˞-level was set at ˞ = 0. A more precise measure of the rarity of this observed outcome, again under the assump- tion that the null hypothesis is true, can be obtained readily. This area is called the p-value and it represents the likelihood that a value for the mean of a random sample from this population, would be as far away as 67. That is, the observed outcome is so rare that it is difficult to believe that μ = 80 kg. Statistical power In the description of the two-sample t-test below, there is reference to the null hypothesis: 72 Chapter 4 H 0, H 1 22 ≠ which examines the differences between the means of two populations. If these are two populations of body weights, then, in this context, clearly, the larger the differ- ence between the two population means, the easier it will be to reject this null hypothesis using the sample means. In order to set up a statistical test for this question, we select two options for comparison: • the null hypothesis: H0: μ = 80 kg andg • the alternative: H : μ ≠ 80 kg. If H11 is selected, the usual statement is that the null hypothesis H0 has been rejected. Note that the alternative is expressed as H1: μ ≠ 80 kg rather than eitherg μ > 80 or0 μ < 80. This implies that a two-tailed test is to be done rather than a one-tailed test as would be the case if either of the other two alternatives were used. In general, a two-tailed test should be used for basic epidemiologic applications as the conditions necessary for comfortable use of the one-tailed test in this context are rare.

**Diseases**

- Vascular disruption sequence
- Goldblatt Wallis syndrome
- Subvalvular aortic stenosis
- Epitheliopathy (APMPPE)
- Microcephaly lymphoedema chorioretinal dysplasia
- Fibrochondrogenesis

The result is usually an abnormal protein—or no protein at all. Alzheimer's disease, for example, is caused by this kind of splicing error. Until recently, researchers looked at genes, and the proteins they encode, one at a time. Now, they can look at how large numbers of genes and proteins act, as well as how they interact. Molecular biologist Christine Guthrie of the University of California, San Francisco, wants to understand more fully the mechanism for splicing. Guthrie can identify which genes are required for splicing by finding abnormal yeast cells that mangle splicing. Without introns, cells wouldn't need to go through the splicing process and keep monitoring it to be sure it's working right. As it turns out, splicing also makes it possible for cells to create more proteins. This means that the results have real consequences for people. Your first encounter with genetic analysis probably happened shortly after you were born, when a doctor or nurse took a drop of blood from the heel of your tiny foot. Those born with this disorder cannot metabolize the amino acid phenylalanine, which is present in many foods. Department of Health and the amino acid Human Services, recommended a phenylalanine, standard, national set of newborn which is present tests for 29 conditions, ranging from in many foods. Done one gene at a time, used the results to identify genes that aren’t tran using methods considered stateoftheart just a scribed correctly in people with the disease. He used a variation of the yeast how genes respond in diverse situations, researchers may be able to learn how to stop or jumpstart genes on demand, change the course of a disease or prevent it from ever happening. This is called translation, and its principal actors are the ribosome and amino acids. The ribosome also links each additional amino acid into a growing protein chain (see drawing, page 13). Some ﬁrstaid ointments contain the antibiotic neomycin, not proteins, performed the ribosome’s job. In 1999, he showed how different parts she found, the nucleotides do something else of a bacterial ribosome interact with one entirely: They help the growing protein slip off another and how the ribosome interacts with the ribosome once it’s ﬁnished. Noller, Green and hundreds of other scientists These studies provided near proof that the work with the ribosomes of bacteria. For example, jobs for proteins is to control how embryos antibiotics like erythromycin and neomycin work develop. Scientists discovered a hugely important by attacking the ribosomes of bacteria, which are set of proteins involved in development by study different enough from human ribosomes that our ing mutations that cause bizarre malformations cells are not affected by these drugs. As researchers gain new information about the most famous such abnormality is a fruit bacterial translation, the knowledge may lead to ﬂy with a leg, rather than the usual antenna, more antibiotics for people. Kaufman of Indiana University many bacteria have developed resistance to the in Bloomington, the leg is perfectly normal—it’s current arsenal. In this type of mutation and many others, It can be difﬁcult to ﬁnd those small, but critical, something goes wrong with the genetic program changes that may lead to resistance, so it is that directs some of the cells in an embryo to important to ﬁnd completely new ways to block follow developmental pathways, which are bacterial translation. In the antennaintoleg problem, strategy is to make random mutations to the it is as if the cells growing from the ﬂy’s head, genes in a bacterium that affect its ribosomes. Using clever molecular tricks, Green ﬁgured Thinking about this odd situation taught out a way to rescue some of the bacteria with scientists an important lesson—that the proteins defective ribosomes so they could grow. Scientists determined that several different genes of different organisms, it’s a good clue genes, each with a common sequence, provide that these genes do something so important and these anatomical identiﬁcation card instructions. In the early 1980s, he and yeast to plants, frogs, worms, beetles, chickens, other researchers made a discovery that has been mice and people. For example, in Antennapedia but in the several genes next to researchers have found that abnormalities in it and in genes in many other organisms. This technology has Microarrays are used to get clues about changed the way many geneticists do their work which genes are expressed to control cell, tissue by making it possible to observe the activity of or organ function. What newborn tests does your fluorescence at each spot on the chip, revealing area hospital routinely do? A computer analyzes the patterns of gene activity, providing a snapshot of a genome under two conditions. But that view is no longer the bases adenine (A), cytosine (C), guanine (G) accurate. These discoveries reveal that it is truly a remarkable molecule and a multi talented actor in heredity. The riboswitch shown here bends into a special shape when it grips tightly onto a molecule called a metabolite (colored balls) that bacteria need to survive. Another interesting aspect of editing is that react quickly to a changing environment. Utah School of Medicine in Salt Lake City studies Understanding the details of this process is an one particular class of editors called adenosine important area of medical research. For example, Gregory Hannon of the Cold and practically break in half as the worm grows. Researchers investi awarded the Nobel Prize in physiology or gating genes involved in plant growth noticed medicine for their discovery. They have later, two geneticists studying development saw learned, for example, that the process is not limited a similar thing happening in lab animals. These changes A good part of who we are is “written in our make genes either more or less likely to be genes,” inherited from Mom and Dad. Where we mental inﬂuences, affect who we are and what live, how much we exercise, what we eat: These type of illnesses we might get. Improper expression Epigenetic means, literally,“upon” or “over” of growthpromoting genes, for example, can lead genetics. It describes a type of chemical reaction to cancer, birth defects or other health concerns. Upon viewing chromatin up close, the researchers described it as “beads on a string,” an image still used today. The observation that a cell’s genereading machinery tracks epigenetic markings led Histones C. David Allis, who was then at the University of Virginia Health Sciences Center in Charlottesville and now works at the Rockefeller University in New York City, Chromosome to coin a new phrase, the “histone code. These markings help determine whether genes will ment of medicines to correct such errors. Thus, the grandsons of a man with a fragile X chromosome, who is not himself affected, have a 40 percent risk of retardation if they inherit the abnormal, or mutant, form of the gene. A single copy of the abnormal, or mutant, form of the Igf2 gene causes growth defects, but only if the abnormal gene variant is inherited from the father.

Battle of the Sexes
A process called imprinting, which occurs naturally in our cells, provides another example of epigenetics. In the case of Igf2, the father's copy of Igf2 is expressed, and the mother's copy remains silent (is not expressed) throughout the life of the offspring.

Exposure
An important aspect of case-control studies is the determination of the start and duration of exposure for cases and controls. In the case-control design, the exposure status of the cases is usually determined after the development of the disease (retrospective data) and usually by direct questioning of the affected person or a relative or friend (Box 3.1). The informant's answers may be influenced by knowledge about the hypothesis under investigation or the disease experience itself.

Thalidomide
A classic example of a case-control study was the discovery of the relationship between thalidomide and limb defects in babies born in the Federal Republic of Germany in 1959 and 1960. Of 46 mothers whose babies had malformations, 41 had been given thalidomide between the fourth and ninth weeks of pregnancy.

Exposure is sometimes determined by biochemical measurements. This problem can be avoided if exposure can be estimated from an established recording system.

The odds ratio is very similar to the risk ratio, particularly if a disease is rare. For the odds ratio to be a good approximation, the cases and controls must be representative of the general population with respect to exposure. However, because the incidence of disease is unknown, the absolute risk can not be calculated. An odds ratio should be accompanied by the confidence interval observed around the point estimate (see Chapter 4). Proportionately more people who had the defects in babies born in the Federal Republic of Ger- disease (50 of 61 cases) reported prior meat consumption many in 1959 and 1960. Of 46 mothers whose babies had malformations, 41 had been Exposure is sometimes determined by biochemical given thalidomide between the fourth and ninth weeks measurements. This problem can be avoided if exposure can be estimated from an established recording system. Disease (enteritis Yes 50 11 61 necroticans) the odds ratio is very similar to the risk ratio, partic- No 16 41 57 ularly if a disease is rare. For the odds ratio to be a good Total 66 52 118 approximation, the cases and controls must be represen- tative of the general population with respect to exposure. However, because the incidence of disease is unknown, the absolute risk can not be calculated. An odds ratio should be accompanied by the confidence interval observed around the point estimate (see Chapter 4). Cohort studies Cohort studies, also called follow-up or incidence studies, begin with a group of people who are free of disease, and who are classified into subgroups according to exposure to a potential cause of disease or outcome (Figure 3. Variables of interest are specified and measured and the whole cohort is followed up to see how the subsequent development of new cases of the disease (or other outcome) differs between the groups with and without exposure. Cohort studies have been called prospective studies, but this terminology is confusing and should be avoided. As mentioned previously, the term “prospective” refers to the timing of data collection and not to the relationship between exposure and effect. Cohort studies provide the best information about the causation of disease and the most direct measurement of the risk of developing disease. Although conceptu- ally simple, cohort studies are major undertakings and may require long periods of follow-up since disease may occur Box 3. For example, the induction pe- riod for leukaemia or thyroid cancer caused by radiation An example of measuring effects over a long time period is the catastrophic poisoning of residents around a (i. An inter- outcome) is many years and it is necessary to follow up mediate chemical in the production process, methyl study participants for a long time. Many exposures inves- isocyanate, leaked from a tank and the fumes drifted into tigated are long-term in nature and accurate information surrounding residential areas, exposing half a about them requires data collection over long periods. In addition, 120 000 people still suffer health effects caused by the crash and subsequent relatively stable habits and information about past and pollution. The acute effects were easily studied with a current exposure can be collected at the time the cohort cross-sectional design. As cohort studies start with exposed and unexposed people, the difficulty of measuring or finding existing data on individual exposures largely determines the feasibility of doing one of these studies. If the disease is rare in the exposed group as well as the unexposed group there may also be problems in obtaining a large enough study group. Since cohort studies take healthy people as their start- ing-point, it is possible to examine a range of outcomes Box 3. Nurses’ Health Study (in contrast to what can be achieved in case-control stud- Although cost is a major factor in large cohort studies, ies). For example, the Framingham study – a cohort study methods have been developed to make them less ex- that began in 1948 – has investigated the risk factors for pensive to run. In 1976, 121 700 married female nurses a wide range of diseases, including cardiovascular and res- aged 30–55 years completed the initial Nurses’ Health piratory diseases and musculoskeletal disorders. Every two years, self-administered questionnaires were sent to these nurses, who supplied Similar large-scale cohort studies have been started in information on their health behaviours and reproductive China. The initial cohort was enrolled tories, and major cardiovascular risk factors including with the objective of evaluating the health effects of oral blood pressure and body weight were obtained from a contraceptive use. Investigators tested their methods on small subgroups of the larger cohort, and obtained in- representative sample of 169 871 men and women 40 formation on disease outcomes from routine data years of age and older in 1990. Such studies have atively common cause of death, it is a rare occurrence provided strong evidence for a variety of cause-effect re- in younger women, and so a large cohort is lationships for chronic diseases. This type of investigation is called a historical cohort study, because all the exposure and effect (disease) data have been collected before the actual study begins. For example, records of military personnel exposure to radioactive fall-out at nuclear bomb Box 3. Nested case-control study of gastric testing sites have been used to examine the possible causal cancer role of fall-out in the development of cancer over the past To determine if infection with Helicobacter pylori was 30 years. By 1991, 186 people in the original co- Nested case-control studies hort had developed gastric cancer. The investigators the nested case-control design makes cohort studies less then did a nested case-control study by selecting the 186 people with gastric cancer as cases and another expensive. The cases and controls are both chosen from 186 cancer-free individuals from the same cohort as a defined cohort, for which some information on expo- controls. Identifying cases and controls for a nested case-control study Disease Cases People without disease Population No disease Sample Controls Time (follow-up over many years) Summary of epidemiological studies Table 3. Applications of different observational study designs a Objective Ecological Cross-sectional Case-control Cohort Investigation of rare disease ++++ – +++++ – Investigation of rare cause ++ – – +++++ Testing multiple effects of cause + ++ – +++++ Study of multiple exposures and determinants ++ ++ ++++ +++ Measurements of time relationship ++ – +b +++++ Direct measurement of incidence – – +c +++++ Investigation of long latent periods – – +++ – a +…+++++ indicates the general degree of suitability (there are exceptions); – not suitable b If prospective. Experimental epidemiology Intervention or experimentation involves attempting to change a variable in one or more groups of people. This could mean the elimination of a dietary factor thought to cause allergy, or testing a new treatment on a selected group of patients. The effects of an intervention are measured by comparing the outcome in the experimental group with that in a control group. Since the interventions are strictly determined by the study protocol, ethical considerations are of paramount importance in the design of these studies. For example, no patient should be denied appropriate treatment as a result of participation in an experiment, and the treatment being tested must be acceptable in the light of current knowledge. Informed consent from study partici- pants is required in almost all circumstances. An interventional study is usually designed as a randomized controlled trial, a field trial, or a community trial. Randomized controlled trials A randomized controlled trial is an epidemiological experiment designed to study the effects of a particular intervention, usually a treatment for a specific disease (clinical trial). Subjects in the study population are randomly allocated to intervention and control groups, and the results are assessed by comparing outcomes.