Professor Omar Hasan Kasule, Sr. MB ChB (MUK), MPH, DrPH (Harvard)


Data gives rise to information that in turn gives rise to knowledge. Knowledge leads to understanding. Understanding leads to wisdom. The primary purpose of statistical analysis is to derive information from data. The human mind cannot deal effectively with raw data. It can only deal with a representative summary of the raw data expressed as statistical parameters.


Presentation of statistical results must be concise and to the point following the principle of parsimony first stated by the Prophet Muhammad and restated centuries later by William of Occum. The Prophet said that the best of words is what is little but coveys the essence, khayr al umuur ma qalla wa dalla’. Occum’s razor is attributed to William of Occum (1285-1349M), an English philosopher, who formulated the maxim ‘entia nonsunt multiplicanda praeter necessitatem’ which translates that the assumptions to explain a phenomenon must not be multiplied beyond necessity’ This law was called the law of parsimony by Sir William Hamilton in 1853. Karl Pearson called it the canon of economy in 1892.



The results section presents the findings of the procedures carried out in the methods section. It should be brief and to the point. A distinction must be made between results and data. Result refers to summary information obtained from data analysis. Results of hypothesis-based studies should be in the past tense. Data of descriptive studies should be in the present tense. Data is the actual numerical information often presented in a summarized form. The result is presented followed by presentation of supporting data.


Data are presented in the form of tables and diagrams (figures, bar diagrams, graphs, pie charts, maps etc). Presentation of numerical data in text should be kept to a minimum. Only results relevant to the research hypothesis should be presented. Both negative and positive results are presented. It is considered scientific fraud to present only those results that the author thinks favor a particular hypothesis.


The results section is written in chronological order. The most important results are presented before the least important. Magnitude of change should be presented as a summary statistic such as percentage change instead of presenting the raw data.


Summary statistics normally used as the mean, the median, and the proportion. The mean should be presented properly as mean +/- standard deviation or standard error of the mean (SD or SE) with units of measure indicated. Measures of effect are normally the chi square and the t statistics. Actual p values should be given instead of indicating p<0.05 or p>0.05. When specifying the sample size the type of sample should be explained for example ‘the sample was 20 rats’ instead of the sample size was 20’. Emphasis can be put on some results and not others. Not all the data from the study need be reported. Citing data in the text takes less space but is more difficult to read. A topic sentence is used to give an overview. Important results are put first.


Figures used to present results must have a strong visual impact and must be simple. The following types of figures are used: line graph, scatter gram, bar graph, histogram, and the frequency polygon. The title of the figure should reflect its contents. It must be labeled correctly. Symbols must be defined. The names of variables and units of measurement must be labeled appropriately. Tables must be properly titled and column headings clearly indicated. Footnotes, subscripts, and superscripts can be used.



Several problems are found in the results section of papers. The reporting of results is sometimes selective showing only favorable outcomes.  Missing denominators and numbers that do not add up are common deficiencies. Tables may not be properly or completely labeled. Marginal totals (rows or columns) may not equal the sum of the respective cells. The column totals may not equal row totals. Row or column percentages may not add up to 100%s. Numbers in the table may not reconcile with the text. There may be inconsistencies in rounding off, number of decimal places, and units of measurement. The author may fail to mention whether statistical tests are based on confidence intervals or p-values? The following pit-falls are common with the t-test: not stating the degrees of freedom, not stating the confidence intervals, use of the t-test for non-Gaussian data, and multiple comparisons. Some authors go on a ‘fishing’ expedition when they set out to analyze data without a prior hypothesis. Other common mistakes are improper multiple comparisons, failure to standardize for age, failure to take into account information loss due to censoring, using the wrong statistical formula, confusing continuous and discrete scales, and using mean +/- 2SD on non-normal data.



Lying with statistics: Statistics are often abused. Benjamin Disraeli, the 19th century British Prime Minister, is credited by Mark Twain for making the statement: ‘There are three kinds of lies: lies, damned lies, and statistics’ to which Frederick Mosteller, Harvard Professor of Statistics replied: ‘It is easy to lie with statistics but it is easier to lie without them’ (Demetri Kantarelis: Essentials of Inferential Statistics’ McGraw-Hill NY 1996.). Figures never lie but liars figure.


Fallacies of numerical reasoning: Information presented as numerical and scientific may be based on false or wrong numerical reasoning. Statistics can be abused by selection of a favorable rate and ignoring unfavorable ones. The author may 'play' either with the numerator or the denominator. The numerator and the denominator can be made wider or narrower to give false and misleading rates. The numerator may be given without a denominator or an inappropriate denominator may be used. It is misleading to make comparative statements without specifying a comparison group or using an inappropriate comparison...


  • Data  
  • Information
  • Knowledge
  • Understanding
  • Wisdom



  • William of Occum (1285-1349M)
  • Prophet Muhammad



  • Continuous data:        mean, median
  • Discrete data:                        proportion (%), rate, ratio



  • Continuous data:                    t, F
  • Discrete data:                                    Chi square
  • Both                                        Regression & correlation



  • Risk Ratio
  • Odds Ratio



  • State actual value and not < or >
  • Interpretation of p value



  • Numerator/denominator problems
  • Inappropriate comparison
No comparison

Professor Omar Hasan Kasule Sr. February 2003