Test of Reliability

Reliability is an essential element of test quality. An instrument for measurement is reliable if it provides consistent results. But a reliable instrument need not be valid. For example, if a clock shows time nonstop then it is reliable, but that does not mean it is showing the correct time. Reliability deals with consistency, or reproducibility of similar results in a test by the test subject, if a test is administered on two occasions; the same conclusions are reached both times. While a test with poor reliability will have remarkably different scores each time with the same test and same examinee.

If a test is then it has to be reliable, but the vice versa is not true. Although, reliability might is not as valuable as validity, but nonetheless reliability it is easier to assess than validity for a test. Reliability has two key aspects: stability and equivalence. The degree of stability can be located comparing the results of repeated measurements with the same candidate and the same instrument. Equivalence means the probability of the amount of errors getting introduced by various investigators or different sample items being studied during the repetition of the test. The best way to test for reliability of a test is that two investigators should compare their observations of the same events. Reliability can be improved in the following ways:

(i) By standardizing the measurement conditions to reduce external factors such as boredom, fatigue, etc. which leads to achievement of stability.

(ii) By detailed directions for measurement which can be generalized and used by trained and motivated persons to conduct research and also by increasing the purview of the sample of items used, this lead to equivalence.


Tests of Sound Measurement

While evaluating a measurement tool, three major considerations must be taken into account: validity, reliability and practicality. A sound measurement should fulfill all of these tests.

Test of Validity

It is the most important criterion. It indicates the degree to which an instrument measures what it is supposed to measure. There are three types of validity: Content validity, Criterion-related validity, and Construct validity.

Content validity refers to the extent to which a measuring instrument adequately covers the topic under study. Its determination is mainly judgmental and intuitive. It cannot be expressed in numerical terms. It can also be determined by a panel of persons who judge the extent of the measuring instruments standards.

Criterion-related validity refers to our ability to predict or estimate the existence of a current condition. It reflects the success of measures used for empirical estimating purposes. Criterion-related validity is expressed as the coefficient of correlation between the test scores. Here, the concerned criterion must possess the following characteristics:

  • Relevance: When a criterion is defined in terms judged to be the proper measures, it is known to be relevant.
  • Unbiased: When the criterion provides each subject an equal opportunity to score, it is unbiased.
  • Reliability: When a criterion is stable or reproducible, it is considered as reliable.
  • Availability: The information specified by the criterion should be easily available.

Construct validity is most complex and abstract. It is the extent up to which the scores can be accounted for by the explanatory constructs of a sound theory. Its determination requires association of a set of other propositions with the results received from using the measurement instrument. If the measurements correlate with the other propositions as per our predictions, it can be concluded that there is some degree of construct validity.

If the above criteria are met, we may conclude that our measuring instrument is valid and provides correct measurement; if not, we may have to look for more information and/or depend on judgment.


Criteria of Good Research

Although the research works and studies differ in their form and kind, they all still meet on the common ground of scientific methods employed by them. Hence, scientific research is expected to satisfy the following criteria:

i.  The aim of the research should be clearly mentioned, along with the use of common concepts.

ii.  The procedures used in the research should be adequately described, in order to permit another researcher to repeat the research for further advancement, while maintaining the continuity of what has already been done.

iii.  The researchs procedural design should be carefully planned to obtain results that are as objective as possible.

iv.  The flaws in the procedural design should be sincerely reported by the researcher to correctly estimate their effects upon the findings.

v.  The data analysis should be adequate to reveal its significance.

vi.  The methods used during the analysis should be appropriate.

vii.  The reliability and validity of the concerned data should be checked carefully.

viii. The conclusions are needed to be confined and limited to only those data, which are justified and adequately provided by the research.

ix.  In case, the researcher is experienced and has a good reputation in the field of research, greater confidence in research is warranted.


In other words, we can state the qualities of a good research” as following:

1)  Systematic - This states that the research is structured with some specified steps, which are to be followed in a specified sequence, according to the well defined set of rules. Systematic characteristic of the research does not actually rule out creative thinking, but it does discourage the use of guessing and intuition in order to arrive at conclusions.

2)  Logical - This states that the research is guided by the rules of logical reasoning, and that the logical process of induction and deduction are essential while conducting a research. Induction is the process of reasoning from a part to the whole; while, deduction is the process of reasoning from some premise to a conclusion that follows from that very premise. Besides, logical reasoning enables the research to be more meaningful in the context of decision making.

3)  Empirical - This states that the research is basically related to one or more aspects of a real situation. Moreover, it deals with the concrete data, which provides a base for the external validity of research results.

4) Replicable - This states that the research results should be allowed verification by replicating their study, to thus build a sound basis for decisions.