Item analysis can be treated under three heads
I         Item selection
II      Item difficulty
II Item validity
I.                   ITEM SELECTION
The choice of item depends in the first instance, upon the judgement of competent persons as to its suitability for the purpose of the test. Certain types of items have proved to be generally useful in intelligence examination. Problems in mental arithmetic, vocabulary, analogies, number series completion, for example are found over and over again. So also are items requiring generalization, interpretation and the  ability to se relation, the validity of the items in most tests of educational achievement depends, as a first step upon the consensus of teachers and educators as to the adequacy of the material concluded. Courses of study, grade requirements, and curricula from various parts of the country are  carefully called over by test makers in order to determine what content should be included  in the various subject fields.
Item chosen for aptitude tests, for tests in special fields and items used in personal data sheets, interest and attitude tests are selected in the same manner, such questions represent a consensus of experts as to the most relevant problems in the area sampled.
II.                ITEM DIFFICULTY
The difficulty of an item may be  determined in several ways.
1.                             By the judgement of competent  people who rank the
items in order of difficulty.
2.                             By how quickly the item can be solved
3.                             By the number of examines in the group of who get the
item right
The first two procedures are usually  a first step, when the items one are for usen in special aptitude tests, in performance tests and in areas where qualitative distinctions and opinions must same as criteria
The proportion (p) passing an item is an index of item difficulty. If 90% of a standard group pass an item, it is easy . if only 10% pass the item is hard.
When ‘p’ = the percentage passing an item and ‘q’ = the percentage failing. It can be shown that the set of the item is *** and its variance (6*) is pq
When p =.50 and q =.50, the item variance is .25. this is the maximum variance which an item can have. Hence an item with a difficulty index of .50 (p=50) brings  out more individual difference than a harden or easier item. In general, as ‘p’ drops below.50  or goes above .50.*** variance of the item steadily decreases. Thus an item pass by 60% has a variance of.24 and the item passed by 90 % and failed by 10% has a variance of . 09.
In item selection not only the individual item differ he considered, but the inter correlations of the item difficulty be confidential but the inter corrections. For a test of only 50 items for example, there would e 5x49/or 1225 tetxachoric r’s  co efficient. If the item of a test all correlate +1.00 then a single item will do the work of all.
In the absence of precise knowledge concerning item correlation, it is impossible to say exactly Dhat is the best distribution of item difficulties. There is  agreement among test makers, however, that (1) for the sharpest discrimination among examiners items should be around 50% in difficulty that (2) when a certain proportion of the group (the upper 25%for example) is to be separated from the reminder (the lower 75%) Finally (3) Dhan item correlation are high ( as is true in most educational achievementtests) and the range from high to low. The normal curve can be taken as a guide in the selection of difficulty indices. Thus 50%of the items might have difficulty indices between. 25 and 75, 25% induces larger than 75 % and 25% smaller than .25.At item passed by 0% or 100% has no differentiating value of course but such items may be included in a solety for the psychological effect. Differently includes with in more narrow ranges may, of course be taken from normal curve.
It is important to try to estimate the number of examiners who get the right answer through correct knowledge or correct reasoning and to rule out answers which are based up on guess work. In correcting for chance success we assume that (2) to one who does not know under these assumptions it is reasonable to expect that some of those who really did not know the right answer selected it by chance. A formula for correcting the difficulty index of an item for chance success in following
In which
Pc      = the percent who actually know the right answer
R        = the number who get the right answer
W       =  the number who get the wrong answer
N        = the number of examines in the sample
HR    = the number of examines who do not reach the item
K        = the number of options or choices
To illustrate, suppose that a sample of 300 examiners take a test of 100  items, each item having 5 options. Suppose further that  150 answer item # 48 corectly, that 120 answer it  incorrectly, and that 30   do not reach the item and hence do not attempt in the time limit. Instead of a difficulty index of 50, item # 40 has a corrected difficulty index of 44. Thus
Pc =*****
The corrected value of the difficulty index is to be sure, an approximation, but it probably gives a more nearly true measure than does the experimentally obtained percentage
1 . the validity index (discriminative power) is determined by the extent to which the given item discriminates among by the examinees who differ sharply in the function measured by the test as a whole. A number of methods have been devised for use in determining the discriminative power of an item. But biserial correlation is usually regarded as the standard procedure in item analysis. Biserial ‘r’ gives the correlation of an item with total score on the test, or with scores in some independent criterion. The adequacy of other methods is judged by the degree to which they are able to yield results which approximate those obtained by biserial correlation.
One method of determining validity indices much favoured by test makers, set p extreme groups in computing the validity of an item. This procedure will be as follows
1.     Arrange  the test papers in order of size for test score put the paper with the highest score on top.
2.     Assume that we have 200 examinees. Count of top 27% of papers and the bottom 27%. This puts 54 papers in the first pile and 54 in the second.
3.     Lay aside the middle 92 papers. These are used simply to mark off the two and groups.
4.     Tally the number in the top group which passes each item on the test, and the number in the bottom group which passes each item. Convert these number into percentage
5.     Correct these percents for chance success
6.     Entering these percent of succers in the two groups and read the biserial r from the interesting column and row in the body of the table.
7.     Average the two percentages to find the difficulty index of the item.
The validation of a completed test always be computed on a new sample – ie, one different from that used in the item analysis. This process is called cross validation
The validation of a test, when computed from the
standardization sample;

The effect of chance factors upon validity can be shown in the following way. Suppose thaqt the items on an aptitude test, specially designed for retail salesman, have been so selected as to yield satisfactory validity induces in terms of the top and bottom 27% of a standard sample of sales personnel. Many irrelevant factors are likely to be present in this group. Some of which will be correlated with scores on the test. Such factor will often be correlated with responses to the items in one extreme group more often than in the other. The validity coefficient of the final test will be lower in the new groups than in the new groups than in the original standardization group. Validity correlations tend always to be spuriously high in the standards group. Thus making cross validation necessary.

