SPM+
(Standard Progressive Matrices) |
A nonverbal test of ability (NOT a test of nonverbal ability), ex making meaning out of confusion, handling complexity.
Structure:
|
MHV
(Mill Hill Vocabulary)
|
A verbal measure of general ability. Designed for use with the SPM+
Structure:
|
SPM+/MHV |
= Brief nonverbal and verbal screening measures of general ability.
Provide a fairer assessment of:
When used in combination, SPM+ and MHV can measure the difference between verbal and nonverbal aspects of general ability. |
Other forms of Raven’s Progressive Matrices | Coloured Progressive Matrices (CPM):
Advanced Progressive Matrices (APM):
|
Other forms of Raven’s Vocabulary
|
Crichton Vocabulary Scale (CVS):
|
Raven’s
Educational UK: SPM+/MHV (2008 revision) |
Significant changes from previous editions:
|
DEVELOPMENT OF THE RAVEN’S EDUCATIONALUK: SPM+/MHV | |
Historical background | Raven’s Progressive Matrices and Vocabulary Scales were created to assess the g (general factor) as proposed by Spearman (1927) and it’s two undercomponents, eductive ability and reproductive ability, in a simple and unambigous way. |
Eductive ability | Making meaning out of confusion, developing new insights, thinking beyond the obvious, forming new constructs in order to handle complex problems. Mostly nonverbal. |
Reproductive ability | Mastering, recalling, and reproducing material (explicit, verbalised knowledge). Mostly verbal. |
THE RELATIONSHIP AND DISTINCTIONS BETWEEN EDUCTIVE ABILITY, G, INTELLIGENCE, GENERAL INTELLEGENCE AND OTHER BASIC CONCEPTS | |
Eductive Ability and g | Too often too much explanatory power is given to the broad concept of g, making it a less useful idea, while the term eductive ability suggests a more limited and concrete domain of utility. |
Eductive Ability and ”Intelligence” |
The Raven’s Progressive Matrices were never intended to used on their own as a measure of g or general intelligence, but it has often been demonstrated that the Matrices are one of the best single measures of g.
Correlations between Raven’s and full-length intelligence measures tend to range between 0,6-0,8 for the Matrices and between 0,8-0,95 for the MHV, suggesting that full-length intelligence tests are primarily tests of reproductive ability. |
General Intelligence and g | Spearman never intended that g would be synonymous with all the abilities required for intelligent behaviour, nor that g or intelligence could be interchangeable with ability.
General intelligence requires many abilities (ex making meaning out of confusion) but also requires jugdement and specialist information. |
DEVELOPMENT OF THE PROGRESSIVE MATRICES | |
Raven’s Progressive Matrices
|
Raven’s Progressive Matrices have been in use for more than 70 years. The first series were based on a test used by Spearman.
1938 – Standard Progressive Matrices (SPM) 1941 – Advanced Progressive Matrices (APM) 1947 – Coloured Progressive Matrices (CPM) Each version has been revised since their initial publications, incl. shortening and re-sequencing (the APM) and developin parallel versions (SPM and CPM). The SPM+ One of the most significant developments of the Progressice Matrices. Developed as a consequence of evidence of the Flynn effect, ie. that eductive ability had been increasing in the general population, producing a ceiling effect. The new developments therefore included more difficult items. Latest standardisation in 2008, completed in the UK with 926 children age 7-18 years. |
DEVELOPMENT OF THE VOCABULARY SCALES | |
The MHV | The development of MHV began in the late 1930s. The words were sampled from a dictionary and arranged in increasing order of difficulty.
Many forms of the MHV are/have been available, ex MHV all open definitions, MHV short form, MHV multiple choice only. Latest standardisation in 2008, completed in the UK with 926 children age 7-18 years. |
STANDARDISATIONS | |
SPM+ | The SPM+ has been standardised numerous times on different populations. The majority of standardisations have been completed on the Classic Form, from which the SPM+ has been developed.
1930s-1960s: Standardised in Ipswich. Extensive collection of adult norms during Second World War + standardisation along with MHV on schoolchildren. Additional accumulation of data on older adults in 1940s. Check-up on accuracy of norms in 1950s and 1960s. 1970s: Testing 3700 Irish schoolchildren aged 6-12. A large-scale German standardisation. Testing 3500 British schoolchildren (incl. special schools) aged 6-16. 1980s-1990s: Further data collected on various age groups in various countries, incl. USA, New Zealand, Australia, China, Switzerland, France and Great Britain, Belgium, USA and China (adults). Data for hard-of-hearing young people was also collected. 2002/2003: A nationally representative sample of 2755 Romanians aged 6-80, were administered the SPM+ in their homes. |
MHV | The MHV has been standardised numerous times on various populations.
1940s-1960s: Standardised on 5600 schoolchildren in Essex. Both SPM+ and MHV were administered to (as group tests) 1047 post office engineers, 1145 postal workers and 920 male employees of a photographic company, all aged 16-65. 1970s +: Revision of the MHV (1977) reflecting changes in item difficulty and due to the Flynn effect. 3500 young people aged 6-16 were tested in the UK. The standardisation showed no sex differences. Standardised on adults in Scotland (1992) and also in USA and Belgium. |
CONTEMPORARY ISSUES FOR THE RAVEN’S PROGRESSIVE MATRICES AND VOCABULARY SCALES | |
The Flynn Effect |
Large rises in mean scores since initial publication (also seen in other psychometric tests, ex WISC). Much work on the Flynn Effect has come from analysis of Raven’s Progressive Matrices.
Flynn showed that on average, IQ scores increased by 0,3 IC points every year and had been doing so throughout most of the 20th century. He argues that the rise is due to the increasing influence of scientific ways of thinking. The Flynn Effect appears to be universal with similar results being reported in over 14 countries. |
EVIDENCE OF RELIABILITY | |
Group vs. Individual Administration of the SPM+ | Sample size = 105, ages 7-18
Setup: 2 groups. Group A is first administered SPM+ in a group setting and (with a time delay) then individually, while group B is first administered SPM+ individually, and after a time delay in a group setting. Results: Higher raw scores were achieved when the SPM+ was administered individually at first, and afterwards in the group setting (Group B). This setup was also used to calculate the test-retest reliability (r = 0,833). |
SPM+/SPM-C Equating Study
|
Sample size = 109, ages 7-18
Setup: 2 groups. Group A first does the SPM+ and after a time delay the SPM-C, while group B first does the SPM-C and after a time delay the SPM+ Results: A previous administration of SPM+ or SPM-C has no effect on subsequent performance on SPM-C or SPM+. |
SPM+/SPM-P Equating Study | Sample size = 91, ages 8-18
Setup: as above in the SPM+/SPM-C equating study. Results: Suggest that a previous administration of SPM+ or SPM-P should not affect subsequent performance on either SPM-P or SPM+. |
SPM+ Reliability | Split-half reliability: r = 0,936, n = 924
Test-retest reliability: r = 0,833, n = 105 |
SPM+ Standard Error of Measurement and Confidence Intervals | SEM = 3,79 (standardised scores)
95 % confidence interval ≈ 7 (standardised scores) |
MHV Reliability
|
Sample size = 171, ages = 7-18
Setup: Group A is first administered Form 1 and after a time delay Form 2. Group B does it other way around. Results: Form 1 and Form 2 are parallel and a previous administration of the MHV has little effect on the subsequent score resulting from a retest. These data also provided the test-retest reliability = 0,916 and the parallel forms reliability = 0,929. |
MHV Standard Error of Measurement and Confidence Intervals | SEM = 3,99
95 % confidence interval ≈ 8 |
Gender differences for the SPM+ and MHV | No significant sex differences have been found for either the SPM+ or the MHV, nor has there been found any significant sex x age interaction. |
EVIDENCE OF VALIDITY | |
Different forms of validity
|
Content validity = whether the measure adequately represents relevant aspects of the construct being measured.
Construct validity = whether the test measures the construct it claims to measure. Evidence of construct validity often comes from factor analysis, expert reviews, multi-trait multimethod studies, clinical investigations etc. Criterion-related validity = whether scores are shown to be related to some external criteria, ex performance on another measure. |
SPM+ | (Evidence of SPM+ validity also comes from SPM-C data as the two are similar in form and content.)
Content validity Item analysis shows that the properties of SPM+ are relatively stable. SPM-C has face validity in cross-cultural settings, ie. its form is not culturally biased, but appears to people from various cultures to be a measure of the basic ability to reason. Construct validity Developmental changes between age groups can be observed, as mean raw scores of children in adjacent age bands increase incrementally, thereby providing evidence of age-related validity. Evidence from Item Characteristic Curves shows that:
Raven’s is generally described as one of the best measures of g and fluid intelligence, with evidence of this coming from factor analytic studies and cross-cultural studies, all revealing high g loadings. However some factor analytic studies also suggest that Raven’s measures other factors in addition to g. There has especially been evidence of a spatial component.
Criterion-related validity Concurrent validation of the SPM+ and the SPM-C shows a pooled correlation between the two forms to be 0,797. Concurrent validation between SPM+ and SPM-P shows a pooled correlation of 0,830. In general, concurrent and predictive validity of the SPM-C varies with age, possibly sex, homogeneity of the sample etc. There has been shown reliable correlations between SPM-C and Stanford-Binet and Wechsler-scales, with correlations generally being lower for Verbal IQ/performance. Correlations between the SPM-C and performance on achievement and scholastic aptitude tests have generally been lower and more variable than correlations with intelligence tests. Overall there are lower correlations/concurrent validity between SPM-C and measures of verbal and language abilities than with measures of maths and science skills. |