Customise the evaluation framework

Customise the evaluation framework to include measures important to your evaluation

Now you can tailor the evaluation framework to suit your specific STEM education initiative.

You can customise the evaluation to align with the evaluation’s purpose. Think about what evidence you might need to collect to answer your question. You can have more measures under particular categories of the evaluation framework to focus on the area your evaluation is interested in:

If your evaluation is focused on seeing whether you could improve processes or implementation, your evaluation might have more measures under the design and implementation categories.
If your evaluation is focused on seeing whether the initiative has been effective, more measures and effort should be dedicated to the impacts category. However, it is important to still cover design and implementation to check they were done well. If they weren’t, they can be a bottleneck and might limit impact. For example, if an initiative was poorly targeted, or was not delivered on time, this can reduce its success.

Your choice of measures customises the framework to be unique for your STEM education initiative.

Here is an evaluation framework with common measures for STEM education initiatives to start you off.

The following sections describe tips and tricks for choosing your measures under each evaluation framework category.

Choosing your key evaluation question

Take the common key evaluation question for all STEM education initiatives and customise it for your initiative:

Did [initiative name] achieve [intended initiative objective(s)]?

Ideally, this is easy because the objectives your initiative is trying achieve will already be clear when you start the evaluation. If not:

Look at what was written down about the initiative when it started. For example, any planning documents.
If they are not documented, ask stakeholders what they think the intended objective(s) were. If stakeholders have a clear and shared understanding, use these objectives in the evaluation question.
If the objective is still unclear, you may have to come up with them yourself. Use the guidance from this Toolkit’s sections about how to develop STEM education initiatives to determine objective(s) before you begin the evaluation (even if this was not done before the initiative began).

Choosing your design measures

Design measures consider whether decisions made about the initiative set it up for success.

For design measures, do:

Consider all the key decisions that were made that influenced how an initiative was designed.
Consider decisions made by different groups (e.g. a school, a funder).
Ask whether there was an appropriate process or justification for key aspects of the design.

But don’t:

Go overboard investigating why every aspect of the initiative was set up the way it was. As long as your measures capture the key design aspects, this should be enough to identify strengths or concerns. You can always investigate further if there is something a measure shows that you want to dig into.

Choosing your implementation measures

Implementation measures look at whether the initiative’s rollout matched expectations.

For implementation measures, do:

Use the generic measures that are frequently used across most evaluations: e.g. on time, on budget.
Include measures that go beyond these and are specific to your initiative. For example, if a set of resources were designed to be taught over five lessons, were they taught over five lessons?
Accept that sometimes things don’t go according to plan for good reason. These measures are not ‘pass / fail’. If something was implemented differently from how it was designed, you need to record the reason for this. When you look into it you might find that it makes sense.

But, don’t:

Turn the evaluation into an audit. Ensure your measures are aimed at understanding whether implementation limited and / or enabled impact (i.e. answering the overall evaluation question). In other words, keep your eye on the main game.

Choosing your outputs measures

Outputs measures capture what the initiative has produced or delivered, as well as information on who (e.g. students / teachers) received what (e.g. mentoring / professional learning).

For outputs measures, do:

Include the types of measures that are relevant for your initiative and adjust these for what your initiative produced (e.g. things produced, number of people reached, or time spent).
Include multiple measures, such as the number of people the initiative reached and time spent. You would do this because on its own, number of people or time spent might not be meaningful enough to help you understand potential impact. For example, a professional learning initiative could reach 50 teachers, but there’s a significant difference if they spent 2 hours doing the professional learning or 20 hours.

But don’t:

Exclude measures on demographics, as without these measures we are unable to tell whether the initiative ended up serving the students it intended to target.

Choosing your outcomes measures

Outcomes measures capture the initiative’s impacts, effects, consequences and what it has produced. Identifying these measures is the most complex but important step in customising your evaluation framework. They should be specific to your initiative and what it was trying to achieve.

There are four standard steps to follow:

Determine whether the initiative is aiming to improve engagement, achievement, or both.
Consider the standard direct measures for each of these outcomes (e.g. attentiveness in class for engagement, or test results for achievement). However, also consider how you would know that students are more engaged or are achieving more than they would have without the initiative, and construct measures using these ideas.
Determine whether your direct measures provide sufficient information to make conclusions about the initiative’s impact.
If you need to include inferred measures, imagine the outcome you’d ideally like to measure (e.g. ability for students to apply communication and critical thinking skills) and consider what would be the best predictor for this outcome (e.g. teacher feedback about whether students are asking tougher questions in class)?

For outcomes measures, do:

Think about students. Frame your outcomes so they are focused on student outcomes. Even if the initiative is about equipment or teacher professional learning, consider why this professional learning exists — is it to improve teaching to better engage students, help with student learning (achievement), or both?
Choose several measures. Focus on prioritising direct measures if you can, but also include some measures that rely on inferences. This is so both types of measures can be compared to provide a deeper picture of the initiative’s impact.
Identify when measures are inferences, as this will be important to whoever makes a decision based on your evaluation.

But don’t:

Measure teacher engagement only. This would have a specific focus on teachers rather than thinking about the ultimate goal — improving student outcomes through teachers. However, you might use this as a measure to make an inference, such as “improved teacher engagement leads to more engaged students”.

The following evaluation framework is designed to be a starting guide for you to use. It contains common measures for STEM education initiatives.

Key evaluation question: Did the initiative achieve its intended outcomes?

Design: Does the initiative's design set it up for success?

Potential measures for design

Were decisions on the following backed by evidence / information?

What problem should the initiative aim to solve / what objective did the initiative have to achieve?
Who should be the focus of the initiative?
What type of initiative should be used?
How to implement the initiative
- Was rollout designed to get best uptake / deliver best quality of initiative?
Whether or not to reach out for partners
Did the resources / activity co-ordinate with curriculum?

Implementation: How has the initiative been implemented in practice

Potential measures for rollout

Did the initiative try to engage the target population?
Was rollout carried out in a way to get best uptake / deliver best quality of initiative?
Did the initiative deliver on its intentions to:
- Develop x many resources
- Host x many visits
- Deliver the number and type of experts or providers planned (e.g. PhD science students or experienced teacher professional learning mentors)
- Deliver professional learning as part of initiative?

Generic program measures

Did the initiative:

Get developed and delivered on time
Get developed and delivered on budget
Comply with appropriate probity / process
Establish appropriate governance structures
Complete monitoring / reporting
Communicate and engage with stakeholders (e.g. other year level or subject teachers, parents).

Outputs: What has the initiative produced or delivered?

Potential measures for things produced

How many of the following did the initiative produce, e.g.:

Teaching / student resources
Products or resources delivered to schools
Experts for visiting schools
Experts for coaching / mentoring teachers
Events or expos held or visited
Teacher professional learning sessions
Lessons observed / led by experts or other teachers

Potential measures for people reached

How many of the following did the initiative reach, e.g.:

Student attendees
Teacher attendees
Parent attendees
Page visits on a website

Potential measures for time spent

How much time spent doing initiative activities, e.g.:

Student hours in activity / using resources / equipment
Teacher hours spent in professional learning
Hours spent mentoring other teachers or observing their lessons

Potential measures about who received the initiative

Who received the initiative in practice?

For students, by:

Student year level
Student ability
Demographics, e.g. SES, gender, ethnicity, geographic location

For teachers, by:

Year level they teach
Subjects they teach
Years of experience teaching (in general, and for specific subjects)
In-field / out-of-field experience

Outcomes: What impacts or consequences did the initiative have for students?

Direct measures of engagement and achievement

Direct measures of engagement

Are students more engaged and attentive in class?
- Particularly students in the target population?
Are enrolments in STEM subjects increasing?
- In short term?
- In long term?

Direct measures of achievement

Are students’ results improving?
Are students’ skills improving (e.g. through critical thinking skills test)?
Do teachers / employers have positive feedback on achievement / performance of students?
Depth of understanding in subjects
Retention of concepts

Proxies to measure engagement and achievement

Potential behaviours to use as proxies

Are students or teachers demonstrating best practice behaviours or actions, e.g.:

Teachers integrating technology in the classroom?
Teachers adapting material / resources / skills from professional learning for their classes’ abilities
Teachers enabling inquiry-based learning?
Students asking and answering questions that demonstrate a deeper understanding and interest

Potential beliefs to use as proxies

Are students or teachers demonstrating beliefs that indicate engagement or achievement outcomes, e.g.:

Do teachers feel more confident since undertaking professional development?
Are students less anxious about certain topics or subjects?
Teachers believe students’ proficiency has developed
Students have aspirations for taking STEM subjects later in their education or for STEM careers

You can make your evaluation framework as large and complex or as short and simple as you want. The real examples below show how the evaluation framework can adapted.

Here is a template for an evaluation framework which you can use and populate. The template also has a list of different measures that you might want to use.

Case study

The first example is an evaluation framework based on the STEMship initiative. STEMship is a STEM-based vocational education and training (VET) program for senior secondary students in the Hunter region of New South Wales. STEMship offers students TAFE NSW units of competency, industry visits and targeted work placements.

STEMship uses a relatively small number of measures to evaluate its outcomes. This is fine because the measures (apprenticeships, VET courses and jobs) still tell the evaluator whether the initiative improved the talent pool and bridged the gap between school and university programs. If you do not have the time or resources to run a large evaluation, you can have a smaller number of measures. But your measures must help you answer whether the initiative achieved its objective.

Key evaluation question: Did the STEMship pilot create a highly-skilled job-ready talent pool and bridge the gap between secondary school and University integrated STEM programs?

Design: Does the initiative’s design set it up for success?

Measures for design

Were decisions on the following backed by evidence / information?

What problem was the initiative aiming to solve?
Who should be the focus of the initiative?

Implementation: How has the initiative been implemented in practice?

Measures for rollout

Did the initiative try to engage the target population?
Did the initiative deliver on its intentions to:
- Provide industry site tours
- Provide x places for students
- Involve regional organisations

Generic program measures

Did the initiative:

Get developed and delivered on time?
Get developed and delivered on budget?
Communicate and engage with stakeholders?

Outputs: What has the initiative produced or delivered?

Measures of things produced

Number of students who completed STEMship and gained the qualification
Number of work placements that took place

Measures of people reached

Number of student applications to take part in course
Number of students who started STEMship

Measures about who received the initiative

Who received the initiative in practice, by:

Student year level
Student ability
Demographics, e.g. SES, gender, ethnicity, geographic location

Outcomes: What impacts or consequences did the initiative have for students?

Direct measures

Direct measures of achievement

How many students gained apprenticeships following STEMship?
How many students continued into VET courses following STEMship?
How many students gained full time employment following STEMship?

Proxies to measure engagement and achievement

Beliefs used as proxies

Did students give positive feedback on the training?
Did students give positive feedback on their work placement?
Are students aware of possible employment pathways in the Hunter region?

Customise the evaluation framework

On this page: