Big data and machine learning make statistics knowledge more important than ever

Dr Jody Muelaner

It's going to be more and more important for engineers to understand statistics (Credit: Shutterstock)
It's going to be more and more important for engineers to understand statistics (Credit: Shutterstock)

Engineers need a strong foundation in mathematics.

In fact one definition of engineering is the “application of mathematics and science to real-world problems”. Most mechanical engineers have a good understanding of pure mathematics, especially calculus. These are the skills required for the traditional work of modelling structures, dynamics, fluid flow and heat transfer. 

However, mechanical engineers are increasingly being expected to understand risks, improve quality in manufacturing systems and make technical business decisions. 

These are areas where statistics is of much more importance. Added to this, understanding big data and utilising machine learning require the use of advanced statistical methods.

Aid to decision making

Making quantitative business and financial decisions often involves statistics. Examples include using expected costs and returns to make financial decisions involving uncertainty and risk. Quality engineers have long understood the importance of statistics in predicting defect rates and production capability. As industry moves to a more scientific approach, based on uncertainty and not just variation, a deeper understanding of statistics is required. 

Increasingly, numerical methods are used to solve statistical questions. This invariably means using some form of Monte Carlo simulation, which means random number generators used to simulate random effects. Simulating an event many times means that the variation can be measured, much as it would be if real samples were taken during an experiment.  

In everyday life, the most common random number generators are dice, so for this reason dice are often used to represent Monte Carlo simulations. In real simulations, random number generators are software functions within a program. These digital random number generators can produce values from a given probability distribution, with millions of values created in under a second. This can enable many complex effects to be simulated far more quickly than actual experiments could be performed.

Modern machine learning, or artificial intelligence, is predominantly based on deep learning, a type of artificial neural network with more than three layers. Neural networks recognise patterns in data. Simple patterns in data with only a few dimensions can be identified using regression, often thought of as fitting a curve to two-dimensional data. Machine learning is able to identify much more complex patterns in data with many variables, but the statistical theory has a great deal in common with other forms of regression. 

In deep-learning networks, each layer of nodes takes the previous layer’s output as its input and is set to train on specific features. The deeper layers perform more complex feature recognition, aggregating and recombining features from previous layers. 

Deep learning is showing huge potential in applications such as the optimisation of products and manufacturing processes, and predictive maintenance. The Internet of Things is making huge datasets available to engineers and often the only way to make any sense of them is using machine learning. At the same time, increasing computer power is making it increasingly affordable to deploy machine learning. 

Adept at algebra

Understanding and using machine-learning algorithms, and especially deep-learning algorithms, requires a good knowledge of linear algebra. Many engineers are already familiar with the basics of linear algebra, working with matrices and vectors to model physical systems. However, for deep learning, more advanced linear algebra such as singular value decomposition and principal components analysis must be combined with probability, information theory and optimisation theory. 

Statistics is vital for quality improvement, evaluation of risks and for Industry 4.0 technologies such as big data and deep learning. For all of these reasons, anyone wanting to prepare themselves for engineering in the future should take their statistical skills very seriously.

Want the best engineering stories delivered straight to your inbox? The Professional Engineering newsletter gives you vital updates on the most cutting-edge engineering and exciting new job opportunities. To sign up, click here.

Content published by Professional Engineering does not necessarily represent the views of the Institution of Mechanical Engineers. 


Read more related articles

Professional Engineering magazine

Professional Engineering app

  • Industry features and content
  • Engineering and Institution news
  • News and features exclusive to app users

Download the Professional Engineering app

Professional Engineering newsletter

A weekly round-up of the most popular and topical stories featured on our website, so you won't miss anything

Subscribe to the Professional Engineering newsletter

Opt into your industry sector newsletter

Related articles