IEM’s Hu utilizing machine learning to predict China’s imports and exports
Monday, August 26, 2024
Media Contact: Tanner Holubar | Communications Specialist | 405-744-2065 | tanner.holubar@okstate.edu
College of Engineering, Architecture and Technology researchers are set to begin a three-year project to measure and forecast China’s agricultural production, stocks and imports using machine learning.
A partnership between Oklahoma State University, Iowa State University and Cornell University, funded by the United States Department of Agriculture, leads to a project to understand the factors that impact China’s importing and exporting.
Dr. Guiping Hu — professor and head of the School of Industrial Engineering and Management and Donald & Cathey Humphreys Chair — is helping lead this project as one of the principal investigators.
The project will focus on studying data and then using machine learning to try to understand the factors that impact China’s imports and exports. Another goal is to make policy recommendations and share insights with farmers and other stakeholders.
Machine learning is an aspect of AI research that utilizes data and algorithms that allow the program to improve its accuracy by imitating the way humans learn.
“Using the data that's available through the importing and exporting control system, the research goal is to understand what some of the factors are that impact the importing and exporting relationship,” Hu said.
Hu said she pursues interdisciplinary research when possible, and this project brings economists and engineers together.
Hu studied the imports and exports of mining and textile industries using machine learning while at ISU, which served as a basis for the new project studying China’s agricultural economy.
There are traditional economic analysis models that can achieve this type of forecast. This project will use an innovative aspect of cutting-edge machine learning and data analytics tools to try and achieve the same result.
Machine learning methods can improve the accuracy of models and predictions compared to traditional methods. They also provide better insight when models are designed.
The project will utilize high-quality data, which is a structured database that has been vetted and rigorously collected. There are some kinds of data that may be missing from the database, such as policy information or information that is anecdotal.
Hu said those types of data will be structured or incorporated in a way that ensures the data is more informed.
The dataset being used was collected through the help of government agencies on international trading and nonprofit organizations.
The team will look at historical data to study how things have played out over time, and then use machine learning to consider factors that could result in a different outcome.
There are three levels of analysis for this project: descriptive, predictive and prescriptive. Descriptive is understanding what occurred using historical data and trying to understand the relationship between the data and trends.
Predictive analysis is taking historical data, looking at factors that changed and using the understanding of the relationship between the data and trends to create a prediction about what could happen in the future.
Prescriptive analysis is taking what is known about what has happened in the past, what could happen in the future and looking at ways that could be changed to improve it. Hu notes this would include asking a lot of what-if questions that could be related to policy making, policy recommendations and scenario analysis.
The first phase of the project will focus on descriptive analysis of the data. The team will focus on collecting data and understanding some of the factors that might not be represented in the data.
“Even with all those comprehensive data analyses or retrieving of those types of data, there might still be missing data,” Hu said. “For example, some of the policies and specific USDA regulations, it may not be directly representative. So, we may need to do what is called data engineering.”
The second phase will focus on descriptive and predictive analysis and looking at historical trends related to the data. The third phase will be both predictive and prescriptive, and will include analyzing scenarios, making policy recommendations and analyzing possible what-if scenarios.
A major focus of the research is to get to the point of making policy recommendations based on their findings.
“We are interested in addressing real-world problems and using data and some of the industrial engineering tools to generate insight so that the stakeholders can benefit from the research and analyses,” Hu said.