Relational Database Synthetic Data Generation

The object of this project is to define and develop a relational database specification language and implement a compiler to convert specifications into populated databases. We are currently developing a specification language to allow a user to describe a business process that would generate data to populate a relational database. This will specify the business processes (sequence of activity steps) and the distribution of the time to complete the business process and the resources required.

Once the business processes are defined, we will map them to a data model with the objective of being able to create a discrete event simulation of the business processes which will generate data to populate the database.

The objectives for the spring 2025 semester are:

  • Write a paper describing the relational database specification language and approach for database generation
  • Develop software to take as inputs a data model and a business process specification and generate a realistic populated database (stretch goal)

We need 3-4 students for this project. Required background is having taken ISE-558, strong Python skills, and having taken ISE-580 (or have a background in discrete event simulation).