MockDataGen

coverage pylint Latest Release PyPi Deployment

Description

This Python project provides a robust solution for generating synthetic datasets tailored for Master Data Management (MDM) testing and data integration projects.

In the end we have a realistic test bed for assessing record matching, data cleansing, and standardization workflows. The ability to test how systems handle inconsistencies and merge similar records (MDM use case) is invaluable for improving data quality, ensuring seamless integration, and refining entity resolution processes. This tool is especially beneficial for evaluating record matching algorithms and validating data governance strategies in enterprise environments.

Usage

Help Options :

mockdatagen –help

Generate 10 records with no display on screen :

mockdatagen –number 10 –print N

SampleOutputScreen

High Level Conceptual Data Flow Diagram:

The Idea!!

Release history

  • Version 1.0.0 - Date 6/14/2025 { Run CLI Command like mockdatagen --number 10 --print N }
  • Version 1.1.0 - Date 6/15/2025 { Added Unit test cases, Pylint for quality and github actions}
  • pypi.org Web Portal URL

    ======================= Developers Notes ======================

    Dynamic Update Pylint & Coverage Badge via shell script

    pylint_badge.sh

    Tree

    tree /F /A > tree_output.txt

    pylint local run

    uv run pylint myapp > pylint_report.txt || true

    Back to Portfolio!