We invite submissions for the workshop "SynthAI@SIGMOD 2026: Workshop on Synthetic Data Generation and Management for Building AI Systems", to be held at SIGMOD 2026. For topics of interest, please see the workshop homepage.
Important Dates (All deadlines are AoE):
- Submission Deadline: February 15, 2026 (AoE)
- Notification of Acceptance: March 15, 2026 (AoE)
- Camera-Ready Submission: April 1, 2026 (AoE)
Deadlines are strict and will not be extended under any circumstances. All deadlines follow the
Anywhere on Earth (AoE) timezone.
Submission Information:
Submissions should be made via the submission platform (link will be provided soon). All submissions will undergo a rigorous double-anonymous peer-review process by our program committee of experts from academia and industry.
Papers must be submitted as a single PDF file:
- Long Papers: at most 8 pages (main text)
- Short Papers: between 4 and 6 pages (main text)
- References and appendices are not included in the page limit, but the main text must be self-contained. Reviewers are not required to read beyond the main text.
Please use the standard ACM SIGMOD template for submissions. Submissions exceeding the page limit will be desk rejected.
Anonymity
The workshop follows a
double-anonymous review process. Submissions must be anonymized by removing author names, affiliations, and acknowledgments. Prior work should be cited in the third person. Identifying information, including in supplementary materials, must be omitted.
Dual Submission and Archival Policy
Submissions under review at other venues will be considered, provided they do not breach any dual-submission or anonymity policies of those venues. Accepted papers will be published in the ACM Digital Library as part of the SIGMOD Workshop Proceedings. Selected high-quality papers may be invited for extended versions in relevant journals.
Topics of Interest
We welcome submissions on (but not limited to): architectures and systems for scalable synthetic data generation, LLM-based data synthesis, evaluation and benchmarking, responsible data generation (privacy, fairness, bias), applications in data-scarce domains, storage and management of synthetic datasets, quality assessment frameworks, and synthetic data for training and benchmarking.