In the realm of large-scale artificial intelligence (AI), data preparation emerges as a vital stage, often ignored. BulkDaPa, a novel framework, addresses this need by offering scalable data processing solutions tailored for gigantic datasets. By leveraging sophisticated methods, BulkDaPa improves the whole data preparation pipeline, enabling AI de