Stratified Train-Test Split

Data Preparation & Feature Engineering DS practice problem on Onlearn.

Difficulty: medium.

Topics: Understanding Stratified Sampling in Data Splitting, Stratification Constraints, Label Distribution Preservation, Random State Seeding, Array Index Mapping, Proportional Allocation, Statistical Sampling, Supervised Learning, Data Preprocessing, Model Evaluation, Probability Theory, Train-Test Splitting, Class Imbalance Handling, Randomized Algorithms, Data Partitioning, Sampling Bias Mitigation.

Implement a function 'stratified split' that takes a list of labels 'y' and a test size float. The function should return the indices for the training set and the test set such that the proportion of each class label in both sets is approximately equal to the proportion in the original input.