The NIPBL expression level in each sample was measured by 3 NIPBL probes, each against 2 endogenous control probes and repeated twice. So, each sample got 12 total measurements. The following steps were then taken to get a single measurement for each sample. All pairs of repeated measurements were fitted into a linear regression model. Four pairs having high residuals (greater than 3 standard deviations) were identified as outliers. All pairs of repeated measurements were averaged to get a single value except the outlier pairs, which were treated as missing values. The missing values were imputed using all the other pairs and the imputed values were compared to the 2 measurements of corresponding outlier pairs. The measurement that was closer to the imputed value was kept while the other was discarded. We assumed that one measurement of each outlier pair was closer to the true value than the other measurement and then estimated based on a linear regression model. As a result, the number of measurements of each sample was reduced to 6 (3 NIPBL probes X 2 endogenous control probes). The measurements based on the 2 control probes were treated pairs as above. The same procedure was applied to these pairs, which identified and imputed 1 outlier pairs. As a result, we obtained one measurement for each sample from each NIPBL probe. Since probe #1 and #3 had the strongest correlation to each other, we used their average as the final measurement of NIPBL expression level in each sample.
Unrelated controls were compared 2 unaffected siblings of probands and no significant differnce was found (p = 0.74) and they were combined as the control group for all analyses. Similarly, there was no significant difference between unrelated controls and mutation-negative samples, so they were combined as one control group for analysis of mutation types. Furthermore, the singleton patient with SMC3 mutation and 4 patients with SMC1A mutations were pooled into one group.
We first performed ANOVA analyses that confirmed significant difference of NIPBL expression across sample groups when they were classified by disease severity, mutation gene, and mutation type. Post hoc analysis using Student's T test was then used to compare pairs of sample groups.