Hi Nik,

Thanks for your contribution and for providing an alternative solution for the readers.

I do agree on the fact that DISTINCT could be a performance killer for big datasets (but it is not the case here).

However I have to completely disagree on the performance comparison in between IN() and EXISTS() operators.

As you can read in the tutorial below:


Both IN() and EXISTS() generate exactly the same execution plan and the same number of scan/read operations.

This is due to the fact that the Optimizer will either use IN() if the dataset is small enough, or convert it to EXISTS() or to a JOIN() if an improvement can be achieved.

Let me know if you have any comments :D

Snr BI Engineer @Wise | 🏆 Among Top Writers In Data Engineering 💻 Follow & Contact Me 🤝 https://www.linkedin.com/in/anbento4