Channel | Publish Date | Thumbnail & View Count | Download Video |
---|---|---|---|
Publish Date not found | 0 Views |
Expert Blend explained, or rather, re-explained. We are in the fine-grained era of Expert Blend and it will get even more interesting as we expand on it.
This video was sponsored by Brilliant
Check out my newsletter:
https://mail.bycloud.ai
Mixtral 8x7B paper
[Paper] https://arxiv.org/abs/2401.04088
Sparse MoE (2017)
[Paper] https://arxiv.org/abs/1701.06538
Adaptive mixtures of local experts (1991)
[Paper] https://direct.mit.edu/neco/article-abstract/3/1/79/5560/Adaptive-Mixtures-of-Local-Experts?redirectedFromfulltext
Gshard
[Paper] https://arxiv.org/pdf/2006.16668
Branch-Zug-Mix
[Paper] https://arxiv.org/pdf/2403.07816
DeepSeek MoE
[Paper] https://arxiv.org/abs/2401.06066
MoWE (from the meme at 7:51)
[Paper] https://arxiv.org/abs/2311.10768
Mixture of a million experts
[Paper] https://web3.arxiv.org/abs/2407.04153
This video is supported by the kind sponsors and YouTube members:
Andrew Lescelius, alex j, Chris LeDoux, Alex Maurice, Miguilim, Deagan, FiFaŁ, Robert Zawiasa, Daddy Wen, Tony Jimenez, Panther Modern, Jake Disco, Demilson Quintao, Shuhong Chen, Hongbo Men, happi nyuu nyaa, Carol Lo, Mose Sakashita, Miguel, Bandera, Gennaro Schiano, gunwoo, Ravid Freedman, Mert Seftali, Mrityunjay, Richár d yfi, Timo Steiner, Henrik G Sundt, projectAnthony, Brigham Hall, Kyle Hudson, Kalila, Jef Come, Jvari Williams, Tien Tien, BIll Mangrum, owned, Janne Kytölä, SO, Richárd Nagyfi, Hector, Drexon, Claxvii 177th, Inferencer, Michael Brenner, Accusative, Oleg Wock, FantomBloth
[Discord] https://discord.gg/NhJZGtH
[Twitter] https://twitter.com/bycloudai
[Patreon] https://www.patreon.com/bycloud
[Music] Massobeats – Daydream
[Profile and banner graphics] https://twitter.com/pygm7
[Video Editor] @Askejm
Please take the opportunity to connect with your friends and family and share this video with them if you find it useful.